... In all probability it would have been enough to ... do a field to field copy (perhaps between two tables), a pure database operation. I am pretty sure that there was a table with post ids and a set of user ids.
Good summary!
I prefer to dump the old data and QA it for awhile before reloading it, but yes, that is how it is done.
Performing operations like these should be the basic skill of any professional database administrator. Especially in a commercial setting. I am sure there are a bunch of DBAs here on the forum...
Hah! It
should be
a basic skill. You should meet some of the programmers and DBAs I have interviewed.
:facepalm9999:
If there is a full database dump from a time shortly before the plugin was decommissioned, it might take a few minutes to copy the dumpfile, select the post_id of every post with one or more "likes", ordered by the user_id of the "liking user", and ouput those columns as CSV or more-generic SQL.
This dumpfile would give you a graph of all of the like counts, likers and (indirectly) the likees.
Presumably, you could rescue lost likes from any number of abandoned like-systems, limited only by the availability of old backups and the patience of the data spelunker.
Big dumps like these can be spot-checked in Excel, which has some excellent tools, but I prefer to write my own data-specific tests each time. In this case, for example, if any liked posts no longer exist (deleted, corrupted, moved, renamed, missing, whatever), it is better to consider not importing a reference to nonexistent data. Similarly, some users may have been banned and had their user_id reassigned later, etc. A larger userbase should demand greater care in this QA step, even if the entire import can be run inside of a transaction and backed out if it doesn't work.
After the data QAs OK, then someone has to study the new "like" system to see if there is a clean, one-to-one mapping of the old relationships onto the new relationships. "Liking" is such a simple concept that any code implementing it is usually readable by ordinary humans. Unless there is some disgusting
hairball in the code that maps user_ids to liked_posts, finding precise relationship logic takes anywhere from ten seconds to an hour.
A test account using the new like system can be prodded with command line tools (one bogus like at a time) to map and unmap experimental likes until everyone on the team feels that the logic is sound. With judicious use of safe SQL, this can be done during working hours while the site is live, but many site operators prefer to wait until a scheduled downtime.
Once a valid import path exists, the old likes can be loaded from their most recent backup file during a maintenance window, while the application is offline, and inserted in a minute or two on a fast machine.
If the result is undesirable, the changes can be tossed or postponed. If something really odd has happened, the entire database can be discarded and reloaded from a recent backup in ten or twenty minutes (usually).
Some users lost their likes, but more relevantly all the posts lost their likes. The only quality indicator of old posts is gone, and an awfull lot of really valuable information is lost. ...
Yes, losing the old likes is really not an ideal outcome.
But I am willing to bet that the data is just hiding. I doubt that it's gone forever.
It would costs a million dollars to have someone do the assessment job to give you a competitive advantage like [a "most liked posts" -index].
An index like that is a really nice idea. On big gaming web sites, they are called leaderboards. They are updated continuously. For a forum site, recalculating a top ten list to show "the posts with the most likes" could happen every five minutes or so. Simple code, fun to write.
Large corporation, fat checkbook? Keep adding zeroes. The bureaucracy, the sluggish thinking and the inevitable aggravation demand it. For large enough sites, the ROI is there. The data really is gold and they should pay you dearly to unlock it.
For a small, family-owned site? Some air fills or a boat ride should be more than enough.