My site has been plagued by odd characters in some of my old posts for a rather long time. The most common is the seemingly random appearance of Â characters in the midst of some of my posts.
I know the basic cause: one of my WordPress / MySQL updates in ancient times (circa 2008) ended up producing a character set mismatch. I long ago fixed the cause, but all the existing bad characters persisted. Until today… I hope
I have been periodically going through small batches of old posts to find and correct (remove) the bad characters. I started doing that again today, and at some point got bored. I thought: there is certainly an easier, computer-literate way to solve this problem. So I found this site, and this little snippet of SQL:
WARNING: THIS KIND OF GLOBAL UPDATE CAN CAUSE BAD THINGS
UPDATE 'wp_posts' SET 'post_content' = replace('post_content', 'Â','')
It looked right and proper to me, but when I attempted to use it my SQL tool “MySQL Workbench” didn’t like the single comments around ‘wp_posts’ and ‘post_content’. These were supposed to be backticks (`, not ‘ ), but they weren’t necessary anyway, so I removed them… but not in the replace statement. I briefly paused before I clicked on “execute”, thinking “I should probably take a backup before I do this”. Yes, Kelly, that would have been a good idea.
I viewed my site after performing this update, and every single post, over 1300 of them, had its content replaced… with ‘post_content’. Oops.
Fortunately, my site is configured to automatically take regular backups, so I was able to recover everything up until Thursday of this week. And thankfully, my site isn’t terribly active: as far as I can tell, I only lost one legitimate comment since Thursday. And since I get all comments emailed to me automatically for approval, I had the contents of that comment in my email inbox. I was able to manually edit the SQL database and re-create it.
I then re-ran the SQL update, this time without the quotes e.g.:
UPDATE wordpress.wp_posts SET post_content = replace(post_content, 'Â','')
This appears to have worked fine. I’m slightly concerned that, in the initial query that messed things up, 1366 records were selected. But after restoring from backup, only 1322 records were found. Somewhere out there, 44 posts have “disappeared”, or at least 44 post records- maybe they were broken in some way, and so were never actually showing up in my blog.
I think I may have eliminated all of those stupid oddball characters in my posts after only seven years and one “ohmygodijustdeletedallthethings” moment. Hurray, I guess?