I dodged a very large (metaphorical) bullet yesterday. I’m not in the habit of trying to find bullets to dodge, but this one found me because it turns out I didn’t plan as well as I thought I had.
It all started yesterday morning when a client asked me to make some batch updates to the data in his MySQL database. So, I wrote a script to make the required updates. Before running the script, I made a backup of the database in case something went wrong. Everything was good so far. I ran the script, checked the database to see if anything looked amiss and called it a night (oh yeah, this was like 11:00pm last night). This morning, I get an urgent text message from the client saying that the data didn’t look quite right. No problem I think, I made a backup copy of the database last night, I can fix this. But nooooo, that database backup wasn’t around any more. Why, you ask? Oh yeah, that. I saved it to the same location that the regular, nightly backup saves to. So by the time I needed the backup this morning, it was already gone. Luckily, the eventual fix was very simple once I figured out where my script had gone wrong. It could have been really ugly though.
So, here’s to lessons learned:
- Don’t do production database work late at night. The root of the problem was that I misread the client’s original email and wrote my script incorrectly because of it. Part of that failing was the late hour and long day that preceded it.
- Plan, plan, plan and re-plan. Thinking my one backup was sufficient was my second mistake. When working with production data, you can absolutely not be “too backed up”
- Test your scripts on non-production systems first. This client doesn’t have a dedicated development system, but I do have a few-weeks-old copy of the database on my development machine. Another mistake I made was being lazy and not testing these changes on my local copy before applying them to production
- Have a plan in place to verify the database changes as soon as they’re made. My cursory inspection of the data didn’t reveal the actual problems that occurred. If I had done this when the client was available to test the changes, we’d have been able to immediately restore the database and assess what went wrong.
- Nothing is ever as simple as it seems. These updates seemed like a walk in the park and lulled me into not taking the precautions I knew I should have
So, in the end, my client has his data back where it belongs and I have a few things I’ve learned I’ll never do again. My biggest embarrassment in this whole thing is that I consider myself a senior developer. I’ve been writing web apps for almost 10 years across Oracle, MS SQL, MySQL and PostGres databases and I should have known better than to take shortcuts. My only excuse is that I’m human and I can only try to learn from my mistakes.