Website Issues

What happened to the website this past week?

I’m sure most of you aren’t actually interested in the technical details of what happened, so I’ll do a small overview and do a larger technical description down below. On Monday, I started doing some maintenance on my website server and the result was that the website went down Monday morning. There wasn’t a quick fix I could do, and I ended up having to rebuild the infrastructure for serving my website in a completely different way. It took me a few days to work through the various steps and pieces to getting everything up and running, but things are back to normal and should actually be more reliable going forward.

Unfortuantely, this also affects the process of issuing serial numbers. Any purchases of File Buddy this week will have their serial numbers delayed until I finish moving that over as well, which I should get done this weekend.

Quick Update on File Buddy

I’ve had some emails about this, so I’ll add it here for anyone who is interested. I am still working on File Buddy, but life has kind of gotten away from me and I haven’t made much progress recently. My COVID-related side-project has wound down so I should have more time to focus on File Buddy going forward.

I apologize for the lack of progress and communication this year and last, I’ll try to do better on that front.

Technical Details on the Website Outage

I was running a server that automatically refreshed my TLS cert from LetsEncrypt using ACMEv1. I wasn’t keeping up on things and a couple of things changed since I last looked at it. One, the server I was using released a new version that wasn’t compatible with what I was using. Two, the ACMEv1 refresh process was deprecated and disabled by LetsEncrypt in favor of alternative methods. When I noticed the certificate wasn’t getting refreshed, I tried to restart the server in case it was “just one of those things”. However, when the server tried to come back up, it couldn’t get the TLS certificate stuff going because the process was disabled and the server would just immediately shut down again.

First, I tried to just upgrade to the new version of the server in case that might be able to fix it. It wasn’t a straight forward migration and at the end of it, what I found was that I couldn’t use that server anyway because the DNS provider I was using didn’t have an API for the server to automatically configure.

Once I realized that, I decided it was time to make a bit more robust website serving process anyway. For the last few years, it was just a single server with the files on the same VM and that was it. No redundancy, no CDN, etc. I decided it was time to get things into a more modern, stable, and scalable deployment. As such, I spent the next couple of days moving things over to AWS CloudFront CDN. This means the site should be faster and more reliable going forward.