I got a call from a friend that I had not been in contact with for quite some time. He asked if I could help him with his website. After a good conversation on the phone it turns out what he really needed help with was his server that ran the website. He had noticed some performance issues and grew concerned about the overall health of his website, The Christian Gamers Alliance. I agreed to take a look at things under the hood and let him know what I found. Little did I know the challenge I had stepped into. I had told him of my rates and he asked if there was a way to lower the bill. Sure, I will work on this only in my spare time…which he agreed to.
The first thing I did was log into the control panel for the server. I am familiar with the control panel(Virtualmin), and I began looking around. I immediately noticed that according to the control panel there was no database installed on the server. That wasn’t correct as the site is 100% database driven so there had to be a database somewhere. I then requested full root access to the server so I could begin a thorough analysis of the server.
Once inside the deep innards of the machine I saw some concerning things:
- The system had several third-party software repositories activated. These caused large changes to the default configuration that were not supported by the Linux version that was installed on the server.
- The database program was still nowhere to be found. Even after checking the third-party areas I still could not locate the database program.
- Security updates had not been run on the machine in a very long time.
- Backups were only done onto the local machine. There was no external storage for backups for their website data.
- I could not be sure if the databases were being backed up at all do to my inability to find the database program that ran it.
- This machine was running an old Linux version that was going to be going end of life soon. Would I be able to even migrate this to either a reloaded version of this server or another one? I could not say with confidence I would.
First, I took a look at their site activity and figured that they really do not need a dedicated server. I also knew by looking at the website metrics they would easily fit into my shared hosting which would save them a ton of money each year. I informed the client of this and told them they have a ton of options. I can either get this server into a condition where we could make a good, tested, working backup and then they could either have their existing server reloaded with a newer operating system with many more years of service, they could ask their provider for a lower level of service (like shared hosting or a virtual private server), or if they would be interested in any of my products. The client took some and discussed it with others and decided to see about using my shared hosting service.
I dug into the scheduled actions and found a database backup ran every day at night. Using that information, I found where the database was. It has been installed outside of the control panel purview which is why I could not find it at first. Now that I knew where the database was I went looking into the scheduler more. This backup script would place the database backups inside a folder that the control panel system had access to. i then found where the control panel was doing its backups to…a folder locally on the server hard disk. There was nothing setup for secured, offsite backups. I then made a manual backup of the database and the rest of the website files and pulled them down to my network here at the office.
Next, it was time to address the lack of updates. I ran a command to clear the system of cached updates and reran the update command. In Linux, it scans the system for updates and lists them all AND tells you which software repository they are coming from. The list was more than 4 pages long and it took me a while to read through all of them. Once I went through each line I know some of those repositories would cause conflicts with the base system architecture and possibly take the machine offline. I already had some dependency conflicts as it was and this would have only made the problem worse. I then had to remove some of the conflicting repositories, remove the software those conflicting repositories has installed and then fix the existing conflicts. Once I had repaired the conflicts I was then able to reinstall the supported versions of the software packages that the system needed to support the website and its community.
With the software conflicts fixed now I could try another update cycle. I requested a maintenance window because I knew from the first test run this would require a reboot. i got the maintenance window and i ran the updates. No conflicts and the system is now up to date. I rebooted the server and the client brought the site back online and everything worked as before..whew.
All of this previous work only took a few hours to complete. The next section is what took much longer to get sorted out.
I first tried to do a test restore of the backup onto my web server. The website files restored OK but it crashed my web server (it is called Apache) in the process. I quickly deleted the files and went into figuring out what went wrong. It turns out the Apache on the old system had some interesting configurations calls in it that my server did not like. i then redid the control panel backup without the Apache configuration backups. This lead to the website files restoring to my server without incident. The database was another matter entirely.
I tried to restore the database for the website and got a syntax error which caused the restore to fail. I then tried using a different backup program and tried to restore the database to my program but it gave a different error that also caused the restore to fail. I then tried a test restore on the originating server (into a different database) and when it also failed I knew there were big problems. I pulled down a copy of the database to my network and ran some diagnostics on the database. The error rate was 25%. That meant at least 25% of the database had some kind of corruption and the integrity of the database was in question. There was one weird thing though…the site ran fine. How can there be up to 25% corruption and the website continues to function ok? I had to go to the website’s software manufacturer for an answer. I was informed this is a common error when the database tables use InnoDB instead of MyISAM. Due to this, the standard tools run in default MyISAM mode instead of InnoDB. Armed with this information I re-ran the diagnostics with the correct mode and found the reason for the failures. Somehow the header of the database file got encoded the wrong way and this made the database program on my server reject the file. I fixed the encoding issue and I tried to another database backup and restore. This time I got a series of syntax and duplication errors and the restore failed again…. grrrrr. I cleared the database on my server but left the shell and did another restore. An hour later, success! I then have the client move DNS to point to my server to see how things are working. 5 minutes later and an unexpected database error has occurred is all I get when I went the website.
Here is where the disjointed MySQL install takes its toll. I figured out that the database permissions were totally wrong because the database program had been installed outside of the control panel. I had to find the root password of the database program and then change it to the main administration username and password inside the control panel. Once I got those to match I then was able to finally see the database program inside the control panel. I was able to then “grab” the database into the control panel under the clients account and FINALLY I was able to get a good backup from the old server. I then took this backup, uploaded it to my server, and began the import…which failed again. It turns out the control panel of the old server was causing the encoding problems when it did a backup of the database. I then did another backup of ONLY the website files and other critical configuration details and I then I manually exported a database backup from the old server. I then restored the website files from this last backup and then using my control panel I imported the final database backup into my server. An hour later the database was finally in its new home. 10 minutes later DNS has caught up and the site is now on my server running perfectly. He was previously paying north of $400/year for a dedicated server they were hardly using and having to self-manage to being hosted on my server. The Christian Gaming Alliance is currently on my Standard Hosting Package. They are relived of the burden and expense of their own dedicated server. If you are having to manage the underlying software AND your website is performing poorly contact me. You can reach me at 301-524-5271, or Twitter @etc_md, or my business Facebook page.