ETC Maryland Rescues Gaming Site


There is a site I frequent called the Christian Gamer’s Alliance.  I used to be many things on the site, server administrator, forums administrator, chat administrator…just to name a few.  Over time I faded away from the site as the clan I was in(God’s Frozen Chosen) disbanded.  The other duties went to other folks who were still active in the Alliance.

One day I get contacted by the president of the CGA that their server is having problems.  The CGA had also been experiencing various problems with their server they were renting.  Tek asked me to take a look and to see what I found.  The costs were kept to near zero with the agreement I would do this on my spare time.  What did the CGA gain from moving from a dedicated server to shared hosting?

  1. No more having to administrate their own dedicated server
  2. They have a direct line to the owner of the server and company
  3. They have faster performance for less than 1/3 of the price
  4. They get all of the benefits of the Standard hosting package.

This is the story of the journey from their nearly $300/yr dedicated server to my faster and less expensive($85/year) Standard Web Hosting:

Tek got a hold of me and asked if i could take a look at the now old CGA server.  he thought that there were some issues but wasn’t sure and figured it needed to be looked at closely.  I let him know that I would work on it in my spare time to keep the bill from getting stupid.

The first thing I did was log into the control panel for the server.  I am familiar with the control panel(Virtualmin), and I began looking around.  I immediately noticed that according to the control panel there was no database installed on the server.  That wasn’t correct as the site is 100% database driven so there had to be a database somewhere.  I then requested full root access to the server so I could begin a thorough analysis of the server.

Once inside the deep innards of the machine I saw some concerning things:

  1. The system had several third-party software repositories activated.  These caused large changes to the default configuration that were not supported by the Linux version that was installed on the server.
  2. The database program was still nowhere to be found.  Even after checking the third-party areas I still could not locate the database program.
  3. Security updates had not been run on the machine in a very long time.
  4. Backups were only done onto the local machine.  There was no external storage for backups for their website data.
  5. I could not be sure if the databases were being backed up at all do to my inability to find the database program that ran it.
  6. This machine was running an old Linux version that was going to be going end of life soon.  Would I be able to even migrate this to either a reloaded version of this server or another one?  I could not say with confidence I would.

I dug into the scheduled actions and found a database backup ran every day at night.  Using that information, I found where the database was.  It has been installed outside of the control panel purview which is why I could not find it at first.  Now that I knew where the database was I went looking into the scheduler more. This backup script would place the database backups inside a folder that the control panel system had access to. i then found where the control panel was doing its backups to…a folder locally on the server hard disk.  There was nothing setup for secured, offsite backups.  I then made a manual backup of the database and the rest of the website files and pulled them down to my network here at the office. 

Next, it was time to address the lack of updates.  I ran a command to clear the system of cached updates and reran the update command.  In Linux, it scans the system for updates and lists them all AND tells you which software repository they are coming from.  The list was more than 4 pages long and it took me a while to read through all of them.  Once I went through each line I know some of those repositories would cause conflicts with the base system architecture and possibly take the machine offline.  I already had some dependency conflicts as it was and this would have only made the problem worse.  I then had to remove some of the conflicting repositories, remove the software those conflicting repositories has installed and then fix the existing conflicts.  Once I had repaired the conflicts I was then able to reinstall the supported versions of the software packages that the system needed to support the website and its community.

With the software conflicts fixed now I could try another update cycle.  I requested a maintenance window because I knew from the first test run this would require a reboot.  i got the maintenance window and i ran the updates.  No conflicts and the system is now up to date.  I rebooted the server and the client brought the site back online and everything worked as before..whew.

All of this previous work only took a few hours to complete.  The next section is what took much longer to get sorted out.

I first tried to do a test restore of the backup onto my web server.  The website files restored OK but it crashed my web server (it is called Apache) in the process.  I quickly deleted the files and went into figuring out what went wrong.  It turns out the Apache on the old system had some interesting configurations calls in it that my server did not like.  i then redid the control panel backup without the Apache configuration backups.  This lead to the website files restoring to my server without incident.  The database was another matter entirely.

I tried to restore the database for the website and got a syntax error which caused the restore to fail.  I then tried using a different backup program and tried to restore the database to my program but it gave a different error that also caused the restore to fail.  I then tried a test restore on the originating server (into a different database) and when it also failed I knew there were big problems.  I pulled down a copy of the database to my network and ran some diagnostics on the database.  The error rate was 25%.  That meant at least 25% of the database had some kind of corruption and the integrity of the database was in question.  There was one weird thing though…the site ran fine.  How can there be up to 25% corruption and the website continues to function ok?  I had to go to the website’s software manufacturer for an answer.  I was informed this is a common error when the database tables use InnoDB instead of MyISAM.  Due to this, the standard tools run in default MyISAM mode instead of InnoDB.  Armed with this information I re-ran the diagnostics with the correct mode and found the reason for the failures.  Somehow the header of the database file got encoded the wrong way and this made the database program on my server reject the file.  i fixed the encoding issue and I tried to another database backup and restore.  This time I got a series of syntax and duplication errors and the restore failed again…. grrrrr.  I cleared the database on my server but left the shell and did another restore.  An hour later, success!  I then have the client move DNS to point to my server to see how things are working.  5 minutes later and an unexpected database error has occurred is all I get when I went the website.  Seriously?

Here is where the disjointed MySQL install takes its toll.  i figured out that the database permissions were totally wrong because the database program had been installed outside of the control panel.  I had to find the root password of the database program and then change it to the main administration username and password inside the control panel.  Once I got those to match I then was able to finally see the database program inside the control panel.  I was able to then “grab” the database into the control panel under the clients account and FINALLY I was able to get a good backup from the old server.  I then took this backup, uploaded it to my server, and began the import…which failed again.  It turns out the control panel of the old server was causing the encoding problems when it did a backup of the database.  I then did another backup of ONLY the website files and other critical configuration details and I then I manually exported a database backup from the old server.  I then restored the website files from this last backup and then using my control panel I imported the final database backup into my server.  An hour later the database was finally in its new home.  10 minutes later DNS has caught up and the site is now on my server running perfectly.