Wikimedia Database

This past week, a guy signed up to edit the Flashlight Wiki, and he even sent me an email asking to be confirmed, like the instructions say to do. I went ahead and confirmed him for editing and he went to work on the article about Maglites. Apparently on Wikipedia, there was some information being added that was kind of critical of Maglites, and some people said the information didn’t have legitimate sources since it was from flashlight discussion forums. Because it was still good information, this guy thought it would be good to move it to Flashlight Wiki, where sourcing isn’t quite as rigorous. I didn’t have a problem with that.

One of the things they have on Wikipedia that is not included in a MediaWiki installation is the ability to have footnotes. Even if you install the extension Cite (which I have done), you can’t use the same format that they use on Wikipedia without installing additional extensions and templates. The Wikipedia version has kind of a fill-in of different aspects of a source like the author, date, title, etc. Then the template takes that data and puts the footnote in the correct format. To me that has always been a pain, but I’m starting to come around on it.

In order to get that template to work, I had to install the extension ParserFunctions, which is actually included in the MediaWiki installation, but not enabled. So all I had to do was enable it in the localsettings.php file. I also enabled string functions. The guy added the cite web template and everything worked great.

But then today I was doing my weekly backup of the MediaWiki database. Usually when I export the database from the PhpMyAdmin control panel, the gzipped archive runs about 10 MB. I also sometimes create a zip file of all of the MediaWiki installation, which runs about 25 MB, and includes all of the pictures, extensions, skins, etc. Though I’ve never needed to do a recovery, these two files should let me recover the entire website. The database contains all of the text content, users, edit history, etc. while the zip archive has the settings and pictures.

By the time the database finishes downloading, it is 125 MB, over 10 times the size of the last archive. This is not a good thing. The only thing different that I can think of is me enabling ParserFunctions and the rest of the stuff to get better footnotes. I looked up ways to compact the database, but there doesn’t seem to be much that can be done. So I decide I will revert the database back to my backup. I go through the list of recent edits and copy the source of all the articles that have changed since the last backup, and save the text in text files.

In phpMyAdmin, I tried importing the backup of the database, but it gave me an error because the database already existed. So I figured I would rename the current database, and then import the backup database into the correct name. I was able to do this, but for some reason the database didn’t show up in my list of databases in the My SQL Databases control panel. I’m not sure what is up with that, but the renamed database is still there with the new name. The bad thing is I could delete the database if it would show up, but not if it doesn’t show up. So instead I delete the renamed database, which for some reason it lets me do. Then I replace it with the contents of the backup database. If I try to import it into the correct name of the database, it tells me the database exists already. But since I have the backup database installed under the new name, I just go ahead and change the wiki’s localsettings.php file to use the renamed database. And that works out perfectly. Then I went in and re-edited the articles that had changed since the last backup, uploaded two pictures I had uploaded since the last backup, and then reformatted the footnotes using regular old cite instead of cite web.

Anyway, it’s all working, and I’ve also reduced the size of my web host account with database getting smaller. Since the old database was still in phpMyAdmin, I did go in and delete all of its tables at least.

4 thoughts on “Wikimedia Database”

  1. This morning the empty database showed up in the My SQL Databases control panel, so I can delete it now and hopefully copy the backup into a new version of it.

  2. Found out what was going on today. The size had jumped up again from the usual 10-12 MB to 72 MB this week. In phpMyAdmin I looked at the list of tables and the objectcache was huge. I looked into this and found out that this table helps reduce load on the server by caching objects (pages, I guess, which are put together from many different parts of the database and installed files), and isn’t really necessary. You can turn off caching if you want. Or you could delete that table, but I didn’t want to do something that could corrupt the database. I found an extension called PurgeCache that gives the admin (if he has developer privileges, which I had to grant to myself) a page with a button that purges the cache. I pressed the button. Then I did another backup of the database. 3.8 MB. A 95% reduction. Nice! And the site still seems to run just fine.

  3. Since I seem to have the database size problem solved (or fixable), I went ahead and reinstalled the cite web template that I had removed earlier. Then I fixed all the references in that article to use cite web format. Not worth the trouble, but I didn’t want a contributor’s efforts to go to waste. It could be that the template had nothing to do with the database size exploding. I’d still like to know what is going on. It could be that with two different skins involved, there are a lot more objects to cache. That was what messed up the page caching feature previously (if a page with the mobile skin was cached, that is what a non-mobile user would see next time, and vice versa).

Leave a Reply

Your email address will not be published. Required fields are marked *