- Kelly's World- A View into the mind of Uber Geek, Kelly Adams - https://www.kgadams.net -

Recovering ‘lost’ pre-WordPress blog content from PHPNuke

I upgraded this site from PHPNuke to WordPress in 2005 [1].  The cut over was rather abrupt and poorly thought out by yours truly.  At the time, I believed I had successfully migrated all of my old content to the new environment.  But several years later (!!), I realized the migration had left any article from my old site with a ‘read more’ tag without more to read in my WordPress configuration.  This bothered me, and today I decided to see what I could do to ‘fix’ it.

PHPNuke: still alive, actually

The first thing I discovered in my quest was that I still possess the data (mySQL tables) and site source for PHPNuke.  The process of migrating the data from the old site could be completely accomplished purely using the mySQL database tables for PHPNuke and WordPress.  But I wanted to see my old site ‘alive’ in some form, to get a sense of whether there was any point in trying to recover the old content.  This desire arguably complicated things a lot: but complicated is my middle name 😉

Getting the site itself running would mean getting PHPNuke running again.  My initial thought was that this would be my ultimate roadblock, but I was surprised to find that PHPNuke itself has been updated several times since 2005.  The most recent version I could find is 8.2 [2], which is several versions newer than what I was using back then.  And I was able to find a place with source for the version my old site was running (PHPNuke 6.8 [3]), so I have a place to go to pick up the site code if needed.

Running multiple versions of PHP: Docker phpfarm

The next problem: PHPNuke 6.8 won’t run with the PHP 7.x version I’m running on my web server.  I’ve encountered this PHP version issue before without resolving it; but now, if I wanted to see my old site again, I’d have to run multiple versions of PHP on one server.

I found that there are a couple of approaches to getting multiple versions of PHP running.  One solution is to switch from using Apache and move to NGNX [4].  The main difference here is that NGNX doesn’t run PHP ‘in thread’.  The impression I got from skimming several articles on the topic was that, although NGNX is arguably superior versus Apache for high volume sites, the switch is not without pain.  

I decided on an alternative path for getting multiple versions of PHP running: using docker.  Seeing as how, at this point at least, I’m not really wanting to run my old site in a ‘production’ state, this would be acceptable.  In fact, it is actually (arguably) superior-  the docker container approach is easy to start/stop/completely blow away if necessary.  And a fellow by the name of Andreas Gohr aka ’splitbrain’ [5] on Github had kindly created a complete docker configuration to do all of the phpfarm setup for you: docker-phpfarm [6].  This reduced my effort to get multiple PHP versions running to the following

1. install docker on my Fedora Linux web server

dnf install docker

2. start docker

service docker start

3. pull the docker-phpfarm image

docker pull splitbrain/phpfarm:jessie

4. start container, pointing it at my old blog document root

docker run --rm -t -i -e APACHE_UID=<apache UID> -v /var/www_old:/var/www:rw -p 8053:8053 splitbrain/phpfarm:jessie

Consult the docker-phpfarm GitHub page [6] for more detailed documentation on using the phpfarm.  But the basic idea of step #4 is that I’m launching the container with port 8053 mapped, and that port is configured in the container to PHP version 5.3.29.  

After these simple steps… nothing at all worked.  Well, that’s not completely true: a very large number of PHP errors were generated.

Making the old site code work again

The sequence of steps looked something like this:

  1. Get the mySQL database connection from docker working
    • the running docker container is basically a ‘separate’ computer, and thus can’t access mySQL locally (i.e.: via localhost)
    • the following actions were required to correct this
    • changing the PHPNuke ./config.php file to use an actual resolvable network address instead of ‘localhost’
    • creating ‘remote’ access IDs in mySQL for the PHPNuke user
  2. Install the ‘clean’ PHPNuke code for my site
    • although my old site was using PHPNuke 6.8, it had been heavily modified with various ‘security’ extensions
    • Instead of trying to get these all working again, I decided to get the ‘pure’ PHPNuke 6.8 code and just copy the critical components
    • I got the PHPNuke 6.8 code here=> PHPNuke 6.8 [3]
    • I installed it in a new directory, and copied over my config.php, images, and themes directories
    • this was enough to get the site to come up
  3. Turn off noisy messages from PHP
    • by default PHP inserts warning/non-critical errors in-line in the rendered page.  I added the following line to the php.ini used by the docker/phpfarm configuration
    • <span style="font-variant-ligatures: no-common-ligatures;">error_reporting = E_ALL &amp; ~E_NOTICE &amp; ~E_WARNING &amp; ~E_DEPRECATED</span>

With these actions, I could see my (somewhat janky/imperfect) site again.  There was still some oddness in the header, but it was working in the sense that I could navigate through the site and see my old posts in their complete form.

But really, none of this was necessary

Seeing my old site was kind of neat, but as I mentioned earlier it wasn’t really required in order to recover the content ‘below the fold’ content that had been missed in my WordPress migration.  All of that exists in the mySQL database.  The post content for PHPNuke is in a table called nuke_stories– the ‘above the fold’ content is in the field home_text in that table, and the ‘below the fold’ content is in body_text.  A simple query showed me all the posts that had been incompletely migrated to WordPress:

SELECT title, time, hometext, bodytext FROM &lt;schemaName&gt;.nuke_stories where (bodytext is not null &amp;&amp; bodytext != '');

In my case, there were a grand total of 41 posts dated between 2003 and 2005..  I briefly thought about writing some code or possibly an SQL query to recover the missing data, but with only 42 entries I’m going to manually migrate the content instead.