Breaking Up: Moving Blog Engines

Like relationship break-ups, leaving a blog engine (for another, no less!) is not easy. There’s pleading (“Stick with Blogger, it’ll change to meet your needs, I swear!”), denial (“Database driven blogs are slower than statically published blogs.”), and crying (“Why, why, oh why is this so hard? *sob*”). Not to mention the “grass is greener” syndrome: “only once I switch to Wordpress will I truly be happy – it will have every single feature I need!”.

That’s been my spare-time-life on and off over the past few weeks, and now you all get to benefit from my pain. Here are the requirements I had to move to a new blogging platform:

  1. Self hosted – while not a hard-and-fast requirement, I really want something I can tweak if necessary. And it can’t be ASP.NET-based, as much as I wanted it to, because I’m not going to pay for ASP.NET hosting. As a developer I also get warm fuzzies about being in 100% absolute control. (Okay, really, as a human being I get warm fuzzies about being in 100% absolute control…)
  2. Old permalinks need to 301 redirect to new permalinks – I don’t have a lot of subscribers (but I love you all, so very much) and get a very modest number of hits every day. I really want/need those old links to point to the new ones. It needed to be seamless.
  3. Full, or nearly-full, import – I wanted to bring as much information as possible from my Blogger blog to my new blog platform. That means no “import from RSS” features for me, I wanted comments to come over, etc.
  4. Well used/tested, and feature rich – I wanted a blog engine that’s been proven. I don’t want to be overwhelmed with spam, or anything else. Along those lines, if a platform is going to be mature, it should have plenty of features. I want to be able to do things like moderate comments (if I ever feel like it), write an occasional article instead of a blog post, etc.

Given those requirements, I decided to switch to Wordpress, and here’s how I did it. (Spoiler: it took longer than I thought it would, required some custom development, and worked out okay in the end.)

I would divide my tasks into three components: exporting from Blogger, importing to Wordpress, and “general configuration” (permalink redirects, etc.)

Export from Blogger

I’ve actually already posted on this here. I decided on BlogML as my “offline” format for my blog, and wrote a PowerShell script to export from Blogger and output the BlogML format. So step one, export to BlogML using my script.

BlogExportInProgress
Figure 1. Exporting from Blogger to a local BlogML file.

Import to Wordpress

Update: Rob Walling from softwarebyrob.com has updated the import module to work with Wordpress 2.3 – get the updated version from my tools page. Thanks Rob!

This took the most work. The first thing I had to do was write a Wordpress import module for BlogML, since none existed. You can download the import module here. IMPORTANT NOTE: This requires the Php.XPath library available from SourceForge. Go there and download v3.5, making sure to upload it to the same directory as “blogml.php” (typically something like “/wp-admin/import/”). It is a single .php file. You’ll need to do this at least until I figure out how licensing works between BlogML and Php.XPath–once I get that figured out I’ll be adding this import module to the BlogML CodePlex project.

My steps were as follows:

  1. Back up my Blogger files (on my FTP server) to my local hard drive.
  2. Log into my Wordpress install and update the URL to be “http://www.aaronlerch.com/blog” (instead of the intermediate location of “http://www.aaronlerch.com/wp”). Note that as soon as I saved this setting, it tried to redirect me to a location that doesn’t exist yet. It’ll fail, that’s okay.
  3. Rename the “/blog” directory with my Blogger files to something else, and rename my intermediate Wordpress directory “/wp” to “/blog”.
  4. Change the encoding in my BlogML output file. This sucks, but the XML class I used (Php.XPath) requires the XML to be in UTF-8 format. My Blogger->BlogML export script generates it using the .NET APIs, which uses UTF-16 (Unicode) encoding by default. Instead of spending a lot of time tweaking the BlogML output-which wouldn’t benefit anybody but myself-I just fired up Notepad2 and changed the file encoding to UTF-8 (File -> Encoding -> UTF-8), making sure to also update the “encoding” attribute of the “<?xml” header tag from “utf-16″ to “utf-8″.
    CropperCapture[1]
    Figure 1. Be sure to change the “encoding” attribute of your XML file when changing the file encoding.
    CropperCapture[2]
    Figure 2. Before conversion – “Unicode BOM” (BOM = Byte Order Mark)
    CropperCapture[3]
    Figure 3. After conversion – “UTF-8″
  5. Import my BlogML export from Blogger into Wordpress (very straightforward–just try it). On the “completed” page of the import, it lists the posts that were imported, and at the bottom it gives a link to a permalinkmap.csv file which contains a mapping of old permalinks to new permalinks. I downloaded that file and saved it for use later.
General Configuration

Finally, I needed to set up my redirects. Using the permalinkmap.csv file I downloaded, I ran the following PowerShell one-liner to create a text file containing entries that I copied into my “.htaccess” file (my hosting company uses LAMP):

import-csv permalinkmap.csv |% { $oldurl = ($_.OldPermalink -replace “http://www.aaronlerch.com”, “”); “redirect 301 $oldurl ” + $_.NewPermalink } | out-file htaccessredirectdata.txt

It simply creates a list of redirect commands that end up looking something like this:
redirect 301 /blog/2007/08/uri.html http://www.aaronlerch.com/blog/2007/08/21/uri-purity/

Finally, I needed to set up redirects for my archive files. Those were all stored in a single “archive” subfolder by Blogger, which I had locally as part of my original backup. This PowerShell one-liner output more redirect commands for me to copy into my .htaccess file:

dir archive | select Name |% { ($_ -match “([0-9]{4})_([0-9]{2})_.*”) | out-null; “redirect 301 /blog/archive/” + $_.Name + ” http://www.aaronlerch.com/blog/” + $matches[1] + “/” + $matches[2] + “/” } | out-file archivehtaccessdata.txt

This short command takes a file named “2007_08_01_archive.html” and produces the following redirect line:
redirect 301 /blog/archive/2007_08_01_archive.html http://www.aaronlerch.com/blog/2007/08/

Cleanup

What good blog migration would be complete with out some cleanup? The export/import process mangled some special characters various blog writing tools I’ve used inserted. Not to mention code examples getting new-lines removed. (Beware!) A quick visual skim and it was all fixed up.

Conclusion

So what do we take away from all this? A few things:

  • Cleanly migrating between blogging platforms sucks. It’s not straightforward, and “clean” is in the eye of the beholder.
  • If you ever want to migrate from Blogger to Wordpress, you’ve hopefully found some help here. At a higher (even more helpful) level, if you ever want to export from Blogger to BlogML, or import from BlogML to Wordpress, you’ve got some tools and guidance.

Technorati Tags: , , ,

This entry was posted on Thursday, August 23rd, 2007 at 2:49 pm and is filed under blogging, powershell. You can follow any responses to this entry through the RSS 2.0 feed. You can leave a response, or trackback from your own site.

30 Responses to “Breaking Up: Moving Blog Engines”

  1. BlogML to WordPress Importer - Keyvan Nayyeri Says:

    [...] has done it again by writing a BlogML to WordPress importer tool with PHP.  He gives more details about this tool on his blog which is now powered by [...]

  2. mike Says:

    Nice job, Lerch. And nice theme too. I love the fluid sizing option.

  3. mike Says:

    By the way, instead of 301 redirection if you have access to the head tag in your blog you could use <meta http-equiv=”refresh” content=”0″ url=”your site”>. That would just send you to the new page once the old page loaded.

  4. aaron Says:

    True, but then:
    a) the old page still has to load in entirety before the new page will load,
    b) it’s not permanent — I want to permanently redirect since the old pages don’t even exist anymore, and
    c) search engines won’t recognize that as a valid redirect

    Here’s a decent summary of redirect techniques:
    http://www.theinternetdigest.net/archive/301-redirects-seo.html

    And here’s another:
    http://www.bruceclay.com/blog/archives/2007/03/how_to_properly.html

  5. mike Says:

    True, but not everyone can get to their htaccess file too. I’m just suggesting another option.

  6. Roman Says:

    Hi
    Unfortunately I can’t download the import module for wordpress. Could you send it to me in an email or fix the download link?

  7. aaron Says:

    Whoops–that’s what I get for making some assumptions (and not testing them).

    It’s available as a .zip download now, sorry about that! The link has been fixed.

  8. DaveBost.com 3.0 Releases to Web (RTW) | Dave Bost Says:

    [...] daunting task and arguably the most important. Thanks to the BlogML project and Aaron Lerch and his WordPress BlogML plug-in, I saved a number of hours and anguish in having to write a data transformation script from [...]

  9. A Tale of Moving Blog Engines: Community Server to WordPress — Software by Rob Says:

    [...] step was the gold-mine find: I installed and ran Aaron Lerch’s WordPress BlogML importer, which as far as I can tell is the only one of its kind. I had to make some modifications to the [...]

  10. Jay R. Wren Says:

    How about the other way? Exporting BlogML from Wordpress?

    Does anyone know if this has been done? I’ll get started on it if it hasn’t been done yet.

  11. aaron Says:

    There is a Wordpress export available for download as part of the BlogML release on CodePlex, but I don’t know if it’s been updated to be compatible with the latest release of Wordpress.

    http://www.codeplex.com/BlogML

  12. parallelthinking » Blog Archive » Migrating from SingeUserBlog (SUB) to WordPress Says:

    [...] does not have native support for the BlogML format. After a bit of googling around I found this implementation of an import module for WordPress and [...]

  13. Subtext to WordPress: Converting blog engines | Jason Kemp .ca Says:

    [...] else is moving the other way. There was only one, Aaron Lerch, who went in my direction and he wrote an import plug in, which I didn’t use because it required all these extra downloads. [Um, re-reading [...]

  14. Made some changes: moved blogs | Ivan Porto Carrero Says:

    [...] I wasn’t looking forward to moving blogging engines because I don’t want my permalinks to change etc. The whole move was less painful than expected because somebody had already done the work and made a nice write-up. http://www.aaronlerch.com/blog/2007/08/23/breaking-up-moving-blog-engines/ [...]

  15. The Little-Known Secrets of Changing Blog Engines | John A Simpson Says:

    [...] Breaking Up: Moving Blog Engines [...]

  16. BlogML importer for Wordpress 2.5 « This part of my life is called: Being Myself Says:

    [...] format. So I set out to find an BlogML importer for WordPress 2.5. I stumbled upon Aaron’s post on importing to wordpress. So here is what needs to be [...]

  17. Moving to a new blog platform | SharePoint Use Cases Says:

    [...] all the contents from my previous blog platform (Community Server) to this one. Many thanks to Aaron for the BlogML > WordPress plugin. Tags: [...]

  18. Moving to a new blog platform - Sharepoint Use Cases Says:

    [...] all the contents from my previous blog platform (Community Server) to this one. Many thanks to Aaron for the BlogML > WordPress plugin. Published Aug 21 2008, 10:22 AM by toni Filed under: [...]

  19. www.copyandwaste.com » Hello, Goodbye! Subtext to Wordpress Says:

    [...] Downloaded and installed BlogML import plugin, instructions here [...]

  20. Migrating from DasBlog to WordPress Says:

    [...] data to WordPress: Aaron lerch had done all the hard work by creating the import module for WordPress for BlogML. I downloaded [...]

  21. nateirwin.net » Blog Archive » Migrating from BlogEngine.Net to WordPress Says:

    [...] then followed this post from Aaron Lerch’s blog to walk me through importing the content into WordPress. A couple of [...]

  22. Exporting BlogML from Subtext 2.1 and importing BlogML into Wordpress 2.7 « Tinderblog Says:

    [...] the import part – Thankfully Aaron Lerch had handled this back in WordPress 2.3 in his post about moving to Wordpress from Blogger.   The important thing for me was his BlogML [...]

  23. Vasanth Dharmaraj Says:

    Do you know if anyone has updated your BlogML import plugin for Wordpress 2.7

  24. Vasanth Dharmaraj Says:

    The changes made by Kavinda in comment # 20 worked for Wordpress 2.7 for importing a dasBlog BlogML.

  25. Steps to Migrate from dasBlog to Wordpress | Vasanth Dharmaraj's Blog Says:

    [...] importing the posts I used a plugin originally created by Aaron lerch and updated by Kavinda. This was a bit tricky. My Wordpress install allowed uploads of only 2MB [...]

  26. Wayne Says:

    Congratulations on your soon to be bundle of joy! You have fun raising those three now ya hear? haha

    Wanted to let you know that I have modified your BlogML import class for WordPress. I found that commentators URL’s were not being imported, thus all commentators would lose their links after I migrated from BlogEngine to WordPress.

    I also found that tags were not being imported also, so I added that functionality into the class as well.

    Was there any compelling reason those were left out? I found no problem including them and performing an import.

    The only problem left outstanding that I can see, other than the categories (which I’m ignoring since I’m creating a whole new category structure) is that a 4mb file fails due to a memory being exceeded error. The error occurs in the XPath class, but like the categories, I’m ignoring that issue too.

    To work around it, I simply broke the BlogML.xml file into 3 parts and uploaded 3 times, thus avoiding any memory issues.

    I wrote a post for the class, you can see it here http://www.waynejohn.com/post/2009/06/19/Updated-BlogML-Import-Class-for-WordPress.aspx

    Thanks for writing this, seems there isn’t any others out there.

    Cheers!

  27. hubertsvk Says:

    hi,
    i try export community server to blogml format and then import it to wordpress but i got errors …can anyone help me?

    PHP Warning: Invalid argument supplied for foreach() in H:\Home\WU_000278_896b2f24f90712902262a7662d5c22e2\Webs\hubka.net\wp\wp-admin\import\blogml.php on line 126
    PHP Warning: Invalid argument supplied for foreach() in H:\Home\WU_000278_896b2f24f90712902262a7662d5c22e2\Webs\hubka.net\wp\wp-admin\import\blogml.php on line 133
    PHP Warning: file_get_contents(H:\Home\WU_000278_896b2f24f90712902262a7662d5c22e2\Webs\hubka.net\wp/wp-content/uploads/) [function.file-get-contents]: failed to open stream: No such file or directory in H:\Home\WU_000278_896b2f24f90712902262a7662d5c22e2\Webs\hubka.net\wp\wp-admin\import\blogml.php on line 96

  28. Migrating from BlogEngine.NET to WordPress Says:

    [...] the wonderful BlogML import script for WordPress by Aaron Lerch (get it here) and FTP it up to your /wp-admin/import/ [...]

  29. Abandoning BlogEngine.NET :-( – Read Bits Says:

    [...] about everything but wash the dishes (and too bad, because I could use that plug-in). Fortunately, Aaron Lerch wrote a PHP plug-in for importing BlogML. Rob Walling posted his notes on Aaron’s [...]

  30. How to Convert from Community Server 2007 to Wordpress at Space Ninja Says:

    [...] Aaron Lerch wrote a great post in August 2007 about switching from Blogger to Wordpress using BlogML. I already knew from when we originally set it up that Community Server supports BlogML, and Aaron wrote a module for Wordpress to import from BlogML format. [...]

Leave a Reply