Transition

If you can see this, I have successfully moved my blog from Blogger to a self-hosted WordPress instance.

This was not a trivial endeavor. I have had this blog since 2006, and there are links to it all over the net. I did not want to break those links, nor did I want them to point to an abandoned Blogger instance. In particular, I wanted readers to still be able to comment on old posts.

I had already set up Blogger to redirect maycontaintracesofbolts.blogspot.com to blog.des.no, so if I changed blog.des.no to point to my WordPress instance and made sure the slugs matched, most of the links should still work.

The first step was to configure WordPress to use the same style of permalinks as Blogger, i.e. year/month/slug.html. There was a catch, though: Blogger views posts as individual pages, whereas WordPress views them as directories, and is more comfortable with year/month/slug/ (note the final slash). Therefore, instead of trying to get WordPress to mimic Blogger more closely, I used the following mod_rewrite hack to convert Blogger-style URLs in incoming requests to WordPress-style URLs:

RewriteEngine On
RewriteRule ^/([0-9]{4}/[0-9]{2}/.*)\.html$ /$1/ [R,L]

Next, I used the Blogger Importer plugin to import posts and comments from Blogger. Several problems immediately arose.

First, Blogger Importer imported all my drafts as published posts. Luckily, this is a known bug with a simple fix.

Secondly, WordPress slugs and Blogger slugs are often but not always identical. They are both based on the title, but Blogger removes short words and cuts off at a certain length. Unfortunately, while Blogger Importer records the original Blogger slug as metadata for each post, it does not actually set the WordPress slug (post_name for those familiar with WordPress internals) when it imports posts. I therefore had to write a Perl script that scans the posts and metadata looking for Blogger slugs, then sets the WordPress slug to the correct value if it does not already match the Blogger slug.

Finally, after I had imported posts and comments but before I was ready to throw the switch, new comments were posted on Blogger. In theory, Blogger Importer can import new posts and comments without touching existing ones. However, I had already gone through existing posts and added breaks, fixed image placement, and fixed a few conversion errors (mostly related to the use of angle brackets in posts), and Blogger Importer was extremely confused. In the end, I realized that what was happening was that it did not recognize posts and comments it had already imported, because the duplicate checks were too narrow. I managed to work around this by modifying import_posts() and import_comments() in blogger-importer.php so the post_exists() and comment_exists() relied solely on the timestamp when comparing posts and comments. I also had to increase the MAX_EXECUTION_TIME parameter from 20 to 60 seconds.

I had a lot of trouble finding a good WordPress theme for the site, and I’m still not really satisfied. I may end up switching to Delicate, which is just about as basic as it gets (short of Toolbox) yet manages not to stray on the wrong side of drab.

I hope you like the new site, or at least that you don’t hate it. Feel free to comment, and I’ll be happy to answer any technical questions regarding the transition and my setup. I’ll discuss my server configuration in more detail in a later post—if I get around to it…


Edit 2013-02-14: I added a couple of extra RewriteRules:

RewriteRule ^/search/label/(en|fr|no)/?$ /category/$1 [R,L]
RewriteRule ^/search/label/([^/]+)/?$ /tag/$1 [R,L]

These rewrite Blogger tag URLs to WordPress tag URLs, except for the en, fr and no, which are categories (Blogger Importer imports Blogger tags as WordPress categories; I used the Categories to Tags Converter to convert most of them back to tags).

I use Piwik analytics and have enabled 404 tracking, which should help me identify additional URLs that need to be rewritten.


Edit 2013-02-16: Additional RewriteRules, courtesy of SEO Ultimate‘s 404 Monitor:

RewriteCond %{QUERY_STRING} ^m=1$
RewriteRule ^(.*)$ $1? [R,L]
RewriteRule ^/([0-9]{4})_([0-9]{2})_([0-9]+)_archive.html$ /$1/$2/page/$3 [R,L]
RewriteRule ^/feeds/posts/default/-/(en|fr|no)/?$ /category/$1/feed/ [R,L]
RewriteRule ^/feeds/posts/default/-/([^/]+)/?$ /tag/$1/feed/ [R,L]
RewriteRule ^/feeds/posts/default/?$ /feed/ [R,L]

The first two lines strip a query string that shows up a lot—I’m not sure what it means to Blogger. The third line rewrites archive page URLs. The final three rewrite RSS feed URLs, including feeds for categories and tags.

3 thoughts on “Transition”

  1. Hi DES, hope you’re well.

    One question/request for you…

    Before your transition, I would read your posts just from the RSS feed and only click through if I wanted to comment (which I think I did a total of once).

    Since the transition to wordpress, the RSS feed only has the start of the post and then a “Read more” link.

    Would it be possible to turn on full articles in the RSS feed?

    Thanks in advance!

    1. I started adding breaks (including to existing posts) to keep the length of the front page down. As for the RSS feed, it’s actually set to “full text”. I don’t know why you only get the summary—I doubt Feedburner is rewriting the content, so I suspect it’s a WordPress bug.

    2. Actually, this post has a break after the second paragraph, but shows up unabbreviated in my Google Reader feed and on Planet NUUG, where I’m syndicated. Which RSS reader are you using?

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.