Migrate Custom Blog to Blogger

For the last ten years I have run this website from various systems. First it was on Wordpress, then Mambo, then Joomla, and since early 2006 it has been running on custom code written using Django. I used this site as a learning tool for Django, re-wrote it after gaining more knowledge of Django, and then re-wrote it again when Google released App Engine. However, I recently realised that for the last few years I have spent more time writing little features than actually writing. I have entire trips that I never wrote because I was too busy writing code.

This week it all changed. I did the unthinkable. I moved this website to Blogger.

After evaluating some of the features of blogger, i.e. custom domains, location storing, ability to filter on labels, custom HTML/CSS, great integration with Picasa, and their mobile app, I realised I could virtually replace everything I had previously custom made.

This post gives a technical description how to migrate a site running Django, but readily applies to any blog running on custom code. I initially spent a fair bit of time trying to figure out how to convert my existing RSS feed into something Blogger could accept, but every solution required troubleshooting. I soon remembered why I love Django so much, and that it would be trivial to generate the correct XML for import.

  1. Create Blogger Template
    I wanted to keep my design, so I hacked it to support Blogger. Take one of the existing templates, edit the HTML, and adjust it for your design. If you’ve worked with templates before this shouldn’t be too difficult.
  2. Generate Sample XML
    The first step was to generate a sample XML file from Blogger to see what would be needed for import. Create a sample post with a unique name and a few labels, and location. In Blogger, go to Settings->Other and click Export Blog. The first 90% of the file will be for your template and other settings, but eventually you will find a section with entry elements in it. Copy this sample element out - this will become your template.
  3. Format Template
    Using the sample section from the blog export, format it so the view you will create populates it correctly. A note of caution: the template needs time in ISO 8601 format, you need the id element, and the location element needs coordinates if there is a name. It won’t import later if there is a name with no coordinates. My template looks like this:

feeds/rss.html

{%  load blog_extras %}
{% for entry in entries %}
    tag:blogger.com,1999:blog-1700991654357243752.post-{% generate_id %}
        {{ entry.publish_date|date:"Y-m-d" }}T10:30:00.000123
        {{ entry.publish_date|date:"Y-m-d" }}T10:30:00.000123
        {% for tag in entry.tags %}
            {% endfor %}

        {{ entry.title }}
        {{ entry.content }}

        

        Joe Bloggs
            https://plus.google.com/12345689843655881853
            [email protected] 
{% endfor %}

This isn’t really RSS, so if you are pedantic you can name it something else. You will notice I loaded some template tags in there (“blog_extras”). This is for generating the random number, as this is needed for the ID element.. Here’s the template tag.

blog_extras.py

# 'import random' at beginning of file
def generate_id():
    id = ""
    for x in xrange(1, 7):
        id = id + str(int(random.uniform(400, 900)))
    id = id + "8"
    return {'obj': id}
register.inclusion_tag('blog/generate_id.html')(generate_id)

/blog/generate_id.html

{{ obj }}
  1. Create Code To Populate View

This section should be easy if you have written your blog in Django. Simply populate the template, what I have shown as “rss.html” above

blog/views.py

def show_rss(self):
    q = Entry.all()
    q = q.filter("genre !=", "blog")
    entries = q.fetch(500)
    return render_to_response("feeds/rss.html", {
        'entries': entries,
        }, mimetype='text/plain')

I did a filter on the model to not include “blog” entries - these are my travel stories, and I exported them separately. Remember that this is all happening on App Engine, so you will need to adjust if using Django’s normal ORM.

  1. Download Entries

Visit the URL you mapped to the “show_rss” function in urls.py, it should generate your list of entries. Copy and paste those entries into the exported XML from Blogger where you took out the original entry element.

  1. Import Entries

Now go to Blogger and import your blog. With any luck you will have imported all your entries. You will probably need to do this a few times as you tweak the text. I had to remove some newlines from my original posts.

Optional Steps

  1. Create Redirect URLS
    Links in Blogger appear to only end in .html, which is a problem for links coming from Django. Luckily, Blogger includes the ability to add redirects. Go to Settings->Other-Search Preferences. You can then edit redirects there. I generated a list of my old URLs and combined that with a list of the new URLs. Hint: you can use Yahoo Pipes to extract a list of URLS from a RSS feed. If you open any of the links in Excel and split on forward slashes, remember that it will cut off leading zeros. Set that field to TEXT during import.

I decided not to create redirects for every entry, as I didn’t really have time, and it only probably matters if somebody links directly to that page. I opened Google Analytics and looked at the Search Engine Optimisation page and sorted it by the most used inbound links. After getting down to entries that only had 1 inbound request per month I stopped creating redirects.

  1. Host Stylesheets and Images Externally

Blogger won’t host host files, so you need to work around this problem. All my images are generally from Picasa, except very specific website related ones. I moved those to Amazon’s S3 and updated the links. I did the same with my CSS. You could probably store them in Google Storage, too.

  1. Create Filters on Labels

If you had any previous groupings you can still link to them using label searches (in my case I actually added the “genre” as a label). The syntax is “/search/label/labelname/”, as you can see in my howtos section link.

  1. Update Webmaster Tools

If your site is part of Google’s Webmaster Tools, you will want to login and take a look that things are OK. You will also probably want to update your sitemap (send Google your atom.xml feed).

Removing Unused ContentTypes

I’ve been cleaning up my personal blog a bit, and I noticed that my tagging system recently broke. I’ve investigated the cause, and it appears to be because I removed some apps but the contenttypes remained. This meant that whenever I tried calling a tag with a TaggedItem that had been deleted, I was getting this error:

'NoneType' object has no attribute '_meta'

The solution is to first list all app_labels for contenttypes, and then delete any not in use.

In [61]: from django.contrib.contenttypes.models import ContentType
 
In [62]: for ct in ContentType.objects.all(): print ct.app_label
   ....:
picasaweb
lifestream
readernaut
delicious
mapfeed
comments
...

I could then delete the unused contenttypes.

ct_list = ["delicious", "flickr", "photologue", "twitter"]
 
for ct_label in ct_list:
    for ct in ContentType.objects.filter(app_label=ct_label):
        ct.delete()
    

And no more errors! For more details take a look at David’s article.

Integrate imified into Django

I recently had the desire to send small updates to my so called lifestream page via XMPP/GTalk. I played around with Twisted Words and several other Python XMPP clients, but I didn’t really want to keep a daemon running if unnecessary. It turns out imified took a lot of the pain out of it. The steps for me were as follows:
Create an account with imified, and create a URL, e.g. /app/api/
We then configure the urls.conf

urlpatterns = patterns('',  
    (r'^app/api/$', bot_stream),
)

We then create the necessary views. So, in views.py:

from django.shortcuts import render_to_response
from django.http import HttpResponse
from lifestream.forms import *
from datetime import datetime
from time import time
 
def bot_stream(request):
    if request.method == 'POST':
        botkey = request.POST.get('botkey')
        username = request.POST.get('user')
        msg = request.POST.get('msg')
        network = request.POST.get('network')
    
    if username == "[email protected]" or network == "debugger":
        blob_obj = Blob(id=time(), body=msg, service_name="Mobile",
        link="http://www.kelvinism.com/about-me/", published=datetime.now())
        blob_obj.save()
        resp = "OK"
    else:
        resp = "Wrong username %s" % username
    else:
        resp = "No POST data"
    return HttpResponse(resp)

To complete this little example, you can see what I used for my models.py

class Blob(models.Model):
    id = models.CharField(max_length=255, primary_key=True)
    body = models.TextField(max_length = 1024, null = True, blank = True)
    service_name = models.CharField(max_length=50, null=True, blank=True)
    link = models.URLField(max_length=255, verify_exists=False, null=True, blank=True)
    published = models.DateTimeField(null=True, blank=True)
 
def __unicode__(self):
    return self.id
 
class Meta:
    ordering = ['-published']
    verbose_name = 'Blob'
    verbose_name_plural = 'Blobs'
 
def get_absolute_url(self):
    return "/about-me/"

It maybe isn’t super elegant, but it works just fine, and maybe can provide a hint if somebody else is contemplating using a homebuilt xmpp solution, or just pawning it off on IMified.

Using HTML in a Django form label

I recently had the need to add some HTML to the label for a form field using Django. The solution is pretty easy, except I didn’t see it written explicitly anywhere, and I missed the memo of the function I should be using.
My form first just had the HTML in the form label as so:

from django import forms
 
class AccountForm(forms.Form):
    name = forms.CharField(widget=forms.TextInput(), max_length=15, label='Your Name (<a href="//www.blogger.com/questions/whyname/" target="_blank">why</a>?')

However, when I displayed it, the form was autoescaped.

This is generally a good thing, except my form obviously didn’t display correctly. I tried autoescaping it in the template, but that didn’t work. To resolve this you’ll need to mark that individual label as safe. Thus:


from django.utils.safestring import mark_safe
from django import forms
 
class AccountForm(forms.Form):
    name = forms.CharField(widget=forms.TextInput(), max_length=15, label=mark_safe('Your Name (<a href="//www.blogger.com/questions/whyname/" target="_blank">why</a>?)'))
    

It will now display correctly:

In [1]: from myproject.forms import *
 
In [2]: form = AccountForm()
 
In [3]: form.as_ul()
Out[3]: u'
<li><label for="id_name">Your Name (<a href="//www.blogger.com/questions/whyname/" target="_blank">why</a>?):</label> <input id="id_name" maxlength="15" name="name" type="text"></li>
'

There’s maybe another easier way to do this, but this worked for me.

Ubuntu 10.04, Django and GAE - Part 1

I’ve started to get into Google’s App Engine again, and have started developing a simple product that I had a use for. The initial first draft was a quick 200 lines in webapp, and it worked great. However, I’m starting to find certain things quite cumbersome. I’m a huge fan of Django, and but also about keeping things as simple as possible, which is why I picked webapp to begin with.
I’m now considering making a swap to Django, but there are some development issues; namely, I’m using Ubuntu 10.04, Python 2.6, and Django 1.2. This setup presents several setbacks, as GAE has the requirement of Django 1.1 and Python 2.5. There are two solutions that I found: a) use virtualenv, which I’ve detailed, or b) chroot. This document will hopefully show how to configure a chroot environment of Ubuntu 9.10 and prepare it for Django on GAE. Using a jailed environment should allow you to edit your code with your normal IDE and VCS, but use Django 1.1 and Python 2.5.
First, I installed schroot and debootstrap.

$ sudo apt-get install schroot debootstrap

Second, I edited /etc/schroot/schroot.conf and added the following section to the end.

[karmic]
description=karmic
type=directory
location=/var/chroot/karmic
priority=3
users=kelvinn #your username goes here
groups=admin
root-groups=root
run-setup-scripts=true
run-exec-scripts=true

Third, I created the directories needed for the jailed environment and installed karmic.

$ sudo mkdir -p /var/chroot/karmic
$ sudo debootstrap --arch i386 karmic /var/chroot/karmic

Forth, I logged into the jailed environment and updated packages, installed Python 2.5 / Django 1.1. Make sure to note that I don’t call ‘python’, I call ‘python2.5’.

$ sudo schroot -c karmic
(karmic)root@kelvinn-laptop:~# apt-get update
(karmic)root@kelvinn-laptop:~# apt-get install python2.5
(karmic)root@kelvinn-laptop:~# cd /usr/src
(karmic)root@kelvinn-laptop:~# apt-get install wget
(karmic)root@kelvinn-laptop:/usr/src# wget http://www.djangoproject.com/download/1.1.2/tarball/
(karmic)root@kelvinn-laptop:/usr/src# tar -xpzf Django-1.1.2.tar.gz
(karmic)root@kelvinn-laptop:/usr/src/Django-1.1.2# python2.5 setup install
(karmic)root@kelvinn-laptop:/usr/src/Django-1.1.2# exit

Lastly, I log in as my normal user, and start the app. Let’s say I have a folder called ‘~/gaeapps’ for my GAE stuff, and that’s where I put the SDK.

$ scroot -c karmic
(karmic)kelvinn@kelvinn-laptop:~/gaeapps$ ls
google_appengine  myproject
(karmic)kelvinn@kelvinn-laptop:~/gaeapps$ google_appengine/dev_appserver.py myproject

Ubuntu 10.04, Django and GAE - Part 2

All my Django sites are running 1.2, which poses a conflict with writing apps for Google’s App Engine, as use_library currently only supports < Django 1.1. There are two solutions that I found: a) use virtualenv, or b) chroot, which I’ve already detailed. This document will hopefully show you how to create a virtual environment to use a secondary django version, especially for GAE. Of the two options, I think this one is a bit quicker, but there will likely be tradeoffs that a chroot environment can deal with better, e.g. python imaging (I don’t use it for GAE).
First, install PIP and virtualenv:

kelvinn@kelvinn-laptop:~/workspace$ sudo easy_install -U pip
kelvinn@kelvinn-laptop:~/workspace$ sudo pip install -U virtualenv

Second, configure an environment for any app that will use Django 1.1:

kelvinn@kelvinn-laptop:~/workspace$ virtualenv --python=python2.5 --no-site-packages django-1.1
New python executable in django-1.1/bin/python
Installing setuptools............done.
kelvinn@kelvinn-laptop:~/workspace$ pip install -E django-1.1 yolk
kelvinn@kelvinn-laptop:~/workspace$ pip install -E django-1.1 Django==1.1

Now, download the python GAE sdk and put it in the django-1.1 folder. I also just dump any project directory requiring Django 1.1 into this django-1.1 folder, although I guess you could create a virtualenv for each project. The last thing to do is start the virtual environment, and run the GAE app.

kelvinn@kelvinn-laptop:~/workspace$ source django-1.1/bin/activate
(django-1.1)kelvinn@kelvinn-laptop:~/workspace$ yolk -l
(django-1.1)kelvinn@kelvinn-laptop:~/workspace$ cd django-1.1
(django-1.1)kelvinn@kelvinn-laptop:~/workspace/django-1.1$ ls
bin  google_appengine  include  lib  myproject1 myproject2
(django-1.1)kelvinn@kelvinn-laptop:~/workspace/django-1.1$ google_appengine/dev_appserver.py myproject1

When you’re all finished, you can jump out of virtualenv:

(django-1.1)kelvinn@kelvinn-laptop:~/workspace/django-1.1$ deactivate

Update: You’ll find this article especially interesting if you get an error such as the following:

UnacceptableVersionError: django 1.1 was requested, but 1.2.0.beta.1 is already in use

Remove Dead Tags

I’ve noticed my django-tagging install has been giving a lot of empty entries when doing a lookup on a tag. Tonight I finally got around to looking at what was causing this. This is surely not the best way to do this, but at 12:00am on a weekday, well, I shouldn’t be doing it in the first place… I first wanted to see what type of content was generating the error:

for item in TaggedItem.objects.all():
    try:
        print item.object
    except:
        print item.content_type_id

Now that I could see what was causing it (I had removed an app that used django-tagging, but it left the tags with empty pointers). Removing the empty tags was easy enough:

for item in TaggedItem.objects.all():
    try:
        print item.object
    except:
        item.delete()

No more hanging TaggedItems.

Alexa Thumbnail Service

Amazon offers some pretty cool services: S3, EC2, Alexa Site Thumbnail, and others. A while back I wanted to use AST with Django, so ended up writing the Python bindings to the REST API (they didn’t previously exist. I even wrote up a quick tutorial.

Update: Amazon no longer maintains AST. I’ve decided to archive a few of the old sites, so no longer need to take thumbnails. However, a few other thumbnail services seem to have crept up, including SnapCasa", and WebSnapr.

Using Django with SQL Server and IIS

As you can tell from reading some of the other pages, I like Linux and open source. But I also like to answer the question “what if…” This post is my [brief] run down of answering “what if I could run Django on Server 2003 with SQL Server and IIS.” Why, you may ask? To be honest with you, at this point, I don’t really know. One of the deciding factors was seeing that the django-mssql project maintains support for inspectdb, which means I could take a stock 2003 server running SQL Server, inspect the DB, and build a web app on top of it. The Django docs offer a lengthy howto for using Django with IIS and SQL Server, but the website for PyISAPIe seems to have been down for the last month or so. Without further delay, below are my notes on installing Django with SQL Server and IIS.

1a) Install python-2.x.x.msi from python.org

1b) Consider adding C:\Python25\ to your Path (right click My Computer, Advanced, Environment Variables. Enter in blahblahblah;C:\Python25\)

  1. Download a 1.0+ branch of Django (and 7-zip if you need it)

3a) Extract the contents of the Django. From inside Django-1.0, execute:

C:\Python25\python.exe setup.py install

3b) Consider adding C:\Python25\Script to your path.
4) Look in C:\Python25\Lib\site-packages – confirm there is a Django package.
5) Checkout django-mssql (http://code.google.com/p/django-mssql/), copy sqlserver_ado from inside source to the site-packages directory
6) Download and install PyWin32 from sf.net
7) Start a test project in C:\Inetpub\ called ’test'

c:\Python25\scripts\django-admin.py startproject test

8a) Create a database using SQL Management Studio, create a user. (First, go to the Security dropdown. Right click Logins, add a new user. Next, right click Databases, New Database. Enter in the name, and change the owner to the user you just created).

8b) Edit the settings.py and add ‘sqlserver_ado’ and add database credentials. Use the below example if your database comes up in the Studio as COMPUTERNAME\SQLEXPRESS (you are using SQLExpress).

import os
DATABASE_ENGINE = 'sqlserver_ado'           # 'postgresql_psycopg2', 'postgresql', 'mysql', 'sqlite3' or 'oracle'.
DATABASE_NAME = 'crmtest'             # Or path to database file if using sqlite3.
DATABASE_USER = 'crmtest'             # Not used with sqlite3.
DATABASE_PASSWORD = 'password'         # Not used with sqlite3.
DATABASE_MSSQL_REGEX = True
DATABASE_HOST =  os.environ['COMPUTERNAME'] + r'\SQLEXPRESS' # I use SQLEXPRESS
DATABASE_PORT = ''             # Set to empty string for default. Not used with sqlite3.
  1. Install/download FLUP: http://www.saddi.com/software/flup/dist/flup-1.0.1.tar.gz
python setup.py install

10a) Download pyisapi-scgi from http://code.google.com/p/pyisapi-scgi/

10b) Extract the files to somewhere you can remember on your computer, like, c:\scgi
11) Double click pyisapi_scgi.py
12a) Follow the directions here: http://code.google.com/p/pyisapi-scgi/wiki/howtoen – I set a temporary different port since I’m just testing this out.
12b) The last few parts might be better served with an image or two:

Using an app pool to get the right permissions

(No resource/photo)

The SCGI configuration file

(No resource/photo)

Properties of the web site

(No resource/photo)
13) Start the scgi process from the Django folder directory

python manage.py runfcgi method=threaded protocol=scgi port=3033 host=127.0.0.1
  1. Test your django page, http://192.168.12.34:8080

(No resource/photo)

And Yet Another Remodel

I have finally decided to do another remodel of this site. I had a few goals before starting:

  • Use one image
  • Use the YUI-CSS framework
  • Use Django
  • Make it easily extendable

So far, I think I’ve accomplished these goals. The site is easier to read, easy to modify, and has a few new features. More entries to come!