Tag Archives: django

Serving custom Django admin media in development

I’ve just discovered Django‘s development server always serves admin media. This is tremendously useful because it means you don’t need to configure a static serve view in your project urls.py during development.

However what bit me was I wanted to use a customised set of admin media and had configured a view for the ADMIN_MEDIA_URL path and was going batty trying to work out why Django was ignoring it. It used to be that as long as you had DEBUG = False in settings.py then the development server did not try to help serve the admin media automatically.

Changeset 6075 added a switch to the runserver command for over-riding the admin media directory.

python manage.py runserver --adminmedia /path/to/custom/media

That change was made more than two years ago. It is right there in the documentation. A little bit of magic that wasted fifteen minutes of my frantic schedule (except for the fact I do not have a frantic schedule).

Migrating a Filemaker database to Django

At work we have several Filemaker Pro databases. I have been slowly working through these, converting them to Web-based applications using the Django framework. My primary motive is to replace an overly-complicated Filemaker setup running on four Macs with a single 2U rack-mounted server running Apache on FreeBSD.

At some point in the process of re-writing each database for use with Django I have needed to convert all the records from Filemaker to Django. There exist good Python libraries for talking to Filemaker but they rely on the XML Web interface, meaning that you need Filemaker running and set to publish the database on the Web while you are running an import.

In my experience Filemaker’s built-in XML publishing interface is too slow when you want to migrate tens of thousands of records. During development of a Django-based application I find I frequently need to re-import the records as the new database schema evolves – doing this by communicating with Filemaker is tedious when you want to re-import the data several times a day.

So my approach has been to export the data from Filemaker as XML using Filemaker’s FMPXMLRESULT format. The Filemaker databases at work are old (Filemaker 5.5) and perhaps things have improved in more recent versions but Filemaker 5/6 is a very poor XML citizen. When using the FMPDSORESULT format (which has been dropped from more recent versions) it will happily generate invalid XML all over the shop. The FMPXMLRESULT format is better but even then it will emit invalid XML if the original data happens to contain funky characters.

So here is filemaker.py, a Python module for parsing an XML file produced by exporting to FMPXMLRESULT format from Filemaker.

To use it you create a sub-class of the FMPImporter class and over-ride the FMPImporter.import_node method. This method is called for each row of data in the XML file and is passed an XML node instance for the row. You can convert that node to a more useful dictionary where keys are column names and values are the column values. You would then convert the data to your Django model object and save it.

A trivial example:

import filemaker

class MyImporter(filemaker.FMPImporter):
    def import_node(self, node):
        node_dict = self.format_node(node)
        print node['RECORDID'], node_dict

importer = MyImporter(datefmt='%d/%m/%Y')
filemaker.importfile('/path/to/data.xml', importer=importer)

The FMPImporter.format_node method converts values to an appropriate Python type according to the Filemaker column type. Filemaker’s DATE and TIME types are converted to Python datetime.date and datetime.time instances respectively. NUMBER types are converted to Python float instances. Everything else is left as strings, but you can customize the conversion by over-riding the appropriate methods in your sub-class (see the source for the appropriate method names).

In the case of Filemaker DATE values you can pass the datefmt argument to your sub-class to specify the date format string. See Python’s time.strptime documentation for the complete list of the format specifiers.

The code uses Python’s built-in SAX parser so that it is efficent when importing huge XML files (the process uses a constant 15 megabytes for any size of data on my Mac running Python 2.5).

Fortunately I haven’t had to deal with Filemaker’s repeating fields so I have no idea how the code works on repeating fields. Please let me know if it works for you. Or not.

Download filemaker.py. This code is released under a 2-clause BSD license.

Outputting Excel with Django

xlwt is an excellent Python module for generating Microsoft Excel documents (xlrd is its counterpart for consuming Excel documents). I use it in a Django Web application so a visitor can export her data as a spreadsheet.

Django’s documentation includes an example of how to export data in comma-separated values (CSV) format. CSV has the significant advantage of being a standard Python module as well as being a relatively simple and non-vendor specific format. However there are some disadvantages to using CSV:

  1. Values can only be stored as strings or numbers.
  2. Unicode text must be explicitly encoded as UTF-8.
  3. Users are often unfamiliar with the .csv file name extension – “What the hell do I do with this damn you?”

It would be unfriendly of me to expect a user to open a CSV file and then format a column of date strings as proper date values (especially when the user is almost certainly using Excel already). So I choose Excel format over CSV format.

Dates in Excel documents (97/2004 format) are actually stored as numbers. In order to have them appear as dates one must apply a date formatting. You do this by using xlwt.easyxf to create a suitable style instance and then pass that when writing the cell data.

A word of advice: do not instantiate style objects more than once! My initial approach created a new style whenever writing a date/time value. Only once I was testing with more than a few dozen rows did I discover that Excel will grow grumpy and complain about too many fonts being open when trying to display the spreadsheet. The correct approach is to have one instance for each different style and then re-use that instance for the appropriate type of value.

Here is an example that writes all objects of one class to a spreadsheet and sends that file to the client’s browser. You could stuff this in a Django view method.

from datetime import datetime, date
from django.http import HttpResponse
from myproject.myapp.models import MyModel
import xlwt


book = xlwt.Workbook(encoding='utf8')
sheet = book.add_sheet('untitled')

default_style = xlwt.Style.default_style
datetime_style = xlwt.easyxf(num_format_str='dd/mm/yyyy hh:mm')
date_style = xlwt.easyxf(num_format_str='dd/mm/yyyy')

values_list = MyModel.objects.all().values_list()

for row, rowdata in enumerate(values_list):
    for col, val in enumerate(rowdata):
        if isinstance(val, datetime):
            style = datetime_style
        elif isinstance(val, date):
            style = date_style
        else:
            style = default_style

        sheet.write(row, col, val, style=style)

response = HttpResponse(mimetype='application/vnd.ms-excel')
response['Content-Disposition'] = 'attachment; filename=example.xls'
book.save(response)
return response

That code works a peach with a 30,000 row / 25 column database, taking about a minute to generate a 13 megabyte file on my lowly iMac G5.

You want to buy me a new Intel iMac, don’t you? Yes, you do.

Django and time zone-aware date fields (redux)

Previously on 24…

I posted a module for handling time zone-aware datetime objects, but I left out all the hassle of dealing with form input. Here is a more complete python package for Django that includes a form field sub-class that can handle a small set of datetime string formats that include a time zone offset.

Timezones 0.1

This code is released under Django’s BSD license.

Grep template tag for Django

Nice and easy couple of Django template tags that filter lines of text using a regular expression. I had a block of text where I wanted to remove some of the lines but not others.

You can use grep to remove any lines that do not match your pattern:

>>> s = 'The quick brown fox'
>>> grep(s, 'quick')
u'The quick brown fox'

And its converse grepv to remove any lines that do match your pattern:

>>> s = 'The quick brown fox'
>>> grepv(s, 'quick')
''
>>> s2 = s + '\nJumps over the lazy dog'
>>> grepv(s2, 'quick')
u'Jumps over the lazy dog'

Stick it in a module in your Django application (documentation), then load it up at the top of a template.

from django import template
from django.template.defaultfilters import stringfilter
import re


register = template.Library()


@register.filter
@stringfilter
def grep(value, arg):
    """Lines that do not match the regular expression are removed."""
    pattern = re.compile(arg)
    lines = [line for line in re.split(r'[\r\n]', value) if pattern.search(line)]
    return '\n'.join(lines)


@register.filter
@stringfilter
def grepv(value, arg):
    """Lines that match the regular expression are removed."""
    pattern = re.compile(arg)
    lines = [line for line in re.split(r'[\r\n]', value) if not pattern.search(line)]
    return '\n'.join(lines)

Django and time zone-aware date fields

Django makes it inordinately complicated to support time zone-aware dates and times because it has so far simply ignored the problem (so far being Django 1.0.2).

This is understandable given the database-agnostic nature of the Django ORM: although PostgreSQL 8.3 supports a datetime type which is time zone-aware, MySQL 5.1 does not (I have no idea what SQLite does about time zones). By ignoring time zones, Django works with the lowest common denominator.

Given time zone support in Postgres, there is a chunk of work to write a variation of models.DateTimeField which can handle time zone-wise datetimes. Python 2.5 does not help things – Python’s native datetime module is similarly agnostic about time zones, the standard library does not include a module for handling wise datetimes.

(If regular datetime instances are naive then datetime instances that honour time zones are wise.)

Django does make it pretty easy to write a custom field class, which means it shouldn’t be too difficult to write a custom datetime field class that is time zone-wise. As ever it is the Django project’s regard for documentation that transforms that which is possible into that which is practical.

Given your backend database has a time zone-wise datetime type (i.e. PostgreSQL), what input values does one need to handle in a time zone-wise custom field class?

  • value set to None
  • value set to a naive datetime instance
  • value set to a wise datetime instance
  • value set to a naive datetime string
  • value set to a wise datetime string

Now the essence of a custom field in Django is two methods: to_python and get_db_prep_value. If the custom field defines

__metaclass__ = models.SubfieldBase

then the to_python method will be called any time a value is assigned to the field, and we can make sure that a suitable type is returned before the model object is saved. Because Postgres supports time zone-wise datetimes and if we take care to return a wise datetime instance we can ignore get_db_prep_value.

When Django reads a record from the database it strips the time zone information, effectively giving your custom field a naive datetime string that belongs to the same time zone as the database connection object. (At least this seems to be true for Postgres and the psycopg2 adaptor.) And since the database connection sets the time zone to be the same as set by settings.TIME_ZONE your custom class needs to treat any naive datetime strings as belonging to the time zone set with settings.TIME_ZONE.

So this leads to the important behaviour for a time zone-wise DateTimeField sub-class: always convert naive datetimes to the time zone set in settings.TIME_ZONE.

For convenience my custom field class, the TZDateTimeField, returns a sub-class of Python’s datetime which has an extra method that converts the datetime to the zone defined by the project’s time zone. Therefore whether the field value has been set from a naive or wise datetime instance, or a naive or wise date string you will end up with a time zone-wise value and you can get the value converted to the project’s time zone. This extra method is intended for use in a Django template.

What I was hoping was that the backend would store the datetime as a datetime in an arbitrary zone, potentially a different time zone from one record to the next for the same field. That behaviour would allow one to infer that one datetime value was created in this time zone while another datetime value was created in that time zone. Instead all datetime values are effectively normalized to your Django project’s time zone.

So here is an example of a model class that uses my time zone-aware datetime field. It ought to work just like a regular DateTimeField but always stores a time zone-aware datetime instance:

from django.db import models
from timezones.fields import TZDateTimeField
from datetime import datetime


class Article(models.Model):
    pub_date = TZDateTimeField(default=datetime.now)

And below is my custom field definition, which has a dependency on the pytz module to handle all the difficult stuff. You can grab the complete module over here, including tests in doctest format. The tests are intended to be run by Django’s manage.py test management command, and so one needs to add the module to the list of installed apps.

"""A time zone-aware DateTime field.

When saving, naive datetime objects are assumed to belong to the local time
zone and are converted to UTC. When loading from the database the naive datetime
objects are converted to UTC.

These field types require database support. MySQL 5 will not work.
"""
from datetime import datetime, tzinfo, timedelta
from django.conf import settings
from django.core.exceptions import ValidationError
from django.db import models
import pytz
import re


# 2009-06-04 12:00:00+01:00 or 2009-06-04 12:00:00 +0100
TZ_OFFSET = re.compile(r'^(.*?)\s?([-\+])(\d\d):?(\d\d)$')


class TZDatetime(datetime):
    def aslocaltimezone(self):
        """Returns the datetime in the local time zone."""
        tz = pytz.timezone(settings.TIME_ZONE)
        return self.astimezone(tz)


class TZDateTimeField(models.DateTimeField):
    """A DateTimeField that treats naive datetimes as local time zone."""
    __metaclass__ = models.SubfieldBase

    def to_python(self, value):
        """Returns a time zone-aware datetime object.

        A naive datetime is assigned the time zone from settings.TIME_ZONE.
        This should be the same as the database session time zone.
        A wise datetime is left as-is. A string with a time zone offset is
        assigned to UTC.
        """
        try:
            value = super(TZDateTimeField, self).to_python(value)
        except ValidationError:
            match = TZ_OFFSET.search(value)
            if match:
                value, op, hours, minutes = match.groups()
                value = super(TZDateTimeField, self).to_python(value)
                value = value - timedelta(hours=int(op + hours), minutes=int(op + minutes))
                value = value.replace(tzinfo=pytz.utc)
            else:
                raise

        if value is None:
            return value

        # Only force zone if the datetime has no tzinfo
        if (value.tzinfo is None) or (value.tzinfo.utcoffset(value) is None):
            value = force_tz(value, settings.TIME_ZONE)
        return TZDatetime(value.year, value.month, value.day, value.hour,
            value.minute, value.second, value.microsecond, tzinfo=value.tzinfo)


def force_tz(obj, tz):
    """Converts a datetime to the given timezone.

    The tz argument can be an instance of tzinfo or a string such as
    'Europe/London' that will be passed to pytz.timezone. Naive datetimes are
    forced to the timezone. Wise datetimes are converted.
    """
    if not isinstance(tz, tzinfo):
        tz = pytz.timezone(tz)

    if (obj.tzinfo is None) or (obj.tzinfo.utcoffset(obj) is None):
        return tz.localize(obj)
    else:
        return obj.astimezone(tz)

Using plists for site-specific Django settings

I have a Django project that I am going to deploy at several sites, but I need to tweak the project settings slightly for each site. Specifically I need different a EMAIL_HOST address and related settings for sending mail at each site.

The simplest route is to customize the project settings.py as part of the site deployment, but that will drive you insane when you deploy the wrong custom-settings to a site.

Another approach is similar to that used by many for switching between settings when moving between testing / staging / live environments: your settings.py has a few lines something like

try:
    from sitesettings import *
except ImportError:
    pass

so you can over-ride any setting by putting them in a sitesettings.py file, and then make sure your deployment never overwrites that site-specific file.

In my case I want to make it easy for the site administrator to customize the settings, but I am worried that it is too easy for someone who does not know Python syntax to inadvertently break things by writing a sitesettings.py that throws a SyntaxError exception. Given the significance of white-space in Python I feel this would be easy to get wrong.

So I’ve gone for storing the custom settings in Mac OS X’s property list format. Bless Python for it has the plistlib module that reads and writes the simple XML format of property lists.

Here’s my module that imports all properties from a plist straight into the module’s namespace. This then makes it easy to over-ride Django’s settings by doing

from plistsettings import *

A couple bits made my lips move during the writing. The contents of __all__ are updated dynamically because I wanted to use this with from plistingsettings import * without worrying that my module’s imports would get clobbered by imports used in the plistsettings module. And working out how to bind keys and values to the module itself is not obvious to me – it feels like one ought to be able to use self within the scope of the module to refer to the module itself. Except you can’t. No biggie.

# plistsettings.py
import os.path
import plistlib
import sys
from xml.parsers.expat import ExpatError


__all__ = []


PLIST_PATH = '/Library/Preferences/com.example.plist'


def read_prefs(plist_path):
    """Import settings from preference file into this module's global namespace.

    Returns a dictionary as returned by plistlib.readPlist().
    """
    try:
        if os.path.exists(plist_path):
            prefs = plistlib.readPlist(plist_path)
        else:
            return
    except ExpatError:
        return

    mod = sys.modules[__name__]
    global __all__

    for key, value in prefs.items():
        setattr(mod, key, value)
        __all__.append(key)
    return prefs

read_prefs(PLIST_PATH)

Now if you are the kind of Mac guy who enjoys using defaults you can write out your site-specific settings from the command-line.

defaults write /Library/Preferences/com.example EMAIL_HOST smtp.example.com
plutil -convert xml1 /Library/Preferences/com.example.plist

N.B. Mac OS X 10.5 defaults uses the binary format by default, so you need plutil to convert it back to XML because plistlib does not handle the binary format.

Q and operator.or_

I’ve finally settled on a nice syntax for OR-ing Django Q objects.

For a simple site search feature I needed to search for a term across several fields in a model. Suppose the model looks like this:

class BlogPost(models.Model):
    title = models.CharField(max_length=100)
    body = models.TextField()
    summary = models.TextField()

And you have a view method that accepts a parameter q for searching across the title, body and summary fields. I want to find objects that contain the q phrase in any of those fields. I need to build a QuerySet with a filter that is the equivalent of

queryset = BlogPost.objects.filter(
    Q(title__icontains=q) | Q(body__icontains=q) | Q(summary__icontains=q)
)

That’s not too much of a hassle for this simple example, but in cases where the fields you are searching are chosen dynamically, or where you just have an awful lot of fields to search against, I think it is nicer to do it like so:

import operator

search_fields = ('title', 'body', 'summary')
q_objects = [Q(**{field + '__icontains':q}) for field in search_fields]
queryset = BlogPost.objects.filter(reduce(operator.or_, q_objects))

Nice one! The list comprehension gives me a list of Q objects generated from the names in search_fields, so it is easy to change the fields to be searched. And using reduce and operator.or_ gives me the required OR filter in one line.

I see for Python 3 reduce has been moved to the functools module.

This stuff never used to be that obvious to me. It kind of isn’t even now.

P.S. I promise I am not writing a blog engine at this time, it was just for the example.

reverse() chicken and egg problem

I wound up in a chicken and egg situation today using Django’s syndication framework and the reverse helper. The problem was that immediately after starting the development server, Django would throw a NoReverseMatch exception on the first client visit, followed by AttributeError on all subsequent visits.

It all started so innocently… I had wanted a set of urls for my application like this:

So I put the following in the application’s urls.py:

# myapp/urls.py
from django.conf.urls.defaults import *
from views import arrivals_list, departures_list
from feeds import LatestArrivals, LatestDepartures


feed_dict = {'a': LatestArrivals, 'd': LatestDepartures}


urlpatterns = patterns('',
    (r'^a/$', arrivals_list, {}, 'arrivals'),
    (r'^d/$', departures_list, {}, 'departures'),
    (r'^(?P<url>[ad])/feed/$', 'django.contrib.syndication.views.feed', {'feed_dict':feed_dict}),
)

That covers my URL wishes, and because I have named the URL patterns I can use that name in templates with the {% url %} template tag and in Python code using the reverse helper.

So naturally the feed classes in feeds.py look like this:

# myapp/feeds.py
from django.contrib.syndication.feeds import Feed
from django.core.urlresolvers import reverse
from django.utils.feedgenerator import Atom1Feed
from models import Tx


class LatestArrivals(Feed):
    """Produces an Atom feed of recent arrival tickets."""
    feed_type = Atom1Feed
    title = 'Arrivals'
    link = reverse('arrivals')
    subtitle = 'Most recent arrivals'

    def items(self):
        return Tx.objects.arrivals()[:10]


class LatestDepartures(Feed):
    """Produces an Atom feed of recent departure tickets."""
    feed_type = Atom1Feed
    title = 'Departures'
    link = reverse('departures')
    subtitle = 'Most recent departures'

    def items(self):
        return Tx.objects.departures()[:10]

Note I used reverse on the link attribute of each class so that I can define the URL in one place, the urls.py module, and a change there will be reflected in the feed’s link too.

But this doesn’t work! When Django imports my urls.py module, it imports LatestDepartures and LatestArrivals, and they in turn use reverse to find the named URL patterns – except those names aren’t defined until after urlpatterns has been defined in urls.py so Django throws an exception and never imports my urls.py module.

You could work around this either by defining your syndication feeds in an entirely different urls.py module. But you can also split up urlpatterns within the same module and import the feed classes after their named URL patterns have been defined.

Here’s the working urls.py module:

from django.conf.urls.defaults import *
from views import arrivals_list, departures_list


urlpatterns = patterns('',
    (r'^a/$', arrivals_list, {}, 'arrivals'),
    (r'^d/$', departures_list, {}, 'departures'),
)


from feeds import LatestArrivals, LatestDepartures
feed_dict = {'a': LatestArrivals, 'd': LatestDepartures}


urlpatterns += patterns('',
    (r'^(?P<url>[ad])/feed/$', 'django.contrib.syndication.views.feed', {'feed_dict':feed_dict}),
)

Using an object for Django’s ChoiceField choices

I had another thought about per-instance choices for forms.ChoiceField. Instead of overriding the __init__ method of your form class, you could use an object with an __iter__ method that returns a fresh iterable each time it is called.

from django import forms


class LetterChoices(object):
    """Return a random list of max_choices letters of the alphabet."""
    def __init__(self, max_choices=3):
        self.max_choices = max_choices

    def __iter__(self):
        import string, random

        return iter((l, l) for l in random.sample(string.ascii_uppercase, self.max_choices))


class LetterForm(forms.Form):
    """Pick a letter from a small, random set."""
    letter = forms.ChoiceField(choices=LetterChoices())

I don’t know if I prefer that style to having a simple function – having to instantiate the class seems wrong to me, I’d much rather use any callable as the choices argument.