Category Archives: Blog

Package installer wish

Mac OS X administrators frequently need to build installer packages to
help deploy and manage software on a network of Macs. The motive for
creating a package is one of

* Packaging software that does not have a dedicated installer. This applies
to all the nice drag-and-drop applications like [Firefox][firefox] and
[Cyberduck][cyberduck].
* Packaging your own site’s software, whether that is as simple as printer
descriptions or as complex as a full-blown application.
* Re-packaging some miserable piece of shit installer that either totally
denies the harsh reality of Apple’s non-cross platform installer formats or
which manages to make such a balls of an install package that you were
better off before they bothered.
[Most everything by Adobe is in this category][adobeinstallers].

The first of those three is very common, and it ought to be easy to
create packages for existing installed applications.
[Apple’s PackageMaker][packagemaker] application provides a nice interface
for creating packages, but it has two drawbacks:

* PackageMaker is not installed by default on Mac OS X (it gets intalled as
part of the Developer Tools).
* Before PackageMaker 3 (part of Xcode 3, which requires Mac OS X 10.5)
there was no *simple* method for quickly packaging an installed application.

What I want is a package creation tool that works on a 10.4 system without
requiring the developer tools and which can be scripted. The `packagemaker`
command-line tool requires an existing `Info.plist` file or `.pmdoc` file
if you want to set a custom default installation directory – not the end
of the world, but tedious.

Useful links
————

* [Iceberg][iceberg] is an excellent graphical tool for building packages, by
Stéphane Sudre who also wrote up the…
* [PackageMaker how-to][pmhowto] which is a very useful introduction to the
details of `.pkg` files. Bit out-of-date these days.
* Man pages for the command-line [packagemaker][man1pm] and for the
[installer][installer] tools.
* Apple’s [software distribution documentation][softdist], which is very
quiet on the subject of custom installer plug-ins. Xcode has a template
project for an installer plugin, and the `InstallerPlugins.framework` headers
have lots of information.
* [Installer-dev mailing list][installerdev], where the people who wrote
the tools and documentation help out a lot.
* [JAMF Composer][composer] which is part of the Casper management tools.
Version 7 is no longer free.

[firefox]: http://www.mozilla.com/firefox/
[cyberduck]: http://cyberduck.ch/
[packagemaker]: http://developer.apple.com/DOCUMENTATION/DeveloperTools/Conceptual/PackageMakerUserGuide/index.html
[adobeinstallers]: http://blogs.adobe.com/OOBE/
[iceberg]: http://s.sudre.free.fr/Software/Iceberg.html
[pmhowto]: http://s.sudre.free.fr/Stuff/PackageMaker_Howto.html
[man1pm]: http://developer.apple.com/DOCUMENTATION/DARWIN/Reference/ManPages/man1/packagemaker.1.html
[installer]: http://developer.apple.com/DOCUMENTATION/Darwin/Reference/ManPages/man8/installer.8.html
[softdist]: http://developer.apple.com/documentation/DeveloperTools/Conceptual/SoftwareDistribution
[installerdev]: http://lists.apple.com/mailman/listinfo/installer-dev
[composer]: http://www.jamfsoftware.com/products/composer.php

Python features for a PHP refugee

These are things that particularly impressed me
when I decided I had had enough of [PHP][php] and I really ought to look at
[the crazy white-space language called Python][python] that was used by
[Plone][plone], [Trac][trac] and [Django][django].

The [Zen of Python][zen] states most of this in 19 lines, for all you
_tl;dr_ types.

Name-spaces and a minimal set of built-ins
——————————————

I like that the set of keywords is small, and that the built-in methods are
not much larger. This leaves you with an unpolluted name-space (and if you
enjoy confusing people you can always override the built-ins).

Explicit versus implicit
————————

Related to name-spaces is the notion that Python is explicit: there is very
little magic in a Python script. Perhaps the closest thing to magic are the
various special methods that define a behaviour, for example the `__getattr__`
/ `__setattr__` / `__delattr__` methods on a class to control attribute access.
But even then Python makes it obvious those methods have special meaning by
using a double-underscore for the method names.

See [the page on the data model][special] in the Python documentation for
a description of these methods and their purpose.

Generators and list comprehensions
———————————-

I never realized how much I missed these until I went back to PHP for a small
web project. So much of my code seems to be looping through lists and
accumulating a result or applying a function to each member of the list. I
don’t think the syntax is particularly obvious, but then I can’t think of a
better way to do it. At first glance generators looked to be the same as
list comprehensions, but eventually I began to understand the difference
between needing a finite list of objects ([list comprehension][listcomp])
and consuming a sequence of objects ([generators][generators]).

Named arguments for methods and functions
—————————————–

Gosh, not having named arguments is painful. As a consumer, named arguments
allow one to forget a function’s precise argument signature, and as a
designer it allows one to provide sensible defaults and flexibility.

Dates and times as a native type
——————————–

Well, not _native_, but readily available.

[Python’s `datetime` module][datetime] provides representations of calendar
dates and times and a bunch of obvious behaviour for comparing two moments.
PHP 5 introduced a proper DateTime class, but I had jumped ship a while
before then – my affection for Python’s date handling is borne of a time
when one had to rely on PEAR for useful date functions. Converting everything
to seconds since the Unix epoch was never fun.

The greatest annoyance in Python’s date implementation is its shrugging
support for timezones – you nearly always need to resort to a third-party
module ([such as pytz][pytz] or [python-dateutil][dateutil]) to handle
timezones without jeopardizing one’s sanity.

Batteries included
——————

It is odd that one _does_ need an additional module to handle timezones
seeing as the Python standard library includes so many useful modules for
common tasks.

Need to work with [CSV files][csv]? Or [command-line arguments][getopt]?
Or [Mac OS X-style .plist][plistlib] files? Or configuration files in
[INI format][ini]? Or [tar archives][tar] (with gzip or bzip2 compression)?

Oh golly so much tedious work has been done for you in the Python standard
library. I suppose this reflects PHP’s emphasis as a scripting language for
the Web versus Python’s use as a general purpose language, but I am very
grateful for that distinction.

[zen]: http://www.python.org/dev/peps/pep-0020/
[special]: http://docs.python.org/reference/datamodel.html#special-method-names
[datetime]: http://docs.python.org/library/datetime.html
[pytz]: http://pytz.sourceforge.net/
[dateutil]: http://labix.org/python-dateutil
[csv]: http://docs.python.org/library/csv.html
[getopt]: http://docs.python.org/library/getopt.html
[plistlib]: http://docs.python.org/library/plistlib.html
[ini]: http://docs.python.org/library/configparser.html
[php]: http://www.php.net/
[plone]: http://plone.org/
[django]: http://www.djangoproject.com/
[trac]: http://trac.edgewall.org/
[python]: http://www.python.org/
[listcomp]: http://docs.python.org/tutorial/datastructures.html#list-comprehensions
[generators]: http://docs.python.org/tutorial/classes.html#generators
[tar]: http://docs.python.org/library/tarfile.html

Q and operator.or_

I’ve finally settled on a nice syntax for `OR`-ing [Django Q objects][q].

For a simple site search feature I needed to search for a term across several
fields in a model. Suppose the model looks like this:

class BlogPost(models.Model):
title = models.CharField(max_length=100)
body = models.TextField()
summary = models.TextField()

And you have a view method that accepts a parameter `q` for searching across
the `title`, `body` and `summary` fields. I want to find objects that contain
the `q` phrase in any of those fields. I need to build a [`QuerySet`][queryset]
with a filter that is the equivalent of

queryset = BlogPost.objects.filter(
Q(title__icontains=q) | Q(body__icontains=q) | Q(summary__icontains=q)
)

That’s not too much of a hassle for this simple example, but in cases where
the fields you are searching are chosen dynamically, or where you just have
an awful lot of fields to search against, I think it is nicer to do it like so:

import operator

search_fields = (‘title’, ‘body’, ‘summary’)
q_objects = [Q(**{field + ‘__icontains’:q}) for field in search_fields]
queryset = BlogPost.objects.filter(reduce(operator.or_, q_objects))

Nice one! The list comprehension gives me a list of `Q` objects generated from
the names in `search_fields`, so it is easy to change the fields to be searched.
And using [`reduce`][reduce] and [`operator.or_`][operator] gives me the
required `OR` filter in one line.

I see for Python 3 `reduce` has been [moved to the `functools` module][functools].

This stuff never used to be that obvious to me. It kind of isn’t even now.

P.S. I promise I am not writing a blog engine at this time, it was just for
the example.

[q]: http://docs.djangoproject.com/en/dev/topics/db/queries/#complex-lookups-with-q-objects
[queryset]: http://docs.djangoproject.com/en/dev/ref/models/querysets/
[operator]: http://docs.python.org/library/operator.html
[reduce]: http://docs.python.org/library/functions.html#reduce
[functools]: http://docs.python.org/library/functools.html

reverse() chicken and egg problem

I wound up in a chicken and egg situation today using [Django’s syndication
framework][syndication] and the [`reverse`][reverse] helper. The problem was
that immediately after starting the development server, Django would throw a
`NoReverseMatch` exception on the first client visit, followed by `AttributeError`
on all subsequent visits.

It all started so innocently… I had wanted a set of urls for my application
like this:

* [http://example.com/a/]() # List view of arrivals
* [http://example.com/d/]() # List view of departures
* [http://example.com/a/feed/]() # Syndication feed for arrivals
* [http://example.com/d/feed/]() # Syndication feed for departures

So I put the following in the application’s `urls.py`:

# myapp/urls.py
from django.conf.urls.defaults import *
from views import arrivals_list, departures_list
from feeds import LatestArrivals, LatestDepartures

feed_dict = {‘a’: LatestArrivals, ‘d’: LatestDepartures}

urlpatterns = patterns(”,
(r’^a/$’, arrivals_list, {}, ‘arrivals’),
(r’^d/$’, departures_list, {}, ‘departures’),
(r’^(?P[ad])/feed/$’, ‘django.contrib.syndication.views.feed’, {‘feed_dict’:feed_dict}),
)

That covers my URL wishes, and because I have named the URL patterns I
can use that name in templates with the [`{% url %} template tag`][urltag]
and in Python code using the `reverse` helper.

So naturally the feed classes in `feeds.py` look like this:

# myapp/feeds.py
from django.contrib.syndication.feeds import Feed
from django.core.urlresolvers import reverse
from django.utils.feedgenerator import Atom1Feed
from models import Tx

class LatestArrivals(Feed):
“””Produces an Atom feed of recent arrival tickets.”””
feed_type = Atom1Feed
title = ‘Arrivals’
link = reverse(‘arrivals’)
subtitle = ‘Most recent arrivals’

def items(self):
return Tx.objects.arrivals()[:10]

class LatestDepartures(Feed):
“””Produces an Atom feed of recent departure tickets.”””
feed_type = Atom1Feed
title = ‘Departures’
link = reverse(‘departures’)
subtitle = ‘Most recent departures’

def items(self):
return Tx.objects.departures()[:10]

Note I used `reverse` on the link attribute of each class so that I can
define the URL in one place, the `urls.py` module, and a change there will
be reflected in the feed’s link too.

But this doesn’t work! When Django imports my `urls.py` module, it imports
`LatestDepartures` and `LatestArrivals`, and they in turn use `reverse` to
find the named URL patterns – except those names aren’t defined until after
`urlpatterns` has been defined in `urls.py` *so Django throws an exception
and never imports my `urls.py` module*.

You could work around this either by defining your syndication feeds in an
entirely different `urls.py` module. But you can also split up `urlpatterns`
within the same module and import the feed classes after their named URL
patterns have been defined.

Here’s the working `urls.py` module:

from django.conf.urls.defaults import *
from views import arrivals_list, departures_list

urlpatterns = patterns(”,
(r’^a/$’, arrivals_list, {}, ‘arrivals’),
(r’^d/$’, departures_list, {}, ‘departures’),
)

from feeds import LatestArrivals, LatestDepartures
feed_dict = {‘a’: LatestArrivals, ‘d’: LatestDepartures}

urlpatterns += patterns(”,
(r’^(?P[ad])/feed/$’, ‘django.contrib.syndication.views.feed’, {‘feed_dict’:feed_dict}),
)

[syndication]: http://docs.djangoproject.com/en/dev/ref/contrib/syndication/
[reverse]: http://docs.djangoproject.com/en/dev/topics/http/urls/#reverse
[urltag]: http://docs.djangoproject.com/en/dev/ref/templates/builtins/#url

Using an object for Django’s ChoiceField choices

I had another thought about [per-instance choices for `forms.ChoiceField`][oldpost].
Instead of overriding the `__init__` method of your form class, you could use
[an object with an `__iter__` method][iter] that returns a fresh iterable each time
it is called.

from django import forms

class LetterChoices(object):
“””Return a random list of max_choices letters of the alphabet.”””
def __init__(self, max_choices=3):
self.max_choices = max_choices

def __iter__(self):
import string, random

return iter((l, l) for l in random.sample(string.ascii_uppercase, self.max_choices))

class LetterForm(forms.Form):
“””Pick a letter from a small, random set.”””
letter = forms.ChoiceField(choices=LetterChoices())

I don’t know if I prefer that style to having a simple function – having to
instantiate the class seems wrong to me, I’d much rather use any callable as
the `choices` argument.

[oldpost]: http://reliablybroken.com/b/2009/03/per-instance-choices-for-djangos-formschoicefield/
[iter]: http://python.org/doc/current/library/stdtypes.html#typeiter

Django test database runner as a context manager

In my last post I mentioned it might be an idea to [wrap up the Django test
database setup / teardown in a context manager][lastpost] for use with [Python’s
`with` statement][pythonwith]. Here’s my first stab, which seems to work.

from contextlib import contextmanager

@contextmanager
def test_db_connection():
“””A context manager for Django’s test runner.

For Python 2.5 you will need
from __future__ import with_statement
“””

from django.conf import settings
from django.test.utils import setup_test_environment, teardown_test_environment
from django.db import connection

setup_test_environment()

settings.DEBUG = False
verbosity = 0
interactive = False

old_name = settings.DATABASE_NAME
connection.creation.create_test_db(verbosity, autoclobber=not interactive)

yield connection

connection.creation.destroy_test_db(old_name, verbosity)
teardown_test_environment()

All of this requires Python 2.5 or later.

So with that snippet you could write a test something like so:

import unittest

class MyTestCase(unittest.TestCase):
def test_myModelTest(self):
with test_db_connection():
from myproject.myapp.models import MyModel

obj = MyModel()
obj.save()
self.assert_(obj.pk)

… and just as with Django’s `manage.py test` command the objects would be
created within the test database then destroyed when the
`with test_db_connection()` block is finished.

Everything’s going to be hunky dory.

[lastpost]: http://reliablybroken.com/b/2009/03/creating-a-django-test-database-for-unit-testing/
[pythonwith]: http://docs.python.org/reference/datamodel.html#context-managers

Creating a Django test database for unit testing

I needed to run tests involving a Django application but without using the
`manage.py test` management command. So I need my own test suite that
sets up the test database and drops it after, leaving my real database untouched.

As of Django 1.0.2 the default behaviour for the test runner is the [`run_tests`
function in `django.test.simple`][runtests]. Here is the bones of that function
with the required setup and teardown calls.

from django.conf import settings
from django.test.utils import setup_test_environment, teardown_test_environment

verbosity = 1
interactive = True

setup_test_environment()
settings.DEBUG = False
old_name = settings.DATABASE_NAME

from django.db import connection
connection.creation.create_test_db(verbosity, autoclobber=not interactive)

# Here you run tests using the test database and with mock SMTP objects

connection.creation.destroy_test_db(old_name, verbosity)
teardown_test_environment()

Hmmm… Wouldn’t this be a good candidate to be wrapped up for use with
[Python 2.5’s `with` statement][with]?

[runtests]: http://code.djangoproject.com/browser/django/tags/releases/1.0.2/django/test/simple.py#L102
[with]: http://docs.python.org/reference/datamodel.html#context-managers

sys.exit(1) versus SystemExit(1)

I used to write Python scripts and have the option parsing go something like this…

if __name__ == “__main__”:
import sys, getopt

opts, args = getopt.getopt(sys.argv[1:], ‘h’, [‘help’])

for opt, val in opts:
if opt in (‘-h’, ‘–help’):
usage(sys.argv)
sys.exit(1) # Exit to shell with non-zero result

And then I finally started writing tests for my code, at which point I decided
I need to [`raise SystemExit(1)`][exception] rather than `sys.exit(1)` because I imagined
[Python’s unittest module][unittest] would get bypassed whenever my code called
`sys.exit(1)`.

Except of course I was wrong. [`sys.exit` throws `SystemExit` in turn][sysexit],
so it comes to the same thing from the point of view of `unittest`. Failing to read
documentation is a very bad habit.

But I prefer throwing the exception myself. You don’t have to `import sys` if you
don’t need it, and *it feels prettier* (if I had more Python experience I might say *more Python-ic*).

I used to [smoke, drink and dance the hoochie-coo][saved] too.

[unittest]: http://docs.python.org/library/unittest.html
[sysexit]: http://docs.python.org/library/sys.html#sys.exit
[exception]: http://docs.python.org/library/exceptions.html#exceptions.SystemExit
[saved]: http://www.google.co.uk/search?q=lavern%20baker%20saved

Per-instance choices for Django’s forms.ChoiceField

I keep forgetting the details of how one customizes the choices in a `forms.ChoiceField` per instance. The [`forms.ChoiceField` documentation][choices] says the required argument has to be an iterable but then moves straight to the next section.

Fortunately this was covered long ago by [James Bennett][ubernostrum] when Django’s newforms module was introduced. See [__Getting dynamic model choices in newforms__ on Django Snippets][snippets].

The following form example has a field for picking a letter of the alphabet (works for [Django 1.0][django]). The choices are limited to 3 letters only, picked at random and different for each form instance:

from django import forms

def letter_choices(max_choices=3):
“””Return a random list of max_choices letters of the alphabet.”””
import string, random

for l in random.sample(string.ascii_uppercase, max_choices):
yield (l, l)

class LetterForm(forms.Form):
“””Pick a letter from a small, random set.”””
letter = forms.ChoiceField(choices=letter_choices())

def __init__(self, *args, **kwargs):
super(LetterForm, self).__init__(*args, **kwargs)
self.fields[‘letter’].choices = letter_choices()

So that works. The `LetterForm class` uses the helper function `letter_choices` to provide the random choices, which actually returns a generator object rather than a list or tuple of choice pairs. I am relying on Django’s base `ChoiceField` class [calling `list()` on the choices][fields_py] when each form is instantiated, so having `letter_choices` return a generator is hunky dory.

>>> f1 = LetterForm()
>>> f1[‘letter’]
>>> print f1[‘letter’]

>>> f2 = LetterForm()
>>> print f2[‘letter’]

>>>

Now the only thing is… this example is not practical. Using genuinely random choices means that the valid choices on the form submitted by the user will be different to the valid choices on the form used to validate the user input on the next request, and this will likely raise a ValidationError.

*__Django feature suggestion__: allow choices to be any iterable or callable, calling it as appropriate when instantiating the field. If it is callable you could pass a function which returns an iterable at that point, which would save one having to write an `__init__` method for the form sub-class.*

[choices]: http://docs.djangoproject.com/en/dev/ref/forms/fields/#choicefield
[django]: http://www.djangoproject.com/
[fields_py]: http://code.djangoproject.com/browser/django/tags/releases/1.0.2/django/forms/fields.py#L634
[snippets]: http://www.djangosnippets.org/snippets/26/
[ubernostrum]: http://www.b-list.org/

First steps with South

My first time using a schema evolution tool for a [Django][1] project, and I like it. I chose [South][2] because it had a clear path for devolving a schema and had a database-independent API (I’ve been testing [Postgres][4] with some projects that currently use a [MySQL][3] database since we have a [Trac installation that uses Postgres][5] anyway).

I have a simple application to store serial numbers and purchasing information for software. It has a `License` model, but I wanted to add a comment on each license. Following the instructions for [converting an existing app][6] on the South wiki, here’s what I did to convert my software license application.

First I added `south` to `INSTALLED_APPS` in `settings.py` and ran `manage.py syncdb` to install the South application tables for the first time. This was before using South to actually manage any of the evolutions.

% ./manage.py syncdb
Syncing…
Creating table south_migrationhistory

Synced:
> django.contrib.auth
> django.contrib.contenttypes
> django.contrib.sessions
> django.contrib.sites
> django.contrib.admin
> myproject.software
> south

Not synced (use migrations):

(use ./manage.py migrate to migrate these)

Then I created an initial migration for my application. This gives me a migration that matches the existing database schema.

% ./manage.py startmigration software –initial
Creating migrations directory at ‘/Users/david/myproject/../myproject/software/migrations’…
Creating __init__.py in ‘/Users/david/myproject/../myproject/software/migrations’…
Created 0001_initial.py.

The next step was to bring this application under South’s control. You have to pretend to apply the initial migration (because it already exists in the database), which is done using South’s `–fake` switch. I discovered a problem with my use of `_` in my `0001_initial.py` migration: the easiest fix was to import it explicitly within the migration (this fix was necessary for the second migration too).

from south.db import db
from django.db import models
from myproject.software.models import *
from django.utils.translation import gettext_lazy as _

Then I was ready to bring my application under South’s control.

% ./manage.py migrate software 0001 –fake
– Soft matched migration 0001 to 0001_initial.
Running migrations for software:
– Migrating forwards to 0001_initial.
> software: 0001_initial
(faked)

Now I was ready to alter my model definitions and get South to do the tedious work of updating the database schema. I added a new `Note` model in the application’s `models.py` and added a new field to the existing `License` model.

class License(models.Model):
“””A license for a software title.”””

department = models.CharField(_(“Department”), max_length=100, blank=True)

class Note(models.Model):
“””A note for a license.”””
license = models.ForeignKey(License, editable=False)
author = models.CharField(_(“author”), max_length=100, editable=False)
created = models.DateTimeField(auto_now_add=True)
note = models.TextField(_(“note text”))

def __unicode__(self):
return self.note

class Meta:
ordering = [‘-created’]

With those changes I used the `startmigration` command to generate a migration for the new model and fields.

% ./manage.py startmigration software notes_department –model Note –add-field License.department
Created 0002_notes_department.py.

(Afterwards I edited the migration to import `_` as noted above.)

The final step was to use South to apply this migration (and any others) to the database.

% ./manage.py migrate software
Running migrations for software:
– Migrating forwards to 0002_notes_department.
> software: 0002_notes_department
= ALTER TABLE “software_license” ADD COLUMN “department” varchar(100) NOT NULL; []
= CREATE TABLE “software_note” (“id” serial NOT NULL PRIMARY KEY, “license_id” integer NOT NULL, “author” varchar(100) NOT NULL, “created” timestamp with time zone NOT NULL, “note” text NOT NULL); []
= ALTER TABLE “software_note” ADD CONSTRAINT “license_id_refs_id_61c4291d” FOREIGN KEY (“license_id”) REFERENCES “software_license” (“id”) DEFERRABLE INITIALLY DEFERRED; []
= CREATE INDEX “software_note_license_id” ON “software_note” (“license_id”); []
– Sending post_syncdb signal for software: [‘Note’]
– Loading initial data for software.

Sweet!

[1]: http://www.djangoproject.com/
[2]: http://south.aeracode.org/
[3]: http://www.mysql.com/
[4]: http://www.postgresql.org/
[5]: http://trac.edgewall.org/wiki/DatabaseBackend
[6]: http://south.aeracode.org/wiki/ConvertingAnApp