Monthly Archives: November 2009

I am very bad at writing tests

… but I think I might be getting a little better.

At least these days when I am writing some script (almost certainly in Python) I start out by intending to write tests. I usually fail because I haven’t learnt to think in terms of writing code that can be easily tested.

Mark Pilgrim‘s Dive Into Python has great stuff on how to approach a problem by defining the tests first and gradually filling in the code that satisfies the test suite. One day I may be able to work like that, until then I work by writing a concise docstring, then stubbing out the function. Once the function is in a state where it might actually return a meaningful result I can play with it in the Python interpreter and start adding useful doctests to the docstring.

What really helps is to break the logic out into tiny pieces where ideally each piece returns the result of transforming the input (which I think is known as a functional approach). By doing this I can have tests for most of the code and those functions that have a lot of conditional logic, those functions that are harder to write tests for, will at least be relying on sub-routines that are themselves well tested.

I can dream.

The hidden depths of Adobe CS4

Adobe‘s installers and updaters for the Creative Suite are amazingly bad. The updaters actually create a hidden directory /Applications/.AdobePatchFiles and store what I assume are the old versions of the files that get updated. Almost a gigabyte of data on my system!

What the fuck is wrong with these guys?

Not certain which is worse, that the updaters created a folder in /Applications that clearly belongs somewhere in /Library/Application Support (if it should exist at all) or that they made it hidden.

You can delete it.

Crazy Acrobat installers love Python

Looking through the updaters for Adobe Acrobat 9 for Mac I came across a bunch of scripts written in Python. My favourte was called FindAndKill.py:

#!/usr/bin/python
"""
    Search for and kill app. 
"""
import os, sys
import commands
import signal


def main():
    if len(sys.argv) != 2:
        print 'Missing or too many arguments.'
        print 'One argument and only one argument is required.'
        print 'Pass in the app name to find and kill (i.e. "Safari").'
        return 0

    psCmd = '/bin/ps -x -c | grep ' + sys.argv[1]
    st, output = commands.getstatusoutput( psCmd )

    if st == 0:
        appsToKill = output.split('\n')
        for app in appsToKill:
            parts = app.split()
            killCmd = 'kill -s 15 ' + parts[0]
            #print killCmd
            os.system( killCmd )

if __name__ == "__main__":
    main()

(You can download the Acrobat 9.1.3 update and find this script at Acrobat 9 Pro Patch.app/Contents/Resources/FindAndKill.py.)

Was the author not aware of the killall command for sending a kill signal to a named process? The killall man page says it appeared in FreeBSD 2.1, which was released in November 1995. Adobe CS4 was released about 14 years later. How is it Adobe’s product managers approve these things for release?

What is particularly galling about Adobe’s Acrobat 9 updaters is that they seem to re-implement so much of what the Apple installer application does, even down to their use of gzipped cpio archives for the payload.

Migrating a Filemaker database to Django

At work we have several Filemaker Pro databases. I have been slowly working through these, converting them to Web-based applications using the Django framework. My primary motive is to replace an overly-complicated Filemaker setup running on four Macs with a single 2U rack-mounted server running Apache on FreeBSD.

At some point in the process of re-writing each database for use with Django I have needed to convert all the records from Filemaker to Django. There exist good Python libraries for talking to Filemaker but they rely on the XML Web interface, meaning that you need Filemaker running and set to publish the database on the Web while you are running an import.

In my experience Filemaker’s built-in XML publishing interface is too slow when you want to migrate tens of thousands of records. During development of a Django-based application I find I frequently need to re-import the records as the new database schema evolves – doing this by communicating with Filemaker is tedious when you want to re-import the data several times a day.

So my approach has been to export the data from Filemaker as XML using Filemaker’s FMPXMLRESULT format. The Filemaker databases at work are old (Filemaker 5.5) and perhaps things have improved in more recent versions but Filemaker 5/6 is a very poor XML citizen. When using the FMPDSORESULT format (which has been dropped from more recent versions) it will happily generate invalid XML all over the shop. The FMPXMLRESULT format is better but even then it will emit invalid XML if the original data happens to contain funky characters.

So here is filemaker.py, a Python module for parsing an XML file produced by exporting to FMPXMLRESULT format from Filemaker.

To use it you create a sub-class of the FMPImporter class and over-ride the FMPImporter.import_node method. This method is called for each row of data in the XML file and is passed an XML node instance for the row. You can convert that node to a more useful dictionary where keys are column names and values are the column values. You would then convert the data to your Django model object and save it.

A trivial example:

import filemaker

class MyImporter(filemaker.FMPImporter):
    def import_node(self, node):
        node_dict = self.format_node(node)
        print node['RECORDID'], node_dict

importer = MyImporter(datefmt='%d/%m/%Y')
filemaker.importfile('/path/to/data.xml', importer=importer)

The FMPImporter.format_node method converts values to an appropriate Python type according to the Filemaker column type. Filemaker’s DATE and TIME types are converted to Python datetime.date and datetime.time instances respectively. NUMBER types are converted to Python float instances. Everything else is left as strings, but you can customize the conversion by over-riding the appropriate methods in your sub-class (see the source for the appropriate method names).

In the case of Filemaker DATE values you can pass the datefmt argument to your sub-class to specify the date format string. See Python’s time.strptime documentation for the complete list of the format specifiers.

The code uses Python’s built-in SAX parser so that it is efficent when importing huge XML files (the process uses a constant 15 megabytes for any size of data on my Mac running Python 2.5).

Fortunately I haven’t had to deal with Filemaker’s repeating fields so I have no idea how the code works on repeating fields. Please let me know if it works for you. Or not.

Download filemaker.py. This code is released under a 2-clause BSD license.

Network users and Mac 10.5 archive and install

When upgrading a Mac from Mac OS X 10.4 (Tiger) to 10.5 (Leopard), remember that network accounts are not included if you do an archive and install and choose to migrate existing users. If a network account had its home folder at /Users/jbloggs then it will have been moved to /Previous Systems.localized/2009-11-06_0346/Users/jbloggs (although the date portion will be the date that you did your install).

This applies to network accounts which authenticate against Active Directory and do not have a mobile account.

Why my place of work used to setup Macs with the option for create mobile account at login turned off is a mystery to me.