At work we have several [Filemaker Pro][fmp] databases. I have been slowly working through these, converting them to Web-based applications using [the Django framework][django]. My primary motive is to replace an overly-complicated Filemaker setup running on four Macs with a single 2U rack-mounted server running [Apache][apache] on [FreeBSD][fbsd].
At some point in the process of re-writing each database for use with Django I have needed to convert all the records from Filemaker to Django. There exist good [Python][python] libraries for [talking to Filemaker][pyfmp] but they rely on the XML Web interface, meaning that you need Filemaker running and set to publish the database on the Web while you are running an import.
In my experience [Filemaker’s built-in XML publishing interface][fmpxml] is too slow when you want to migrate tens of thousands of records. During development of a Django-based application I find I frequently need to re-import the records as the new database schema evolves – doing this by communicating with Filemaker is tedious when you want to re-import the data several times a day.
So my approach has been to export the data from Filemaker as XML using [Filemaker’s FMPXMLRESULT][fmpxmlresult] format. The Filemaker databases at work are _old_ (Filemaker 5.5) and perhaps things have improved in more recent versions but Filemaker 5/6 is a very poor XML citizen. When using the FMPDSORESULT format (which has been dropped from more recent versions) it will happily generate invalid XML all over the shop. The FMPXMLRESULT format is better but even then it will emit invalid XML if the original data happens to contain funky characters.
So here is [filemaker.py, a Python module for parsing an XML file produced by exporting to FMPXMLRESULT][dave] format from Filemaker.
To use it you create a sub-class of the `FMPImporter` class and over-ride the `FMPImporter.import_node` method. This method is called for each row of data in the XML file and is passed an XML node instance for the row. You can convert that node to a more useful dictionary where keys are column names and values are the column values. You would then convert the data to your Django model object and save it.
A trivial example:
def import_node(self, node):
node_dict = self.format_node(node)
print node[‘RECORDID’], node_dict
importer = MyImporter(datefmt=’%d/%m/%Y’)
The `FMPImporter.format_node` method converts values to an appropriate Python type according to the Filemaker column type. Filemaker’s `DATE` and `TIME` types are converted to Python [`datetime.date`][dtdate] and [`datetime.time`][dttime] instances respectively. `NUMBER` types are converted to Python `float` instances. Everything else is left as strings, but you can customize the conversion by over-riding the appropriate methods in your sub-class (see the source for the appropriate method names).
In the case of Filemaker `DATE` values you can pass the `datefmt` argument to your sub-class to specify the date format string. See Python’s [time.strptime documentation][strptime] for the complete list of the format specifiers.
The code uses [Python’s built-in SAX parser][pysax] so that it is efficent when importing huge XML files (the process uses a constant 15 megabytes for any size of data on my Mac running Python 2.5).
Fortunately I haven’t had to deal with Filemaker’s repeating fields so I have no idea how the code works on repeating fields. Please let me know if it works for you. Or not.
[Download filemaker.py][dave]. This code is released under a 2-clause BSD license.