At work we have several Filemaker Pro databases. I have been slowly working through these, converting them to Web-based applications using the Django framework. My primary motive is to replace an overly-complicated Filemaker setup running on four Macs with a single 2U rack-mounted server running Apache on FreeBSD.
At some point in the process of re-writing each database for use with Django I have needed to convert all the records from Filemaker to Django. There exist good Python libraries for talking to Filemaker but they rely on the XML Web interface, meaning that you need Filemaker running and set to publish the database on the Web while you are running an import.
In my experience Filemaker’s built-in XML publishing interface is too slow when you want to migrate tens of thousands of records. During development of a Django-based application I find I frequently need to re-import the records as the new database schema evolves – doing this by communicating with Filemaker is tedious when you want to re-import the data several times a day.
So my approach has been to export the data from Filemaker as XML using Filemaker’s FMPXMLRESULT format. The Filemaker databases at work are old (Filemaker 5.5) and perhaps things have improved in more recent versions but Filemaker 5/6 is a very poor XML citizen. When using the FMPDSORESULT format (which has been dropped from more recent versions) it will happily generate invalid XML all over the shop. The FMPXMLRESULT format is better but even then it will emit invalid XML if the original data happens to contain funky characters.
So here is filemaker.py, a Python module for parsing an XML file produced by exporting to FMPXMLRESULT format from Filemaker.
To use it you create a sub-class of the
FMPImporter class and over-ride the
FMPImporter.import_node method. This method is called for each row of data in the XML file and is passed an XML node instance for the row. You can convert that node to a more useful dictionary where keys are column names and values are the column values. You would then convert the data to your Django model object and save it.
A trivial example:
def import_node(self, node):
node_dict = self.format_node(node)
print node['RECORDID'], node_dict
importer = MyImporter(datefmt='%d/%m/%Y')
FMPImporter.format_node method converts values to an appropriate Python type according to the Filemaker column type. Filemaker’s
TIME types are converted to Python
datetime.time instances respectively.
NUMBER types are converted to Python
float instances. Everything else is left as strings, but you can customize the conversion by over-riding the appropriate methods in your sub-class (see the source for the appropriate method names).
In the case of Filemaker
DATE values you can pass the
datefmt argument to your sub-class to specify the date format string. See Python’s time.strptime documentation for the complete list of the format specifiers.
The code uses Python’s built-in SAX parser so that it is efficent when importing huge XML files (the process uses a constant 15 megabytes for any size of data on my Mac running Python 2.5).
Fortunately I haven’t had to deal with Filemaker’s repeating fields so I have no idea how the code works on repeating fields. Please let me know if it works for you. Or not.
Download filemaker.py. This code is released under a 2-clause BSD license.