Monthly Archives: April 2010

Split a file on any character in Python

I need to split a big text file on a certain character. I expect I am being thick about this, but split doesn’t quite do what I want because it includes the matching line, whereas I want to split right on the matching character.

My Python answer:

def readlines(filename, endings, chunksize=4096):
    """Returns a generator that splits on lines in a file with the given
    line-ending.
    """
    line = ''
    while True:        
        buf = filename.read(chunksize)
        if not buf:
            yield line
            break

        line = line + buf

        while endings in line:
            idx = line.index(endings) + len(endings)
            yield line[:idx]
            line = line[idx:]

if __name__ == "__main__":
    import sys, os

    FORMFEED = chr(12) # ASCII 12
    basename = os.path.basename(sys.argv[1])
    for num, data in enumerate(readlines(open(sys.argv[1]), endings=FORMFEED)):
        filename = basename + '-' + str(num)
        open(filename, 'wb').write(data)

This is also useful when reading data exported from some old-fashioned Mac application like Filemaker 5 where the line-endings are ASCII 13 not ASCII 10.

This post was inspired by Lotus Notes version 8.5, which is so advanced that to save a message in a file on disk you have to export it as structured text. And if you want to save a whole bunch of messages as individual files you must forget that drag-and-drop was introduced with System 7, that would be too obvious.

Django AdminForm objects and templates

I can’t find documentation for the context of a Django admin template. In particular, where is the form and how does one access the fields? This post describes the template context for a generic admin model for Django 1.1.

Django uses an instance of ModelAdmin (defined in django.contrib.admin.options) to handle the request for a model object add / change view in the admin site. ModelAdmin.add_view and ModelAdmin.change_view are responsible for populating the template context when rendering the add object and change object pages respectively.

Here are the keys common to add and change views:

  • title, ‘Add ‘ or ‘Change ‘ + your model class’ _meta.verbose_name
  • adminform is an instance of AdminForm
  • is_popup, a boolean which is true when _popup is passed as a request parameter
  • media is an instance of django.forms.Media
  • inline_admin_formsets is a list of InlineAdminFormSet objects
  • errors is an instance of AdminErrorList
  • root_path is the root_path attribute of the AdminSite object
  • app_label is your model class’ _meta.app_label attribute

The way that Django renders a form in the admin view is to iterate over the adminform instance and then iterate over each FieldSet which in turn yield AdminField instances. All I want to do is layout the form fields, ignoring the fieldset groupings which may or may not be defined in the model’s ModelAdmin.fieldset attribute.

This turns out to be easy once you know how. The regular form is an attribute of the adminform object. So if your model has a field named “king_of_pop” you can refer to the form field in your template like so:

{{ adminform.form.king_of_pop.label_tag }}: {{ adminform.form.king_of_pop }}

Or if you want to save your finger tips you can use the with template tag:

{% with adminform.form as f %}
{{ f.king_of_pop.label_tag }}: {{ f.king_of_pop }}
{% endwith %}

Delving through the Django source while I tried to understand all of this I was struck by how Python defines hook functions for iteration and accessing attributes. Half of Python’s attraction is in how easy it is from the program author’s point of view to treat objects as built-in types like lists, dicts, etc.; the other half is the responsibility of the author of a Python module to encourage that same ease of use by implementing the related iteration protocols. It is harder to write a good Python module than it is to write a good Python program that uses a good module.