Tag Archives: python

Working with Active Directory FILETIME values in Python

[How To Convert a UNIX time_t to a Win32 FILETIME or SYSTEMTIME][kb]:

> Under Win32 platforms, file times are maintained primarily in the form of
> a 64-bit FILETIME structure, which represents the number of 100-nanosecond
> intervals since January 1, 1601 UTC (coordinate universal time).

***UPDATED* New version with fixes by Tim Williams for preserving microseconds. See [here for details][updated].**

It just so happens that [Microsoft Active Directory][msad] uses the same 64-bit value to store some time values. For example [the `accountExpires` attribute][expires] is in this format. Linked below is a module for Python with utility functions for converting between [Python’s datetime instances][datetime] and Microsoft’s FILETIME values.

Very handy if you enjoy querying Active Directory for login accounts that are due to expire. And who wouldn’t enjoy that? On a Monday.

[Download filetimes.py module for converting between FILETIME and `datetime` objects.][filetimes] This code is released under a 2-clause BSD license.

Example usage:

>>> from filetimes import filetime_to_dt, dt_to_filetime, utc
>>> filetime_to_dt(116444736000000000)
datetime.datetime(1970, 1, 1, 0, 0)
>>> filetime_to_dt(128930364000000000)
datetime.datetime(2009, 7, 25, 23, 0)
>>> “%.0f” % dt_to_filetime(datetime(2009, 7, 25, 23, 0))
‘128930364000000000’
>>> dt_to_filetime(datetime(1970, 1, 1, 0, 0, tzinfo=utc))
116444736000000000L
>>> dt_to_filetime(datetime(1970, 1, 1, 0, 0))
116444736000000000L

I even remembered to write tests for once!

[kb]: http://support.microsoft.com/kb/167296
[msad]: http://www.microsoft.com/windowsserver2008/en/us/active-directory.aspx
[datetime]: http://docs.python.org/library/datetime.html
[expires]: http://msdn.microsoft.com/en-us/library/ms675098(VS.85).aspx
[filetimes]: /b/wp-content/filetimes.py
[updated]: http://reliablybroken.com/b/2011/09/free-software-ftw-updated-filetimes-py/

BBC iCalendar schedules

[Jon Udell][jonudell] recently [wrote about accessing the BBC programming schedules][post] but was put-off by the lack of time zone information in the iCalendar feeds, which prompted me to fix the quick-and-dirty script I have that generates [iCalendar files for the BBC][guide]. (I wrote the first, time zone-blind version of my script in England’s Winter and it worked just perfick back then!)

So [I fix it][fixed]. The updated iCalendar files have events with time zone information.

Everyone’s happy.

Jon Udell’s use of Python to explore data manipulation on the Web was one of the reasons I thought I really ought to get stuck into [Python][python].

[jonudell]: http://blog.jonudell.net/
[post]: http://blog.jonudell.net/2009/08/05/curation-meta-curation-and-live-net-radio/
[guide]: http://reliablybroken.com/guide/
[python]: http://www.python.org/
[fixed]: http://reliablybroken.com/guide/bbcguidetz.py

Tuples to dicts, toot sweet!

Looking through [Trac][trac]’s search internals I came across [a chunk where a list of tuples is converted to a list of dictionaries][searchmodule] for the convenience of the template engine. Each tuple has five fields: *href*, *title*, *date*, *author* and *excerpt*.

for idx, result in enumerate(results):
results[idx] = {‘href’: result[0], ‘title’: result[1],
‘date’: format_datetime(result[2]),
‘author’: result[3], ‘excerpt’: result[4]}

This allows the template author to use nice names for the fields in a row, like `${result.href}` etc. Looking at this reminded me of another approach that uses [list comprehension][list], [`zip`][zip] and [`dict`][dict].

keys = (‘href’, ‘title’, ‘date’, ‘author’, ‘excerpt’)
results = [dict(zip(keys, row)) for row in results]
for row in results:
row[‘date’] = format_datetime(row[‘date’])

The second line in this snippet is where the list of dictionaries is created, but one still has to go back and format the datetime values (the third and fourth lines). There’s no advantage in speed (the majority of the execution time is spent in `format_datetime`) but I like it a little better.

Maybe if Trac used the second approach I would like Trac a little better too.

[trac]: http://trac.edgewall.org
[searchmodule]: http://trac.edgewall.org/browser/tags/trac-0.11.4/trac/search/web_ui.py#L111
[list]: http://docs.python.org/tutorial/datastructures.html#list-comprehensions
[zip]: http://docs.python.org/library/functions.html#zip
[dict]: http://docs.python.org/library/stdtypes.html#typesmapping

os.walk and UnicodeDecodeError

My Python program was raising a `UnicodeDecodeError` when using [`os.walk`][oswalk] to traverse a directory containing files and folders with UTF-8 encoded names. What had me baffled was the exact same program was working perfectly on the exact same hardware just minutes earlier.

Turns out the difference was between me starting my program as root from an interactive bash shell, versus the program getting started as part of the boot sequence by init (on a [Debian Lenny][lenny] system). When started interactively, the locale was set to en_GB.UTF-8 and so names on the filesystem were assumed to be UTF-8 encoded. When started by init, the locale was set to ASCII.

The fix, as described in this article [*Python: how is sys.stdout.encoding chosen?*][codemonk], was to wrap my program in a script that set the LC_CTYPE environment variable.

#!/bin/sh
export LC_CTYPE=’en_GB.UTF-8′
/path/to/program.py

[codemonk]: http://drj11.wordpress.com/2007/05/14/python-how-is-sysstdoutencoding-chosen/
[lenny]: http://www.debian.org/
[oswalk]: http://docs.python.org/library/os.html#os.walk

Python features for a PHP refugee

These are things that particularly impressed me
when I decided I had had enough of [PHP][php] and I really ought to look at
[the crazy white-space language called Python][python] that was used by
[Plone][plone], [Trac][trac] and [Django][django].

The [Zen of Python][zen] states most of this in 19 lines, for all you
_tl;dr_ types.

Name-spaces and a minimal set of built-ins
——————————————

I like that the set of keywords is small, and that the built-in methods are
not much larger. This leaves you with an unpolluted name-space (and if you
enjoy confusing people you can always override the built-ins).

Explicit versus implicit
————————

Related to name-spaces is the notion that Python is explicit: there is very
little magic in a Python script. Perhaps the closest thing to magic are the
various special methods that define a behaviour, for example the `__getattr__`
/ `__setattr__` / `__delattr__` methods on a class to control attribute access.
But even then Python makes it obvious those methods have special meaning by
using a double-underscore for the method names.

See [the page on the data model][special] in the Python documentation for
a description of these methods and their purpose.

Generators and list comprehensions
———————————-

I never realized how much I missed these until I went back to PHP for a small
web project. So much of my code seems to be looping through lists and
accumulating a result or applying a function to each member of the list. I
don’t think the syntax is particularly obvious, but then I can’t think of a
better way to do it. At first glance generators looked to be the same as
list comprehensions, but eventually I began to understand the difference
between needing a finite list of objects ([list comprehension][listcomp])
and consuming a sequence of objects ([generators][generators]).

Named arguments for methods and functions
—————————————–

Gosh, not having named arguments is painful. As a consumer, named arguments
allow one to forget a function’s precise argument signature, and as a
designer it allows one to provide sensible defaults and flexibility.

Dates and times as a native type
——————————–

Well, not _native_, but readily available.

[Python’s `datetime` module][datetime] provides representations of calendar
dates and times and a bunch of obvious behaviour for comparing two moments.
PHP 5 introduced a proper DateTime class, but I had jumped ship a while
before then – my affection for Python’s date handling is borne of a time
when one had to rely on PEAR for useful date functions. Converting everything
to seconds since the Unix epoch was never fun.

The greatest annoyance in Python’s date implementation is its shrugging
support for timezones – you nearly always need to resort to a third-party
module ([such as pytz][pytz] or [python-dateutil][dateutil]) to handle
timezones without jeopardizing one’s sanity.

Batteries included
——————

It is odd that one _does_ need an additional module to handle timezones
seeing as the Python standard library includes so many useful modules for
common tasks.

Need to work with [CSV files][csv]? Or [command-line arguments][getopt]?
Or [Mac OS X-style .plist][plistlib] files? Or configuration files in
[INI format][ini]? Or [tar archives][tar] (with gzip or bzip2 compression)?

Oh golly so much tedious work has been done for you in the Python standard
library. I suppose this reflects PHP’s emphasis as a scripting language for
the Web versus Python’s use as a general purpose language, but I am very
grateful for that distinction.

[zen]: http://www.python.org/dev/peps/pep-0020/
[special]: http://docs.python.org/reference/datamodel.html#special-method-names
[datetime]: http://docs.python.org/library/datetime.html
[pytz]: http://pytz.sourceforge.net/
[dateutil]: http://labix.org/python-dateutil
[csv]: http://docs.python.org/library/csv.html
[getopt]: http://docs.python.org/library/getopt.html
[plistlib]: http://docs.python.org/library/plistlib.html
[ini]: http://docs.python.org/library/configparser.html
[php]: http://www.php.net/
[plone]: http://plone.org/
[django]: http://www.djangoproject.com/
[trac]: http://trac.edgewall.org/
[python]: http://www.python.org/
[listcomp]: http://docs.python.org/tutorial/datastructures.html#list-comprehensions
[generators]: http://docs.python.org/tutorial/classes.html#generators
[tar]: http://docs.python.org/library/tarfile.html

Q and operator.or_

I’ve finally settled on a nice syntax for `OR`-ing [Django Q objects][q].

For a simple site search feature I needed to search for a term across several
fields in a model. Suppose the model looks like this:

class BlogPost(models.Model):
title = models.CharField(max_length=100)
body = models.TextField()
summary = models.TextField()

And you have a view method that accepts a parameter `q` for searching across
the `title`, `body` and `summary` fields. I want to find objects that contain
the `q` phrase in any of those fields. I need to build a [`QuerySet`][queryset]
with a filter that is the equivalent of

queryset = BlogPost.objects.filter(
Q(title__icontains=q) | Q(body__icontains=q) | Q(summary__icontains=q)
)

That’s not too much of a hassle for this simple example, but in cases where
the fields you are searching are chosen dynamically, or where you just have
an awful lot of fields to search against, I think it is nicer to do it like so:

import operator

search_fields = (‘title’, ‘body’, ‘summary’)
q_objects = [Q(**{field + ‘__icontains’:q}) for field in search_fields]
queryset = BlogPost.objects.filter(reduce(operator.or_, q_objects))

Nice one! The list comprehension gives me a list of `Q` objects generated from
the names in `search_fields`, so it is easy to change the fields to be searched.
And using [`reduce`][reduce] and [`operator.or_`][operator] gives me the
required `OR` filter in one line.

I see for Python 3 `reduce` has been [moved to the `functools` module][functools].

This stuff never used to be that obvious to me. It kind of isn’t even now.

P.S. I promise I am not writing a blog engine at this time, it was just for
the example.

[q]: http://docs.djangoproject.com/en/dev/topics/db/queries/#complex-lookups-with-q-objects
[queryset]: http://docs.djangoproject.com/en/dev/ref/models/querysets/
[operator]: http://docs.python.org/library/operator.html
[reduce]: http://docs.python.org/library/functions.html#reduce
[functools]: http://docs.python.org/library/functools.html

Using an object for Django’s ChoiceField choices

I had another thought about [per-instance choices for `forms.ChoiceField`][oldpost].
Instead of overriding the `__init__` method of your form class, you could use
[an object with an `__iter__` method][iter] that returns a fresh iterable each time
it is called.

from django import forms

class LetterChoices(object):
“””Return a random list of max_choices letters of the alphabet.”””
def __init__(self, max_choices=3):
self.max_choices = max_choices

def __iter__(self):
import string, random

return iter((l, l) for l in random.sample(string.ascii_uppercase, self.max_choices))

class LetterForm(forms.Form):
“””Pick a letter from a small, random set.”””
letter = forms.ChoiceField(choices=LetterChoices())

I don’t know if I prefer that style to having a simple function – having to
instantiate the class seems wrong to me, I’d much rather use any callable as
the `choices` argument.

[oldpost]: http://reliablybroken.com/b/2009/03/per-instance-choices-for-djangos-formschoicefield/
[iter]: http://python.org/doc/current/library/stdtypes.html#typeiter

Django test database runner as a context manager

In my last post I mentioned it might be an idea to [wrap up the Django test
database setup / teardown in a context manager][lastpost] for use with [Python’s
`with` statement][pythonwith]. Here’s my first stab, which seems to work.

from contextlib import contextmanager

@contextmanager
def test_db_connection():
“””A context manager for Django’s test runner.

For Python 2.5 you will need
from __future__ import with_statement
“””

from django.conf import settings
from django.test.utils import setup_test_environment, teardown_test_environment
from django.db import connection

setup_test_environment()

settings.DEBUG = False
verbosity = 0
interactive = False

old_name = settings.DATABASE_NAME
connection.creation.create_test_db(verbosity, autoclobber=not interactive)

yield connection

connection.creation.destroy_test_db(old_name, verbosity)
teardown_test_environment()

All of this requires Python 2.5 or later.

So with that snippet you could write a test something like so:

import unittest

class MyTestCase(unittest.TestCase):
def test_myModelTest(self):
with test_db_connection():
from myproject.myapp.models import MyModel

obj = MyModel()
obj.save()
self.assert_(obj.pk)

… and just as with Django’s `manage.py test` command the objects would be
created within the test database then destroyed when the
`with test_db_connection()` block is finished.

Everything’s going to be hunky dory.

[lastpost]: http://reliablybroken.com/b/2009/03/creating-a-django-test-database-for-unit-testing/
[pythonwith]: http://docs.python.org/reference/datamodel.html#context-managers

Creating a Django test database for unit testing

I needed to run tests involving a Django application but without using the
`manage.py test` management command. So I need my own test suite that
sets up the test database and drops it after, leaving my real database untouched.

As of Django 1.0.2 the default behaviour for the test runner is the [`run_tests`
function in `django.test.simple`][runtests]. Here is the bones of that function
with the required setup and teardown calls.

from django.conf import settings
from django.test.utils import setup_test_environment, teardown_test_environment

verbosity = 1
interactive = True

setup_test_environment()
settings.DEBUG = False
old_name = settings.DATABASE_NAME

from django.db import connection
connection.creation.create_test_db(verbosity, autoclobber=not interactive)

# Here you run tests using the test database and with mock SMTP objects

connection.creation.destroy_test_db(old_name, verbosity)
teardown_test_environment()

Hmmm… Wouldn’t this be a good candidate to be wrapped up for use with
[Python 2.5’s `with` statement][with]?

[runtests]: http://code.djangoproject.com/browser/django/tags/releases/1.0.2/django/test/simple.py#L102
[with]: http://docs.python.org/reference/datamodel.html#context-managers

sys.exit(1) versus SystemExit(1)

I used to write Python scripts and have the option parsing go something like this…

if __name__ == “__main__”:
import sys, getopt

opts, args = getopt.getopt(sys.argv[1:], ‘h’, [‘help’])

for opt, val in opts:
if opt in (‘-h’, ‘–help’):
usage(sys.argv)
sys.exit(1) # Exit to shell with non-zero result

And then I finally started writing tests for my code, at which point I decided
I need to [`raise SystemExit(1)`][exception] rather than `sys.exit(1)` because I imagined
[Python’s unittest module][unittest] would get bypassed whenever my code called
`sys.exit(1)`.

Except of course I was wrong. [`sys.exit` throws `SystemExit` in turn][sysexit],
so it comes to the same thing from the point of view of `unittest`. Failing to read
documentation is a very bad habit.

But I prefer throwing the exception myself. You don’t have to `import sys` if you
don’t need it, and *it feels prettier* (if I had more Python experience I might say *more Python-ic*).

I used to [smoke, drink and dance the hoochie-coo][saved] too.

[unittest]: http://docs.python.org/library/unittest.html
[sysexit]: http://docs.python.org/library/sys.html#sys.exit
[exception]: http://docs.python.org/library/exceptions.html#exceptions.SystemExit
[saved]: http://www.google.co.uk/search?q=lavern%20baker%20saved