December | 2012 | Reliably Broken

The problem: you want to test a [Django][django] view for results of a search query, but [Haystack][haystack] will be using your real query index, built from your real database, instead of an index built from your test fixtures.

Turns out you can generalise this for any Haystack back-end by replacing the `haystack.backend` module with the simple back-end.

from myapp.models import MyModel
from django.test import TestCase
import haystack

class SearchViewTests(TestCase):
fixtures = [‘test-data.json’]

def setUp(self):
self._haystack_backend = haystack.backend
haystack.backend = haystack.load_backend(‘simple’)

def tearDown(self):
haystack.backend = self._haystack_backend

def test_search(self):
results = SearchQuerySet().all()
assert results.count() == MyModel.objects.count()

My first attempt at this made changes to the project settings [and did `HAYSTACK_WHOOSH_STORAGE = “ram”`][ram] which works but was complicated because then you have to re-build the index with the fixtures loaded, except the fixtures [don’t get loaded in `TestCase.setUpClass`][setupclass], so the choice was to load the fixtures myself or to re-index for each test. And it was specific to [the Whoosh back-end][whoosh] of course.

(This is for Django 1.4 and Haystack 1.2.7. In my actual project I get to deploy on Python 2.5. Ain’t I lucky? On a fucking PowerMac G5 running OS X Server 10.5 [for fuck sacks][bug].)

[django]: https://www.djangoproject.com/
[whoosh]: http://bitbucket.org/mchaput/whoosh
[haystack]: http://haystacksearch.org/
[setupclass]: http://docs.python.org/2/library/unittest.html#unittest.TestCase.setUpClass
[ram]: https://django-haystack.readthedocs.org/en/v1.2.7/settings.html#haystack-whoosh-storage
[bug]: http://www.youtube.com/watch?v=XZtpAxDEzl8

My [Adobe software updates][asu] app (which uses [Haystack][haystack] + [Django][django] to provide a search feature) has a very inefficient search results template, where for each search result the template links back to the update’s related product page.

The meat of the search results template looks something like this:

{% for result in page.object_list %}

{% endfor %}

The reverse URL lookup triggers a separate SQL query to find the related product object’s slug field for each object in the results list, and that slows down the page response significantly.

For a regular queryset you would [tell Django to fetch the related objects][select-related] in one go when populating the template context in order to avoid the extra queries, but in this case `page.object_list` is generated by Haystack. So how to tell Haystack to use `select_related()` for the queryset?

It is easy. When you register a model to be indexed with Haystack for searching, you have to define a `SearchIndex` model, and you can also override the [`read_queryset()` method that is used by Haystack to get a Django queryset][rq]:

# myapp.search_indexes.py
from haystack import indexes, site
from myapp.models import MyModel

class MyModelIndex(indexes.SearchIndex):
# Indexed fields declared here
…
def get_model(self):
return MyModel

def read_queryset(self):
return self.model.objects.select_related()

site.register(MyModel, MyModelIndex)

And that solved it for me. Shaves three quarters off the execution time.

PS This all pertains to Django 1.4 and Haystack 1.2.7.

PPS Also pertains to a version of my Adobe software updates page that I haven’t deployed quite yet.

[haystack]: http://haystacksearch.org/
[django]: https://www.djangoproject.com/
[asu]: http://reliablybroken.com/wavesinspace/adobe/
[select-related]: https://docs.djangoproject.com/en/1.4/ref/models/querysets/#select-related
[rq]: http://django-haystack.readthedocs.org/en/v1.2.7/searchindex_api.html#read-queryset

Reliably Broken

It's a blog: let's do funch!

Monthly Archives: December 2012

Testing with Django, Haystack and Whoosh

Optimizing queries in Haystack results