CSV sucks

For future reference: [CSV][csv] is a terrible format for spreadsheet data.

Any by spreadsheet data I mean CSV is a terrible format for text data that will be opened in [Microsoft Excel][excel] (unless you only speak English).

My objection to CSV as a spreadsheet format is it has no way to indicate the text encoding, and Excel (at least Excel 2008 for Mac and Excel 2007 for Windows) has no way of choosing the encoding when opening a CSV file. When your data is text from any number of non-English languages you will likely be dealing with characters outside the standard ‘latin-1’ character set.

Even if Excel did have a little pop-up menu for choosing the encoding when opening a CSV file, it would mean instructing your Excel-happy friends to “open as UTF-8” whenever they wanted to open your CSV data.

Wot you mean you don’t have Excel-happy friends?

[csv]: http://en.wikipedia.org/wiki/Comma-separated_values
[excel]: http://office.microsoft.com/en-us/excel/

Styling your Excel data with xlwt

This post is about how to create styles in [Excel spreadsheets][excel] with [the most excellent xlwt][xlwt] for [Python][python]. The documentation for xlwt (version 0.7.2) is a little sketchy on how to use formatting. So here goes…

To apply formatting to a cell you pass an instance of the `xlwt.XFStyle` class as the fourth argument to the `xlwt.Worksheet.write` method. The best way to create an instance is to use the `xlwt.easyxf` helper, which takes a string that specifies the formatting for a cell.

The other thing about using styles is you should only make one instance of each, then pass that same style object every time you want to apply it to a cell.

An example which uses a few styles:

import xlwt

styles = dict(
bold = ‘font: bold 1’,
italic = ‘font: italic 1’,
# Wrap text in the cell
wrap_bold = ‘font: bold 1; align: wrap 1;’,
# White text on a blue background
reversed = ‘pattern: pattern solid, fore_color blue; font: color white;’,
# Light orange checkered background
light_orange_bg = ‘pattern: pattern fine_dots, fore_color white, back_color orange;’,
# Heavy borders
bordered = ‘border: top thick, right thick, bottom thick, left thick;’,
# 16 pt red text
big_red = ‘font: height 320, color red;’,
)

I have no idea what it is based on, but 20 = 1 pt. So 320 = 16 pt text.

book = xlwt.Workbook()
sheet = book.add_sheet(‘Style demo’)

for idx, k in enumerate(sorted(styles)):
style = xlwt.easyxf(styles[k])
sheet.write(idx, 0, k)
sheet.write(idx, 1, styles[k], style)

book.save(‘Example.xls’)

It isn’t included with [the current distribution on the cheese shop][pypi], but there is [a useful Excel spreadsheet demonstrating cell patterns][sheet] in the source repository.

You can find the complete list of possible cell formats by reading [the source for `xlwt.Styles`][styles].

[excel]: http://office.microsoft.com/en-us/excel/
[xlwt]: http://www.python-excel.org/
[python]: http://www.python.org/
[styles]: https://secure.simplistix.co.uk/svn/xlwt/tags/0.7.2/xlwt/Style.py
[pypi]: http://pypi.python.org/pypi/xlwt/0.7.2
[sheet]: https://secure.simplistix.co.uk/svn/xlwt/tags/0.7.2/xlwt/doc/pattern_examples.xls

VCS for Cocoa Programming for Mac OS X

I am working my way through [Aaron Hillegass][aaron]’ [Cocoa Programming for Mac OS X][cocoa] (again). I find it a good book, despite the fact I’ve started it twice before and have abandoned my tuition twice before. I like to think my progression is an awful lot like young [Luke Skywalker recklessly abandoning his Jedi training][luke] on Dagobah in order to save his Web application friends in the cloud city called Bespinternet. Eventually Luke (and I) will go back to complete his (and my) training and will become a truly great Jedi programmer for Mac OS X.

Anyway, for the first few chapters the book leads you through a number of small applications, each exercise starting a new project. Then in the later chapters Hillegass has you adding features to a single project called RaiseMan.

It strikes me that the next edition of the book should introduce using version control to track development.

Many chapters in the book end with challenges that go off on a tangent, getting you to implement alternate approaches to the exercises covered in that chapter. But then the subsequent chapter will expect you to work with the version of the RaiseMan application as it stood before the previous chapter’s programming challenge.

This is a natural fit for revision control (aka [VCS][vcs]). The book would teach you how to tag “release” versions of your code. It would teach you how to create an experimental feature branch for the chapter-end challenges, and then how to resume development from the last “release” version for the next chapter’s main exercises.

This is particularly relevant now that [Xcode 4 includes Git integration][xcodegit]. Of course there is no reason one can’t employ a suitable version control system with the current (third) edition of the book; I just think it makes a lot of sense to get younglings used to this workflow while teaching them all the other stuff about lightsabres and [the Force][rdf] while you’re at it.

You will up-vote this on [Reddit][reddit] / [Hacker News][hn] *(waves hand like her Imperial majesty)*.

[cocoa]: http://www.bignerdranch.com/book/cocoa%C2%AE_programming_for_mac%C2%AE_os_x_3rd_edition
[aaron]: http://www.bignerdranch.com/instructors/hillegass.shtml
[luke]: http://www.youtube.com/watch?v=om6ctZWNw18
[forum]: http://forums.bignerdranch.com/viewforum.php?f=5
[xcodegit]: https://github.com/blog/810-xcode-4-released-with-git-integration
[reddit]: http://reddit.com/
[hn]: http://news.ycombinator.com/
[rdf]: http://www.folklore.org/StoryView.py?project=Macintosh&story=Reality_Distortion_Field.txt
[vcs]: http://en.wikipedia.org/wiki/Revision_control

Adobe Software Updates

What Adobe’s software update site needs is:

– [An RSS / Atom feed of recent updates][feed].

This way you can subscribe to a list of recent updates in your favourite news reader and be informed when a new update is released without having to scan product-specific blogs, etc.

– [Per-product pages listing available updates][product].

When you want to find an update for a product you can find it on the dedicated product update page. It will be there.

– [Nice URLs][acrobat].

URLs make the Web. Putting the product name in the URL makes a human-friendly URL, as opposed to putting the opaque product reference numeric key as a query parameter. Quick test: which updates do you expect to see on [http://www.adobe.com/support/downloads/product.jsp?product=1&platform=Macintosh][acromac] ? And assuming you were so insane as to guess the previous URL correctly, which updates do you expect to see on [http://www.adobe.com/support/downloads/product.jsp?product=1][acrowuh]

Anyway, those be my principal beefs with the current Adobe software updates site. So I made a site that tries to satisfy my beeves. Beefs.

It is here: [http://reliablybroken.com/wavesinspace/][wavesinspace]

Please provide feedback to [[email protected]][gm6]

[acromac]: http://www.adobe.com/support/downloads/product.jsp?product=1&platform=Macintosh
[acrowuh]: http://www.adobe.com/support/downloads/product.jsp?product=1
[wavesinspace]: http://reliablybroken.com/wavesinspace/
[gm6]: mailto:[email protected]
[bowie]: http://www.davidbowie.com/
[feed]: http://reliablybroken.com/wavesinspace/adobe/feed/
[product]: http://reliablybroken.com/wavesinspace/adobe/
[acrobat]: http://reliablybroken.com/wavesinspace/adobe/acrobat/

Adobe’s software update site is shit

This is written from the point of view of someone looking to keep abreast of software patches for [Adobe][adobe]’s many excellent products (also Acrobat).

Adobe’s [Downloads page][downloads] is mostly about downloading product demos. Although on that page there is a list on the side with a link to the real product updates page and a very out-dated list of updates.

So then the actual [Product Updates page][updates] has a menu for all their products that takes you to the updates for an individual product, and a list of “featured updates”. What qualifies an update to be featured is a mystery, so that list is not useful either.

They don’t think to mention it on the Downloads or Product Updates pages, but there is also a [New Downloads page][new] which is actually rather handy, although there is no indication what constitutes “new” so it can be difficult to tell if something was released in the time between your last visit and the oldest update mentioned on that page.

My favourite aspect of Adobe’s support pages is the whimsical approach to the page for a product. For example, [the page for Illustrator for Macintosh][illustrator] includes the 15.0.2 update for Illustrator that shipped as part of Creative Suite version 5. Meanwhile [the Creative Suite for Mac updates page][cs] doesn’t admit there have been any updates for CS5 at all.

What Adobe’s software update site needs is…

[adobe]: http://www.adobe.com/
[downloads]: http://www.adobe.com/downloads/
[updates]: http://www.adobe.com/downloads/updates/
[new]: http://www.adobe.com/support/downloads/new.jsp
[illustrator]: http://www.adobe.com/support/downloads/product.jsp?product=27&platform=Macintosh
[cs]: http://www.adobe.com/support/downloads/collection.jsp?collID=1&platform=Macintosh

Building Nginx 0.9.5 on Debian Lenny

[Nginx][nginx] is available in [Debian Lenny][lenny], but the version in stable is the old 0.6.x series. [Perusio maintains a useful repository][perusio] with development versions built for Lenny, but it requires libraries newer than those in stable.

*UPDATED: fixed ‘build-essential’ – thank you Carlos*

But it is easy enough to build a [deb][deb] from the Perusio package which uses the stable libraries. Here are my notes. N.B. Editing the apt sources and installing packages needs root privileges.

First, add the Perusio repository to `/etc/apt/sources.list`:

cat >> /etc/apt/sources.list <

Caching a Django app with Nginx + FastCGI

I just spent a stupid amount of time trying to figure why [Nginx][nginx] was failing to cache my [Django][django] app responses. The Django app is running as a FastCGI backend and I have Nginx using the [`fastcgi_cache` directive][cache] to cache responses.

The answer is that [Nginx since version 0.8.44 does not cache backend responses if they have a “Set-Cookie” header][0844]. This makes perfect sense because you don’t want a response which sets a cookie to be cached for subsequent requests, but I was stupid because I had totally forgotten that my Django app was using a POST form for all responses for non-authenticated clients (due to how [Django’s CSRF middleware][csrf] does its stuff).

The solution was to change the app so that it uses the GET method on the form in question, which in this case is fine from a security point-of-view.

The moral of this story is I should pay attention to my HTTP response headers and that I am badly short-sighted both figuratively and literally. With that fixed the site has gone from 15 requests per second to ~2000 requests per second!

[nginx]: http://nginx.org/
[django]: http://www.djangoproject.com/
[0844]: http://nginx.org/en/CHANGES
[csrf]: http://docs.djangoproject.com/en/dev/ref/contrib/csrf/
[cache]: http://wiki.nginx.org/HttpFcgiModule#fastcgi_cache

Microsoft isn’t totally evil

[Pierre Igot][igot] complains about the font rendering on [Microsoft’s Office for Mac][macoffice] product pages:

> [Why does the web site promoting Microsoft Office for Mac OS X use, by default, a font that no Mac OS X user has on his or her system?][complain]

Well, the branding for Mac Office uses Segoe so it makes sense to keep that brand consistent across all the pages. But it was more of a rhetorical question, wasn’t it?

But Pierre has missed that the text is not displayed using a font specified by the stylesheet (he notes that he doesn’t have Segoe installed). What actually happens is it is rendered as canvas elements, one for each word.

This works using a JavaScript library called [Cufón][cufon] which aims to allow text to be rendered using any font the designer specifies, even if the font is not installed on the visitor’s computer. In the good old days the text would have been rendered as a graphic (as the designer did for the ‘Office:mac’ logo at the top of the page) but modern Web designers are too cool for that.

The advantage of the crazy download-able fonts and client-side JavaScript approach is the headline text remains accessible, you can copy and paste it etc. And at larger sizes it looks pretty good; only the smaller headline text renders poorly.

This screenshot shows how well it works for large size text and poorly for the small text (taken in Google Chrome dev version). I’ve selected part of the text to demonstrate that it is “live”:

Screenshot of Microsoft's Mac site

Pierre is right, the smaller headline text does look terrible for the very same visitors Microsoft wants to sell to. But I don’t mind too much because the whole site is much less of a useability nightmare since the re-design. Microsoft’s Mac Web designers aren’t entirely evil.

[cufon]: http://cufon.shoqolate.com/generate/
[igot]: http://www.betalogue.com
[complain]: http://www.betalogue.com/2011/02/10/microsoft-segoe/
[macoffice]: http://www.microsoft.com/mac/

Weird App Store buttons

Screenshot of `App Store.app` showing title bar buttons out of place.

Why did Apple allow [the new App Store application][appstore] to ignore the [human interface guidelines][buttons]? It isn’t like the toolbar is doing anything radically different to other information browsers, nothing that might warrant exploring new interface ideas.

It just looks odd, an arbitrary inconsistency that would be simple to correct.

[appstore]: http://www.apple.com/mac/app-store/
[buttons]: http://developer.apple.com/library/mac/documentation/UserExperience/Conceptual/AppleHIGuidelines/XHIGWindows/XHIGWindows.html#//apple_ref/doc/uid/20000961-TPXREF51

Running minidlna on Mac

These are my notes on installing [minidlna][minidlna], a [DLNA][dlna] server for Mac OS X. I compiled it from source and installed the supporting libraries from [MacPorts][macports].

Most of this was culled from [a thread on the minidlna forum][forum].

First install each of the following ports. The command for each would be something like `sudo port install libiconv`.

– libiconv
– sqlite3
– jpeg
– libexif
– libid3tag
– libogg
– libvorbis
– flac
– ffmpeg

Then check out the Mac branch of the current minidlna source from the CVS repository.

cvs -d:pserver:[email protected]:/cvsroot/minidlna checkout -r osx_port minidlna
cd minidlna

The current build script appears to miss out pulling in libiconv so I had to edit `configure.ac`, inserting a line to bring in `libiconv`.

AC_CHECK_LIB([iconv], [main],, AC_MSG_ERROR(Cannot find required library iconv.))

Now the build will work. Although I found I needed to run `autogen.sh` twice for it to generate all the necessary files.

source ENVIRONMENT.macports
sh genconfig.sh
sh autogen.sh
sh autogen.sh
./configure
make

This spits out the minidlna executable and a basic configuration file. Copy these to wherever you want them. Edit the `minidlna.conf` file, pointing it at the files you want to serve. There are examples of what to do in that configuration file.

And for testing purposes you can start the server from the build directory.

./minidlna -d -f minidlna.conf

Bingo.

I did try using [ushare][ushare], another DLNA server, but I couldn’t figure out how to persuade my Sony telly to successfully connect to it. So I gave up. I feel it is useful to give up quickly when something doesn’t work until you run out of alternatives, I consider this triage. I also consider my telly’s inability to work with ushare and the fact that the telly will only play a very limited set of video formats a mark against the promise of DLNA.

[minidlna]: http://sourceforge.net/projects/minidlna/
[dlna]: http://www.dlna.org/
[forum]: http://sourceforge.net/projects/minidlna/forums/forum/879956/topic/3412747
[macports]: http://www.macports.org/
[bravia]: http://www.sony.co.uk/hub/bravia-lcd-televisions
[ushare]: http://ushare.geexbox.org/