Home Updates Messages SandBox

Gettext and Doctest Together in Python

I have recently started playing with python's doctest module, and as I was previously playing with the gettext module, I soon stumbled upon a certain conflict that these two modules have. They both use __builtins__._, and this makes them break each other badly.

Gettext uses _ as the traditional marker for translatable strings. It's a generally accepted way of marking strings, across many languages and platforms: you just enclose them in a call to a function or macro called _, for example _("Hello world!"), and various gettext tools can then extract the strings and provide translations. It's a tradition. In order to make the _ function available in your application without the need of importing and initializing gettext in each and every module where you use it, you use gettext.install() convenience function, which will install that _ in the python's builtins, effectively making it globally available. This is one of the few cases where globals make sense.

On the other hand, python console has another convenient feature: when you enter an expression in it, it will evaluate it and display the result, so that you can use it as a simple calculator. It will also assign that result to the _ variable, so that you can use it in the next expression, as is often needed in calculations. As the doctest module effectively simulates python console, it also uses that variable. And yes, you guessed it: for reasons unknown to me (but probably more than just the convenience of those who wrote the console), it also uses ___builtins__._, and not a global or local variable. This is actually convenient: this way you can cover it with your own global or local variable named _ have it behave normally, not changing value after every evaluation of expression in console.

As you can guess, together it leads to a disaster. A simple function will fail its unit tests if it has any translatable string it it. Consider this:

import gettext
import doctest

gettext.install("hello")

def hello():
    """Prints 'hello world!'

    >>> 42
    42
    >>> hello()
    hello world
    """

    print _('hello world')

doctest.testmod()

When you try to run this program, you will get a failed test, with a puzzling exception:

>>>
Traceback (most recent call last):
      File "/usr/lib/python2.5/doctest.py", line 1228, in __run
        compileflags, 1) in test.globs
      File "<doctest __main__.hello[1]>", line 1, in <module>
        hello()
      File "a.py", line 15, in hello
        print _('hello world')
TypeError: 'int' object is not callable

What happened? Well, at the first test, the __builtins__._ got overwritten with the current value, 42, which obviously is an int and cannot be called like the gettext function. In the second test, the body of the function gets executed, but _ in it is already 42, not gettext. The code that works perfectly outside the tests, fails in testing.

One way around it is to assign _ = _ at the beginning of the module (and each and every module in which you use gettext), then _ gets covered by the global and is no longer treated specially. But this sucks, especially when you have to modify modules not written by you, and you have to modify them each time you get a newer version.

Another way is to disable the use of _ inside doctest. You can do it with code like this:

import sys

def displayhook(value):
    if value is not None:
        print repr(value)

sys.displayhook = displayhook

This way the value of expression only gets displayed, but not remembered. I know that it's figting magic with more magic, but you only need to put it into your test module, so at least you limit the damage – no need to spead workarounds all over your program. I wonder why the authors of doctest didn't think about it, or didn't at least put a warning – it took me quite some time to even guess what is happening.

Many thanks to Berbelek from pl.comp.lang.python for suggesting this solution!