DrProject: Switching to Kid

February 20, 2006 – 1:45 pm

Chris Lenz, Jason Montojo, and I began work on refactoring DrProject in early January. One of the first decisions we made was to replace the Clearsilver templating framework with Kid, an XML-based alternative. Now that the work is done, we’ve learned a few things about Kid that others might find useful.

Why did we abandon Clearsilver? First, its templates are not valid XML documents, making maintenance very difficult. If you have ever had to modify someone elses clearsilver template, you will already know it’s difficult at best. Second, Clearsilver is not Pythonic: when passing data into the template, you first have to preprocess it into a pseudo-dictionary of strings, which means you have to process your data twice: once for the preparation phase, and then again when the template is being rendered. Finally, since you cannot access Python functions and objects from within the template, you have to
execute many UI-related functions in the controlling layer, rather than in the template, which blurs the separation between controller and view.

After looking at a few alternatives, we settled on Kid as a replacement. At first glance, it seemed like a perfect solution: Kid templates are guaranteed to be well-formed XML, and you can pass Python data structures and objects to the template for use in the rendering stage.

Once we eventually finished porting the view layer to Kid (a non-trivial process which I will describe in an upcoming post), the end result was cleaner controlling code and cleaner templates, which will be significantly easier to maintain.

But Kid isn’t perfect (what is?). There are many problems and “gotcha’s”, which I have been documenting. Most of these issues are minor, and only ever catch the developer once. Rendering speed, however, is turning out to be a very significant problem. Simply put, Kid is slow. In my tests, the rendering phase of a single web request is approximately 2-3 times longer than the processing phase, which includes many database seeks. You can see the difference by running this simple test:

#!/usr/bin/python
#

import timing

timing.start()
data = ['Number <em>%s</em>' % x for x in range(100)]
timing.finish()
process_time = timing.milli()

source = """
<html xmlns:py="http://purl.org/kid/ns#">
<head>
</head>
<body>
<table>
<tr py:for="x in data">
<td>${XML(x)}</td>
</tr>
</table>
</body>
</html>
"""

import kid
timing.start()
template = kid.Template(source=source, data=data)
content = template.serialize()
timing.finish()
print "Processing time: %d, Rendering time: %d" % (process_time, timing.milli())

which results in: Processing time: 0, Rendering time: 1759

This performance is almost shockingly poor. The problem appears to be a side-effect of guaranteeing the template is well-formed XML: when you remove the XML(...) fragment from the template, and just display x, the rendering time drops to 129 milliseconds.

There has recently been some talk on the Kid mailing list calling for an option to disable the “well-formed XML” check when embedding XML into a template. Hopefully for DrProject, this change gets pushed into Kid very soon. In the meantime, if you have experienced similar performance issues with Kid and have found a workaround, please email me.

  1. 6 Responses to “DrProject: Switching to Kid”

  2. That’s not really good example for Kid usage: using the XML() function should be an exception.

    Ideally, any XML-data you put into the template data shouldn’t be strings that need to be parsed again by Kid to be rendered, but rather ElementTree data structures, which would get rendered directly.

    For DrProject this means that the wiki formatting would need to be adapted to return an ElementTree instead of a big string. I had hoped this would be done as part of the wiki formatter rewrite, but unfortunately that didn’t happen. And the few other places that generate markup as strings would of course also need to be changed.

    In any case, Generating XML with a proper API is the right thing to do here, and the current practice of just assembling strings should be considered a left-over from “the old times”.

    By Christopher Lenz on Feb 20, 2006

  3. Fair enough, but this is stilll going to be a problem when we use third party tools to generate HTML (ie, enscript, silvercity for the syntax highlighting).

    In the meantime, I’m not sure what we can do. Most of the HTML embedded comes from the WikiFormatter, which as you mentioned, would need a rewrite to conform to the api.

    By Sean Dawson on Feb 20, 2006

  4. Don’t know whether you care (as it is beta) but the site does not render correctly in Internet Explorer 7 (beta2 5296). The menu is cutoff.

    img363.imageshack.us/img363/9011/ie72nf.png

    By Przemek on Feb 20, 2006

  5. This is probably a place where dom/xml based page generation comes into its own. If the fragments you are using a generating are already xml then the syntactial checks should be optimal (only when loading fragments from text files) and can be cached (cache the generated xml tree).

    By Tim Parkin on Feb 21, 2006

  6. I am glad to hear that you are now using Kid. Some of those gotchas are interesting and when I have a little extra time I’ll take a closer look. Have you considered maintaining that information on Kid’s official wiki? That way many more people would be able to take advantage of your experience.

    Speed right now is my number one priority. There is a new branch for speed improvements that I created for the PyCon sprints. There are several changes that I have yet to commit, which I think will yield some improvements.

    By David Stanek on Mar 16, 2006

  7. Hi David,

    Thanks for the suggestion! I will definitely add that info to the Kid page when I get a moment. Great work on Kid, we’re looking forward to the speed improvements ;)

    By Sean Dawson on Mar 27, 2006

Post a Comment