Wednesday, March 25, 2009

Django Workshop

When I originally signed up for this workshop, I was envisioning maybe doing some Django for a research client (as a part of a team).

However, FOSS is still somewhat suspect in conservative environments laden with proprietary applications, or maybe that wasn't the problem? Anyway investing in my future is what I hope I'm doing here. So let's hear what the "real world" is really like then: this is Django in the Real World, by James Bennett and Jacob Kaplan-Moss.

James is Django's release manager and has written Practical Django Projects. He works at Lawrence Journal-World in Kansas, where Django was invented. Jacob is one of the two Django BDFLs, and is author of The Definitive Guide to Django which I've studied on O'Reilly's Safari.

This is about stuff that scares me: tools needed to serve web pages in moderate to high demand environments, the kind of stuff Google App Engine is supposed to handle for us (for a fee). The room is packed, pretty much every seat taken.

"Do one thing, and do it well" -- applications should encapsulate, keeping a tight focus, expressible in two short sentences e.g. "handle storage of users and authentication of their identities." The opposite of coding the equivalent of a run-on sentence.

Good Django applications are small, as they're designed to work together (have lots of INSTALLED_APPS). Extend carefully. Part of this is semantic ("project" versus "application"), but mostly it's advice against the monolithic mindset (you're not writing plug-ins, you're riding herd).

Specify default forms and template names (but the templates themselves should be stubs, or even missing in the distributed version i.e. specifics of look and feel don't really port, so why bother).

Likewise: redirect with a default URL, but let people specify their own (e.g. success.html).

Use reverse lookups (hadn't seen that before). We're getting pretty esoteric here (the presenters admit).

Use introspection on models. Learn to love managers (these let you encapsulte patterns of behavior behind a nice API). Encourage subclassing and use of subclasses. Good extensible APIs is the name of the game.

Build to distribute, even if you don't plan to, ever. Tight coupling, like hardcoding the site name everywhere, kills flexibility. A settings file and URL configuration file should be enough. Ellington is the LJW CMS. I should check into it.

This "workshop" is encouraging in the sense that we're not doing a lot with running sourcecode. I plan to have some for mine, downloadable for students to play with, but will they have VPython installed? Maybe not. I'm planning to write some more last minute examples this evening.

If you pick trunk, update frequently (meaning be clear about which version you're using). APIs aren't frozen in trunk, new stuff might change (e.g. admin "bulk action").

"ORM is the Vietnam of computer science" -- I'd never heard that one before (we're talking about "model inheritance" which is only a year old and people are still figuring out the design patterns -- letting other people be the canaries in the coal mine is Jacob's approach, at least for production work).

Jacob: Tests are the programmer's stone, transmuting fear into boredom (Kent Beck, author of Test Driven Development). Build in time to write tests. TDD = Test Driven Development. Jacob doesn't do it hardcore, gets annoyed with the dogma. Read Code Complete, published by Microsoft -- one of the best books ever. When you start down the testing road, you're making a contract with yourself to not check in code with broken tests.

In Python we use like unittest at the lowest level, however Django has its own django.test.TestCase, which features fixtures, a test client, email capture, database management (i.e. it flushes and fills). This all comes at a speed cost. Don't run tests on production servers though. Doctests are also cool.

"Whitebox" versus "blackbox" testing: you know everything about "under the hood" versus treating the application like it's more closed, seeing it more as a user or outsider might (also called functional testing). The latter is sometimes called "BDD" i.e. "behavior driven development". TDD folks tend to look down on BDD folks but it's not either/or. Twill is a good one, plus Django has django.test.Client. Web browser testing is the ultimate BDD approach: Selenium, and the newer Windmill fall into this category. Check Python Testing Tools Taxonomy.

After break: deployment. This slide always scares me (I keep seeing it)...

(click for larger view)
Django would fit where mod_perl is sitting. Use mod_wsgi instead (more predictable than mod_python). In shared hosting world, Webfaction has mod_wsgi -- that's my provider for 4dSolutions.net. I should go with Django on that site I've long thought but will our Oregon Curriculum Network ever have the energy budget?

Get you media server, database server and application server onto separate machines. Use connection pooling to talk to the database e.g. pgpool. Apache is OK as a media server, but maybe nginx or lighttpd. I'm confused by the diagrams, asked more about those media servers.

Monitoring: lots of tools.

Performance: DB is the bottleneck, most have "slow query" logs, otherwise I/O is a problem. Sometimes slowness is on the front end. Steve Souder's book is good, or check out YSlow. Caching is key. Use Django's? An external cache (Squid, Varnish)? You can direct non-logged-in users to a cache. DB replication is a next step, if caching doesn't do it.

So many tools have been mentioned.

We'd need a team of IS types do deal with these scaling issues. I get intimidated imagining trying to learn and implement all this stuff myself.

Fortunately, this is a large and growing memepool, with a growing number of skilled personnel (like Jacob -- performance issues are his bag).

If the newspaper subculture can do all this performance tuning (Django culture grew up there) then I imagine hospitals might too.

In any case, a research database with a small number of users is nothing like a world readable newspaper. Scenario: IS cuts its teeth on something tiny and intimate, gets bolder and more ambitious over time.

The commitment is more to using FOSS and an MVC approach to data serving, less than to Django per se. FOSS tools are not only free, but best of breed in many cases.

I sat next to a coffee merchant, Brian Zambrano, who does his business web site in Django. He gave me a quick tour of how he uses a media server, and Yahoo's YUI. I should point him to my Coffee Shops Network blog.