Some days ago I used the wget tool at my Cygwin command line, a *nix in a *doze, on Joe Moore's site. He's spent half a lifetime compiling bibliography & connecting ideas around Bucky (Bucky being a veritable switchboard of the 20th century), all in rough and ready wild west HTML (no rules, or almost none -- however consistently tabular, a big help).
So this morning I subclassed HTMLParser in Python and wrote a quick hack, aiming at filtering HTML noise and republishing in a strict XML. From there, it'd be easier to head for mature formatting, via XSLT. We wouldn't lose touch with Joe's primary audience, either: souls on the web (so-called "eyeballs" in market research parlance). My Python code globbed through "books By"*.htm and spit back some reasonable XML for a minority of pages (those in a strict 3-column id, chapter title, page number format). I pasted one success story to GEODESIC c/o SUNY at Buffalo. Here, lemme link to my code.
That was this morning.
This afternoon, I left Dawn with some veteran/alumni of some enlightenment training (Portland has a lot of good schools), while I picked up Tara, who needs something faxed to her teacher (due homework, completed per spec, but left in the printer this morning), and then drove out to Marine Drive to welcome Don back from Guatemala. He spoke with great awe and respect of the Guatemalen experience/people (pueblo). I'll be rejoining him later for the kickoff of the 2005-2006 ISEPP lecture series, our latest in a series of presentations delivered by some Very Big Names (VBNs) -- sometimes with repeat visits (e.g. Roger Penrose is returning in March).
Portland has felt very privileged to be on the receiving end of Terry's circus. I've enjoyed so many of these lectures so very very much. You VBN people are simply amazing and wonderful creatures. Love!
But some of that hasn't finished happening yet. Gotta get back to work.
PS: Lotsa VBN-hoods out there, and you're welcome to be a Big Name in several of them. The Universe is not stingy about letting us be celebrities in cool circles we respect, if that's our goal (which doesn't mean individual humans can't be miserly). Join the club, beat your drum (metaphor), and sometimes hold it down (Quaker voice: ssshhh, keep it quiet), and you'll dimly hear many more drum beats, from far far away (something drums are good for).
Followup (Nov 4): my write-up of last night's lecture (link).