Wednesday, September 08, 2010

DjangoCon Day Two

Sometimes I think anthropology departments must be too full of fluff, as we don't seem to attract many field workers to these tribal events, complete with totems, self-government issues, exogenous and endogamous challenges.

How to pass the torch? Who gets a commit bit?

The Django community only has 14 people with a commit bit, and really low bus numbers on some aspects of the code.

A "bus number" is how many people getting hit by buses would result in zero people remaining to maintain, enhance, otherwise work on some feature, application or code. Django's object-relational mapper (ORM) for example: few people are conversant with it among the inner 14, ergo "low bus numbers".

James Bennett gave a great keynote yesterday on the need to open up the core to more committers. These sentiments were echoed in Eric Florenzano's keynote of this morning, which focused on weaknesses and downsides of Django and the community.

These kinds of rants are valuable and expected within any group that expects to thrive over the long haul I should think. One needs to handle criticisms that emerge from deep within the ranks, perhaps with more alacrity than those more casual criticisms coming from outside, from those with less of an investment. Eric is not a core committer, but he does spend many of his waking hours wrangling Django.

Andrew Godwin's talk focused on the non-relational databases out there, of which there there are many, and efforts to make Django work as a front end for some of them. Document stores may have some schema-like aspects, which means the concept of a "model" in Django may still be apropos.

Considering electronic medical records again, an underlying structure is the time-line, as each person's "life thread" (as Greek mythology viewed them) is a chronological sequence.

:: the three fates ::

Medical devices committing data to an EMR need to authenticate patient identity and then find the appropriate point on the time-line to place the data, perhaps with clinician annotations.

Somehow, the URL to this data, being consulted at a later time, will trigger the right viewer to load, perhaps in its own window, thereby decoding the data and rendering a readily decipherable visualization, assuming a trained eye. Perhaps a cine will play; a short video clip of a beating heart, before and after angioplasty.

Hospitals currently store these cines on their own servers. To what extent will EMRs be independent of specific health care systems? Each person accumulates some gigabytes of medical data over the course of a lifetime.

Many bureaucrats have already thrown up their hands, saying the problem is too difficult, can't be done. Other say that, however it's done, their job will be to inter-connect the various implementations, not source any specific solution themselves.

However, given how under-served people are, in so many regions, opportunities for greenfield development are myriad. Doctors without borders aren't required to use HL7 or follow legislative guidelines specific to any one country.

Within a culture of open source, opportunities will develop and get shared. Closed off silos continually reinvent the same wheels whereas those who share more strongly encourage their peers to keep up to date.

Data migration services will evolve to transform one kind of EMR into another.

Yes, what I'm describing is the "messy scrap book" model at a high level (organized temporally), with specific fields being structured, as worked out among various specialists and vendors.

Might a SQL database serve as an outermost wrapper, perhaps with XML fields for less structured data, with patient identity and chronological sequence serving as primary relational keys?

I'm not looking for a single answer or solution.

Researchers will plow through EMRs seeking to generate clinical research records (CCRs), many of them scrubbed of any traceable identity information, yet still associated by case history. Having fragments of the genome decoded does not imply identifiability.

Additionally, those "donating their bodies to science" (a well-known idiom) might nowadays sign a release allowing unmasked identity information to become available to a wider circle of authorized personnel. Medical literature is already full of case histories of identifiable personae, mixed in with the more anonymous.

The data stays authentic, is not some arbitrary mix. Fictional identities may be synthesized by algorithm to mask the actual ones, but in ways that support valid correlations and therefore conclusions. Arbitrary changes to age, sex, weight, history of smoking, should not be messed with, as medically relevant data points.

Open source data sets, freely downloadable, give researchers shared access to common pools.

Intelligently designed viewers and data harvesters with built in anonymizers will facilitate extracting patient data minus patient identity. Data sets packed with thousands of true case histories will be gold to researchers, all the more so because loss or theft of this data will not represent a security breach in terms of patient confidentiality.