Tuesday, September 27, 2011

Architecture should be as beautiful as Chinese poetry

I very much enjoyed Chinese poetry especially Song Ci (). The term ci simply means "word". ci poetry in fact is written songs. Most of the poems do not even have a distinct title but are named after an original melody. Composers and writers used this melody to write a new poem that could be sung to the original famous melody or tune pattern (cipai 詞牌), a technique called contrafactury. This is the reason why we often see the same title for a ci poem, like Dielianhua 蝶戀花 "Butterflies love blossoms", Mantingfang 滿庭芳 "Scent fills the hall", or Yu meiren 虞美人 "Lady Yu".

Below is one of the most famous Song Ci


One of the main reasons I enjoy this type of poetry is that it is very concise and leads to so much imagination which is extremely hard to describe in word, in fact too much description in word will only destroy readers' ability for imagination.

For example, in the above poetry, "乱石穿空,惊涛拍岸,卷起千堆雪" only few characters, yet it is so powerfully described the extremely dangerous red cliff site where the famous battle was fought between the Wei Kingdom (魏国) and Wu Kingdom (吴国) during Three Kingdoms era. I am sure that you will probably have such kind of mental image projected in your mind.

In fact this picture does not really convey the meaning of "乱石穿空", also it did not exactly convey the speed and power of the wave hitting the cliff, in the Chinese version, it is concisely described by one word in Chinese - 惊 (scared), we can imagine how hard the wave is hitting the cliff since the wave is described as escaping towards the cliff at full speed. I am sure now the mental image will be much more vivid and powerful.

So how is it related to architecture? In my view, two takeaways

1) Architecture should be as concise as possible, it will make easy and quick understanding for architecture users (the development team and maintenance/operation team), and ensure the architecture's efficiency without superfluous artifacts handing around to make the design complicated for the sake of complexity.

2) Architecture should be extensible internally for future ongoing enhancement.

It will be harder to achieve the second objective if the architecture has a lot of superfluous artifacts. Just like poetry, if you described with too much unnecessary words, not only it destroy the beauty of the poetry, also inhibits reader's imagination (future extension).

In fact this principle also applies to information model, try not to develop superset model, intead start from small and then expand (see another post about the fallacy of superset model). One of the architect who worked on UK NHS described this approach of "start from small and expand" in another more fancy term - develop just-in-time model.

Every few months, when you look at the architecture, if you enjoy and discover new potentials e.g you are able to plug in new capabilities without impacting the external facing services, that means you have achieved the above two objectives.

That's beauty and an art.

Saturday, September 24, 2011

How useful is 13606 RM record component id

I was asked why rc_id in ISO13606 is mandatory, below is the summary for sharing.

Note: below few paragraphs are excerpt from ISO13606-1 2008 version spec.

ISO 13606 is not intended to specify the internal architecture or database design of EHR systems or components. Nor is it intended to prescribe the kinds of clinical application that might require or contribute EHR data in particular settings, domains or specialities. For this reason, the information model proposed here is called the EHR Extract, and might be used to define a message, an XML document or schema, or an object interface.

ISO 13606 may offer a practical and useful contribution to the design of EHR systems but will primarily be realised as a common set of external interfaces or messages built on otherwise heterogeneous clinical systems.

Now lets look at this RM (reference model) as shown below RECORD_COMPONENT class which is the top level class for most of the 13606 RM classes - COMPOSITION, SECTION, ENTRY, ITEM and ELEMENT. ELEMENT class is the object that really holds the actual data. (this is important concept, we will refer it later)

"rc_id" attribute inRECORD_COMPONENT class is mandatory, now let's look at the definition of this attribute in ISO13606 spec,

rc_id: The globally-unique identifier by which this node in the EHR hierarchy is referenced in the EHR system to which the data were first committed. This identifier shall be retained by the EHR Recipient and re-used whenever this RECORD_COMPONENT is subsequently included in
another EHR_EXTRACT.

Below was my reply.

First, Is it meaningful or useful that every record in healthcare system has globally unique identifier in data exchange? As we mentioned above, RECORD_COMPONENT is top level class, so if rc_id is mandatory, then every sub class instance will need to supply value for rc_id.

Use a concrete example, lab test lipid panel has the following test item

Since in ISO13606 model, the actual data is holding in ELEMNT class, for the above test result, there are at least 4 ELEMENT object instance and 1 CLUSTER instance for the panel itself.

CLUSTER -- Lipid panel with direct LDL in Serum or Plasma
--> ELEMENT : Cholesterol [Mass/​volume] in Serum or Plasma
--> ELEMENT : Triglyceride [Mass/​volume] in Serum or Plasma
--> ELEMENT : Cholesterol in HDL [Mass/​volume] in Serum or Plasma
--> ELEMENT : Cholesterol in LDL [Mass/​volume] in Serum or Plasma by Direct assay

So for this particular report, the system needs to generate 5 GUID just to be compliant with the Reference Model. But how useful it is for data exchange? Does the receiving system need to know the GUID of each test item in the report? Will there be any value to the receiving system? Not at all.

Second, if the rc_id does not serve any purpose for data exchange, does it serve any purpose within the system? Yes, it is possibly required within the system itself, probably used as primary key to uniquely identify each record in database tables and the relationship. At relational data model level, there definitely needs to be primary key for each table, however this kind of information shall not surface at EHR extract reference model level. The abstract level information model shall not be mixed up the actual data model of the database otherwise each model will not be able to perform its intended function properly . Also it is contradictory to the intended objective of ISO13606 - "ISO 13606 is not intended to specify the internal architecture or database design of EHR systems or components".

So my conclusion is that it is wrong that ISO13606 RM specifies rc_id is mandatory.

However the user was not satisfied, he asked a very interesting question - why not let sending system specify what ever pseudo GUID in rc_id to be compliant with the reference model, and receiving system just ignore the value in rc_id, and then generate its own internally unique guid in rc_id attribute.

That's very very interesting - again back to the fundamental in any model design, if it does not serve any purpose, why you should model it in the first place?