Sunday, July 27, 2008

Data Styles and Token Splitting

This week, my mentor, Justin, and Scott Rosenbaum, a member of BIRT PMC, and myself had a web meeting to show and review the current functionality of the ODA. We got some really good feedback. Some of the main points:
  • The default data set should just show the most recent values for a selected token. This simplifies how the data is first returned and the data style can be changed from this if desired.
  • The ODA should support parameters. For instance, the user should be able to provide a parameter as a value in the modifier page. This will most likely be a project for after GSOC.
  • The more data the better. The default behavior should be to split the tokens by all four of the split values we have chosen to initially support.
  • A tree view to select the tokens would be nice. The branches would be the token tags and the leaves under the branches would be the appropriate tokens. This will also more than likely be a task for after GSOC.

As far as coding, I've added quite a bit of new functionality. The two basic additions can be categorized under data styles and token splitting.

Data Styles

There has been a lot of discussions regarding how to display the data to the user. There will always be the patient ID for the first column, but how the other columns are organized can vary. Rather than try to come up with the perfect data set, I've allowed the user to toggle between three different styles:

  1. Most recent. This is the default selection. There is a column displayed for every token/split combination that is selected. There is one row per patient displaying the most recent value for each token selection.
  2. Stacked. This is the EAV style of data where there is a KEY, VALUE, and appropriate splitter columns. There is the potential for multiple rows per patient/token if more than one value exists for a patient/token combination.
  3. Flat. This style will have the most columns and one row per patient. This style provides more information than the "Most recent" style by getting more than just the most recent value. Right now, its hard coded to return 5 values per token. Each of these 5 values can be split and thus even more tokens. Eventually, when the Logic Service supports FIRST x and LAST x, the user will be able to choose what this value is instead of the hard coded 5.

Token Splitting

Token splitting allows the user to get more data from a selected token than just the value of that said token. The following are the four additional "split" we are initially supporting:

  1. Observation Date
  2. Observation Location
  3. Encounter Date
  4. Encounter Type

I have added a new page to the ODA that allows the user to select which splitters to use for each token (the default is to include all of the splitters). The interface is basically a grid of check boxes where the splitters make up the columns and the selected tokens make up the rows. This page dynamically builds itself based on the tokens added or removed over time. Here's an example of what it looks like:


Splitting the tokens is supported for all three data styles mentioned above under "Data Styles".