Sunday, July 27, 2008

Data Styles and Token Splitting

This week, my mentor, Justin, and Scott Rosenbaum, a member of BIRT PMC, and myself had a web meeting to show and review the current functionality of the ODA. We got some really good feedback. Some of the main points:
  • The default data set should just show the most recent values for a selected token. This simplifies how the data is first returned and the data style can be changed from this if desired.
  • The ODA should support parameters. For instance, the user should be able to provide a parameter as a value in the modifier page. This will most likely be a project for after GSOC.
  • The more data the better. The default behavior should be to split the tokens by all four of the split values we have chosen to initially support.
  • A tree view to select the tokens would be nice. The branches would be the token tags and the leaves under the branches would be the appropriate tokens. This will also more than likely be a task for after GSOC.

As far as coding, I've added quite a bit of new functionality. The two basic additions can be categorized under data styles and token splitting.

Data Styles

There has been a lot of discussions regarding how to display the data to the user. There will always be the patient ID for the first column, but how the other columns are organized can vary. Rather than try to come up with the perfect data set, I've allowed the user to toggle between three different styles:

  1. Most recent. This is the default selection. There is a column displayed for every token/split combination that is selected. There is one row per patient displaying the most recent value for each token selection.
  2. Stacked. This is the EAV style of data where there is a KEY, VALUE, and appropriate splitter columns. There is the potential for multiple rows per patient/token if more than one value exists for a patient/token combination.
  3. Flat. This style will have the most columns and one row per patient. This style provides more information than the "Most recent" style by getting more than just the most recent value. Right now, its hard coded to return 5 values per token. Each of these 5 values can be split and thus even more tokens. Eventually, when the Logic Service supports FIRST x and LAST x, the user will be able to choose what this value is instead of the hard coded 5.

Token Splitting

Token splitting allows the user to get more data from a selected token than just the value of that said token. The following are the four additional "split" we are initially supporting:

  1. Observation Date
  2. Observation Location
  3. Encounter Date
  4. Encounter Type

I have added a new page to the ODA that allows the user to select which splitters to use for each token (the default is to include all of the splitters). The interface is basically a grid of check boxes where the splitters make up the columns and the selected tokens make up the rows. This page dynamically builds itself based on the tokens added or removed over time. Here's an example of what it looks like:


Splitting the tokens is supported for all three data styles mentioned above under "Data Styles".

Sunday, July 20, 2008

Lots of Changes to ODA and Logic Web Service

Wow, there was a lot going on this week with the project. A lot of discussion revolved around the Logic Service with everyone (especially Burke and Tammy). Tammy cleared up a lot of questions I had about the Logic Service and Burke created the beginning of a LogicCriteria parser which I was able to integrate the ODA with. Allow me to summarize all the changes and enhancements made to both the ODA and the Logic Web Service:

BIRT ODA

  • Removed the filter page and reintegrated the filter drop down back into the first page so that the user has to select a filter and then tokens can be chosen using the tag and search feature.
  • Changed the way that the data set wizard pages are presented to the user. Now, instead of having to go through all of the pages when initially creating a data set, the token selection page, the first page, is the only page shown. The user can access the more advanced pages after the edit data set dialogue comes up or by later reopening and editing an existing data set.
  • Added a helper class for easily tearing down and building back up queries, extracting certain pieces of the query, etc.
  • Changed all data set pages to generate query in the format of SELECT {token} optionalModifier x{token2} optionalModifier y... FROM cohortID.
  • Added a new data set page that allows the user to see the actual query that will be sent to the Logic Web Service as they keep changing their queries using the various data set wizard pages. Here's a query I created by selecting one of my cohorts and then various tokens including indicating that I wanted WEIGHT values that were less than 50 and TEMPERATURE that was greater than 40:

Logic Web Service
  • Added latest jars from logic refactoring branch.
  • Changed the data resource to accept new query format in the format of SELECT {token} optionalModifier x{token2} optionalModifier y... FROM cohortID.
  • Changed call that populates filter to single Context.getCohortService().getAllCohortDefinitions() call.
  • Used the new LogicCriteria.parse() method that Burke put together this week so that tokens and their modifiers are passed to this parser. The appropriate LogicCriteria is created and passed on to the logic service for evaluation and the results are passed back to the ODA.
  • Added helper class to help with getting information out of the URL request.

Next step is adding a page and modifications to the query to allow the user to split the tokens into more than just the value like date and location. I'm going to use colons in the query after the token and modifiers to specify how to split the individual tokens.

Sunday, July 13, 2008

Midterm Review and Update

Thursday (7/10/08), I joined the developer's call to review the current status of my project. I got a lot of really good feedback from everyone and a new sense of direction for the project. The following is a basic list of some of the major points from the call and followup mailing discussions:
  • Keep it simple and introduce complexity later if time allows. It's better to have a simple solution that actually works than a really feature rich solution that doesn't do anything.
  • As we add more functionality to the ODA, just creating an initial data set is a lot to throw at the user. When a user initially creates the data set, we just want to provide the first page that allows for token selection and the rest of the pages should not be seen. Then, the user should be able to go to the other pages via the edit data set interface to further refine the query.
  • The Modifier page is going to be redone. The top piece will basically allow a user to add an aggregate to the beginning (just FIRST and LAST for now). LAST will be the default. Then, the user will use the bottom half of the page to choose conditions to add to the query if desired.
  • A new query format is needed to support all of these new additions. We're moving more towards a SQL looking query where the SELECT chooses the tokens and the FROM is the cohort. For token in the SELECT, there will first be an aggregate (LAST), then the token name in curly brackets, followed by an optional condition. After the "aggregate {token} condition", there are pipe delimiters to indicate how to split the token (date, location, etc.). Finally, the desired cohort is in the FROM clause. More will probably required later but this is the simple format for now.

A lot of other details behind adding aggregates and modifiers to tokens were discussed. Check this out which is the latest mock up of the interface. The modifier page (2) still needs more work to provide a better way to indicate how the conditions are applied.

One of the great breakthroughs this week was moving away from the Mock Logic Web Service to the "real" Logic Web Service. The main problem was the data I had from the sample data set had some concepts that were missing names (more details in the bug here). There is also another problem where the dynamic cohorts don't have an identifiable name so I've removed these types of cohorts from the Logic Web Service for now and just the static cohorts are available. Anyways, it's great to be using the ODA and getting back real lists of tokens and actual data instead of the hard coded values I was working with.

As far as coding this week, I added support on both the Logic Web Service and BIRT ODA side to add four columns to every token that is chosen:
  1. Observation Time
  2. Observation Location
  3. Encounter Time
  4. Encounter Type
So far, the user has no choice and all four of these columns are added no matter what is chosen through the interface (this will be worked on when adding the split selection page to the data set wizard). There's a major problem with this so it doesn't work quite like it should. Casting the Result object from the data query to an Obs object does not work so I can't actually get any of the times, location, or type. Right now, it's just returning the Result's date and the rest of the values are null. There's also a problem with datetime data types working (had to use Text for now).

I also fixed a nasty bug where the underlying class that holds the information regarding the token, filter, etc. would never be flushed. This problem isn't noticeable unless you create a brand new data set and notice that all your selections from the previous data set are selected. I added cleanup() methods to all the data set pages that destroy the shared InformationHolder and added logic to reload the InformationHolder when saving the page.

In the immediate future, I'll be working on all the refactoring required to use the new query format. Hopefully soon, I can also create a page for choosing how and which columns to split.

Sunday, July 6, 2008

Persisting the Token Modifiers

This week I worked on persisting the user's selection of token modifiers on the back end so that they could be passed as part of the data query as well as be able to reload the user's selection later in order to further modify. The ODA interface hasn't changed from a graphical point of view but has changed such that all changes made to the modifier table are recorded, saved, and used to make the data query.

Right now, the new URL request format for data looks something like this but will most likely change after further discussion:

(URL removed -> didn't show up correctly in HTML)

Here's one of the examples that I was testing with:

(URL removed -> see example in JUnit test at http://svn.openmrs.org/openmrs-modules/odamocklogicws/test/org/openmrs/module/odamocklogicws/TestMockWebService.java)

I added some support to the mock logic web service so it would know how to handle such requests. Since the actual logic service API isn't ready, I just tacked on the user's modifier requests to the beginning of the hard coded data. Here's a sample data preview after selecting some modifiers for a few given tokens: