Saturday, December 27, 2008

Window Sizing Bug Fixed in BIRT 2.3.1

An annoying bug described here that resulted in the ODA window size behaving unpredictably no longer exists when I recently tested with BIRT 2.3.1. The User and Developer Guide has been updated to require BIRT 2.3.1. This is the latest version of BIRT and the direction we want to take with the plugin anyways.

Tuesday, August 19, 2008

Sunday, August 17, 2008

Wrapping Things Up

Since the GSOC deadline is today, I spent the last week polishing things up, uploading/committing the latest versions of everything, and working on documentation.

The following OpenMRS wiki pages have all been updated to reflect the latest status and implementation of the OpenMRS ODA and Logic Web Service:


I recorded an instructional video that walks through the creation of a data source and data set using the OpenMRS ODA, showing the different wizard pages and options available. However, I'm having issues getting it published to flash on blip.tv. I'll create another post when I get this squared away.


I also put together three simple reports that illustrate how the three different data styles can be used:
  • Most Recent - Patient summary pages with their most recent data
  • Stacked - Graphs for each patient that track weight, height, temperature, and CD4 per patient over time
  • Flat - XLS fact sheet with many columns where there is one row per patient


Any and all feedback is appreciated!

Sunday, August 10, 2008

Fixing Bugs, Javadocs, JUnit Tests, User Conference, and Enhanced Modifier Interface

Wow, that's a long title :). Things have been pretty hectic so I actually missed my blog update last week. Allow me to catch up!

Bug Fixes

Tammy has been great with helping get the Logic Service up to speed. I probably identified at least 5 or so bugs with how the Logic Service was returning data and Tammy always promptly fixed them.

Javadocs

I went through all of the BIRT ODA and Logic Web Service classes and added Javadocs. I used the JAutodoc Eclipse plugin to help with adding all of the Javadocs and OpenMRS headings to each class. We plan on initially hosting them on Justin Miranda's development machine: http://www.justinmiranda.com/.

JUnit Tests

With the developers on the verge of a junit-test-a-thon, I went through the BIRT ODA code and added 33 JUnit 4 tests that uses the "should" keyword at the beginning of each test. The tests mainly cover the back end functionality of the BIRT ODA like the building up and breaking down of the Logic Service query among other things. The plan is to add more self-contained tests that cover the Logic Service and Logic Web Service (right now, the Logic Service and Logic Web Service tests I've created require a running OpenMRS instance with specific data).

Actuate International User Conference

Last Monday (8/4/08) through Wednesday (8/6/08), I was at the Actuate International User Conference in Las Vegas. My mentor, Justin Miranda, was invited to present as part of BIRT Live Day. It was great because I finally got to meet Justin and discuss things face-to-face. Although the presentation Justin gave was meant for those not familiar with OpenMRS, I learned a lot as well. We were also able to demo the current ODA during the presentation.

Scott Rosenbaum of Innovent Solutions was also there so we were able to chat with him. He has been a great resource for this ODA project.

Enhanced Modifier Interface

Although the modifier interface allows the user to add multiple modifiers to any token, one of the key components missing from the modifier interface was the ability to specify an aggregate for a given token. I've been delaying this since the Logic Service Parser only supports AGGREGATE {TOKEN} and not AGGREGATE X {TOKEN} style queries right now. However, I decided to go ahead and build this into the modifier page and thus the enhanced interface:


The selected tokens are still listed at the top the modifier page. You can still click the individual token names to see any current modifiers in the bottom of the page and add modifiers to the token as desired. To the left of each token are two drop downs. The first drop down is the aggregate (FIRST, LAST, MAX, and MIN) and the second drop down is the value (1-10) for the aggregate. For instance, one may wish to get the last 8 weights recorded for patients (LAST 8 {WEIGHT (KG)}. The default aggregate settings for when a token is first selected is LAST 1 (gets just one value which is the most recent). If a user selects or removes tokens from the token selection page, the next time the modifier page is visited, the user will see more or less token rows based on their selection.

In order to change the aggregate and aggregate value, you must select a data style that is not the default, most recent. The ODA builds the aggregates and values into the Logic Service query, but right now it doesn't change how the data is returned very much. Since the Logic Service does not yet support these aggregate queries, I'm handling things differently for each data style:

  • Most Recent - The aggregate and aggregate value drop down boxes are disabled (they are greyed out and cannot be selected). By definition, the most recent data style will just get the most recent data for a token so there is no point in applying an aggregate.
  • Stacked - Does absolutely nothing. All of the data will still be returned for the stacked data style. This will be changed when the Logic Service is ready.
  • Flat - The aggregate is not considered at all, but the aggregate value is. So, if a user constructs a query with FIRST 4 WEIGHT and LAST 3 HEIGHT, the FIRST and LAST aggregates won't effect how the data is returned, but there will be 4 expanded columns for WEIGHT and 3 expanded columns for HEIGHT. Again, this will be changed to present the data as expected when the Logic Service is ready.

When the Logic Service supports AGGREGATE X {TOKEN} queries, the Logic Web Service should take very minor modifications to start using it.

Now I'm off to more ODA polishing for the GSOC deadline that is closing in on me :)

Sunday, July 27, 2008

Data Styles and Token Splitting

This week, my mentor, Justin, and Scott Rosenbaum, a member of BIRT PMC, and myself had a web meeting to show and review the current functionality of the ODA. We got some really good feedback. Some of the main points:
  • The default data set should just show the most recent values for a selected token. This simplifies how the data is first returned and the data style can be changed from this if desired.
  • The ODA should support parameters. For instance, the user should be able to provide a parameter as a value in the modifier page. This will most likely be a project for after GSOC.
  • The more data the better. The default behavior should be to split the tokens by all four of the split values we have chosen to initially support.
  • A tree view to select the tokens would be nice. The branches would be the token tags and the leaves under the branches would be the appropriate tokens. This will also more than likely be a task for after GSOC.

As far as coding, I've added quite a bit of new functionality. The two basic additions can be categorized under data styles and token splitting.

Data Styles

There has been a lot of discussions regarding how to display the data to the user. There will always be the patient ID for the first column, but how the other columns are organized can vary. Rather than try to come up with the perfect data set, I've allowed the user to toggle between three different styles:

  1. Most recent. This is the default selection. There is a column displayed for every token/split combination that is selected. There is one row per patient displaying the most recent value for each token selection.
  2. Stacked. This is the EAV style of data where there is a KEY, VALUE, and appropriate splitter columns. There is the potential for multiple rows per patient/token if more than one value exists for a patient/token combination.
  3. Flat. This style will have the most columns and one row per patient. This style provides more information than the "Most recent" style by getting more than just the most recent value. Right now, its hard coded to return 5 values per token. Each of these 5 values can be split and thus even more tokens. Eventually, when the Logic Service supports FIRST x and LAST x, the user will be able to choose what this value is instead of the hard coded 5.

Token Splitting

Token splitting allows the user to get more data from a selected token than just the value of that said token. The following are the four additional "split" we are initially supporting:

  1. Observation Date
  2. Observation Location
  3. Encounter Date
  4. Encounter Type

I have added a new page to the ODA that allows the user to select which splitters to use for each token (the default is to include all of the splitters). The interface is basically a grid of check boxes where the splitters make up the columns and the selected tokens make up the rows. This page dynamically builds itself based on the tokens added or removed over time. Here's an example of what it looks like:


Splitting the tokens is supported for all three data styles mentioned above under "Data Styles".

Sunday, July 20, 2008

Lots of Changes to ODA and Logic Web Service

Wow, there was a lot going on this week with the project. A lot of discussion revolved around the Logic Service with everyone (especially Burke and Tammy). Tammy cleared up a lot of questions I had about the Logic Service and Burke created the beginning of a LogicCriteria parser which I was able to integrate the ODA with. Allow me to summarize all the changes and enhancements made to both the ODA and the Logic Web Service:

BIRT ODA

  • Removed the filter page and reintegrated the filter drop down back into the first page so that the user has to select a filter and then tokens can be chosen using the tag and search feature.
  • Changed the way that the data set wizard pages are presented to the user. Now, instead of having to go through all of the pages when initially creating a data set, the token selection page, the first page, is the only page shown. The user can access the more advanced pages after the edit data set dialogue comes up or by later reopening and editing an existing data set.
  • Added a helper class for easily tearing down and building back up queries, extracting certain pieces of the query, etc.
  • Changed all data set pages to generate query in the format of SELECT {token} optionalModifier x{token2} optionalModifier y... FROM cohortID.
  • Added a new data set page that allows the user to see the actual query that will be sent to the Logic Web Service as they keep changing their queries using the various data set wizard pages. Here's a query I created by selecting one of my cohorts and then various tokens including indicating that I wanted WEIGHT values that were less than 50 and TEMPERATURE that was greater than 40:

Logic Web Service
  • Added latest jars from logic refactoring branch.
  • Changed the data resource to accept new query format in the format of SELECT {token} optionalModifier x{token2} optionalModifier y... FROM cohortID.
  • Changed call that populates filter to single Context.getCohortService().getAllCohortDefinitions() call.
  • Used the new LogicCriteria.parse() method that Burke put together this week so that tokens and their modifiers are passed to this parser. The appropriate LogicCriteria is created and passed on to the logic service for evaluation and the results are passed back to the ODA.
  • Added helper class to help with getting information out of the URL request.

Next step is adding a page and modifications to the query to allow the user to split the tokens into more than just the value like date and location. I'm going to use colons in the query after the token and modifiers to specify how to split the individual tokens.

Sunday, July 13, 2008

Midterm Review and Update

Thursday (7/10/08), I joined the developer's call to review the current status of my project. I got a lot of really good feedback from everyone and a new sense of direction for the project. The following is a basic list of some of the major points from the call and followup mailing discussions:
  • Keep it simple and introduce complexity later if time allows. It's better to have a simple solution that actually works than a really feature rich solution that doesn't do anything.
  • As we add more functionality to the ODA, just creating an initial data set is a lot to throw at the user. When a user initially creates the data set, we just want to provide the first page that allows for token selection and the rest of the pages should not be seen. Then, the user should be able to go to the other pages via the edit data set interface to further refine the query.
  • The Modifier page is going to be redone. The top piece will basically allow a user to add an aggregate to the beginning (just FIRST and LAST for now). LAST will be the default. Then, the user will use the bottom half of the page to choose conditions to add to the query if desired.
  • A new query format is needed to support all of these new additions. We're moving more towards a SQL looking query where the SELECT chooses the tokens and the FROM is the cohort. For token in the SELECT, there will first be an aggregate (LAST), then the token name in curly brackets, followed by an optional condition. After the "aggregate {token} condition", there are pipe delimiters to indicate how to split the token (date, location, etc.). Finally, the desired cohort is in the FROM clause. More will probably required later but this is the simple format for now.

A lot of other details behind adding aggregates and modifiers to tokens were discussed. Check this out which is the latest mock up of the interface. The modifier page (2) still needs more work to provide a better way to indicate how the conditions are applied.

One of the great breakthroughs this week was moving away from the Mock Logic Web Service to the "real" Logic Web Service. The main problem was the data I had from the sample data set had some concepts that were missing names (more details in the bug here). There is also another problem where the dynamic cohorts don't have an identifiable name so I've removed these types of cohorts from the Logic Web Service for now and just the static cohorts are available. Anyways, it's great to be using the ODA and getting back real lists of tokens and actual data instead of the hard coded values I was working with.

As far as coding this week, I added support on both the Logic Web Service and BIRT ODA side to add four columns to every token that is chosen:
  1. Observation Time
  2. Observation Location
  3. Encounter Time
  4. Encounter Type
So far, the user has no choice and all four of these columns are added no matter what is chosen through the interface (this will be worked on when adding the split selection page to the data set wizard). There's a major problem with this so it doesn't work quite like it should. Casting the Result object from the data query to an Obs object does not work so I can't actually get any of the times, location, or type. Right now, it's just returning the Result's date and the rest of the values are null. There's also a problem with datetime data types working (had to use Text for now).

I also fixed a nasty bug where the underlying class that holds the information regarding the token, filter, etc. would never be flushed. This problem isn't noticeable unless you create a brand new data set and notice that all your selections from the previous data set are selected. I added cleanup() methods to all the data set pages that destroy the shared InformationHolder and added logic to reload the InformationHolder when saving the page.

In the immediate future, I'll be working on all the refactoring required to use the new query format. Hopefully soon, I can also create a page for choosing how and which columns to split.