Database schema for a variable-level hierarchy with variable types


A client is asking for a database structure for a medical translation app that involves narrowing down a health issue and then asking specific questions on it. The tricky part of representing this in a database is that there may be anywhere from 2-5 hierarchical "filters" to narrow down the health issue, then a hierarchy of questions to ask the patient. The same filters and questions can be re-used in any level of the hierarchy. The questions must also be translated into several languages if dealing with a non-English speaking patient. For example:

-> Symptoms


    -> Chest Pain

      "Have you experienced any chest pain recently?"
        "On a scale of one to ten, how severe is the pain?"
        "Is the onset of the pain sudden or gradual?"
        "Is the pain related to exertion?"

      "Do you ever suffer from shortness of breath?"

    -> Palpitations

      "Are you ever aware of your heartbeat? What is it like?"
        "Can you tap out the rhythm with your finger?"

-> Discharge

  --> Chest Pain

    "We recommend the following lifestyle changes to improve your condition."
      "Avoid taking too much aspirin for pain."

I can use a string hack to make this work, but am looking for something that can scale better and is searchable. Also it will be maintained by a non-technical person in the future, so preference to something simple to comprehend vs. better performance like nested sets. (Sorry for the formatting, everything I tried kept auto-block quoting stuff.) Thank you to anybody who can assist!

Best Answer

A good way to do this in a relational database is with statement, category, and category_structure tables. The statement represents the line of text. Assume that each statement resides in a single category. A group of statements in a category have to appear in a certain order. Categories include the text groupings including Cardiovascular and Palpitations. The category_structure relates the hierarchy of categories. Each structure has a child_Category, parent_category, and sort_order. A structure is a unique combination of a child and a parent. Here are the table definitions.

PK   statement_ID         number   An autogenerated sequence
FK   category_ID          number   A foreign key to category table
     statement_text       varchar  The long description On a scale of one...
     statement_sort_order number   The order within a category

PK   category_ID          number   An autogenerated sequence
  UK category_unique      varchar  Uniquely identifies the purpose of a category.
                                   Values could be Discharge. Also, Chest Pain Symptom, 
                                   Chest Pain Discharge. 
     category_description varchar  The visible text description of a category. The
              values would be Symptoms, Cardiovascular,
              Discharge, Chest Pain, Palpitations. 

PK   structure_ID         number  An autogenerated sequence
FKUK child_category_ID    number  foreign key to category
FKUK parent_category_ID   number  foreign key to category. Optional. 
     category_sort_order  number  The order of categories or the order within 

Note the Symptoms category would be the parent of all symptoms. The Discharge category would be the parent of all discharge. The structure table does not prevent a category from having both symptom and discharge as parents.

I see from the reformatting edit that Chest Pain is both a symptom category and a discharge category. This would still work fine. One option is to make two categories for Chest Pain, one as a symptom, one as a discharge. They just happen to have the same name, but they contain different sets of statements. The other option (reflected in the table design) is to add a field to the statement table describing whether the statement is a symptom-style statement or a discharge-style statement. So the Chest Pain category has both symptom statements and discharge statements. The query would only show the appropriate subset based on the need.