Category Archives: Ontology

Ontology: How to use RadLex Ontology to Index Content?

In this post, I will demonstrate how you can text mine your own content with a Protege Ontology.  This is a high level post about using ontologies.

Get the name and unique identifier of every concept within the structured ontology.  Then  parse various digital formats (word documents, pdfs, text files, or database columns) for each concept name.  Lastly, record the concept name found, concept id, its frequency of occurrence, and an identifier for which content I found it in.

The following diagram parses a Microsoft Word doc for ontology concepts.

Now you can start playing with the datasets!  Have Fun!  :D

Android : Radiology Quiz

Part 3 of series on gamification of a dataset.

Part 1 Part 2

A randomized multiple choice quiz which tests the players knowledge of different relationships from the RadLex Ontology.

Download and play for free on GooglePlay

https://play.google.com/store/apps/details?id=MantasCode.RadQuiz

Relationship sets used:

{“anatomical_site”, “blood_supply_of”, “branch_of”, “constitutional_part_of”, “contained_in”, “contains”, “drains_into”, “has_blood_supply”, “has_branch”, “has_constitutional_part”, “has_innervation_source”, “has_member”, “lymphatic_drainage_of”, “member_of”, “projects_from”, “projects_to”, “receives_drainage_from”, “receives_input_from”, “receives_projection_from”, “related_condition”, “related_modality”, “segment_of”, “sends_output_to”};

^ each one of these sets is the same as the Character to Location set from Part 1. However, these are much larger.

over 12,000 questions .

update:

Added count down timer, added correct answer display if incorrect answer chosen, reformatted button layouts, changed some colors, and added some annoying sounds  XD

 

LOL! Big Data Analytics!

So what’s the best way to compare data?


Say you have a bunch of different data, of all different types and sizes.
You’ve got Square Data and Circle Data.

 

And Square Data is Different from Circle Data.
You know they are both Shapes, but one has 4 corners and the other has 0 corners.
“and thats ridiculous!” from either perspective.

Also, if we look closer we can see that circle data looks like this.

And Square Data looks like this.

Well, I can’t compare this “J” with “l” ! >:O


WHAT AN OUTRAGE!  This is very frustrating! >:<

. . .

But WHAT IF!  I had even more granular data about each J and l?

And ‘J’ looked like this.

And  ‘l’ looked like this.

:D!   Fantastic!  They have something in common!  COLORS !

“boring!”

meh, so now what? Well, we can record the frequency of each color related to each individual ‘J’ or ‘l’.

“so what?”

And then we can sort the color data of each J or l from most frequent to least.

“lame…”

“._.”

Then we can take the top x most frequent colors and call that a set (set of colors for each J and l).  So now, by putting emphasis on frequency we can attempt to make relevance!

“Whatever!”

We can now play with Data!

We can figure out and see what the most optimal algorithm for DataSet Comparison is!  Is it top 5 most frequent terms of “J” compared with the top 5 most frequent terms of “l” with at least 2 matches make a relation between “J” and “l”? Maybe its top 15 compared with the top 50 with x matches?  Maybe 5 vs. 15 with x matches?

If the colors were “special” words I find that the  top 10 vs. top 10 with 4 matches or more, works best.  But this could change at any moment!  I could wake up tomorrow and decide differently.  There is no absolute truth here,  teh absolute truth is in teh data!

:D

 

Java: Protege Frames: How to get a distinct list of all possible Slots within a Project.

If for whatever reason you might need this list. All you have to do is loop through the Cls’s, then loop through the Slots of the Cls’s, using the method .getOwnSlots()

More Protege Frames how-to’s here http://mantascode.com/?p=507

import java.util.ArrayList;
import java.util.Collection;
import java.util.HashSet;
import java.util.Iterator;
 
import edu.stanford.smi.protege.model.Cls;
import edu.stanford.smi.protege.model.KnowledgeBase;
import edu.stanford.smi.protege.model.Project;
 
//This Program will output a complete list of distinct slots within a Protege Frames Project file.
 
public class jokke
{
      private static final String PROJECT_FILE_NAME = "C:\\MC\\RadLex.pprj";
      public static void main(String[] args)
      {
            HashSet idsLoaded = new HashSet();
            Collection errors = new ArrayList();
        Project project = new Project(PROJECT_FILE_NAME, errors);
        KnowledgeBase kb = project.getKnowledgeBase();
        ArrayList distinctSlotList = new ArrayList();
        //get Class iterator
        Iterator radClsIter = kb.getClses().iterator();
        //loop through each class
        while ( radClsIter.hasNext())
        {
            //get Cls object
            Cls currentClass = (Cls) radClsIter.next();
            //String for a comma deliminated list of slots
            String slotsString = currentClass.getOwnSlots().toString();
            String [] clsSlotComponents = slotsString.split(",");
            //loop through each slot
            for ( int i = 0 ; i &lt;  clsSlotComponents.length; i++ )
            {
                  //check to see if particular slot already exists in ArrayList
                  if ( distinctSlotList.contains(clsSlotComponents[i].trim().replaceAll("\\[|\\]", "")))
                  {}
                  else
                  {
                        distinctSlotList.add(clsSlotComponents[i].trim().replaceAll("\\[|\\]", ""));
                  }
            }
        }
 
        Iterator iterator = distinctSlotList.iterator();
        while(iterator.hasNext())
        {
            System.out.println(iterator.next());
        }
      }
}

JAVA: How to programmatically manipulate a Protégé-Frames lexicon / ontology / dictionary using Protege API and Java.

This tutorial will give a brief overview of how to modify a Protégé-Frames dictionary using Java.

Download Protege:

protege.stanford.edu/

Reference the API:

protege.stanford.edu/protege/3.4/docs/api/core/

The Following examples will require these imports:

import edu.stanford.smi.protege.model.Cls;
import edu.stanford.smi.protege.model.Instance;
import edu.stanford.smi.protege.model.KnowledgeBase;
import edu.stanford.smi.protege.model.Project;
import edu.stanford.smi.protege.model.Slot;

Assuming you’ve successfully associated the appropriate Protege libraries with your Eclipse, lets get started with some simple HOW-TOs in programmatical Protege ontology manipulation;

For this tutorial I will be using the RadLex Lexicon/Dictionary.

The first thing you will need is the actual project files.
Mine are called : RadLex.pins, RadLex.pont, and RadLex.pprj
These files are created when you save a Lexicon/Dictionary in Frames format using Protege.

Create a pointer to them in your java code. I have mine in in C:\smelo\RadLex.pprj
You only have to point to the *.pprj file.

//Get the project file of the lexicon you want to manipulate
private static final String PROJECT_FILE_Pointer = "C:\\smelo\\RadLex.pprj";

Next, you will need to get the KnowledgeBase object.

//errors object is required in getting the project
Collection errors = new ArrayList();
//Get Project project
Project project = new Project(PROJECT_FILE_Pointer, errors);
//Get Knowledgebase kb
KnowledgeBase kb = project.getKnowledgeBase();

In Protege Frames, each Term/Concept/Lexicon Entity/Thingy is considered a “class”(Cls). The entire RadLex dictionary is nothing more then a bunch of terms(classes) with attributes(slots), and the relationships amongst them.

So lets iterate through all the classes in the KnowledgeBase.

To do this we need to get a class iterator:

//RadLex Class iterator
Iterator radClsIter = kb.getClses().iterator();

Loop through every class

In RadLex, the “name” of every class is its Radlex Id number, or “RID###..”
such as “RID1099” “RID35707” “RID3874” ….

Every RadLex class contains a number of “slots” or attributes that you can populate:

Using the method .getOwnSlots() on a Cls Object will show you a list of all the slots and their names

These are all different attributes(slots) of a single concept(Cls) in RadLex:

Slot(Related_Condition), Slot(Anatomical_Site), Slot(Related_modality), Slot(ACR_ID), Slot(UMLS_Term), Slot(Is_A), Slot(UMLS_ID), Slot(Misspelling of term), Slot(Has_Subtype), Slot(Non-English_name), Slot(Member_Of), Slot(Preferred_name), Slot(Source), Slot(Non-Sanctioned Synonym), Slot(Synonym), Slot(Version_Number), Slot(Term_Status), Slot(SNOMED_ID), Slot(Acronym), Slot(SNOMED_Term), Slot(ACR_Term), Slot(Comment), Slot(Image_URL), Slot(Definition), Slot(:ROLE), Slot(:DOCUMENTATION), Slot(:SLOT-CONSTRAINTS), Slot(:DIRECT-INSTANCES), Slot(:DIRECT-SUPERCLASSES), Slot(:DIRECT-SUBCLASSES), Slot(:DIRECT-TEMPLATE-SLOTS), Slot(:NAME), Slot(:DIRECT-TYPE)

Some slot types are simple strings, while others are more complex Instances.

while( radClsIter.hasNext() )
{
	Cls currentClass = radClsIter.next();
}

So now that we have a brief scattered overview of the RadLex Lexicon/Dictionary,
and we know how to iterate through each Concept/Term/Class within it. Lets do some practical HOW-TOs.

HOW TO get the value of “Preferred_name” slot from a class object(currentClass)?

One way of retrieving of the content of a slot of a class would be by the slot’s name.
So all you have to do is, get another slot iterator this time for the class(term/concept), and compare its name with the one you want.
Then, iterate through the content of the slot and display its value.(You might not have to do this last step; it depends of the type of slot)

In RadLex the Preferred_name slot of a class is populated with an Instance type.
So we have to get the instance of a slot then get the name of the Instance.

String slotName = "Preferred_name"; 
 
Iterator slotIter = currentClass.getOwnSlots().iterator();
 
//Iterate through the slots
while( slotIter.hasNext() )
{
	Slot currentSlot = slotIter.next();
 
	//Compare the slot name with your slotName String
	if( slotName.equals( currentSlot.getName() ) )
    	{
		Object slotContentObject = null;
 
		//Create another iterator for the content of the slot
    		Iterator iterSlotContent = currentClass.getOwnSlotValues(currentSlot).iterator();
 
    		if( iterSlotContent.hasNext() )
    		{
    			slotContentObject = iterSlotContent.next(); 
 
			//if the content of a slot is of type Instance
    			if(slotContentObject instanceof Instance)
  			{
				//cast to Instance
    				Instance instSlotValue = (Instance) slotContentObject;
 
				//use the method getBrowserText() on the Instance to get its String content
    				System.out.println(instSlotValue.getBrowserText());
    			}
    		}
    	}
}

HOW TO change the Definition slot of a Class/Concept/Term by specific “name” (RID) and save to project file?

Say we want to change the content of the “Definition” slot in Cls “RID35712”:

String slotName = "Definition"; 
 
//get the Cls object of the Term/Concept/Class
Cls myConcept = kb.getCls( "RID35712" );
 
Iterator myConceptClsIter = myConcept.getOwnSlots().iterator();
while( myConceptClsIter.hasNext() )
{
	Slot currentSlot = myConceptClsIter.next();
	if(slotName.equals(currentSlot.getName()) )
	{
		//set new Definition
		myConcept.setOwnSlotValue(currentSlot, "This is my new Definition for Term RID35712");
	}
}
 
//save changes
project.save(errors);

HOW TO display ALL synonyms of the entire project?

What if we want to list the names of ALL synonyms throughout the entire dictionary

Iterator radClsIter = kb.getClses().iterator();
 
while( radClsIter.hasNext() ){
        Cls currentClass = radClsIter.next();
        String prefName = findPreferredName(currentClass);
 
        String slotName = "Synonym";
        Iterator slotIter = currentClass.getOwnSlots().iterator();
 
        while( slotIter.hasNext() )
        {
    		Slot currentSlot = slotIter.next();
    		if(slotName.equals(currentSlot.getName()) )
    		{
    			Object slotContentObject = null;
    			Iterator iterSlotContent = currentClass.getOwnSlotValues(currentSlot).iterator();
    			if( iterSlotContent.hasNext() )
    			{
    				slotContentObject = iterSlotContent.next();
    				if(slotContentObject instanceof Instance)
    				{
    					Instance instSlotValue = (Instance) slotContentObject;
    					System.out.println(instSlotValue.getBrowserText());
    				}
    			}
    		}
    	}
}

HOW TO create a new Class/Concept/Term and Add it as a child of RID35712?

We will be assigning the “name” as new unique “RID12345” and adding it as child of the term “RID35712”

Collection myCol =   kb.getClsNameMatches("RID35712",1);
kb.createCls("RID12345", myCol);

That covers a few basics, if you would like an example of something I havent covered, feel free to leave a comment.

GL & HF :D

Android: RadLex – How to HTTP post to a URL, retrieve and parse xml, format and display results in a ListView, override back button, Async Tasks, Loading Dialogs, Webservices and more…

RadLex for android is a program that uses Stanford’s BioPortal Webservices to retrieve data about the RadLex Ontology / Lexicon / Dictionary.  It currently allows the user to navigate the tree structure from concept to concept starting at the root term.

This application requires full Internet access permissions and is dependent on Stanford BioPortal’s webservices http://bioportal.bioontology.org being operational.

RadLex is available for free download at the Android Market: https://market.android.com/details?id=radlex.wave

User Instructions: [Short Click] or tap on a concept, will navigate to its children or sub-terms.  [Long Click] or hold on a concept, will popup a dialog with a definition and other term details (if it has any).


Continue reading