JAVA: How to programmatically manipulate a Protégé-Frames lexicon / ontology / dictionary using Protege API and Java.

This tutorial will give a brief overview of how to modify a Protégé-Frames dictionary using Java.

Download Protege:

protege.stanford.edu/

Reference the API:

protege.stanford.edu/protege/3.4/docs/api/core/

The Following examples will require these imports:

import edu.stanford.smi.protege.model.Cls;
import edu.stanford.smi.protege.model.Instance;
import edu.stanford.smi.protege.model.KnowledgeBase;
import edu.stanford.smi.protege.model.Project;
import edu.stanford.smi.protege.model.Slot;

Assuming you’ve successfully associated the appropriate Protege libraries with your Eclipse, lets get started with some simple HOW-TOs in programmatical Protege ontology manipulation;

For this tutorial I will be using the RadLex Lexicon/Dictionary.

The first thing you will need is the actual project files.
Mine are called : RadLex.pins, RadLex.pont, and RadLex.pprj
These files are created when you save a Lexicon/Dictionary in Frames format using Protege.

Create a pointer to them in your java code. I have mine in in C:\smelo\RadLex.pprj
You only have to point to the *.pprj file.

//Get the project file of the lexicon you want to manipulate
private static final String PROJECT_FILE_Pointer = "C:\\smelo\\RadLex.pprj";

Next, you will need to get the KnowledgeBase object.

//errors object is required in getting the project
Collection errors = new ArrayList();
//Get Project project
Project project = new Project(PROJECT_FILE_Pointer, errors);
//Get Knowledgebase kb
KnowledgeBase kb = project.getKnowledgeBase();

In Protege Frames, each Term/Concept/Lexicon Entity/Thingy is considered a “class”(Cls). The entire RadLex dictionary is nothing more then a bunch of terms(classes) with attributes(slots), and the relationships amongst them.

So lets iterate through all the classes in the KnowledgeBase.

To do this we need to get a class iterator:

//RadLex Class iterator
Iterator radClsIter = kb.getClses().iterator();

Loop through every class

In RadLex, the “name” of every class is its Radlex Id number, or “RID###..”
such as “RID1099” “RID35707” “RID3874” ….

Every RadLex class contains a number of “slots” or attributes that you can populate:

Using the method .getOwnSlots() on a Cls Object will show you a list of all the slots and their names

These are all different attributes(slots) of a single concept(Cls) in RadLex:

Slot(Related_Condition), Slot(Anatomical_Site), Slot(Related_modality), Slot(ACR_ID), Slot(UMLS_Term), Slot(Is_A), Slot(UMLS_ID), Slot(Misspelling of term), Slot(Has_Subtype), Slot(Non-English_name), Slot(Member_Of), Slot(Preferred_name), Slot(Source), Slot(Non-Sanctioned Synonym), Slot(Synonym), Slot(Version_Number), Slot(Term_Status), Slot(SNOMED_ID), Slot(Acronym), Slot(SNOMED_Term), Slot(ACR_Term), Slot(Comment), Slot(Image_URL), Slot(Definition), Slot(:ROLE), Slot(:DOCUMENTATION), Slot(:SLOT-CONSTRAINTS), Slot(:DIRECT-INSTANCES), Slot(:DIRECT-SUPERCLASSES), Slot(:DIRECT-SUBCLASSES), Slot(:DIRECT-TEMPLATE-SLOTS), Slot(:NAME), Slot(:DIRECT-TYPE)

Some slot types are simple strings, while others are more complex Instances.

while( radClsIter.hasNext() )
{
	Cls currentClass = radClsIter.next();
}

So now that we have a brief scattered overview of the RadLex Lexicon/Dictionary,
and we know how to iterate through each Concept/Term/Class within it. Lets do some practical HOW-TOs.

HOW TO get the value of “Preferred_name” slot from a class object(currentClass)?

One way of retrieving of the content of a slot of a class would be by the slot’s name.
So all you have to do is, get another slot iterator this time for the class(term/concept), and compare its name with the one you want.
Then, iterate through the content of the slot and display its value.(You might not have to do this last step; it depends of the type of slot)

In RadLex the Preferred_name slot of a class is populated with an Instance type.
So we have to get the instance of a slot then get the name of the Instance.

String slotName = "Preferred_name"; 
 
Iterator slotIter = currentClass.getOwnSlots().iterator();
 
//Iterate through the slots
while( slotIter.hasNext() )
{
	Slot currentSlot = slotIter.next();
 
	//Compare the slot name with your slotName String
	if( slotName.equals( currentSlot.getName() ) )
    	{
		Object slotContentObject = null;
 
		//Create another iterator for the content of the slot
    		Iterator iterSlotContent = currentClass.getOwnSlotValues(currentSlot).iterator();
 
    		if( iterSlotContent.hasNext() )
    		{
    			slotContentObject = iterSlotContent.next(); 
 
			//if the content of a slot is of type Instance
    			if(slotContentObject instanceof Instance)
  			{
				//cast to Instance
    				Instance instSlotValue = (Instance) slotContentObject;
 
				//use the method getBrowserText() on the Instance to get its String content
    				System.out.println(instSlotValue.getBrowserText());
    			}
    		}
    	}
}

HOW TO change the Definition slot of a Class/Concept/Term by specific “name” (RID) and save to project file?

Say we want to change the content of the “Definition” slot in Cls “RID35712”:

String slotName = "Definition"; 
 
//get the Cls object of the Term/Concept/Class
Cls myConcept = kb.getCls( "RID35712" );
 
Iterator myConceptClsIter = myConcept.getOwnSlots().iterator();
while( myConceptClsIter.hasNext() )
{
	Slot currentSlot = myConceptClsIter.next();
	if(slotName.equals(currentSlot.getName()) )
	{
		//set new Definition
		myConcept.setOwnSlotValue(currentSlot, "This is my new Definition for Term RID35712");
	}
}
 
//save changes
project.save(errors);

HOW TO display ALL synonyms of the entire project?

What if we want to list the names of ALL synonyms throughout the entire dictionary

Iterator radClsIter = kb.getClses().iterator();
 
while( radClsIter.hasNext() ){
        Cls currentClass = radClsIter.next();
        String prefName = findPreferredName(currentClass);
 
        String slotName = "Synonym";
        Iterator slotIter = currentClass.getOwnSlots().iterator();
 
        while( slotIter.hasNext() )
        {
    		Slot currentSlot = slotIter.next();
    		if(slotName.equals(currentSlot.getName()) )
    		{
    			Object slotContentObject = null;
    			Iterator iterSlotContent = currentClass.getOwnSlotValues(currentSlot).iterator();
    			if( iterSlotContent.hasNext() )
    			{
    				slotContentObject = iterSlotContent.next();
    				if(slotContentObject instanceof Instance)
    				{
    					Instance instSlotValue = (Instance) slotContentObject;
    					System.out.println(instSlotValue.getBrowserText());
    				}
    			}
    		}
    	}
}

HOW TO create a new Class/Concept/Term and Add it as a child of RID35712?

We will be assigning the “name” as new unique “RID12345” and adding it as child of the term “RID35712”

Collection myCol =   kb.getClsNameMatches("RID35712",1);
kb.createCls("RID12345", myCol);

That covers a few basics, if you would like an example of something I havent covered, feel free to leave a comment.

GL & HF :D

6 thoughts on “JAVA: How to programmatically manipulate a Protégé-Frames lexicon / ontology / dictionary using Protege API and Java.

  1. sam123

    Hi there,

    Thanks for this tutorial. It is really hard these days to find tutorials about Protege-frames.
    I am new to this field and I am starting a new project which involves ontologies. The problem am facing is that I cannot choose which API to use. In Protege, we have the core API and Protege-OWL API. Honestly, I did not get the difference between them. I read the “Frames and OWL Side by Side” report, but the question I could not answer yet is”Is Protege-OWL usually used for web applications only??”. Because I am planning to use a standalone software which has a knowledge base built using ontology.
    I also read somewhere that the Protege-OWL API is much easier to deal with.

    So my questions are;
    – Is Protege-OWL usually used for web applications only??
    – What do you think is the best API to use in my situation where a standalone software is required with no web interaction (i.e. ontology will be used to have a structured knowledge base)??

    Thank you..

    Reply
    1. MantasCode Post author

      I believe that owl is a slightly simpler structure then frames, I’m not too familiar with owl. Maybe you can do something with Frames that you can’t do with OWL? One might be more friendly towards the web, but I don’t think either are exclusive to it. I think you can use both for whatever you want, web or standalone.

      I would use the Protege Frames or Owl API within your program, to do whatever you want with the ontology. Or, I would use the Protege API to extract the ontology into my own data structure (full or partial ontology). It depends on the application and the ontology.

      :D

      Reply
      1. sam123

        Thank you so much. That is helpful. If you allow me, I have one more question:
        Does Protege-frames supports reasoning ??

        And I hope to see more posts about ontology in your blog =)
        Thank You..

        Reply
  2. linda

    hello,please i want to connect my application j2ee with protege to extract some knowledge (push/pull) have you idee plz?

    Reply
  3. Subrangshu

    Hi Mantas code,

    I need to query radlex based on keyword and get all matches/synonyms using java code. Can you kindly help in this regard on how to achieve this. Would appreciate a quick reply.

    Regards

    Reply

Leave a Reply to sam123 Cancel reply

Your email address will not be published.