In this post, I will demonstrate how you can text mine your own content with a Protege Ontology. This is a high level post about using ontologies.
Get the name and unique identifier of every concept within the structured ontology. Then parse various digital formats (word documents, pdfs, text files, or database columns) for each concept name. Lastly, record the concept name found, concept id, its frequency of occurrence, and an identifier for which content I found it in.
The following diagram parses a Microsoft Word doc for ontology concepts.
Now you can start playing with the datasets! Have Fun! :D
Hello: I am working in a spanish translation of Radlex. At the begining I could download the list of terms from another blog in xls format. This really helped me. Now it is possible to download the latest Radlex version directly from the website in the same format. As radiologists, my collegues and I were interested in using this ontology to get the codes of the terms used in reports parsing these texts for each term. Your description on how to do it is very useful. We have already translated nearly 25% of Radlex. I would like to ask you if you know/think it is possible to recover de ontology architecture from a list in Excel format where I only have the RID codes and the corresponding translated terms? At the same time what would you suggest to do to with the new spanish list of terms to be able to query the code of a term? Should I have to ask to make a program with a browser and a data base engine? It would be interesting to have something like the Radlex browser in Radlex website.
Thanks for your help. Your blog is a very interesting source of information.
Dario
Hello: I am working in a spanish translation of Radlex.
Dr Daniel Rubin would love to hear about it!
At the begining I could download the list of terms from another blog in xls format. This really helped me.
Yeah, things change!
Now it is possible to download the latest Radlex version directly from the website in the same format.
As radiologists, my collegues and I were interested in using this ontology to get the codes of the terms used in reports parsing these texts for each term.
I am confused by what you mean by this.
Your description on how to do it is very useful.
We have already translated nearly 25% of Radlex.
Awesome! (RadLex is the heartbeat of artificial intelligence)
I would like to ask you if you know/think it is possible to recover de ontology architecture from a list in Excel format where I only have the RID codes and the corresponding translated terms?
It sounds to me, like you want a ‘Microsoft Excel’ (Spread Sheet) Document, with a column displaying the RID code of a particular concept in radlex, and maybe another column in English, and its translation in Spanish, if it exists (If a Spanish Translation doesn’t exist, I trust you will submit it to the right people)
The only way to get a list like this ^, is to extract it out of Protege Frames Format RadLex. RadLex which uploaded and browsable on BioPortal, loses its foreign language translation structure. All translations of a particular term are just appended as a Synonym.
As Radiologists who are interested in expanding and using RadLex, you should be able to request this list.
At the same time what would you suggest to do to with the new spanish list of terms to be able to query the code of a term?
Here, again, I don’t know what you mean. :(
What would I personally, do with a list of all new Spanish RadLex Terms?
I would parse all forms of, digital data of, the Spanish Medical Imaging Domain!
Should I have to ask to make a program with a browser and a data base engine?
Yes, Based on your questions, Yes! You probably should ask for a programmer too!
It would be interesting to have something like the Radlex browser in Radlex website.
Thanks for your help. Your blog is a very interesting source of information.
Dario
THANK YOU DARIO, YOUR COMMENT WAS A BLAST OF PLEASANT NOSTALGIA !!!
Hello Mantas: thanks for your answer. Until you posted your comments I couldn´t realized that translations are appended as synonyms. That´s the way the german translation was done. Our Excel file contains three columns: one for the RID number, one for the english term and one for the spanish translation of the term. Is there a way to append the spanish translated terms all together to the protege Radlex file or should I search one by one typing the spanish synonym? A hard and time consuming task, indeed.
Thanks,
Dario
Until you posted your comments I couldn´t realized that translations are appended as synonyms.
They are appended as synonyms in BioPortal bioportal.bioontology.org
RadLex exists, and is worked on, in Protege Frames format.
When enough changes are made, the overseers of RadLex publish it on BioPortal.
BioPortal allows the world to browse the ontology using a web browser, as well as, providing API to programmatically query the structured ontology how you want, through the URI(REST, XML, JSON, SPASQL).
In order for the “True” format of RadLex to be published onto BioPortal, a “Flatten-er” script must be run on the entire ontology.
This Script makes the RadLex Frames files swallowable by BioPortal. It makes the structure simpler. At the same time, certain details of the ontology are lost in the process.
That´s the way the german translation was done.
The German Translations were done (probably) the same way you’re doing yours. In Protege Frames format of RadLex, the German Translation of a Term, is stored, and can be queried by the German Language. (Java API Example: “Give me the German Translation of the Term “pregnancy“, will yield you, “Schwangerschaft”
You cannot ask for this from BioPortal, only Frames Format of RadLex.
Is there a way to append the spanish translated terms all together to the protege Radlex file
Yes!
or should I search one by one typing the spanish synonym?
NO!
A hard and time consuming task, indeed.
It is a Very Easy Simple Task, and NOT hard or time consuming at all!
I used to write scripts that absorb all types of different data structures into RadLex using Java + Protege Frames API.
Here is another post which demonstrates simple things you might want to do when trying to Programmatically Manipulate a Protege Frames Ontology:
http://mantascode.com/java-how-to-programmatically-manipulate-a-protege-frames-lexicon-ontology-dictionary-using-protege-api-and-eclipse/
Thanks,
Mantas
Update : BioPortal no longer supports Frames Format Ontologies.
Hi!!!
I am interested in the Spanish translation of RadLex… Did you make any progress?
Regards!
Alexander Ramos, M.D.
Hi! I am interested in helping with the Spanish translation of RadLex. Who is in charge?
Thank you,
Alexander Ramos, M.D.
axramos at hot mail dot com
Hola a todos los interesados en una traducción al español de la terminología Radlex.
He creado un programa de interpretación radiológica y uso los códigos Radlex de anatomía, patología, etc. para incluirlos como metadatos en el reporte. La traducción ya está completa (versión 3.9, 41,603 términos) y la manejo en formato XML, con lo que conserva la estructura jerárquica.
Cuauhtémoc
Saludos desde España Dr.Rossell.
Estoy desarrollando un archivo radiológico de nuestro sistema PACS con el objeto de almacenar todos nuestros casos diarios interesantes e irlos archivándolos de manera sencilla de cara a generar un archivo docente para nuestros alumnos. Estaría muy interesado en la traducción al español de la terminología Radlex y en su programa de interpretación radiológica y uso de los códigos Radlex. ¿Habría alguna forma de poder probarlo?
Un saludo cordial
Buenas tardes Dr.Rossell:
Nos interesa muchísimo poder usar la traducción en español del RADLEX, para poder codificar los informes radiológicos de los pacientes. Podríamos tener acceso?
Saludos,