About |
Software |
Publications
About
CuiTools
is a freely available package of Perl programs for unsupervised and
supervised word sense disambiguation (WSD) experiments. The name
CuiTools comes from the Concept Unique Identifiers (CUIs) found in
the Unified Medical Language System (UMLS). This package allows the
users to perform supervised or unsupervised word sense disambiguation
using information extracted from the UMLS such as CUIs, semantic types
and semantic relations as well as general english features such as
unigrams, bigrams and part-of-speech information.
This package has also been used to perform classification for other
medical tasks outside of WSD such as assigning ICD9-CM codes to medical
records, determining the co-morbidities of a patient based on their
medical record and identifying relations in biomedical text.
If you use CuiTools, please cite the following paper:
Software
Download the current version (v0.29): SourceForge
Publications
Using PharmGKB to Train Text Mining Approaches for Identifying
Potential Gene Targets for Pharmacogenomic Studies.
Serguei Pakhomov, Bridget T. McInnes, Jatinder Lamba, Ying Liu,
Genevieve B. Melton, Yogita Ghodke, Neha Bhise, Vishal Lamba and
Angela K. Birnbaum. Journal of Biomedical Informatics. 2012 Oct;
45(5):862-9.
Exploiting MeSH Indexing in MEDLINE to Generate a Data set For
Word Sense Disambiguation.
Antonio Jimen-Yepes, Bridget T. McInnes and Alan R. Aronson.
BMC Bioinformatics. 2011 Jun 2;12(1):223.
Using Second-order Vectors in a Knowledge-based Method for Acronym
Disambiguation.
Bridget T. McInnes, Ted Pedersen, Ying Liu, Serguei Pakhomov, and
Genevieve B. Melton. Appears in the Proceedings of the Fifteenth
Conference on Computational Natural Language Learning (CoNLL 2011),
June 23-24, 2011, pp. 145 - 153, Portland, Oregon.
Collocation Analysis for UMLS Knowledge-based Word Sense Disambiguation
Antonio Jimen-Yepes, Bridget T. McInnes and Alan R. Aronson.
BMC Bioinformatics. 2011, 12(Suppl 3):S4.
Supervised and Knowledge-based Methods for Disambiguating
Terms in Biomedical Text using the UMLS and MetaMap.
Bridget T. McInnes. Doctor of Philosophy Dissertation,
Department of Computer Science, University of Minnesota,
Twin Cities, September, 2009.
Using CuiTools to Identify Obesity and its Co-morbidities in
Discharge Summaries. Bridget T. McInnes. In the
Proceedings of the Second i2b2 Workshop on Challenges in
Natural Language Processing for Clinical Data, Nov 7-8, 2008,
Washington, DC.
An Unsupervised Vector Approach to Biomedical Term Disambiguation:
Integrating UMLS and Medline. Bridget T. McInnes.
In Proceedings of the Assocation for Computational Linguistics Student
Research Workshop (ACL-SRW) 2008.
(poster:
pdf)
Using UMLS Concept Unique Identifiers (CUIs) for Word Sense
Disambiguation in the Biomedical Domain. Bridget T. McInnes,
Ted Pedersen, and John Carlis. In Proceedings of the Annual
Symposium of the American Medical Informatics Association (AMIA),
pages 533-37, Nov. 2007, Chicago, IL.
(slides:
pdf
ppt)
Using Domain Specific Information for Word Sense Disambiguation.
Bridget T. McInnes, Ted Pedersen and John Carlis. Grace Hopper
Conference for Women in Computing, October 2007, Orlando, Florida.
National Library of Medicine Research Participation Report.
Bridget T. McInnes. 2008.
|