Textpresso is a text-mining system for scientific
literature. Textpresso's two major elements are (1) access to full text, so
that entire articles can be searched, and (2) introduction of categories of
biological concepts and classes that relate two objects (e.g.,
association, regulation, etc.) or describe one (e.g., methods, etc).
A search engine enables the user to search for
one or a combination of these categories and/or keywords within an entire
literature.
Textpresso is useful as a search engine for researchers as well as
a curation tool. It was developed as a part of WormBase and is used extensively by C. elegans curators.
Textpresso has currently been implemented for 17
different literatures, and can readily be extended to other
corpora of text.