Information Extraction

Information extraction (IE) is a type of information retrieval whose goal is to automatically extract structured or semistructured information from unstructured machine-readable documents. A typical application of IE is to scan a set of documents written in a natural language and populate a database with the information extracted. Current approaches to IE use natural language processing techniques that focus on very restricted domains. For example, the Message Understanding Conference (MUC) is a competition-based conference that focused on the following domains in the past:
  • MUC-1 (1987), MUC-2 (1989): Naval operations messages.
  • MUC-3 (1991), MUC-4 (1992): Terrorism in Latin American countries.
  • MUC-5 (1993): Joint ventures and microelectronics domain.
  • MUC-6 (1995): News articles on management changes.
  • MUC-7 (1998): Satellite launch reports.
Typical subtasks of IE are:
  • Named Entity Recognition: recognition of entity names (for people and organizations), place names, temporal expressions, and certain types of numerical expressions.
  • Coreference: identification chains of noun phrases that refer to the same object. For example, anaphora is a type of coreference.

 

<< PreviousWord BrowserNext >>
black lace (books)
black lace
harajuku station
church of god in christ, mennonite
nexus books
shinjuku station
list of authors of erotic works
wolf rdiger hess
quadroon
beverage can
octoroon
dallas opera
graeme jenkins
peter g. neumann
celestial spheres
california state university, long beach
flip chip
declaration
edward bach
cheddi jagan
newgate
stack frame
julian may
alcock and brown
datsun 510
mellow ambient
hwanghae
crs
pacific surfliner
list of julian may's adult novels
1996 in science
tile
mitel
xanthippus
jeanne of navarre
ad herbal
garda sochna
time cube
bert kaempfert
daggerboard
magnesium carbonate
vigintisexviri
nrt
mirror (dinghy)