Entity Extraction
The goal of the Entity Extraction is to find locations in the text which contain entity information. In other words it tries
to find occurrences of person names, locations, VAT numbers, etc..
The overall process is split into sub modules which run in a pipeline like fashion.
The system performs the following extraction steps:
|
Matching Module |
Matched entities |
|
Matches name variants from the name variant database |
|
|
Matches geo locations (countries, regions, cities) |
|
|
|
|
Guesses further entities according to built-in rules |
|
|
Combines similar name variants to a single entity profile, provides unique ids to entities accross the document set |