Editing the current Name Variant Database

The EMM OSINT Suite uses a large database of named entities containing mainly persons and organizations (see the Name Variant Matching concept for more information).

The suite allows editing its Name Variant Database in order to add new entities or modify existing ones, keeping the current keys assigned for each entity.

The Name Variant Database in OSINT is composed of four columns as follows:

Next, an example of a excerpt from the Name Variant Database in OSINT is shown:

key

pid

type

variant

2

11

p

Aaron Albert

3

11

p

A. Albert

4

11

p

A. M. Albert

5

21

o

Chad Calvin Christian

6

21

o

CCC

7

21

o

C.C. Christian

8

41

t

Milano

9

61

u

Harold Hugh

10

61

u

Henry Hugh

In this example, it can be observed how the entity Aaron Albert (person) has a PID value of 11. The first occurrence would be the canonical (original) form for that entity, whereas the next ones found with the same PID (A. Albert, A. M. Albert) are considered as variants of that canonical form. However, all these occurrences (variants) represent the same entity in real life (the person Aaron Albert). Another example in the table is the organization called Chad Calvin Christian (PID 21). As can be seen , there exist one canonical form and two variants (CCC, C.C. Christian) for this entity. Finally, we find the entity Milano (PID 41) with only one variant (the canonical form) and one entity of unknown type (Harold Hugh) with two variants.

It is important to note that it should be used a UTF-8 flat file with TSV (Tabular Separate Values) format.

The procedure of editing the current Name Variant Database should be done in three steps:

  1. Export the current Name Variant Database to a flat file

  2. Open the database file and add new entities or modify existing ones. It is important to note that the database file must be saved in UTF-8 format and always keeping the structure of four columns, separating them by the tabular character.

  3. Import the updated database file into OSINT

Exporting the current name variant database to a flat file

  • Open EMM OSINT Suite and click in the main menu on File > Export > Entity Extraction > Export Name Variant Database

images_download\attachments\2588680\menu-file-export.png

images_download\attachments\2588680\edit-database-01.png

  • Click on Next and then Browse in order to select the file in your computer in which exports the current Name Variant Database. You can use any name for this file.

images_download\attachments\2588680\export-file-selection.png

  • Finally, click on Finish and a progress bar on the right-bottom of the window will be shown. It might take few seconds depending on the size of the current database. The process of exporting can also be followed in the Progress view

images_download\attachments\2588680\edit-database-02.png

The exporting process will finish when this progress bar disappears.

Opening the database file and adding new entities or modifying existing ones

The new export file generated is a flat file composed of four columns separated by the tabular character (see above).

Any text editor can be used to modify the database file (TextPad for Windows, WordPad, ...).

Once we have added or modified the new entities, as explained above, the database file must be saved in UTF-8 format and always must keep the four columns format.

Importing the updated database file into OSINT

See Importing a new Name Variant Database to import the updated database file.