Creating a custom Report
Reports are a way to export analysis data from a Case Project into flat files. These files can be HTML documents or machine readable formats such as tsv files (tab separated value - to be imported into MS Excel).
If you open the application, you see in the lower left corner a Reports view which may be hidden behind the Search Tool view:
This view shows the currently available reports in the system. For example, if you double-click on AllEntities.ort report, the system creates a HTML report in the current project which shows a list of all found entities.
However, some reports need to be customised for your project in order to perform as desired or you need to create a custom report from scratch. This tutorial gives an introduction on how to create a custom report.
The following tasks are described
-
Setting up an example case project
-
Creating a configuration project
-
Modifying the report "RelatedDocuments_Entity.ort" to list documents related to "Barack Obama"
-
Modifying the report and its output template to list documents related to "Barack Obama" and similar named entities
Setting up an example Case Project
To have some data for this tutorial we set up an example project about "Barack Obama". If you have already a Case Project where you need to change the reports, please skip this step.
Procedure:
-
Create a Case Project named "Barack Obama"
-
Search on Google about "Barack Obama"
-
Download the bookmarks we gathered from the Google search result pages
-
Run the entity extraction to generate analysis data
See also: Quick Start Guide for descriptions of the individual tasks.
Creating a Configuration Project
The first step is to create a configuration project, which allows you to modify the existing report templates (.ort stands for OSINT Report Template) or create new ones.
In order to create a configuration project do the following:
-
In the main menu click on File > New Wizards > Configuration Project
-
Fill in "Configuration" as project name and click on "Finish"
The system now creates a new project which is marked with a small red "c" to show that this is not a normal Case Project but rather contains configuration data and templates.
The following sreenshot shows this configuration project with the "Reporting" folder expanded:
A configuration project contains templates and configurations which apply to all case projects in the system. It is possible to create more than one configuration project in this case only one is active. (You can choose the active one from the Preferences). However, in practice we recommend to create only one configuration project.
Open a report template for editing
Inside the "Reporting" folder you see different types of files. There are files with the *.ort ending which are the report template files for individual reports. A report template contains definitions for the system to generate a report. These available report template files are also accessible from the Reports View in the user interface. The other files are used from within the *.ort files and define output templates (for example to create HTML output) and data selection scripts used by the system to create the final report.
The system provides a report template editor to edit the definitions of the template. We open the "RelatedDocuments_Entity.ort" file by performing a double click on it. It opens in an editor window:
The fields defines the needed template file and optional data preparation script to compile a report:
|
Field |
Description |
|
Description |
Gives a short descriptions what kind of report is generated |
|
Output File Extension |
Defines which file extension the generated report will have. This can be for example html if the report creates a html file or csv for a file readable by MS Excel. |
|
Template Path |
The template which is used to generate the report. During report generation the template is filled with the analysis data from the selected case project. |
|
Script Path |
An option javascript file which can be used to pre-process the data before filling it into the template. For example, a list of entities could be sorted or filtered before it is used in the template. |
The default report template creates a report showing all documents related to "Franz Marc" (which is the example name we use in the quick start guide). We want to change this and generate a report which shows all documents related to "Barack Obama".
Create a custom report based on an existing example
The first step to create a custom report for "Barack Obama" is to copy the existing RelatedDocuments_Entity.ort file to a new file:
-
Close the editor window showing RelatedDocuments_Entity.ort
-
Right click on "RelatedDocuments_Entity.ort" in the Workspace Navigator View and select "Copy"
-
Right click on the "Reporting" folder and select "Paste"
-
The system opens a dialog, we change the name of the copied file to "RelatedDocuments_BarackObama.ort"
Now, as second step we do the same for the javascript file belonging to the report template:
-
Right click on "SelectEntity.js" in the Workspace Navigator View and select "Copy"
-
Right click on the "Reporting" folder and select "Paste"
-
The system opens a dialog, we change the name of the copied file to "SelectBarackObama.js"
The javascript is used to select the needed data for the report. It defines that our main entity is "Barack Obama". (In the next major version of the software this will be done by using a data selection dialog.)
As a third step, we edit the new report template file (double click on "RelatedDocuments_BarackObama.ort" and insert the following data:
|
Field |
Input |
|
Description |
Generates a list of documents where Barack Obama is found |
|
Output File Extension |
html |
|
Template Path |
RelatedDocuments_Entity.html |
|
Script Path |
SelectBarackObama.js |
We save the report template file (Main Menu > File > Save).
As a final step, we modify the javascript file "SelectBarackObama.js" to select Barack Obama as main entity for our report:
-
Double click on "SelectBarackObama.js"
Now, the javascript editor opens and we change the script to the following:
/**
* JavaScript to select an entity and store it into context of
* report template.
*/
var selectedEntity = project.getEntityByName("Barack Obama");
templateContext.put("selectedEntity", selectedEntity);
This selects the entity named "Barack Obama" from the selected project and stores it into the template context. During generation the system replaces the $selectedEntity variable in the html template "RelatedDocuments_Entity.html" with the entity for Barack Obama.
After saving the javascript file, we can test it by doing a right mouse click on "RelatedDocuments_BarackObama.ort" and selecting "Generate Report".
The new custom report also shows up in the Reports View:
Note: This complicated procedure to edit the javascript file manually to select data will go away in the next major version of the software. Instead a selection dialog will be used (similiar to the one to select the source project) to select the entity (or other data) for the report
(Advanced topic) Improve the report to include similar entities
Our custom report uses the "RelatedDocuments_Entity.html" output template to embed the project's entity data into a HTML page. Now we want to improve our report with the following goal:
-
List all documents related to the entity named "Barack Obama" and similar named entities
The entity extraction engine has a normalisation step which tries to match similar names and combine these name variants to belong to a single entity. This way we avoid to have too many entities which represent the same person. However, there is a limit for the system to decide which names are similar enough. If we look at the entity browser view of our application we see that there are quite a few entities listed which represent "Barack Obama":
We now want to improve our report to include also documents which contain one of the different entities above. (Please note, that by editing the name variant database of the software, we can avoid these different entities about the same person. How to do this will be covered in a different tutorial).
In order to do so, we need to do the following:
-
Change the data selection script "SelectBarackObama.js" to obtain a list of entities with names starting with "Bar"
-
Change the output template "RelatedDocuments_Entity.html" in a way that it lists documents of a list of entities not of a single one
Select a list of entities representing Barack Obama
The javascript file "SelectBarackObama.js" defines which data is avaiable to the report generator to include it in the final report. In a first step we modify it to include all entities with names starting with the letters "Bara". (We defined these four letters by looking at the entity browser view.)
We open the "SelectBarackObama.js" javascript from the configuration project by performing a double click on it:
/**var selectedEntity = project.getEntityByName("Barack Obama"); |
We see that the selectedEntity variable is defined in a first step by calling the function property getEntityByName of the project object. This object is a predefined object which is the entry point into the analysis data of the source project for the report. The source project in turn is selected during report generation from a selection dialog.
Now, if we want to define a list of entities with names starting with "Bara". We look into the online documentation to find a suitable function of the project object which provides this list:
-
From the main menu open Help > Help Contents > OSINT Suite User Guide > Reference > Reporting > Data Objects
-
Review the table showing the object properties of the project object
The function property project.getEntitiesByNamePattern(namePattern) seems to be suitable for our purposes. It needs a regular expression pattern as parameter to match against the available entities in the system.
Now, we adapt the javascript code as follows:
var selectedEntities = project.getEntitiesByNamePattern("Bara.*"); |
Note: in order to match "Bara" as the start of an entity name we provided the pattern "Bara.*" which is a regular expression pattern. Soon, we will provide a tutorial to show you how to write these patterns to match text.
After selection of the entities, we store them in the templateContext which defines a set of data available to the report output template.
Adapt the output template to show documents relating to a list of entities
After changing the data selection script, we need to adapt the output template "RelatedDocuments_Entity.html" to show all documents related to a list of entities and not only to a single entity.
We do the following:
-
Copy the RelatedDocuments_Entity.html to a new file:
-
Right click on "RelatedDocuments_Entity.html" in the Workspace Navigator View and select "Copy"
-
Right click on the "Reporting" folder and select "Paste"
-
The system opens a dialog, we change the name of the copied file to "RelatedDocuments_Entities.html"
-
-
Adapt the report template file to use "RelatedDocuments_Entities.html" as output template instead of "RelatedDocuments_Entity.html"
Now, we open the new "RelatedDocuments_Entities.html" file in a text editor (right mouse click > Open With > Text Editor) and edit it.
Since we use internally Apache Velocity as templating engine (see the User Guide), the output template consists mainly of html code with variables starting with $ and directives starting with #:
|
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd"> |
The current output template has a foreach loop directives which loops of the list of all documents related to the selectedEntity (this is the variable we have defined in the javascript).
Now, in order to show all documents of a list of selectedEntities (see javascript above), we need to add another foreach loop directive to first loop over all entities and then internally to loop over all related documents. The resulting code of the output template looks like this (new directives in bold):
|
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd"> |
After saving the new output template, we can test our new report by selecting it from the config project with the right mouse and choosing "Generate Report".
Please find all files of this tutorial attached to this page. You can simply unzip them to disk, then select from the main menu File > Import > Documents > File System and import them to your configuration project.