Downloading of Bookmarks

Prerequisites: Creating a Case Project and Performing an Internet Search

See also: Setting HTTP Proxy Information

Bookmark files contain URLs pointing to some resources on the web. Theses resources are mostly web pages or file resources (such as .pdf files). To analyse theses resources locally, they need to be downloaded to the Case Project in the workspace. Once all the search result links (bookmarks) have been downloaded under the Bookmarks folder, the next step is to download the result pages as text files:

images_download\attachments\2588727\download-bookmarks.png

images_download\attachments\2588727\download-folder.png

After the download has finished the system automatically starts to extract the text from the resources.

The download process is performed in parallel, but in some cases the process has to wait for URLs to respond which are pointing to slow servers.

The files in the Documents folder are marked with an > to show the entity extraction has not yet been performed. Some files may be marked with a red indicator showing that the text extraction has failed. In most cases the file is either an unsupported file type or does not contain enough relevant text.

The downloaded result pages or result files are given a name based on the result URL they were downloaded from. If a title can be extracted from the extracted text of the file this title is shown as an overlay in the Workspace Navigator view. If the text extraction failed to find a title, the original file name is shown instead.

Reviewing result pages and files

Double-click one of the downloaded files (HTML files have a small “HTML” icon in front of them) to open the file in the Document editor view. The Document editor view shows the extracted text. Since the entity extraction has not yet run, just the plain extracted text without any mark-up for entities is shown.

images_download\attachments\2588727\02-download.png