Big Data, Electronic Evidence and Criminal Defence

RAV-Seminar “Digitale Beweismittel: Von Handydaten bis Umgebungsintelligenz – Strafverteidigung im Zeichen von Big Data”, 11. Nov. 2017, Berlin

Big Data electronic evidence predominates more and more the evidentiary procedure in serious and organised crime cases in European criminal courts. Hence, law enforcement and courtroom participants are often still in ‘analogous’ mode and just begin to understanding the nature of digital evidence, the technologies to process digital data and eventually its revolutionising impact on their role in criminal procedures. It appears that criminal defence lawyers are facing a particular challenge in handling Big Data electronic evidence (eEvidence) since they are often not sufficiently prepared and equipped to process digital evidentiary data in contrast to prosecutors with the technological power of law enforcement agencies at their disposal.

Automated Justice: Algorithms, Big Data and Criminal Justice Systems


Organized by Assoc. Prof. Dr. Aleš Završnik (EURIAS-Fellow)

“From predictive policing to probation risk scores, the potential uses to of big data in criminal justice systems pose serious legal and ethical challenges relating to due process, discrimination, and the presumption of innocence.”

Friday, April 20, 2018, 9:00am–5:30pm


Provalis Software

Provalis Software QDA Miner and WordStat are used to analyze unstructured text data. They belong to the category of CAQDAS (Computer Assisted Qualitative Data Analysis Software) and provide strong text mining, content analysis as well as visualization features. Provalis Software is not built for Terabytes of data, yet, one way to analyze and select relevant data from raw digital data on a Big Data scale is to use dtSearch or NUIX (Proof Finder) and to import this pre-processed data into the software. Provalis Software is capable of processing large data sets produced from large amounts of text/data files in many different formats or from very different sources such as social media (FB, Twitter, RSS Feeds), emails (Outlook, Hotmail, Gmail) and many other sources.

What makes analysis of electronic case data with QDA Miner and WordStat so useful for crime analysts and criminal defence lawyers?

In short: QDA Miner and WordStat support many of major analytical requirements for the criminal analysis of digital data, such as
·         integration of different sets of data in one database, e.g. from witness statements, documents, reports or digital data from smart phones, computers etc.
·         simple and qualified search functions, e.g. query by example
·         (automated) coding of content e.g. to identify incidents described in the indictment
·         consistency analysis to find agreements and contradictions among witnesses
·         pattern and network analysis to trace relationships between suspects
·         geo-Mapping to identify spatial patterns
·         visualization of findings

NUIX Proof Finder

Police around the world investigates big data digital evidence using NUIX. Proof Finder is a basic NUIX software tool released as a philanthropic project for 100 USD per year, which allows to learn main functions of NUIX with databases up to 15 GB. Proof Finder can handle data from mobile devices, hard drives, forensic images, file shares, Microsoft Outlook, or Lotus Notes, and complex storage systems, e.g. importing data from XRY or UFED. Proof Finder provides capabilities required for forensic analysis of disk images, including recovering deleted files, carving unallocated space and unidentified data items, fully indexing and navigating Windows Registry files, and a hex viewer to analyse files and file fragments.


MicroStrategy Desktop is a powerful platform for forensic analysis that allows to explore large data sets e.g. from Smartphone data, telecommunication- or IP-surveillance and to build visualisations such as networks (telecommunication or social networks) or Geo-Mapping, and to quickly identify relevant patterns (criminal networks, victim-perpetrator communications) and trends. Very different data sources can be integrated for analysis, e.g. Excel, social network (FB, Twitter), Web- and Cloud services or Dropbox. Important for beginners: MicroStrategyDesktop offers free Jump-Start-Courses after which you can expect to be ready to analyse your data.


One close to perfect answer is offered by dtSearch which allows not only to integrate terrabytes of different types of data into one holistic database and to make these data quickly searchable in an easy way. dtSearch offers special functions for forensic searchessuch as different options for searching emails, processing encrypted PDFs, credit card numbers, social network data, etc.. Moreover, dtSearch is both, easy to use at a reliable basic level but also applied by leading players in the forensic LegalTech field for more sophisticated analysis.

While enabling criminal defence lawyers to search electronic evidence on Big Data magnitude in simple search terms but also for semantic patterns, dtSearch is very convenient to deal with another crucial analytic issue: The selection of relevant data and data reduction.  More sophisticated analysis of eEvidence (e.g. network analysis) requires to transform unstructured and heterogenious data from different sources into structured databases. With dtSearch data can be processed for further analysis with other more complex analysis tools, such as QDA-Miner or WordStat (Provalis Software).

