Public Data and Anticorruption Research

Phil Kittock, Cell Lead & Sofia Vargas, Analyst

Public data is essential for anticorruption research. Many countries publish online databases of public records of all varieties, from court filings to corporate registries, from timber exports to plane registries. These are critical sources of credible information for anticorruption researchers, investigative journalists, and civil society groups around the world.  

The way these datasets are structured ranges from individual records to bulk data, and every type in between. Merely collecting raw data only takes us halfway to conducting the kind of impactful analysis that we seek to provide. Any individual source of information, from a full corporate registry to a single judicial transcript, can be useful – however, keeping data sources separate keeps insight isolated, meaning you have to do a lot more work to discover illicit activity.

By integrating and comparing datasets across formats and jurisdictions, C4ADS’ Organized Crime and Grand Corruption Cell is able to more comprehensively and efficiently understand the transnational illicit networks we research and the systems in which they operate. When datasets are joined together, individuals and entities that appear separate can form networks, while examining activities allows us to identify interesting trends from large datasets.


Mapping out Networks:

In the modern world, criminal networks, including those that facilitate North Korean overseas labor or that specialize in wildlife trafficking worldwide, are able to move fluidly from jurisdiction to jurisdiction, deftly navigating the bounds between the illicit and licit worlds. Whenever illicit networks touch the licit system, they leave a paper trail that researchers can track – C4ADS uses public records to identify the networks that operate in the illicit economy. As we go through and investigate each person or company we come across in our research, we uncover dozens of related records. The trick to making these data points useful and actionable is knowing how to tie them all together, and what they mean when combined.  

Investigating the people and companies in a network can lead to the discovery of more specific data points, like luxury plane travel, export licenses, and property purchases, which all provide a better picture of how a particular network actually operates. When a Venezuelan businessman was indicted by US prosecutors in November 2018 for embezzling funds from the Venezuelan state-owned oil company PDVSA, we started to investigate his business dealings and the network around him. After structuring the data we uncovered in the course of our research, we found a key business associate that appeared to facilitate the business dealings of this network. This business associate co-owned businesses in the US, and we also discovered that they were family.

From there, we searched property records and airline registries to see if any of the companies or individuals in the network owned luxury planes, homes, or other big ticket items. Companies in the network owned luxury homes and a plane in South Florida, a key piece of information that led us to activities-based analysis.


Tracking Activities:

To use activities-based analysis to investigate aircraft, we apply the knowledge we already have about illicit activity involving aircraft to build a pattern we can look for in the data. Since the US government publishes the registration numbers of aircraft owned by now-indicted and sanctioned individuals, there are readily-available examples of illicit activity to analyze in the open source. Using a simple aircraft registration number published in a press release, we are able to connect an aircraft to its owner in an aviation registry, and we simultaneously viewed historical flight data from flight-tracking platforms like ADS-B Exchange.


After analyzing historical flight data to identify where aircraft of interest have been in the past, we can find other aircraft that follow a similar pattern. Additional information, like whether or not the aircraft is registered in a secrecy jurisdiction, provides further clues of suspicious and potentially illicit activity.

In the previously mentioned money laundering case involving the Venezuelan businessman, a sanctioned aircraft’s transmissions revealed that its flights were almost exclusively between South Florida and Caracas in the months prior to the plane being sanctioned. A different, unsanctioned plane that was publicly reported to operate in connection to the same network revealed almost identical flight patterns to that of the OFAC-sanctioned plane. This raised many interesting questions, and by identifying this activity we are able to further our investigation and uncover new layers of the network’s activity.


The Impact for Anticorruption Research

Identifying patterns or typologies of crime can provide a useful methodology, and it can assist in finding illicit activity that emerges from the open-source. In this way, using both network- and activities-based analyses allows us to sift through an immense amount of public data to find cases that are most interesting to the public, the most relevant to our research, and the most impactful for our partners. Furthermore, although a simple network-based analytical approach provides a useful roadmap for researching corruption, being able to highlight the actual activities of a network makes the results of anticorruption research much more dynamic, enlightening, and actionable.  

At the upcoming 2019 OECD Anti-Corruption & Integrity Forum, we will dive into these issues and our methodologies further to provide a glimpse of how C4ADS analysts and partners work to untangle the webs woven by illicit actors.