CDC, Centers for Disease Control and Prevention, bought admission to location data harvested from tens of millions of phones in the U.S. to analyze compliance with curfews.
It tracks patterns of people visiting K-12 schools and precisely monitors the effectiveness of policy in the Navajo Nation.
CDC planned to utilize phone location data to monitor schools and churches and wanted to employ the data for many non-COVID-19 purposes. The documents also indicate that although the CDC used COVID-19 as a cause to buy access to the data more quickly, it planned to use it for more general CDC purposes.
Location data is data on a device’s location sourced from the phone, indicating where a person lives, works, and where they moved. The style of data the CDC bought was aggregated—meaning it was developed to follow trends that arise from the movements of groups of people—but researchers have repeatedly grown concerns about how location data can be deanonymized and employed to track detailed people.
The documents reveal the CDC’s comprehensive plan last year to employ location data from a highly controversial data broker. SafeGraph, the CDC’s company, expended $420,000 for access to one year of data, including Peter Thiel and the former chief of Saudi intelligence among its investors. As a result, Google banned the company from the Play Store in June.
The CDC operated the data for monitoring curfews. The documents said that SafeGraph’s information “has been critical for ongoing response efforts, like hourly monitoring of activity in curfew zones or thorough counts of visits to participating pharmacies. For vaccine monitoring.”
After examining the documents, Zach Edwards, a cybersecurity researcher who closely tracks the data marketplace, told Motherboard in an online chat. It said that “The CDC seems to have purposefully created an open-ended list of use cases, which included monitoring curfews, neighbor to neighbor visits, visits to churches, schools, and pharmacies, and various analyses with this data specifically focused on ‘violence.”
At the pandemic’s beginning, cell phone location data was seen as a potentially helpful tool. Multiple media associations, including the New York Times, employed location data supplied by companies in the industry to offer where people were traversing once lockdowns started to lift or highlight that poorer communities were incapable of sheltering in place as much as more prosperous ones.
The COVID-19 pandemic has been a flashpoint in the broader culture war, with conservatives and anti-vaccine bodies protesting against government mask and vaccine mandates. They’ve also voiced a paranoia that vaccine passports would be utilized as a tracking or surveillance tool, framing vaccine denial as a civil liberties case.
Children’s Health Defense of Robert F. Kennedy Jr. is the fantastic and monied anti-vaccine group in the U.S. that has boosted fears that digital vaccine certificates could surveil citizens. QAnon promoter Dustin Nemos wrote on Telegram that vaccine passports are “a Trojan horse being used to build a completely new type of controlled and surveilled community. The freedom we appreciate today will be a distant memory.”
Against that inflamed backdrop, cell phone location data for such a wide variety of tracking measures, even if adequate for becoming better informed on the pandemic’s spread or policy, is likely to be controversial. But, on the other hand, it’s also expected to give anti-vaccine groups a real-world data point to pin their darkest warnings.
The procurement documents say, “This is an URGENT COVID-19 PR [procurement request],” and ask for expedited the purchase. But some of the use issues are not explicitly linked to the COVID-19 pandemic. For example, one reads, “Research points of appeal for physical activity and chronic disease prevention like visits to parks, gyms, or weight management companies.”
Another section of the document exaggerates the location data’s service for non-COVID-19 related programs.
“CDC also plans to use mobile data and services acquired through this accession to support non-COVID-19 programmatic areas and public health preferences across the agency. Not limited to travel to parks and green areas, physical activity and mode of travel, and population migration before, during, and after natural tragedies,” it reads. “The mobility data acquired under this contract will be open for CDC agency-wide use and help numerous CDC priorities.”
The CDC did not reply to multiple emails requesting comments on which use cases it did deploy SafeGraph data.
SafeGraph is an element of the ballooning location industry, and SafeGraph has earlier shared datasets, including 18 million mobiles from the United States. However, the documents say this acquisition is for geographically representative data, “i.e., derived from at least 20 million active mobile users per day over the United States.”
Generally, companies in this industry help request or pay app developers to incorporate location data-gathering code. The location data then funnels up to businesses who may resell the raw location data outright or package it into creations.
SafeGraph markets both. On the developed product side, SafeGraph has many products. “Places” affect points of interest (POIs), such as where respective stores or buildings are located. “Patterns” are based on mobile phone location data that can show how long people visit a location, “Where they came from,” and “Where else they go,” according to SafeGraph’s website.
In addition, SafeGraph has recently started offering aggregated transaction data, showing how much consumers typically spend at specific locations under the “Spend” product. SafeGraph sells its products to various industries, like insurance, real estate, and advertising. These products contain aggregated data on movements and spending rather than the site of specific devices.
Motherboard earlier bought a set of SafeGraph location data for $200. The data was aggregated, suggesting it was not supposed to pinpoint the motions of specific devices and hence individuals. Still, at the time, Edwards stated, “In my opinion, the SafeGraph data is way beyond any safe thresholds.” For example, Edwards pointed to a search result in SafeGraph’s data portal that stated data related to a distinctive doctor’s office, showing how finely tuned the company’s data can be. Theoretically, an attacker could exploit that data to attempt to unmask the particular users, which researchers have frequently illustrated is possible.
The Illinois Department of Transportation purchased such data from SafeGraph related to over 5M phones, an activist organization, the Electronic Frontier Foundation (EFF), found earlier in January 2019.
The CDC documents reveal that the agency bought a key to SafeGraph’s “Weekly Patterns Data,” “U.S. Core Place Data,” and “Neighborhood Patterns Data. That final product includes information such as home-dwelling time and is aggregated by state and census blocks.
“SafeGraph suggests visitor data at the Census Block Group level that permits extremely accurate insights related to age, citizenship status, gender, race, income, and more,” one of the CDC documents says.
Both SafeGraph and the CDC have formerly connected on their partnership, but not in the detail that is disclosed in the documents. For example, the CDC published an analysis in September 2020 that reflected whether people around the nation were tracking stay-at-home orders, which seemed to use SafeGraph data.
SafeGraph penned in a blog post in April 2020 that “To play our part in the fight against the COVID-19 health crisis—and its devastating impact on the global economy—we decided to expand our program further, making our foot traffic data free for non-profit organizations and government agencies at the local, state, and federal level.” Multiple location data companies touted their data as potential mitigation to the pandemic during its peak in the United States and provided data to government and media organizations.
According to the documents, a year later, the CDC purchased access to the data because SafeGraph no longer wanted to provide it for free. The documents add that the Data Use Agreement for the in-kind provided data was set to expire on March 31, 2021. The data was still essential to access as the U.S. opened up, the CDC argued in the documents.
“CDC is interested in continued access to this mobility data as the country opens back up. This data is used by several teams/groups in the response and have been resulting in deeper insights into the pandemic as it pertains to human behavior,” one section reads.
Researchers at the EFF separately obtained documents concerning the CDC’s purchase of similar location data products from a company called Cubeiq and the SafeGraph documents. The EFF shared those documents with Motherboard. They showed that the CDC also asked to speed up the purchase of Cubeiq’s data because of COVID-19 and intended to use it for non-COVID-19 purposes. The documents also listed the same potential use-cases for Cubeiq’s data as SafeGraph documents.
Google banned SafeGraph from its Google Play Store in June. Any app developers using SafeGraph’s code had to remove it from their apps or face having their app removed from the store. It is not entirely clear how effective this ban has been: SafeGraph has previously said it obtains location data via Veraset. This spin-off company interfaces with the app developers.