Federal policy for self-driving cars pushes data sharing

Self-driving cars are expected to radically transform transportation as we know it.

But the tech-filled vehicles will become data goldmines for governments, manufacturers, and hackers — and the National Highway Traffic Safety Administration is making sure it gets access to the rich repository of information.

The agency today released its Federal Autonomous Vehicles Policy (PDF), a document that will govern the way self-driving cars are developed, regulated, and policed in the U.S. According to the policy paper, NHTSA hopes to broaden the guidelines’ reach by collaborating with the governments of Canada and Mexico. The federal guidelines will push autonomous vehicle manufacturers into sharing data about their failures with each other and with the government, a move that is already being met with resistance from the tech and automotive industries.

When self-driving cars crash, data from their must be retrieved by NHTSA and the manufacturer for crash reconstruction and analysis. “Vehicles should record, at a minimum, all information relevant to the [crash] and the performance of the system, so the circumstances of the event can be reconstructed,” the policy says. That record will then be shared with federal regulators and other manufacturers, and manufacturers are on the hook for ensuring their customers understand how their crash data will be distributed.

How that data will be shared between the companies racing against each other to build the first self-driving car is yet to be determined, but what’s clear is that tech companies aren’t happy about the idea. David Strickland, who represents Uber, Google, and Lyft through the Self-Driving Coalition for Safer Streets, told reporters today that “the devil is in the details” when it comes to data sharing, and that’s going to be a sticking point for all private companies involved, especially in a space as closely competitive as autonomous vehicles.

Strickland indicated that the industry would likely push back against the data sharing requirements. “There is competitor data, there’s confidential business information, there’s a number of aspects which have to be respected. But on the other hand, safety is a number one priority, and figuring out the right context and space that we can ensure that while protecting the data rights and, frankly, the property of all the innovators and manufacturers should be properly balanced and that’s going to take some time,” he said.

Of course, the NHTSA knows that convincing manufacturers to share their crash data isn’t going to be easy. The agency is exploring data sharing mechanisms that will keep data anonymous and avoid antitrust complaints. “While the specific data elements to be shared will need further refinement, the mechanisms for sharing can be established,” the federal guidelines say.

Car companies are not known for embracing the concept of open source. Exceptions make headlines, as when Tesla made its patents available to any competitor who wants to use them “in good faith” in 2014. But Tesla still guards the driving data that powers its Autopilot autonomy features, which is gathered from the Tesla vehicle fleet. Tesla did, however, end up complying with both an initial request from the NHTSA for data logs from the fatal May 7 Model S crash, and with a modified follow-up request from the agency. George Hotz’s Autopilot-syle highway self-driving startup Comma.ai open-sourced the driving data it used to build its first successful prototype, but it’s also keeping a far greater store of data to itself as a competitive hedge.

“We believe having the government force companies to do things will invite bureaucracy and slow innovation in general,” Hotz told TechCrunch, while also noting Comma would still need to examine the policy in more detail. “Openness should be a result of a desire to advance the state of the art, not forced at gunpoint.”

In recent conversations with TechCrunch, Uber, Lyft and GM have all separately pointed to the vast stores of driving data collected by their respective fleets as key competitive advantages in the race to develop truly effective autonomy. And of all the data used to train these systems — information related to how autonomous vehicles handle challenging conditions or actual impact events — might be most valuable in terms of creating a really robust, adaptable self-driving car. If Uber can handle more varied road conditions without engaging a human driver than can Lyft, for example, that’s going to translate into lower cost-per ride and better margins.

One proposed solution would require companies to upload their data to a third-party aggregator — yet another party that tech companies will likely want to exclude from accessing their data.

However, using a third-party data collection service outside the NHTSA may have the benefit of shielding autonomous vehicle manufacturers’ data from the general public. If NHTSA serves as a clearinghouse, data generated by Uber and others will be subject to public records requests, allowing journalists and members of the public to explore the datasets. Uber has already fought this battle on a state-by-state basis, trying to keep data on its drop-offs and pick-ups from the public. The ride-hailing company was unsuccessful in New York, where the city’s Taxi & Limousine Commission released data detailing 1.1 billion Uber rides. But on Uber’s home turf, the California Public Utilities Commission earlier this year rejected a request from TechCrunch for similar data, saying it was considered “confidential, for the present.”

Ford told TechCrunch in a statement that it “appreciates [U.S. Secretary of Transportation Anthony] Foxx’s leadership and NHTSA’s thoughtful efforts to advance the future of mobility and ensure the United States continues to drive transportation innovation,” but would not comment specifically on the company’s position regarding the prospect of having to share more data with government agencies or outside groups.

Whatever data automakers are forced to share, making the requirements at the federal level means that companies will be able to avoid this state-by-state variance. “I think it’s great that they’re at the federal level. I think that’ll be much simpler and easier to work with than a patchwork of state and local agencies,” Lyft CTO Chris Lambert told students at Northeastern today.

The NHTSA doesn’t want companies to share data with each other only when their cars crash — they also want companies to share information about when they’ve been hacked. “As with safety data, industry sharing on cybersecurity is important. Each industry member should not have to experience the same cyber vulnerabilities in order to learn from them,” the policy says.

However, while NHTSA wants to get its hands on crash data, the agency seems ambivalent about receiving vulnerability data. And that may be a good thing.

Although the government has made significant demands on autonomous vehicle design, safety, and data sharing, its requirements for cybersecurity are vague. The policy says manufacturers must “implement measures to protect data that are commensurate with the harm that would result from loss or unauthorized disclosure of the data,” but it leaves companies to determine the harm. A manufacturer should be held responsible if it introduced security vulnerabilities that allowed a hacker to take over and crash a car. But it’s easy to imagine scenarios where companies might try to shirk responsibility. For instance, what if a domestic abuser tracks his victim using geolocation data hacked from an autonomous vehicle and uses that information to harm her?

When it comes to cybersecurity, NHTSA says more research is necessary before regulatory standards can be finalized. The agency is staying out of the cyber debate — for now — and is instead asking manufacturers to share vulnerability information directly with each other through the industry cybersecurity clearinghouse Auto-ISAC.

Auto-ISAC forbids government agencies from being privy to its vulnerability disclosures, meaning that the federal government could be cut out of the cybersecurity picture. But, given increasing concerns about the hoarding of zero-day vulnerabilities by intelligence agencies and frequent legal challenges to government hacking, it might be ideal to keep information about how to hack self-driving vehicles secret from the government.

The debate over how, when, and with whom to share autonomous vehicle data will continue over the next few months as the NHTSA accepts feedback on its policy from industry stakeholders. But getting industry and consumers to agree with the Department of Transportation on data sharing is a challenge. The White House Frontiers Conference coming to Pittsburgh in October, where President Obama has said autonomous driving will be on the docket, may be the next chance we get to see this challenge played out in public.