A brief rant on converging compliance regimes.

Published on December 22, 2022. compliance (2)

Although I’ve never worked exclusively on compliance, much of my work over the past decade has touched on reconciling between product and compliance goals, and over that time I’ve developed something of a pet theory on the evolution of compliance over the next five to ten years: I expect customer-oriented compliance to converge on a unified set of controls.

While today there’s a wide distance between GDPR, CCPA, HITRUST, FedRAMP and SOC2, I generally expect the gaps between these various frameworks to narrow significantly over time around the premise of all customer data being treated as sacred. Consequently, I expect the necessary controls to implement these frameworks to converge, such that the burden to an organization to comply across multiple compliance regimes will shrink. However, this convergence will occur in a haphazard series of jerky, unpredictable steps as various countries, states and regulators push towards stricter controls.

Engineering, product, and legal teams will be able to save their organizations hundreds of years of engineering time by making reasonable, conservative guesses about where these frameworks converge. This will, however, be painful because these conservative positions will significantly limit what these businesses are able to do, in ways that make them less competitive in the short-run against more aggressive peers. Rapidly evolving your company’s approach to simultaneously future proof against regulatory convergence and maximize short-term windows to optimize growth through advertising and marketing efforts, may well be one defining characteristic of successful businesses over this stretch.

It’s not clear to me whether this will actually be beneficial to consumers, as this sort of regulation will certainly reduce 3rd party advertising and marketing, but seems much less likely to reduce 1st party advertising. This shift of momentum from 3rd to 1st party advertising will be particularly advantageous to companies who can already drive customer-facing network effects in their internal platforms (Meta, Google, ByteDance, etc), and particularly hard on new upstarts hoping to one day compete with those well-established tech businesses.

It’s certainly possible that antitrust regulators will step in to reduce the 1st party advertising advantage here, but by no means certain. The EU already seems intent on regulating these mega-platforms, but the US and China appear less interested, which is quite reasonable as their domestic platforms exert extraordinary influence across the world (and, awkwardly, domestically as well).

From the perspective of non-mega-platform companies, I think this is a natural continuation of the timeline evolution from “data as value” to “data as risk”, and these sorts of companies will increasingly want to approach all stored data with their eyes towards secure, restricted, and audited access.

If I had to pick just one implementation concept to focus on, I’d argue that field-level encryption should be a starting place for new application development, and in particular field-level encryption with distinct keys for each entity you might contract with (e.g. each business for a business-to-business product, each user for a consumer product).

Using field-level encryption as the foundational building block has some interesting advantages. To mention three: (1) trivial deletion, including from backups, (2) database-level data exfiltration is less damaging, as data is not usable unless the paired keys are also extracted, and (3) it introduces the exactly right sort of pain to design a higher compliance system.

Conversely, this will generally be a pain to work with. Many nominally trivial queries will no longer be possible within the database tier. (You can, of course, build something like a trigger to allow retrieving the keys within the database, but that would significantly undermine the value of this entire enterprise.) It also means that many queries you’d like to perform for data analysis are much harder to perform. This isn’t permanently true, you can choose to expose metadata about a given field by deliberately shaping the encrypted data (as a contrived example, you could choose to prefix encrypted data from US with “us_” and encrypted data from Canada with “ca_”), but it certainly is more work.

Anyway, I don’t have an important point to make here, it’s just a thread in the evolution of the internet software industry that I find particularly interesting at the moment. I do think that some particularly interesting businesses will be built on this observation over the next few years, this will span across things like Vanta, others like MongoDB’s Client Side Encryption, and even more to make it possible to run performant data pipelines in a world where things are encrypted by hundreds of thousands of distinct keys. At certain moments, the internet feels static, immortal, and immutable, but the fun part is that it never actually stops changing.