New article: Making a Datomic system GDPR-compliant

It seems a number of people are interpreting this article as legal advice, so I added a disclaimer in the beginning: “this article is not legal advice; its goal is to give you options, not to tell you what you’re supposed to do.”

Having said that, I think the legal and ethical discussion around these issues is also worth having:

As someone who gets legal counselling about this (which may or may not be good), I’m very skeptical about these interpretations, that’s not how we read the GDPR at all here. The GDPR talks about user consent (the user should proactively consent to any processing of her personal data, and should be able to modify or withdraw that consent) and also talks about erasure, so presumably those are different things. “Not using/displaying data” is nothing more than abiding by consent, it’s not erasure. I do agree that ‘erasing data’ means making it hard to access more than it means ‘wiping out any occurrence of this sequence of bytes from the universe’, but I’m pretty sure it means more than “flagging the data as not to be used”. I know of some companies that were audited by the CNIL in France for GDPR-related issues, and I can tell you their approach was much stricter.

I don’t want to indulge in fear-selling: again, one of the main points of the article is that data erasure with Datomic is not that hard to achieve.

I also think we need to put ourselves in the shoes of our users, and genuinely ask ourselves what it means to protect privacy. Even if you have flagged the data as ‘must not be processed / read’, what guarantees you that this flagging metadata won’t be left behind in a future refactoring or data migration ? How do you know your successors will have as much ethics as you do, and discipline themselves to say no when the manager asks for an export of all emails in the database ? I don’t think some metadata is an appropriate level of protection here; an appropriate level of protection might be you having to tell your manager “this data has been erased for privacy-regulation reasons, and we can’t retrieve it with a database query, and if we want to retrieve them we’ll have to go all the way to the datacenter hard drives and unreliably scan them for residual data, and by the way I’ve never done that so it’s likely to take weeks”.

1 Like