Jacques Mattheij

Technology, Coding and Business

What is wrong with Microsoft buying GitHub

According to Bloomberg Microsoft is said to have agreed to buy GitHub. GitHub which reportedly has been losing money being acquired is a major development because of its central role in the development of many open and closed source projects.

For the uninitiated here is what GitHub does in a nutshell: GitHub allows computer programmers from around the world to conveniently collaborate on projects, share bug reports and fix those bugs and allows the administration of some project documentation. The company provides this service for free to entities that provide their code for free to the world and for ‘closed source’ projects there is a fee to be paid. GitHub is in essence a friendly wrapper around Git, an open source version control system written by Linus Torvalds (of Linux fame) and many others. Git already does decentralized repository hosting out of the box but it does not support any kind of discovery method, bug tracking or documentation features, GitHub built a community of programmers around Git and many open source contributors consider GitHub too big to fail.

Companies that are too big to fail and that lose money are a dangerous combination, people have warned about GitHub becoming as large as it did as problematic because it concentrates too much of the power to make or break the open source world in a single entity, moreso because there were valid questions about GitHubs financial viability. The model that GitHub has - sell their services to closed source companies but provide the service for free for open source groups - is only a good one if the closed source companies bring in enough funds to sustain the model. Some sort of solution should have been found - preferably in collaboration with the community -, not an ‘exit’ to one of the biggest sharks in the tank.

So, here is what is wrong with this deal and why anybody active in the open source community should be upset that Microsoft is going to be the steward of this large body of code. For starters, Microsoft has a very long history of abusing its position vis-a-vis open source and other companies. I’m sure you’ll be able to tell I’m a cranky old guy by looking up the dates to some of these references, but ‘new boss, same as the old boss’ applies as far as I’m concerned. Yes, the new boss is a nicer guy but it’s the same corporate entity. Some concrete examples of the things Microsoft have done:

  • Abuse of their de facto monopoly position to squash competition, including abuse of the DD process to gain insight into a competitors software

  • Bankrolling the SCO Lawsuit that ran for many years in order to harm Linux in the marketplace

  • Abuse of their monopoly position to unfairly compete with other browser vendors, including Netscape

  • Subverting open standards with a policy of Embrace, Extend, Extinguish

  • The recent Windows 10 Telemetry abuse

  • The acquisition of Skype, after which all the peer-to-peer traffic was routed through Microsoft, essentially allowing them to snoop on the conversations. To pre-empt the technical counter argument that this was done to improve the service: It only improved the service for some edge cases, for everybody else the service got worse because of the extra round-trip latency. So if that was the real reason then you’d have expected to see the traffic routed to the central servers only if one of those edge cases was detected.

  • Unfair advantage over competitors by using internal APIs for applications unavailable for competing products

  • Tied-sales and bundling

  • Abuse of Patents

The list is endless. So, this is the company that you want to trust with becoming the steward of a very large chunk of the open source world? Not me. And for all you closed source customers of GitHub, do you really want the company that abused a due-diligence process faking an acquisition interest to have the inside scoop on your code?

My own personal preferences would have been a federated model if the hosting costs are so problematic or an ‘infrastructure’ model where you basically pay for your usage of the service in a metered way.

I’ve deleted my GitHub account, I’ll find a way to replace it and if you’re halfway clever so should you. Foxes may change their coats, they don’t change their nature.

So Your Startup Received the Nightmare GDPR Letter

Apologies for the typo in the url…

Some dumb lawyer figured it would be fun to give GDPR trolls a form letter to use to inflict maximum damage on unsuspecting companies. The reason why this is dumb is simple: the GDPR serves a legitimate need but by decreasing the signal:noise ratio handcrafted requests from users with a legitimate concern can get drowned in these ‘just because we can’ letters. It’s the legal equivalent of an exploit toolkit. In order to limit the damage somewhat I’ve worked up a recipe for an answer to such a form letter. The answer is made up of three parts: parts that you can automate (or should have already automated), parts that you can answer in a form letter of your own or through an update of your privacy policy and parts that you can refuse to answer because the requstor is placing an undue burden on you, the company. The GDPR is not meant as a means to harass companies any more than it is meant as a way to bankrupt them or cause them to spend a disproportional amount of time on dealing with it and this letter in part seems to aim to do just that ‘just because they can’.

Here is my example of an answer to this DSAR (Data Subject Access Request), and if at all possible you should probably automate this answer so it becomes a ‘self service’ affair, or at a minimum cuts down the overhead:

Dear Sir/Madam:

I am writing to you in your capacity as data protection officer for your company. 
I am a customer of yours

This would need some proof included with the letter, if that proof isn’t present you can mail back the recipient saying that they should include a customer id, handle or other way to identify them.

, and in light of recent events,

Would be nice if the letter writer cleared up this reference. But the law does not have any requirement for some ‘event’ to be present before a request can be made.

I am making this request for access to personal data pursuant to Article 15 of the 
General Data Protection Regulation. I am concerned that your company’s information 
practices may be putting my personal information at undue risk of exposure or in 
fact has breached its obligation to safeguard my personal information pursuant 
to <latest nasty cybersecurity event or thing in the news>.

Again, it seems as if the letter writer is fully aware that they are trolling, there is no reason to believe this absent evidence. So the letter writer is setting themselves up to be labelled ‘troll’ and then tries to mitigate that by handwaving.

I am including a copy of documentation necessary to verify my identity. 

That’s nice of them. if the information on the proof of ID is enough to identify the user in your systems then you can answer the letter. If not then that too is a valid answer (for instance: because you don’t store any data or because you can’t find the person in your systems).

If the person is a user of an online system then you could ask them to use some automated feature to answer their questions (which you now have a nice template for to anticipate such requests with), if not they will have to link their online identity in your systems with their presented ID through some other means (follow up email for instance).

If you require further information, please contact me at my address above.

So in case you can’t make the connection, send them an email and preferably provide them with an online tool that they can use to make a legitimate request from within the system which can then spit out the answer.

I would like you to be aware at the outset, that I anticipate reply to my request 
within one month as required under Article 12, failing which I will be forwarding 
my inquiry with a letter of complaint to the <appropriate data protection authority>.

30 days is a reasonable term for a request like this, but it is almost clear from the wording that the claimant hopes you won’t be able to answer in time, normally you would first expect the counterparty to simply answer your request, you might remind them near the deadline with a reference to the legal term and then when the deadline has passed you would inform them that you were going to contact your regulator. Also note that the GDPR has a provision to extend the deadline with another two months if you are overwhelmed with requests or if the requests is overly complex. This one would qualify for ‘overly complex’ so you can mail back the claimant telling them their request will take up to three months to process giving you some time to automate some of the elements.

The good news is that it is a form letter so the answers can be identical if you’ve been treating your users in the same way and the processes to recover the data are also identical. Eventually, if you don’t want people to waste your time this task should be automated as far as possible (should have already been probably!), and should be passed off to your support people once you have those.

Please advise as to the following:

1.   Please confirm to me whether or not my personal data is being processed. 
If it is, please provide me with the categories of personal data you have about 
me in your files and databases.

That’s an easy one if you do not process personal data of your end-users directly. For instance: if you’re a sub-processor the request is a fishing expedition and should be directed to the b2c entity that asks you to process the data on their behalf. And of course it goes without saying that you did not copy that data elsewhere. So at that point in time you can decide to cut the reply short with a reference to the DPA that you have with the controller and direct them there, or, alternatively, you could decide to continue to answer the letter in good faith even if you legally most likely would not have to. Realise that just answering the questions in good faith may open you up to potential trouble with the controller, so this might be a good time to decide on a joint strategy on how to deal with these requests directed at you. Note that in that case you don’t have a record of the data subject (the claimant) being a customer of your company (which they claimed in the opening part of the letter).

a.   In particular, please tell me what you know about me in your information 
systems, whether or not contained in databases, and including e-mail, documents 
on your networks, or voice or other media that you may store.

The database part should be easy provided the claimant really is a customer, and the email part I would resist if the communication wasn’t with the customer but internally. You are free to discuss your customers internally without those emails becoming part of a request such as this one because that would expose the privacy of someone else. Voice data is typically only recorded for a brief period of time and you can refer to your internal policy (which you will have to have documented anyway) with respect to the retention time of your switchboard. In your two man start-up I assume you have no such records at all so you can tell them that.

b.   Additionally, please advise me in which countries my personal data is stored, 
or accessible from. 

Again, that should be easy to answer if you know what you are doing.

In case you make use of cloud services to store or process my data, please include 
the countries in which the servers are located where my data are or were (in the 
past 12 months) stored.

That’s another easy one, in fact it is the same question as before. I’d answer only one.

c.   Please provide me with a copy of, or access to, my personal data that you have 
or are processing.

That should be easy too, and if it isn’t then you did not spend enough time on making your systems and policies GDPR compliant. This is one of the core rights that users get under the GDPR (and already had, under the DPD).

2.   Please provide me with a detailed accounting of the specific uses that you have
made, are making, or will be making of my personal data.

Excellent question and deserves a frank answer, for the first two parts. But since you don’t have a crystal ball there is no way to answer the third.

3.   Please provide a list of all third parties with whom you have (or may have) shared 
my personal data.

You should have this, and you should disclose this in your privacy policy.

a.   If you cannot identify with certainty the specific third parties to whom you 
have disclosed my personal data, please provide a list of third parties to whom 
you may have disclosed my personal data.

That makes good sense as a fall back, but you really should have the previous one answered.

b.   Please also identify which jurisdictions that you have identified in 1(b) 
above that these third parties with whom you have or may have shared my personal 
data, from which these third parties have stored or can access my personal data. 

This sentence is bullshit, but the gist is clear. The claimant wishes to know under what applicable law the transfers took place so wants to know which laws govern the transfers that you have made.

Please also provide insight in the legal grounds for transferring my personal 
data to these jurisdictions.

Exactly. They are anticipating that you transferred the data outside of their jurisdiction without their consent.

Where you have done so, or are doing so, on the basis of appropriate safeguards, 
please provide a copy.

This makes sense, but you don’t actually need to prove this or give them copies, in my opinion it would be enough to state unequivocally that you have been a good steward of their data, and that you have appropriate DPA’s in place with your sub-processors. If you want you could make a publicly accessible page on your website where you link to the DPA’s. You can then refer to that. If you did transfer their data outside their and your jurisdiction without their consent then you might be in trouble. You should not have done that in the first place but now that you’re there you will need to try to control the damage. You could attempt to force the recipient into destroying the data or you might own up to the fact and take your lumps. The results are much the same as a breach, let’s hope you at least still have a record of what was transferred, when it happened and what the recipient intended to do with it and who they were.

c.   Additionally, I would like to know what safeguards have been put in place 
in relation to these third parties that you have identified in relation to the 
transfer of my personal data.

Again, the DPA would be the document to refer to. You do have a DPA in place with all your subprocessors?

4.   Please advise how long you store my personal data, and if retention is based 
upon the category of personal data, please identify how long each category is 

This should be a standard answer in any GDPR query. You could cut-and-paste your retention policy here or you can update your privacy policy and spell it out. In general, if answers to this letter are already available in your privacy policy then you could also refer the claimant to your privacy policy. If you want to take some risk you could increase the burden on them by first pointing out that they could have known the answer to most or all of the questions that have nothing to do with their personal data and thus conclude that they are not really concerned at all (because then they would have definitely read your privacy policy first), and that they are placing an undue burden on you by refusing to do their own reading first and using a cut-and-paste form letter. I would advise against that route but it is an option.

5.   If you are additionally collecting personal data about me from any source 
other than me, please provide me with all information about their source, as 
referred to in Article 14 of the GDPR.

This should not be happening, but if you’ve been ‘enriching’ profiles with data from others (data brokers) then you will need to disclose this here.

6.   If you are making automated decisions about me, including profiling, 
whether or not on the basis of Article 22 of the GDPR, please provide me with 
information concerning the basis for the logic in making such automated 
decisions, and the significance and consequences of such processing.

Again, the customer relationship would indicate to the claimant whether or not they would have an expectation that such mechanisms are active, there are only very few situations in which that is ambiguous. The claimant is - again - trying to show that they are not a troll but in fact are on a fishing expedition. I’d still answer the question in good faith.

7.   I would like to know whether or not my personal data has been disclosed 
inadvertently by your company in the past, or as a result of a security or 
privacy breach.

a.   If so, please advise as to the following details of each and any such breach:

    i.    a general description of what occurred;

    ii.    the date and time of the breach (or the best possible estimate);

    iii.    the date and time the breach was discovered;

    iv.    the source of the breach (either your own organization, or a 
    third party to whom you have transferred my personal data);

    v.    details of my personal data that was disclosed;

    vi.    your company’s assessment of the risk of harm to myself, as a 
    result of the breach;

Your assessment is irrelevant, it is the users responsibility to assess the risk of harm to themselves but you could give an indication just to satisfy the question with a note reflecting on that.

    vii.    a description of the measures taken or that will be taken to 
    prevent further unauthorized access to my personal data;

    viii.    contact information so that I can obtain more information and assistance in relation to such a breach, and

    ix.    information and advice on what I can do to protect myself 
    against any harms, including identity theft and fraud.

I hope you have no known but undisclosed breaches. If you do then you probably should make a page where you publicly disclose these details and get in front of the story. If you are not aware that you’ve had any breaches then you can simply answer that.

b.   If you are not able to state with any certainty whether such an exposure 
has taken place, through the use of appropriate technologies, please advise 
what mitigating steps you have taken, such as

    i.    Encryption of my personal data;

    ii.    Data minimization strategies; or,

    iii.    Anonymization or pseudonymization;

    iv.    Any other means

All good questions, that you should have already answered in your privacy policy.

8.   I would like to know your information policies and standards that you 
follow in relation to the safeguarding of my personal data, such as whether 
you adhere to ISO27001 for information security, and more particularly, 
your practices in relation to the following:

If you’ve been ISO27001 certified that answer could take the place of all that follows in this section. But if you’re not certified then you probably should answer the questions in some detail, this may help clarify for yourself if you feel that you are doing a good enough job.

a.   Please inform me whether you have backed up my personal data to tape, 
disk or other media, and where it is stored and how it is secured, including 
what steps you have taken to protect my personal data from loss or theft, 
and whether this includes encryption.

You could answer this in your privacy policy, however I would not answer the ‘where it is stored and how it is secured’ in too much detail. For all you know claimant today is hacker tomorrow, besides, these things can change. Just give a general idea of your backup policy and whether or not the backups are encrypted.

b.   Please also advise whether you have in place any technology which 
allows you with reasonable certainty to know whether or not my personal 
data has been disclosed, including but not limited to the following:

    i.    Intrusion detection systems;

    ii.    Firewall technologies;

    iii.    Access and identity management technologies;

    iv.    Database audit and/or security tools; or,

    v.    Behavioural analysis tools, log analysis tools, or audit tools;

You can answer this in a generic way: “The actual implementation of our ISMS is confidential and we do not give out this information to our end-users, but obviously we take great care to secure your data and where applicable any or all of the above will be deployed.”

9.   In regards to employees and contractors, please advise as to the following:

a.   What technologies or business procedures do you have to ensure that 
individuals within your organization will be monitored to ensure that they 
do not deliberately or inadvertently disclose personal data outside your 
company, through e-mail, web-mail or instant messaging, or otherwise.

That’s a good question, the answer should be something along the lines that you have your employees and contractors sign a ‘data confdentiality’ agreement and that upon end-of-contract you make them sign a ‘non-retention’ agreement.

b.   Have you had had any circumstances in which employees or contractors 
have been dismissed, and/or been charged under criminal laws for accessing 
my personal data inappropriately, or if you are unable to determine this, 
of any customers, in the past twelve months.

Again, a good question, it’s a simple yes-no affair so simple to answer.

c.   Please advise as to what training and awareness measures you have taken 
in order to ensure that employees and contractors are accessing and processing 
my personal data in conformity with the General Data Protection Regulation.

Here you could refer to your onboarding and offboarding processes for employees, the annual privacy awareness refresher and oversight procedures.

Yours Sincerely,

    I. Rate

So, there you go, that should take the sting out of answering the ‘nightmare letter’, even if not all the questions are appropriate (or appropriately worded) you can answer the bulk of them in relatively short order and with automation you can take the sting out. If this is the worst you can expect under the GDPR then that’s not so bad, and the effect might actually be positive:

  • we get to know about a lot of undisclosed breaches

  • it will be clear who has their house in order and who hasn’t

  • if you don’t have your house in order just answering the letter will help you to get there

Note that this form letter makes your life easier in many ways, it’s a form letter so there can be a standardized process to answer it. A handcrafted letter would require a bit more work on your end to ensure it is properly answered.

GDPR Hysteria Part II, Nuts and Bolts, actionable advice

Part I of this is here, if you haven’t read it yet please do so before reading this installment, it will help to establish context.

So, after geting the most repeated fallacies about the GDPR out of the way let’s take a long look at the real life impact of the GDPR and why it is worded the way it is, after that we’ll look at some of the more important ‘nuts and bolts’ that will help you to decide what to do.

What’s important with any law is - besides the letter of the law - what the spirit of the law is, the laws intent. In this case the intent is to curb some of the worst excesses in terms of privacy violation by corporate entities and to put control over their data back in the hands of the owners of that data: the private individuals that are the subject of the data (hence the term ‘data subjects’). There are countless examples a short search away of such violations, I’m not going to catalogue them here because there simply isn’t enough time in a day to do so but rest assured that the state of affairs is such that regulation could not come fast enough. Regular readers of my blog know that privacy is a subject that is dear to me and as such I welcome the GDPR and hope that the law will have its intended effect. Judging by the number of emails I’m receiving from companies that are almost begging me for consent to continue to spam me this is probably the only law ever that has a measurable positive impact in my life before it has become enforceable. (Ironically, these companies are breaking the law by sending these messages…)

Privacy is an important thing, it is so important that the framers of the Universal Declaration of Human Rights saw fit to include it in their short list of things that everybody should have. At the same time companies try to make money every way they can and if violating someone’s privacy is a quick way to make a buck then it sucks to be you. So it shouldn’t be a surprise that lawmakers have seen fit to put together a law that protects this right to privacy, and it isn’t a surprise that such legislation starts in Europe because we have some very good local examples of what a lack of privacy can do to people’s lives.

Depending on the kind of data, the size of your organization and the amount of data you process as well as your relationship vis-a-vis the owners of the data the impact of the GDPR can be anything from ‘nil’ to ‘huge’ with associated costs. I’m going to try to give a rough guide to how the new EU legislation will most likely impact your company and what your expsosure is.

First of all, let’s look a the kind of data that you are processing. I’m going to look at several different scenarios to gauge the impact of the GDPR on each of those.

Kinds of data

Data comes in many different kinds, there is data associated with a person (an individual) and there is data that is not associated with a particular individual. Data that is not associated with a particular individual is not ‘in scope’ when it comes to the GDPR unless that data can be re-associated with that individual. This means that to all intents and purposes you should focus on data associated with a particular individual. In the context of most businesses operating online this data will be stored in a ‘profile’, a record or series of records that contain an identifier that can be used to assign the record to a particular person. Examples of such profile data are your social media postings, your medical history including your x-rays, the data an advertising agency keeps on you and so on. What the GDPR clarifies (and which was already the law anyway, but which companies routinely ignored) is that you are not the owner of that data. You are merely the steward of the data, and that holding the data is a liability to your business. In other words: data is only an asset to your business if the value of that data outweighs the costs required to be a proper steward of that data. And being a proper steward of that data requires all kinds of processes to be in place (which you should have had anyway!), including a facility to delete the data at the first request of the user unless you are required to keep the data by law, to allow for corrections of the datai and to give the ‘data subjects’ (the individuals) access to the data that you hold on them.

If that sounds like a burden to you then yes, you are right, it is a burden. But then again, data life-cycle management makes good sense, after all if you are hanging on to data that you have no business having or if you refuse to correct wrong data or if you refuse to tell people what you have on them then your company is not acting in the interest of the people whose data it holds. And that is a key item: the EU legislation has been written from the perspective of the citizens of the EU, not from the perspective of those that happen to come in posession of data on those subjects. Their interests are legitimate, but secondary.

The kinds of data that companies hold will have a significant impact on the burden of the GDPR, as a rule, the more critical the data to the individual, the higher the burden. So the burden for data that is already public is relatively small or non-existent. The burden for data that is highly confidential, such as your medical records or your financial dealings is much higher. The good news is that this proportionality was already well known long before the GDPR took effect and that’s why banks tend to work harder to keep your data private than that e-commerce store where you bought a pair of socks last week. Of course not all banks were equally concerned with this and some banks will see a larger amount of work under the GDPR than before. And one hospital may have done a better job in the past than another. This goes for businesses just the same, those businesses that already had their house in order and that have automated procedures in place and that in general have put themselves in the position of being stewards rather than owners of the data they hold are likely in a good position when it comes to dealing with the GDPR.

So it’s clear that the kind of data you hold on your data subjects makes a big difference.

Then there is the kind of data that has special consideration: PII. PII is short for Personally Identifiable Information. It is everything that allows you to find out who the data is about. Examples of obvious PII are full names and social security numbers. Not so obvious: an IP address. Really not obvious: the fact that you have a rare disease coupled with the name of the small town where you live. And many more examples like that. The simplest solution is to deal with all data that you hold on an individual as though it is PII. Better safe than sorry. But if you feel you must treat some of your users data in a different way then you need to carefully weigh what data you treat as PII and what data you are more cavalier with.

Quantity of data

Companies that have thousands of records of data (for instance: an e-commerce company that holds a record of each sale) have a different exposure to risk than those that hold millions or even billions of records of data on their end users. For that reason such companies (which are most likely larger anyway) will have to expend more effort and will have face a larger burden to become compliant than the smaller ones. Yes, the aforementioned life-cycle management is going to be the same amount of work for a small company as it is for a large one, but if you could do the work to collect the data there is no excuse for not properly managing it. That’s simply a cost of business. But SaaS companies that handle large amounts of customer data will want to make extra sure that all the i’s are dotted and the t’s are crossed because they are much juicier targets for wannabe hackers and that increases the risks of having that data exposed. Note that there are exceptions for instance in order to keep your bookkeeping in order you will have to record a certain amount of data and your local tax office will have it’s own rules about how long you should keep such data. But that is your bookkeeping, which probably has little to no direct connection to your live web services, and is only concerned with actual sales and refunds.

Size of the organization

The burden of compliance on a small organization will be lower because having a dedicated DPO (data protection officer) or CCO (chief compliance officer) is not going to be required for SMEs unless they deal with very large volumes of data or deal with very sensitive data. And in those cases you probably want to have that dedicated person anyway so the law really doesn’t make much difference.

But very small companies (say a 1 person company) dealing with extremely high risk data would do well to consider at least hiring an outsider for a bit to ensure that they are not exposed to extreme risks. In some cases, for very small entities processing very small amounts of non-critical data a DPO may not be required at all. Even then, since it is free I would advise to assign the role. Better safe than sorry and like that you are also immediately covered when the business grows.

Taking that all together

So, if you’re a small company and you deal with limited amounts of non-critical data then the impact is very well manageable. Somewhere in the middle, say 20 FTE (full time equivalents) and highly critical data the burden will be proportionally much larger and if you run a very large company then you most likely already have people dedicated to compliance anyway. So it is true that the law is not hitting all parties exactly equally but at the same time given the importance of the issue the rights of the data subjects outweigh the commercial interests of the companies and if the data was worth collecting in the first place it is also worth protecting and properly disposing of when you are done with it. And this difference in impact is mostly a relative one, in an absolute sense large companies and companies with lots or critical data will expend more to ensure they are compliant than smaller ones, it’s just that for the smaller ones the impact will be larger relative to their total turnover. This is a well known phenomenon and applies to all aspects of running a business, it’s called ‘economy of scale’ and is one of the reasons why software and software based services are so lucrative: the economies of scale are enormous.

Things you can no longer do

  • store all the data you can get your hands on forever

So far the norm for storing data on end users seems to have been ‘until we run out of diskspace’, and when that happened we found out that disks had gotten cheap enough to stop worrying about this altogether and simply append data without the possibility of ever removing anything. Yes, you read that right: up to the GDPR companies had a habit of never deleting anything. Even if you asked them to delete your data - which they grudgingly might have taken action on - the accepted solution was to mark the data as deleted but to leave the data itself in place. That’s going to go away now: if you request a deletion it can no longer be ignored and the data really has to be deleted. The only exception is if you are by law required to keep the data, but in that case I would advise to keep it in archival storage only.

The situation with backups is a little tricky - and this is one more of those areas where there is a lot of fearmongering saying you can’t possibly be compliant - but it is actually very simple: you keep the keys of the deleted records in a separate log which you back-up as it is written and if for any reason you want to restore a backup you simply re-play the deletion requests starting from the timestamp of the backup. That way backup restoration can not result in accidental revival of deleted records. If there are concerns with dangling foreign keys or there is the potential for key re-use you could choose to overwrite the relevant data with blanks. Note that this is similar to pseudonymization, which if only applied to PII is probably not enough to ensure compliance if other data that you do retain can be used to reconstruct the data that was erased. Real deletion is by far the most secure way of dealing with the deletion requirement. This all assumes that your backups are properly encrypted and that the keys are not recoverable without some kind of intervention from the people responsible for operations.

  • ignore requests for deletion, correction or insight from your users

The burden of removal requests, correction or insight into the data you hold on your users (or ‘data subjects’) can be quite significant. This is where automation shines: self service and a one time investment in time and effort pays back manyfold over the years that follow. Ignoring such requests will sooner or later cause an upset individual (could be me) to contact their local Data Protection Authority to file a complaint. The first time this happens the DPA might ignore the complaint, because I suspect they will be quite busy in the near future. But they probably will start a file. And if that file gets thicker because more people complain about your company then sooner or later someone will be tasked with setting up a conversation with you. This conversation will be along the lines of ‘we have received multiple complaints that you are purposefully ignoring legitimate data subject requests, please explain yourself’, after which you - brazen as always - explain that you feel the law is too burdensome to be complied with and to hell with all DPAs. The DPA will respond along the lines of ‘we are very sorry this is how you feel, however, you too will comply ‘or else” where the ‘or else’ bit will be a warning about what kind of fine you can expect if you continue to ignore the law. Then, some time passes. A new user files a complaint. This gets added to your file. And while it is being added the clerk doing the filing notes that you have been warned already. This is where it gets interesting. This time the DPA will most likely issue a ‘binding warning’, they will issue a directive that you ignore at your peril.

A third time would be best avoided. You will be fined. You may choose to ignore the fine because your company is incorporated outside of the EU. But the EU regulators saw that one coming: you have to have something called a designated representative in the EU in order to do business there. Yes, you read it right. That’s probably the most invasive change the EU could have picked, you need to establish yourself through a legal proxy in the EU. If your business already has a presence there then that will be your designated representative. If not you’ll have to get one, there will - and already are - companies that will offer this service for a fee to the companies that it applies to. So as soon as you get fined the designated representative will be notified of this. They most likely will have a contract with you and will also have a representation in the country where you reside, and part of the contract you have with them is that you indemnify the representative for any and all fines collected through their representation of your legal entity. So then you have the choice of fighting your own representative in court in your home country or coughing up the fine.

  • wipe breaches under the carpet and pretend they did not happen

People are usually shocked when I give them some stats on the number of reported breaches for the year to date. The numbers are simply scary and that’s before you come to the realization that the vast majority of data breaches is never reported so they are not counted in the statistics. There are two reasons for this, the first is that a large number of data breaches is never discovered, the second is that even those that are discovered are not always reported. That’s ‘Game Over’ territory as soon as the GDPR is enforceable, not reporting a breach is one of the worst things you could do. It’s both irresponsible towards the people whose data has been breached and it potentially makes matters worse because you are taking away an important tool from the regulators: their ability to measure how big the problem is so that they can spend their effort there where it has the most effect. Of course it is bad PR. But then, there was the option of spending some more resources on security and to reduce the possible haul of such a breach. And if you did all that then at least you have a good story to tell. Responsible disclosure in case of a breach will go a long way towards establishing good faith on your part. Wiping the breach under the carpet could work but if you are found out you can fully expect the book (and the lectern) to be thrown at you.

  • pretend the data on your systems is yours rather than the end-users

The data you process on behalf of your users is their data. The GDPR is not ambiguos on this front at all, hence the consent/delete/correct/insight (and data portability) parts of the law. You at best are a steward of that data and if users consent to specific uses of the data then you are allowed to process it. But it isn’t yours and as soon as you start to think of it as yours it is only a matter of time before you will create some use case that is against the law. Don’t do that.

  • treat data security like it is optional

Companies have a rather strange attitude towards security: it is treated entirely as a cost with zero upside and if possible it will be ignored. The GDPR has finally given the people tasked with security at least one stick to wave around in case their arguments are ignored, the GDPR fines are potentially so large they tend to get the undivided attention from management. Management of most companies is risk averse to the point that they would rather get their security in order than to face a possibly very large fine (even if the chances of that fine materializing are slim to none). So even if that isn’t a very large stick it is better than nothing and it seems to have the intended effect, the companies that I look at seem to have woken up to security no longer being an optional item to take care of when they have retired.

  • sell end user data with abandon

Just like you can’t sell a car or a house that isn’t yours you can’t just go and sell the data that users entrust you with (or that you gather from their devices, such as location information and other good stuff). If you wish to sell your users data you are going to have to obtain consent. And that consent has to be freely given with the express purpose of the transaction you have in mind, in other words there is a specific purpose. If you feel like riding the line here you could try to argue that ‘selling your data to unspecified partners’ is a one time blanket consent that a user could give but if I were a user I would likely not give you that consent. But if - in the interest of keeping the lights on in your fine institution - you instead ask consent to sell my data to ‘Pathfinders Inc, a company that wishes to analyze my location data to determine where traffic jams occur’ then I might agree. So obtain consent if you intend to sell data and make that consent as explicit as you are comfortable with in order to optimize the number of people that ‘opt-in’ versus the amount of work you have to do to re-obtain consent. And always remember: consent once given can be withdrawn, so you will have to have a provision in place on how you and your data buyer will deal with the withdrawal of that consent.

The easiest way to deal with all this is of course simply not to sell data in the first place.

  • fail to obtain consent

This is one of the areas where the GDPR is extremely pointy and clear: if you wish to process a users data for a new purpose you will need to obtain consent. It is almost comical how many companies seem to suddenly realize that they’ve been spamming large numbers of people without their consent and they now - sheepishly and rather belatedly - figured out that maybe they should ask for that consent before the clock runs out because even such an email would likely be seen as a violation of the law! (And rightly so imo.) So resist temptation and do not use data that you already have for entirely new purposes.

Things you will have to do

  • enable data life-cycle management

Data has a live-cycle, just like everything else. Endless appending is not an option, you need to plan on how you will acquire the data, process it, store it, allow the owners of the data to modify and correct it and eventually on how you will delete it. To be fair to the creators of the GDPR: you should have already had this and if the law had not existed at all you still should have already had this. It’s a common sense thing and it shouldn’t take a law to make this happen. Think about it: at some point your data subjects will die. That would indicate that there is at least one valid reason why there should be an end to storing data on a particular individual no matter what.

  • figure out what data you have that is in scope for the GDPR

    This sounds easy, but for large companies with lots of data and bad processes it can be a lot of work. If you have not started timely with this it is doubtful you will be ready with this in time before enforcement starts. Even so, you should still do this and for each data set that you have you will have to determine the following:

    • find out what is in the data
    • determine whether or not it is ‘in scope’ for the GDPR
    • determine if it is in scope if you want to keep the data
    • securely erase it if you do not want to keep the data (and update any software that relies on it)
    • alternatively, anonymize it in such a way that the data can no longer be used to identify individuals (this is a lot harder than it may seem at first glance)
    • document that you have it and how it gets processed if you want to keep it

  • ensure your systems are secure

Do your level best to ensure that the computer systems that you use especially those directly facing the web are secure. Read up on security, find the person in your organization that is most experienced in this field and have them analyze your systems from a security point of view. Implement their advice and make security a part of your organzation from the top all the way down to the bottom.

  • disclose all uses of the data you collect in your privacy policy

Try to make your privacy policy short, to the point and complete. If you plan on doing something with the data obtain consent and disclose all uses in the privacy policy so that someone can review it. If you make changes to your privacy policy publish a changelog.

  • enter into DPA’s (data processing agreements) with all those that you farm out data processing to

Each and every firm that you farm out data processing to should be under a special contract with your organization called a DPA (confusing, this means Data Processing Agreement), no exceptions. If a party does not want to sign on the dotted line ditch them. If you don’t feel comfortable with giving them access to your crown jewels ditch them. If you feel that you could do without that particular service ditch them. By far the easiest way to deal with third party risks regarding the data that you administer is to ensure that it never leaves your premises. And if you have to make sure that DPA is executed properly. And make sure that your counterparty is able to deal with the withdrawal of consent after the data has already passed on and that they in turn do not under any circumstance pass the data on to others without your explicit consent (in writing!).

  • disclose those companies that you have DPA’s with

Tell your end users about which companies process data on your behalf. A good place to disclose this is in your privacy policy.

  • obtain consent from your users for the use of their data

Before you can use data supplied by individuals you have to obtain their consent. This is not just good manners, it is a hard requirement. This goes for the initial use of the data after it has been collected and all subsequent uses of the data.

So if you have a brilliant new idea that will result in a new use of the data that you already hold you will have to update your privacy policy and you will have to re-obtain consent for the data that you already have.

  • plan for the withdrawal of consent

Consent can not only be given, it can also be withdrawn. And that withdrawal also affects your relationship with sub-processors, parties that process the data on your behalf. Withdrawing consent should be no harder than giving it.

  • report breaches immediately if they cross the reporting threshold

If there is a breach of your system that is above the reporting threshold (which is quite low!) then report it immediately. You have 72 hours, if you establish a breach protocol it is much safer to do this on day one rather than to wait until the last moment, because any hickup in the reporting process will slow you down and will cause you to miss the deadline. If you deal with very sensitive data always report the breach, even if it is a small one, the damage to your data subjects could be substantial which increases the risks of them going to their local data protection authority and if they do not have a report from you about the breach you will have a fairly large problem.

Things you probably should do

  • store data in off-line systems if you don’t need it on the live systems

If you don’t need data right now and on the live system (for instance, data that is historically important but not part of the current dataset) then you should move that data to off-line systems. That way if there is a breach of your system the damage is limited.

  • act in good faith, try to respect the spirit of the law

I’ve seen lots of examples over the last year of people that were trying to be clever, to try to find a loophole in the GDPR that would allow them to continue with ‘business as usual’. Some of these examples were of the very worst and contrived variety, some were more reasonable but still clearly in violation of the spirit of the law. If you really insist on going to court over this by all means find that line but if your budget is limited and you’d rather not be fined or have an expensive court case to deal with then please act in good faith. That will go a very long way towards convincing regulators that you did not intend to do something bad and you were looking for ways to get away with it.

  • keep your users (‘data subjects’, ‘individuals’) interests first

If you put your own interests over the interest of the data subjects and you make use of the data as if it is yours then you are doing it wrong. Consider the data that you hold borrowed and be a good steward of that data. Keeping your users interests first has other effects as well, for instance it makes for happy users, which is eventually good for business.

  • delete data that you no longer use

If you have data that is no longer useful to you don’t keep it around ‘just in case’. Get rid of it. This vastly limits the amount of damage a breach can do and reduces the chances of data that you should no longer have being exposed in a breach. After all a breach is already a bad event, no need to make it worse by leaking data that you had no use for in the first place. Newspapers love large numbers, if you lose enough data you just might find yourself to be front page news in the worst possible way. Treat data as though it is nuclear waste: get rid of it as soon as you can.

  • use the GDPR as an opportunity

If there are 10 parties offering the same service and only one of those 10 has decided to become compliant with the GDPR, and in fact treats all their customers (and their data) with the same level of care and respect then in the short term the other nine may have an advantage. They don’t have the costs associated with compliance and they can concentrate on expending those resources on features and marketing. But in the longer term I suspect the party that took the high road will win the race. Because that party will not be subject to fines, that party will be able to sell their product to 100’s of millions more individuals and that party will be able to use their GDPR compliance as an easy to recognize badge of approval. The chances of their data leaking (which causes substantial brand damage) is lower and the user trust in the brand will be correspondingly higher.

  • do annual pentests / audits

It’s good practice to ask outsiders to try to ‘rattle the doors’ on your systems to ensure that they are secure. This is a relatively costly affair and probably out of reach for most reall small companies. But if you have a large enough business this will definitely help you to sleep well at night. After all, if your doors are locked better than your neighbors the chances are high you will be passed over. So don’t see this as an absolute affair (there is no such thing as perfect security), treat it as a relative one: you don’t have to be able to outrun the tiger, but you should try to outrun the guy next to you.

  • if your company is large enough and the kind and quantities of data you store warrants it get ISO27001 certified

ISO27001 certification is not a proxy for GDPR certification. But absent GDPR certification (for now) it is a pretty good proxy because even if ISO27001 does not address privacy directly it does require you to have substantial processes in place around the theme of security and this in turn will vastly reduce the chances of exposure to a security breach and the associated reporting duty. On top of that those companies that I’ve looked at that were ISO27001 certified as a rule also had their house in order when it came to compliance with privacy legislation. It forces you to think about data as a liability, rather than an asset and that particular mindset is a good one to have when you are dealing with end user data.

  • read the law at least once, or if you have employees, ask one of them to read the law at least once

Reading the law is actual work. If you’re not a lawyer it will probably cost you at least a day and maybe more. This is useful because the law contains a lot of things that are informative and that will help you to form a coherent picture about what the law really embodies and how it applies to you and your business, as well as to you in your capacity of private individual whose data is processed and stored by other companies. If you can’t afford the time (though I personally believe every business owner should be able to find the time to read a reasonably compact document that has the potential to impact their business) then delegate it or at least read the excellent WikiPedia page on the GDPR until you are familiar with the terms, the intention and the general scope of the GDPR. That will help you in your discussions with others and will help you in determining the impact for your business.

  • cut down on the number of parties that you ship your users data to

The more parties that you ship you users’ data to the bigger the chance of a mishap. What goes for not having data that you don’t need goes doubly so for not shipping data to parties that are not essential in the kind of processing that you do. An example of such an instance would be a company that processes medical data for patients enrolled in hospital treatments. You could embed an analytics tag of some third party analytics provider on the pages where you collect the data. But your need to collect statistics is in no way outweighed by the patients’ needs for data confidentiality and so you’d be better of to remove that analytics tag on those pages. This may be a minor inconvenience but it is much prefered to leaking this data.

Like that you have a minimal version of a GDPR impact study in your head and you will be much better capable of figuring out if, where and how your business should adapt to the new legislative environment. In general this is good practice when a new major body of legislation that might impact you is launched. If your business is serious enough then you might want to engage the services of a professional with the required background (a lawyer specialized in privacy, for instance). Even then it is good to know the basics and you should probably still read the text of the law.

Things you probably should not do

  • pretend you don’t know about the law and hope it goes away

Ignorance of the law is never a good defense. Besides making you look dumb and careless it would be the easiest way out of every law that you do not wish to comply with. Just pretend you don’t know about it and the problem goes away. Toddlers tend to do this when they close their eyes: “I can’t see you, so you can’t see me”. It doesn’t work for toddlers and it won’t work for you either.

  • assume the law doesn’t apply to you without properly researching if it does

There are exceptions to rules and you can read up on them to figure out if the law applies to you (if your situation is ‘in scope’) or if it does not. By researching this properly you may find out that if you thought the law does not apply to you that it does, or the other way around. Either way, knowledge is power, in the one case you will be forewarned, in the other you will know you are in the clear.

  • try to armchair lawyer your way out of having to comply with the law

I’ve seen lots of this: people who have not bothered to read the law or who are looking for that one line that they can use to justify their stance. This is silly and most likely will lead to some serious disappointment down the line. If you read the law and you gain some insight into it you will see that the law is written in a way that creates a lot of leeway for interpretation. This is a good thing, even if you would like to have things spelled out. The reason why it is good is because it allows the regulators to throw the book at those that would like to go for the narrowest interpretation possible whilst at the same time ignoring the spirit of the law.

This strategy works in some countries (notably: the USA), but it really does not work in Europe. This is probably the biggest clash between those two worlds to date and the frontier is one that matters a lot so take this from a European that has had a reasonably large exposure to both US and EU law: the systems are very different in this respect and if you comply with the spirit of the law but break the letter then in almost all cases you will be fine. If you try very hard to comply with the letter while at the same time ignoring the spirit of the law you will hit the wall, and probably quite hard.

  • break the law knowingly

This is an obvious one, but I’m putting it here just to make sure: Do not break the law knowingly. If you do so you are setting yourself up for future failure. Either comply with the law or shut down your service for EU access (assuming you can realiably do so, which as I’ve already noted is hard).

If you have very deep pockets and are willing to go to court, and then eventually to the EU high court to challenge the law or aspects of it then more power to you, you will have to break the law knowingly to invite a challenge from the regulators. But for most ordinary companies this is not a viable path, this is best left to the Googles and the Facebooks of this world.

Things you could do to make your life easier

  • apply GDPR principles globally

Note that all these make good sense even absent the law, it’s just that now that the law is there they can help you to reduce your exposure and will go a long way towards demonstrating that you are working to keep your users interests close to heart.

Frequently Asked Questions regarding the GDPR

Once more: what follows is not legal advice even though it suspiciously looks like it. If you are going to implement any of this think of my reading of the law as an 80% or so solution and you will likely need to get to 100% if you wish to minimize your risks and like all advice it is worth what you paid for it, in this case absolutely nothing. Even so I have done my best to not spout nonsense. Buyer beware!

  • Do I need a (dedicated) Data Protection Officer?

That depends on the size of your organization, the kind of data and the amount of data that you have. If processing other people’s data is your bread and butter and the quantity and kind of data indicate that there is substantial risk in case of a breach and your organization is large enough then most likely the answer is ‘yes, you do’. So medical, financial and advertising companies of any size will almost certainly require having a dedicated DPO. If your company processes very little data (say thousands of records) and the data is not super critical (say you are selling clothing online) then you will need a designated DPO, in other words you need to assign the role of Data Protection Officer to an individual that most likely also does other stuff. It’s not ideal and in that situation you will have to take extra care to ensure that the DPO has sufficient independence in their role. If you do not process end user data at all then you wil not need a DPO.

  • Do I need an EU designated representative?

    If you are doing business in the EU but you are located outside of the EU then yes, under the GDPR you will need a designated representative. This is definitely a burden and I sincerely hope that the free market will bring the cost of this down to a level that even the smallest businesses will be able to solve this in a cost effective way.

    The tasks of such a representative are:

    • The representative is an authorized agent to receive legal documents
    • can be subject to enforcement proceedings in the event of a company’s non-compliance with the regulation
    • to be a direct contact with the authorities
    • to be a direct contact for data subjects concerning their data processing

    If you feel this is unfair then keep in mind that it is only to level the playing field between foreign companies and EU players. The unfair part is that for EU companies there is no additional burden to have a registered representative, they represent themselves. The alternative is reciprocity, a bit like what happens with traffic violations within the EU. The members states have agreed to enforce each others speeding tickets and parking tickets and will collect for their foreign counterparts. But as long as only the EU has legislation at this level reciprocity is not an option. If in time there will be similar legislation elsewhere then a reciprocity agreement could be entered into and then the need for a local representative will go away.

  • What is my exposure if I ignore the GDPR and just keep going as I used to?

    That is a very dangerous question. What your exposure is depends very much on what you do. If you are super concientious in how you deal with user data, already have all or most of the procedures in place but do not otherwise change your ways because of the GDPR you are obviously going to entertain some exposure under the law. The regulators will probably take a dim view of this if they are required to interact with you. If you are a really small entity (say a private bulletin board with a few 100 regular visitors) then you might even get away with it. But I would advise against this strategy. All it takes is a couple of angry users that report you to their respective DPA if you do not comply with legitimate requests and you will have a hard time explaining why you chose to ignore the law. You might spin a good yarn and you could try to bluff your way through this but if there is one thing that I’ve learned over the years by looking at lots of companies there is almost no strategy so dumb as to underestimate the competition.

    The burden of compliance is definitely non-zero for most somewhat serious businesses, but then again, the upside is access to a market of 300 odd million people. That’s got to be worth something on your end, and what better way to signal that you are a trustworthy business partner than to treat the data of your customers with respect.

  • Can I store IP addresses?

Yes you can, but in quite a few instances they are considered PII and even if the norm is for any and all software to log this information if you don’t really need it you are better off without. But if you do need them ensure that you remove them or zero out the last octet once that need has passed, alternatively, rotate your logs fast enough that you can drop the logs before too much time has expired.

  • Can I send marketing email?

Yes you can, provided you have obtained proper consent to do so. And no, you can’t mail your users to obtain consent after May 25th, you actually shouldn’t be doing that anyway. Any addresses that you have collected without consent should be considered lost. In general such messages, collectively classified as ‘SPAM’ are off limits when sent to private individuals.

  • Can I have my users check a box that says they opt-out of their rights under the GDPR?

No, you can’t, full stop.

  • Should I hire a privacy law specialist?

If you can afford it, by all means! But make sure you get some references first before you jump into a contractual relationship with anybody.

  • Should I obtain ISO27001 certification?

ISO27001 is not directly related to the GDPR. However, it does take care of a lot of the question marks raised by the authorities in case of a breach: Have you done enough to ensure that it would not happen (even if it did, with hindsight). ISO27001 certification is essentially an outside party verifying that you did indeed do your homework. After that of course it is still possible that you will have to deal with a breach, it could be an inside job or an oversight. But at least you did your best to avoid it.

Another advantage of having ISO27001 certification is that it for the moment seems to have some proxy value, businesses will be more likely to do business with you in the new legal climate if you have obtained this certificate because they will assume you have your house in order.

The downside is that such an audit isn’t cheap (so it is only something that a certain sized business can afford) and that it is a repeating cost (annual audits are the norm).

  • How do I reliably determine if someone is an EU resident?

The short version is that you can’t. The slightly longer version is that you could ask but someone could lie, you could use a geotargeting library but it could be faulty, someone could use a proxy or a satellite connection and so on. This is more than just a little bit annoying but you can not with 100% certainty determine where someone is located.

  • Should I have two tracks in my website, one for EU residents and one for the rest of the world?

You could do this, but it would be a lot of work. And in the end, wouldn’t you want to extend the same courtesy to people from other places than the EU? It would seem like a complete waste of resources to have to maintain two separate tracks for similar data and similar users all so that you can withhold certain fairly normal features from people from outside of the EU. Personally I’d use it to my advantage and advertise loud and clear that people from other places get the same level of protection and the same level of control over their data. That would possibly also give you a leg up on the competition if they do not do this. And conversely, it could work against you if your competitor does decide to go that route.

  • What about my ‘hobby business’?

The GDPR does not make a distinction between hobbies and businesses and it seems right to me that it does not. Whether it is a hobby to you - and hence not commercial in nature - makes no difference whatsoever. As soon as you start the collection of data on private individuals from the EU the directive kicks in and you will have to find a way to be compliant.

  • What about my open source project?

Open source software is not a service, it is software. As long as you just write software and distribute it without recording anything about those that receive the software your operation is not ‘in scope’ for the GDPR. As soon as you run an online service using software you wrote yourself - even if you open source it - your project is in scope.

  • How do I deal with - insert some wild edge case here?

There are many such wild edge cases, too many to list here but let’s take one that I’ve seen multiple times:

If you are worried at this point in time about how you should deal with the situation where an EU resident is visiting the United States and you are no longer able to accurately determine their location then you are probably worrying about the wrong thing. EU legislators and data protection authorities are not going to jump out of a bush near your house to fine you in such cases. But you should still treat removal, access and update requests from these users (and probably all others) as though they are legitimate, after all why would you want to ignore them?

There is a good chance there will be a part three to this series, I’m still mulling that over, these posts are a lot of work.

GDPR Hysteria

In another week the GDPR, or the General Data Protection Regulation will become enforceable and it appears that unlike any other law to date this particular one has the interesting side effect of causing mass hysteria in the otherwise rational tech sector.

This post is an attempt to calm the nerves of those that feel that the(ir) world is about to come to an end, the important first principle when it comes to dealing with any laws, including this one is Don’t Panic. I’m aiming this post squarely at the owners of SME’s that are active on the world wide web and that feel overwhelmed by this development. A bit of background about myself: I’ve been involved in the M&A scene for about a decade, do technical due diligence for a living (together with a team of 8). This practice and my feeling that the battle for privacy on the web is one worth winning which has led me to study online privacy in some detail puts me in an excellent position to see the impact of this legislation first hand as well as how companies tend to deal with it.

First some context: Every company and every project or hobby ever has to be compliant with the law. Whether or not that is possible usually depends on what you are doing, your local legislative climate and, obviously, the law. So whether or not you are doing something for profit, as a hobby or making a few bucks on the side all the way up to a company doing billions in turnover with 10’s of thousands of employees does not matter. Compliance with the law is the norm. If you are doing business abroad then this means that you may have to be compliant with the laws of another country, and the web being as connected as it is this means there is a fairly high chance that your little domain will be impacted by the laws from multiple jurisdictions. For people from relatively insignificant (in terms of power in the rest of the world) countries this is not exactly news, they are already impacted by the laws from very powerful countries and so they are probably well adapted to this. For the inhabitants of large countries that so far have been able to ignore the laws of other places this is a new situation which may require some new level of understanding.

The easiest way to gain some of this understanding is to realize that you already have to be compliant with a lot of laws in order to be able to operate anything at all, even a lemonade stand comes with the following legal implications:

  • food safety laws

  • commercial operation laws

  • municipal laws

  • administrative laws

  • employment law

  • and possibly even others

So, nothing is really simple but one more law added to the pile is also not going to be the end of the world. Because this article is not aimed at large enterprises and because I am not a lawyer (yes, that’s one of those annoying disclaimers) this article is not written in legalese, but there will be some terms from the GDPR that I will not be able to get around. These terms will be defined when they are first used, a search with your favorite, GDPR compliant search engine will usually give you more context than I can put in this article.

The first thing you have to realize in coming to terms with the GDPR is that ‘one law fits all’. The GDPR was written as a law to repair the lack of adherence to its predecessor, the DPD, the European Data Privacy Directive, which has had the unfortunate shortcoming of being a directive rather than a regulation. The effect of this - and the lack of teeth - was that it was mostly ignored by businesses. This is a recurring theme in our collective history: first there will be room to self regulate, if that does not work there will be a directive and if all that fails then finally there will be a law with penalties in case of non compliance. As the sign on the maps on billboards all over the world says ‘You are here!’. Now - in exactly 7 days - we will have a law come into effect that has some serious teeth and that you will - for a change - not be able to ignore.

So what form does the panic take? I’ve seen a lot of different kinds of it but most of it revolves around the a fairly limited number of themes that I will try to address one by one from the perspective of a small business owner in order to reduce the emotional levels to something more manageable. Getting these fallacies out the way before going into more detail about the kind of impact the GDPR does have is productive because it will allow us to concentrate in more detail on what actually matters.

  • The GDPR is going to expose me to fines of up to 20 million Euros for even the slightest transgression

No, the GDPR has the potential to escalate to those levels but in the spirit of the good natured enforcers at the various data protection agencies in Europe they will first warn you with a notice that you are not in compliance with the law, give you some period of time to become compliant and will - if you ignore them - fine you. That fine will be proportional to the transgression. You can of course ignore the fine and then ‘all bets are off’ but if you pay the fine and become compliant you can consider the matter closed. The typical EU pattern in case of repeated transgressions on the same subject is increasing fines. This can get expensive quickly and most businesses tend to adjust their processes promptly once they have been fined the first time. The reason why I am sure this is the way it will go down is this is exactly how it has been done so far, every interaction with data protection authorities has followed the exact same pattern: warn, fine, increased fines. There are no known cases - though I’m willing to be surprised on this one, but none that I can find - where an entity was presented with a huge fine without first being given a chance to comply with the law.

Note that the 20 million Euros or 4% of global turnover is the maximum fine, the specific language is ‘a fine up to €20 million or up to 4% of the annual worldwide turnover of the preceding financial year in case of an enterprise, whichever is greater’, so that’s the maximum of the fine that’s being set by the 20 million or the 4%, and this bit is there to ensure that even the likes of Facebook and Google will not simply ignore the law and pay the fine to be able to continue as they have so far. This in no way should be read as you, the small business operator will face a fine of 20 million for each and every infraction that could be found.

  • The GDPR will enable anybody to be able to sue me, even from abroad

The GDPR does not have this effect, but you may be interested to know that anybody can sue you or your business for whatever reason strikes their fancy. This is a direct consequence of doing business and has nothing to do with a particular law. What the GDPR allows private individuals to do is to contact their regulators and to complain if you decide to ignore their requests. So if John Doe wants to have his data removed from your service and you tell him to get stuffed then John has the right to alert his regulator to the fact that you are probably not in compliance. If the data protection entity of John’s country feels the case has merit they will send you the letter mentioned above. If not you might never hear from them. The data protection authorities will function as a clearing house. If you feel this is selective enforcement then you should be happy about it for a change: by providing this clearing house function the burden of regulation will be substantially lower than it would be without and it will ensure that the public will not be able to use the GDPR to harass businesses, and they will allow the insertion of a bar to be met before action is taken.

  • Fines will land without warning and will be draconian

No, fines will be proportional and will only be levied after a chance to become compliant has been given. This has been the case in all other EU law regarding privacy to date, this one will not be any different. The EU regulators see their job as ensuring compliance, not as creating a source of income.

  • The GDPR will require me to deal with complaints/paperwork in 28 different languages

The text of the GDPR is available in English, a typical regulator will send you a notice in a language that you can understand. This goes for everything in the EU that has to do with the law, from traffic fines to copyright law and everything else. If the EU is good at dealing with something it is dealing with other languages. So the paperwork - if any - that you will receive will be in a language that you can read and if you can’t there will be an English translation available. Case in point: I got a parking ticket in Paris last year where my car was on the wrong side of the road on a particular day I’d parked there on Monday, apparently on Tuesday you have to park your car on the other side of the road and me being a stupid tourist I thought I was safe because everybody else parked there too. I received my ticket in the mail a few days later, with a French text, an English text, and - most surprising - a perfectly worded Dutch text complete with instructions on how to have myself represented in court if I wished to contest the fine and instructions for paying the fine if I did not want to contest it.

  • The GDPR will require me to hire people and my entity is too small to be able to afford this

No, the GDPR will require you to assign certain roles to ensure that someone is in charge of privacy related stuff.

  • Faceless bureaucrats will use the selective enforcement of the GDPR to stuff the coffers of the EU at the expense of foreign companies

The EU tends to use fines as a means of forcing a company into compliance. Companies that are large and that have large European holdings or that use the EU to avoid paying taxes rightly worry about this particular aspect, especially if they have constructed their business around massive databases of profiles on EU citizens. If this isn’t you then you can probably ignore this aspect of EU legislation. If you’re Mark Zuckerberg however I would definitely advise not to ignore this, however the chances of Mark reading this blog post are nil.

  • The EU is over-reaching here, as a foreigner I should be free to just comply with my local laws and ignore the rest

As soon as you do business abroad you will have to comply with the laws of those countries. That’s maybe not what you were hoping for but this has always been the case. For physical products there are all kinds of entities that ensure compliance with the laws of other countries including rules for manufacturing, transportation, storage, ingredients - all the way back to the source - and so on depending on the context and nature of your business. For online businesses this has never been any different for instance you have to comply with copyright law, laws on online gambling, the DMCA and lots of other laws that are essentially local in nature (though copyright laws were harmonized long ago to make this easier).

  • Processing all these end-user requests will be a huge burden

Then automate it. If you could automate the collection of the data in the first place then you definitely can automate the rest of the life cycle. There is no technical hurdle companies won’t jump through if it gets them juicy bits of data but as soon as the data needs to be removed we’re suddenly back in the stone age and some artisan with a chisel and hammer will have to jump into action to delete the records and this will take decades for even a small website. Such arguments are not made in good faith and in general make the person making them look pretty silly after all nobody ever complained about collecting data, in fact there are whole armies of programmers working hard to scrape data from public websites which is a lot more work than properly dealing with the life cycle of that data after it has been collected. So yes, it is a burden, no, the burden isn’t huge unless you expressly make it so but that’s your problem.

  • This law was sprung on us, there is absolutely no way I’m going to be prepared a week from now

The law has been in effect for over two years at this point, and the DPD, the European Data Protection Directive has been in effect for over two decades. So no, this law was not sprung on anybody, though it is very well possible that you only became aware of it a few weeks or months (or days?) ago. If that’s the case do not panic, you too will most likely be fine.

  • It is impossible to be compliant with this law

Well, this website is fully compliant with the law, so at least in this particular case it seems to work. Why? Because I don’t store any information about you. That’s a conscious choice on my part which I made long before the GDPR was even talked about in public. But if your situation is more complex then you too can be compliant, or at least - and this is key - you could try to be compliant. For instance, one oft heard argument is that no webserver (or even any internet service) is going be able to be compliant because all web servers log IP addresses, and IP addresses are PII. But that argument does not hold water. There are several reasons for that, the major ones being: webservers only log IP addresses if you configure them to do so. Almost all webservers have a formatting option that determines what exactly is logged and you could configure your webserver to not log the whole address but just the network portion. You also have the option to log the address and to disclose that you do so in your privacy policy, but then you will have to allow for the removal of that data on request, which you may find burdensome (or not, that depends on the volume of such requests). Finally, you may have a legitimate reason to log the IP address, provided you delete it after you are done with whatever use you collected it for in the first place. There is enough room in the GDPR to hold on to the address for 30 days with a possible extension of another 60 days after which an automated reply to the user can tell them their IP address was purged and you’d be in compliance. That’s one of the reasons why I think the GDPR is a surprisingly good law, most of the times when legislation is written that impacts technology the end result is absolutely unworkable, in this case most scenarios seem to work well for all parties involved.

  • Becoming compliant with this law will cause my business to go under

I’m terribly sorry to hear that. But consider this: this law was written with the express purpose to rein in some of the worst violations of the privacy of EU citizens during their online activities. If becoming compliant with the law will cause your business to go under that is more or less the same as saying that your business is built on gross privacy violations. So if that’s your business model then good riddance to you and your company. However if that is not your business model then most likely you will be just fine.

  • It’s not fair, I have no representation in the EU because I’m not from there, why should my company comply?

Because you wish to do business in the EU. For what it’s worth, there are plenty of laws that project across the borders of countries and harmonization of laws between countries means that people are not always aware of the fact that this is happening. The DMCA is a nice example. Besides that, privacy is a fairly hot topic and there is hope in privacy advocacy circles that the EU is lighting the way here and that other countries will likely follow its example.

The fact that you or your company do not have representation in the EU does not mean you get to ignore the law, if you could then that would mean an automatic disadvantage for others that do play by the rules. You ignore the law at your peril.

  • I don’t want to end up being arrested for GDPR violations when I go on a holiday in Europe (yes, I really saw that one)

This is so far fetched it is comical. The EU does not operate that way, and besides, why would you wilfully break the law and continue to do so after you have been made aware of this? I’ve yet to hear about a single individual that was lifted from their bed in a French bed and breakfast during their well deserved holiday, but maybe you’ll be the first. If it happens let me know and I’ll come visit you in jail, I might even throw some bucks towards your defense fund. (Apologies for the flippant tone in this section but it really irks me, the only case like this that I’m aware of was the USA arresting one David Carruthers of betonsports.com.)

  • My business can not be compliant with this draconian and burdensome law

In that case please shut down or do not serve EU customers. But be aware that (1) you are leaving a nice opening for a competitor and (2) you are probably doing something you should not be doing in the first place, in which I would say the law is working as intended.

  • The law is so complicated, there is no way I could ever make sense of it

As laws come I was actually surprised by how easy it is to read it. It’s not particularly large, it uses mostly clear language and it usually (but critically, not always and this is a justified complaint) defines its terms. The biggest area where the lack of definition is annoying (but understandable) is when it comes to determining at what size company you need to take certain measures. I understand the complainers and I understand the lawmakers positions and this probably could have been handled in a more robust manner. But there are good reasons for doing it this way, as I hope to illustrate later.

  • I can’t afford the risks associated with this law so I am shutting down/I will lock Europeans out

Ok. Bye. But make sure you really understand those risks and please understand as well that it may not be possible for you to lock Europeans out reliably enough to not have any exposure under the law and realize that there are lots of other laws that you are also exposed to that could cause you to be wiped out. This law is really no different than any others in that respect. The price of using the web as a world stage is that you effectively are interacting with the legal domains of every country that you do business with.

  • I should be able to engage in a contract with my users that lets them opt out from this law so I can ignore it

For once the lawmakers saw what was coming and they actually repaired this before it became an issue. I suspect that the ‘cookie law’ debacle made them realise that companies have absolutely no scruples when it comes to things like this and will happily blackmail their users into consenting to something that they’d rather not consent to just to be able to participate in what is more and more unavoidable: online interaction.

  • For large companies the burden is manageable, for small companies it is too high

From what I’ve seen in my practice over the last couple of years the burden is roughly proportional to three things:

    - the amount of data you hold

    - the number of employees in your company

    - the kind of data you hold

In effect the burden of a large company holding vast amounts of sensitive data will likely be very large. The burden on a small company holding small amounts of non-sensitive data will be very low or even none.

  • Nobody knows what the GDPR really means

The text is readily available, it is true that there are no meaningful certification programmes as yet but in time these will be available. In some ways this is a pity because it would be nice to be able to say ‘We’re compliant because we have a stamp of approval from such and such a certification authority’ but at the same time the lack of certification requirements actually goes a long way towards reducing the burden on small companies.

Anyway, you get the gist by now. Each of these misconceptions is like dry tinder in the hands of those that wish to have a good old GDPR bonfire inciting others to panic as well and in general does not really contribute to the discussion. As a rule the statements are either made by well meaning people who have not really done their homework or they are done by people whose businesses depend on being able to violate other people’s privacy and they are hoping that by stoking this fire they will be able to turn the sentiment against the GDPR, to play politics. And as we all know we are in a fact-free environment when it comes to politics nowadays so anything goes. With that out of the way let’s look at some of the actual impact of the GDPR, at what level your exposure most likely is and how - according to me - the future will play out.

… to be continued, hopefully on Monday …


I love cycling. Sometimes I think I love it just a little bit too much. Such as the day two years ago that I took my Zephyr Lowracer for a spin and ended up in the hospital with my right leg broken in way too many pieces. It was a pretty harsh experience, one second you’re fine, the next you are flying through the air (speedbumps work on bicycles too…) and before I’d landed I already realized this was going to be a bad one. No disappointment on that front, Basically my foot was wrenched from my leg, the right leg dropped from the pedal, slowing it down and so then the bike with me on it went over that. Not a good sound, to hear your bones snapping. Before the ambulance arrived I’d set my foot myself (don’t try this at home). The ambulance guys were a bit surprised that they were called in for a broken leg because it all looked pretty good to them.

After a very gingerly ride to the hospital with an air splint around the whole affair I ended up being operated on for a really long time. Thanks to the wonders of modern medicine, the super nice nurses, caretakers, doctors and surgeons of the hospital in Hilversum I was out of the hospital in two days with about half a Home-depot hardware department in my leg. Steel plates and lots of screws. Now to heal. That took a long time, in fact it is work in progress. After a few weeks of doing nothing except hopping around on one foot and using crutches I took to cycling again. Very very carefully. This worked but it didn’t take me long to realize that I was scared out of my wits of falling again. One of the problems of being a business owner is that you really can’t afford downtime. We didn’t miss a single job because of my little accident but that was mostly because of fantastic timing, it happened in the middle of the holiday season. By the time the first customers returned I was on crutches. Another week later I was walking - slowly. But I did notice that recovery went rather slower than I wanted. So more cycling, but no more room for accidents.

After a few weeks of this I realized that it was going well as long as I could keep it up, every day at least an hour. But if I slacked off for work, weather or other reasons it was back to square #1 with everything stiff and painful.

A couple of weeks ago I finally had enough of this and decided to do something about it rather than to be frustrated and I started to research indoor options. Many of them are way too expensive and seem to be mostly designed to look good to justify their price. I didn’t care much for that. Some more searching on the various forums and I realized I needed a hybrid between a bike and a hometrainer. For a bit I messed around with duct-tape-and-glue solutions and then the other day an associate at one of the companies I work for (thanks Reinder!) mentioned the Tacx brand again, I’d already run into them before. So I got one of those, the simple ones are pretty cheap, and serve my needs well.

One of the things I noticed right away after getting it all to work was that it was extremely boring. No road to watch, no goal to strive for, no weather, just a blank wall. Listening to music helps, but it is only so much of a motivator. So what I ended up doing is this: I ‘trainified’ VLC.

Here is how it works: You want to watch a movie? Fine, but you’ll have to work for it. A minimum speed is set, say 30 Kph, and only when you are above that speed does the movie run normally. There is a movie called ‘speed’ around that theme, but there things blow up if you drop below the minimum speed. That wouldn’t go down well with the neighbors so instead what happens is that as soon as you drop below the minimum speed the movie slows down, and the sound with it. So the only way to really watch the movie is to keep going, otherwise you start missing bits. For people with perfect pitch: watch music videos, you will be extremely motivated ;)

VLC has some excellent features for this, for one you can switch off audio stretching, which means the audio changes in pitch as soon as you slow down, for another there is a remote control feature that allows you to specificy a ‘playback’ rate as a floating point value. A little transmitter on the bicycle counts wheel revolutions and a python script then converts that data to the rate which is sent on to VLC. That’s all, and it works pretty good. A two hour movie is a pretty good exercise this way.

Here are some pictures, this is an older Koga-Miyata racing bike, bought from the local bicycle shop for a song. The rear wheel has a special tire on it that is more wear resistant against this kind of use than a normal wheel would be.

This is the speed sensor in the rear wheel:

And this is the ANT+ usb radio that connects to a USB extension cable to get it out of the zone where the computer interferes with reception:

The python code for the conversion is here, it is a slight modification of one of the demos in this repository.

The order of starting is a bit tricky, in three different shells you start VLC, the speed monitor and a telnet session to connect the two:

vlc yourmovie.mp4 –intf rc –rc-host

python 07-rawmessage3.py | tee out.log

tail -f out.log | telnet localhost 12345

There are very fancy programs and online options to make something like this work but they are all centered around simulating cycle trips and races and I can’t get myself to pretend I’m cycling outdoors when I’m indoors and it seems like a giant waste of time. Not that movies is much better but it does seem a bit more justifiable and the penalty for going slow is much more realistic than the penalty for cycling slow in a pretend trip or race.

So, instead of gamifying the real world trainifying entertainment. The same trick applies to all kinds of other fitness hardware, such as running machines and rowing machines. As long as you can find a part of the machine that you can attach one of those sensors to you’re good to go. Maybe a next version of this setup will include a generator on the rear wheel coupled to a bunch of fans to simulate a headwind.

Happy training. And stay away from Low Racers. If anybody is interested in a Zephyr…