Response to Geoff Mulgan’s “Will open data be a damp squib?”

Geoff’s piece, Will open data be a damp squib? prompted me to comment. At length. And wander around a bit. So for what it’s worth…

An alternate view: the ‘value’ of open data is a function of its impact in people’s lives.

So transport and geographical data – ‘getting from A to B’ or ‘finding C’ – is unsurprisingly useful, and straightforward to monetise. Which possibly explains why government / Ordnance Survey / Royal Mail are so reluctant to relinquish their monopolies on some datasets.

Open data about the operation of government and/or public services may be useful – even invaluable – in terms of transparency and accountability, but its ‘entrepreneurial value’ is quite low. And institutions will tend to resist revealing the truly shameful, corrupt or embarrassing stuff, preferring – where they cannot avoid publication – to bury it in a blizzard of other data; the classic bureaucrat’s tactic. This, ungenerously, might also go some way towards explaining the ‘oversupply’ issue.

(Also, who says what is ‘open’? We don’t only want to know what the government is willing to tell us. We want to know what we want to know, which is why ‘open data’ should NEVER be allowed to replace, substitute for or weaken Freedom of Information.)

The most valuable data is data about people. People who can buy stuff, e.g. advertising; people who need stuff, e.g. service provision; people who can give you stuff, e.g. votes ≡ power. It is as it ever was; people as exploitable resource.

Everyone wants it. Companies will ‘give’ you loads of cool stuff for it – repackaged relationships (social networks), software, pizza tokens…

And the public services may hoover it up, form after form. Governments may even mandate it – which is what makes the database state especially dangerous. But just because personal data has been gathered in or by the public sector, doesn’t make it ‘public data’ any more than my name, address and date of birth ‘belongs’ to my bank.

Bottom line: personal data ≠ open data. There are laws about that.

And “anonymised” doesn’t get you off the hook, much as many in government and business would quite like it to. The shameful attempts to present “anonymisation” – in practice more often pseudonymisation or de-identification, as genuinely anonymised data tends not to be very useful – as an alternative to proper notification and informed consent are coming from a similar sort of self-serving, self-justifying, shallow-thinking place as the one that reckons ‘big data’ (i.e. pattern-driven prediction) is hard science, when it’s more like something between stats and artifact-discovery.

In reality, the ‘bigger’ data all gets – i.e. the more cross-referenceable datasets there are out there – the less anonymiseable it all is. And there’s maths about that (cf. Differential Privacy).

Taking or using something “just because it’s there” – or, to quote the Second Data Protection Principle, that has been obtained for a specified and lawful purpose – isn’t ‘openness’. It’s theft.

I repeat: personal data ≠ open data. For, in an information society, things done to my data affect me in my life as surely as if you walked up to me and punched me in my face. You might not intend to do those things – but if you suck up or process my data and you or others make decisions based on it, I’m the one who must suffer the consequences. So I get to choose.

Personal data is my data. Not anyone else’s to exploit without my consent. It’s not ‘public’, unless I freely choose it to be – and it’s definitely not ‘open’!

Returning to my original point about functional value; health data – deeply personal and virtually impossible to “anonymise” and keep useful – is amongst the highest value data of all. (The potential for fear marketing alone must be worth billions, probably trillions if you add in sequenced DNA data.) Hence the multiple ongoing attempts right now to suck up, pass around and sell or ‘give away’ – in “anonymised” form, of course – our health data.

I agree with Geoff’s point about vested interests – that where open data has succeeded it has done so because it didn’t threaten vested interests – and also with his observation that there’s not the political will to tackle the “top down systems”, i.e. the bureaucracies, which – cognisant of information as power – institutionally tend to use information technologies to embed and extend their empires.

The rhetoric is agile and citizen-centred, the reality is an all-too-familiar attempt to redefine personal data as ‘public’ or ‘open’; to “overcome the barriers to sharing”. And where government transformation – or “transformational government”, if you can remember back a few short years – isn’t about government changing itself at all. It’s about changing us.

(N.B. You will note that in this, the interests of the corporations and bureaucracies are quite closely aligned. Which makes ‘data envy’ on the part of governments all the more pernicious.)

So, if open data – i.e. information about systems and their operations – works, where’s the disintermediation in the public sector and the bureaucracies that we’ve seen in commercial supply chains? New political and bureaucratic initiatives add in yet more layers of complexity, exposing the citizen to yet more “computer says no” or “your problem doesn’t fit our solution”, paid for by paring back yet more front line staff while the back office and managerial layers metastasise and the systems integrators are laughing all the way to the (failed) bank, not even having paid their taxes…

For an example, look no further than the reengineering of the NHS: the new Commissioning Board introduces a brand new mega-bureaucracy, minimises accountability, replaces hundreds of administrative bodies with hundreds more, leaves an entire Department effectively redundant but still in place. And its first move? Abolish system-wide information governance oversight, re-write the Constitution and go for the data…

Until government proves it can properly reengineer itself, delivering genuinely citizen-oriented services without destroying the all-important human interface, it simply shouldn’t be trusted with any more of our data. Especially if it’s going to redefine what’s ours as ‘public’ or ‘open’ in the hope of a quick buck. Sorry, “stimulating economic activity”.

I understand the urgency. I’m a huge fan of entrepreneurship; I’ve been operating in that mode for the last 20 years or so. But the danger isn’t that open data is a damp squib, it’s that open data is subverted or suborned to drive the further commodification and bureaucratisation of personal data, to limit choice, control and consent*, and to make citizens less free.

*For if liberal democracy is to work we must be autonomous agents, not coerced ‘consumers’ of government or, far worse, the ‘product’ – as in “If you’re not paying for…”

Posted in choice and consent, database state, medical confidentiality, open data | Leave a comment

The ID scheme rides again… *sigh*

A copy of the comment I left on the slides to Cabinet Office / Government Digital Service’s recent ‘SPRINT 13’ conference, Workshop 2 on “Electoral Registration Transformation”:

Please provide a human-readable transcript!

The following is just gobbledegook, e.g. ‘5. Electoral Registration after 2014 Each person Choice of digital Electoral Names Citizens registers and non digital Registration Officer added to exercise individually and routes during must verify name, electoral right to voteprovide identifying transition DOB, National register, held information to (barriers to Insurance Number locally.enable verification digital channels with DWP using IER of entitlement to removed). Digital Service. register. Adoption of ID Assurance when market developed’

With regard to Slide 5: I note the (convenient?) omission of the Query engine that will effectively federate the locally-held electoral registers – conveniently cross-matched with the NINO – that makes this *whole scheme* a direct analogue of the Home Office’s ID scheme, and Treasury’s ‘Citizen Information Project’ before that.

To call this a mere ‘electoral registration transformation’ misses the point. (Deliberate myopia or paranoid political PR?) Anyone smart enough to engineer a system like this should know that – or they shouldn’t be building population-scale systems at all. And you people aren’t stupid.

The Coalition may have scrapped the Home Office ID scheme; with this programme, Cabinet Office is bringing it back.

And in the process it is perverting some of the very principles of our democratic ‘contract’. Compelling or coercing people to vote is one thing; coercing people to *register* to vote is about building a register, not about widening participation or preventing fraud. (Fraud which in large part was exacerbated by ‘innovation’ with postal votes.)

(Slideshare’s comment system doesn’t appear to respect line breaks, so I thought I’d put a more legible copy here.)

Posted in database state, GDS, ID cards | Leave a comment

Permanent Secretaries’ objectives, plus links

Published on 20th December 2012, the list of Permanent Secretaries’ objectives unfortunately omits to name the Department for which each person is Permanent Secretary. (Indicative of a Whitehall mindset where such knowledge is assumed, or of the extraordinary level of turnover of ‘Permanent’ Secretaries since 2010?)

The following edited list provides this and other information:

Defining operational and commercial experience for Permanent Secretaries.

Posted in transparency | Leave a comment

‘Opening’ the National Pupil Database: a parent’s response

The Department for Education proposes to ‘widen access’ to the information held in the National Pupil Database, changing the ‘purposes’ for which it can be shared. Or in other words, it intends to share your kids’ personal data, which it sucks up every term from school systems whether you know or like it or not, with commercial organisations, the media and a whole host of others…

DfE is running a 6-week consultation on this, which ends tomorrow. Others, including the Open Data Institute and Open Rights Group (added 18/12/12) have already published their responses.

Here is mine:

1) Do you agree with the proposal to widen the purposes for which data from the National Pupil Database can be shared? Please explain the reasons for your answer.

I disagree with the proposal for a number of reasons – not least that the purposes as redrafted offer few limits to the types and number of people and organisations with which children’s (and adults’) personal data held in the National Pupil Database could in practice be shared.

I am particularly concerned by the indicative list in section 7.1, which includes “the media” and the catch-all “commercial or non-profit organisations”; a definition that covers any corporate entity, but which clearly does not exclude purely profit-driven enterprises. There appears to be no reason why that list might not also include ‘market researchers’ or ‘direct marketers’ – or for that matter ‘political parties’.

While I trust that the Department’s Data and Statistics Division and Data Management Advisory Panel do their best to protect the personal data of the millions of people for which they are custodians, broadening the purposes in the way proposed would make this extraordinarily important job much harder; maybe even impossible.

Setting aside for a moment the stated aims or intended consequences of these changes, it is always worth examining the unintended but predictable consequences.

Were the Regulations altered to stipulate that data in the National Pupil Database could be shared, some may insist on the letter of the law that they must be shared. In such circumstances, and especially if the motivations were financial gain rather than research, the protections in place could prove to be altogether too weak. A contract or set of Terms and Conditions with threat of audit may have been sufficient to manage the release of data to the academy. To pretend that these and the sanctions offered by the Data Protection Act are sufficient to disincentivise bad behaviour in a commercial context or with the media is to fly in the face of evidence.

The second reason I disagree with the proposal is because of what it fails to say. The consultation document provides a few anecdotal suggestions in section 5.1 but no evidence or detail of how any of the “potential uses” would deliver benefit or to whom. Of course, providing superficially accurate ‘estimates’ would be meaningless, but providing some sort of outline cost/benefit and risk analyses would at least allow people to gauge the Department’s argument and thinking. As it stands the ‘argument’ is little more than a set of assertions and the proposal appears to boil down to, “We’ll suck it and see”.

I was struck by the complete absence from the entire document of the description ‘personal data’. This is a significant and damning omission. The data in the National Pupil Database is variously referred to as “rich data”, “government data”, “pupil data” and just once as “sensitive data”. At individual level, the data in NPD are indisputably personal data. Though interspersed with administrative and other data, this is information about my children, their circumstances and characteristics and my home. That this is not consistently referred to as personal data gives me cause for concern.

Which leads me to my third point; informed consent. I am fully aware that much of the data in the National Pupil Database is harvested via statutory gateway, which in law provides an exemption for processing data without consent. But what the Department chooses to present as “minor amendments” to the Regulations in fact represent a hugely significant change in what may be done with the personal data – some highly sensitive – which have been collected in this way.

To claim: “The Department makes it clear to children and their parents what information is held about pupils and how it is processed, through a statement on its website. Schools also inform parents and pupils of how the data is used through privacy notices” is, quite frankly, a joke.

(I would be very interested to see the annual traffic logs / unique visits for the web page(s) on which the Department’s “statement” resides. As far as I can tell, the page ‘National pupil database: How is the data used?’ is merely a re-ordering of the four bullet-points that appear in section 3.1 of the consultation document. Three of the six pages explaining NPD are concerned with accessing the data.)

I am probably quite unusual in having read the “privacy notices” sent from both my children’s primary and secondary schools, generally amongst a swathe of other papers and forms at the beginning of a school year or induction. These notices may have provided some information, but they certainly did not properly inform me about the National Pupil Database and what is done with the data on it. And I doubt anyone who wasn’t specifically looking for it would notice any change to the boilerplate from one year to the next.

According to the Second Data Protection Principle: “Personal data shall be obtained only for one or more specified and lawful purposes, and shall not be further processed in any manner incompatible with that purpose or those purposes.”

The proposed changes are not just a minor ‘loosening’ of the limits to sharing. The amendment quite clearly changes the specified purpose for which at least some of personal data is collected – e.g. via the School Census – at the same time vastly increasing the type and number of bodies and organisations which may access the data and the ways in which they may process them.

That the reach, scope and scale of the National Pupil Database have been extended time and again over the years is bad enough. To try to pass off these currently proposed changes as “minor” is as cavalier as it is deceptive.

In summary:

1)     The proposed change offers few practical limits to the type or number of organisations which may apply for access; the protections in place are insufficient for the marketplace the Secretary of State appears to wish to create.

2)     The Department provides no evidence, cost/benefit or risk analysis; its argument is mere assertion and it appears to have adopted a “suck it and see” approach.

3)     The amendment as it stands is a major change to the specified purpose for which (some) personal data is being gathered; to use the “lawful” exception yet again is a perversion of principle.

So what do I suggest? Stop! Go back to the beginning. Think again.

I urge you not to go down this route at all. But if you must, for something that would clearly affect every child in state education, many who have gone through it and their families now and for generations to come, the only reasonable way forward is to conduct a full and proper consultation (I say more on this below) and to allow Parliament the opportunity to debate a far more coherent, robust and properly-evidenced proposal, in primary legislation.


2) How could you or your organisation potentially use the data?

I or my children might like to see a copy of the information that is held about them, but I assume we already have the right to do this through a Subject Access Request.

I would very much like to have a complete list of every body or agency that has accessed my children’s data. I do not know why the Department does not publish this information on its website – it is one of the most obvious things about NPD data sharing that children and parents are likely to want to know.


3) What do you see as the benefits of widening the purposes for which data can be shared?

I see significant risks and no attempt to model them, let alone the cost/benefit. I do not propose to feed the Department’s fantasies.


4)  Do you have any other comments you would like to make about the proposals in this consultation document?

No. I hope I have made myself clear.


5) Please let us have your views on responding to this consultation (e.g. the number and type of questions, whether it was easy to find, understand, complete etc.).

My other comments relate to the consultation process itself.

As you say below, Cabinet Office’s current Consultation Principles state: “Departments will need to give more thought to how they engage with and consult with those who are affected.”

A six-week online consultation in the busy run-up to Christmas is neither an appropriate nor sufficient way in which to engage with and consult those who would be affected by these proposals. As much as the Department appears to wish to downplay it, the proposed amendment represents a major change in what is intended to be done with the personal information – some highly sensitive – of millions of children, some now adults and their families. It appears any thought given to this consultation process was on the basis of how quickly it could be hustled through.

Based on people I have spoken with, I strongly suspect that many if not most children and parents are still quite unaware of the National Pupil Database. But, again only anecdotally, I believe many parents and children would be very concerned at the notion that their personal information – gathered without their consent, simply by virtue of them attending school – might be shared with, say, commercial companies and the media.

At the very least, details of this consultation should have been sent to every school with instructions to notify parents that a significant change to arrangements regarding their children’s personal data was being proposed. (Neither head at either of my children’s schools had heard anything about this.) And, for any change that would affect so many, a period of at least half a term should be the minimum allowed for people to respond. Preferably not just before a holiday!

As is often the case, the questions and language in this consultation are skewed towards the positive with no mention of risks or potential disbenefits. I hardly expect this to change, but will continue to comment in the hope of improvement.

Posted in National Pupil Database | 1 Comment

The other Comms Data report

Quick note if you’ve come here from Twitter: I joined Twitter a couple of months ago and am still learning the ropes. @frabcus pointed out that my timeline is hard to read, so I’ve now Storified my relevant Tweets from Tuesday night – along with those of a few others who were also commenting as they read the Joint Committee’s report on the draft Communications Data Bill.

While the Joint Committee has been scrutinising the government’s draft Communications Data Bill, the Intelligence and Security Committee (ISC) has been conducting a parallel inquiry into the use of communications data by the intelligence and security Agencies. [@smithsam points out I should clarify that the ISC’s inquiry was also into the draft Bill.]

On Tuesday it published the conclusions of its investigation. The ISC takes pains to point out that it has a different frame of reference from the Joint Committee, and states:

We have taken detailed evidence, much of which is highly classified as it relates to the current capabilities – and lack of capabilities – of our intelligence Agencies. We have sent a classified report on our findings to the Prime Minister. However we are conscious that the question of access to communications data is one which is generating significant public debate – and rightly so, since any intrusion into an individual’s personal life should not be done lightly. We are, therefore, intending to publish in due course as much of the content of that report as possible.

The ISC is a cross-party body of peers and MPs. It has had an opportunity to look at evidence the Joint Committee has not. Though the language of its summary is quite naturally circumspect there are some striking parallels in its conclusions:

5. Turning to the draft Bill, we strongly recommend that more thought is given to the level of detail that is included in the Bill, in particular in relation to the Order-making power. Whilst the Bill does need to be future-proofed to a certain extent, and we accept that it must not reveal operational capability, serious consideration must be given as to whether there is any room for manoeuvre on this point: Parliament and the public will require more information if they are to be convinced.

i.e. Clause 1 is much too broadly drafted (cf. paras 287-297, JC summary of recommendations). The report continues:

6. We have similar concerns regarding the background information accompanying the Bill. Whilst we recognise the need to take action quickly, the current proposals require further work. In particular, there seems to have been insufficient consultation with the Communications Service Providers on practical implementation, as well as a lack of coherent communication about the way in which communications data is used and the safeguards that will be in place. These points must be addressed in advance of the Bill being introduced.

i.e. there has been a failure to consult (cf. paras 284-286, JC Report) and the Home Office has failed to provide a proper explanation of how communications data will be handled in practice and how oversight and other ‘safeguards’ will actually work (cf. paras 300-317, JC Report). The message is perfectly clear: ‘not good enough, try again’.

The ISC also notes:

We do not believe that there is any benefit in providing superficially precise estimates of the size of this ‘capability gap’: unless there is a demonstrable basis for such figures they can be misleading.

Ah, superficial precision without evidence – a speciality of Home Office figures! Of course, this was also noted by the Joint Committee (cf. paras 34-39, JC Report). And there’s more on ‘the gap’:

We therefore welcome the decision by the Home Office to make public information on the three core elements of the gap: subscriber details showing who is using an Internet Protocol address; data identifying which internet services or websites are being accessed; and data from overseas Communications Service Providers who provide services such as webmail and social networking to users in the UK. This is a positive step. However, we recommend that more thought is given as to whether this can be reflected on the face of the Bill.

i.e. the core purposes of the scheme aren’t reflected in the wording. When the raisons d’être for a piece of legislation fail to appear on the face of the Bill, it is a bad Bill. Full stop. We live under the rule of law, which means the law must be clear and explicit; law by insinuation should not stand.

Unsurprisingly and sort-of-understandably, the ISC is of the opinion that judicial oversight for the security and intelligence Agencies is unnecessary. Sticking to its frame of reference, it remains mute as regards police or other bodies’ access to detailed dossiers on the communications behaviour of every person in the UK:

Any move to introduce judicial oversight of the authorisation process could have a significant impact on the intelligence Agencies’ operational work. It would also carry a financial cost. We are not convinced that such a move is justified in relation to the Agencies, and believe that retrospective review by the Interception of Communications Commissioner, who provides quasi-judicial oversight, is a sufficient safeguard.

Though the ISC may be strictly correct in describing the IoCC as ‘quasi-judicial oversight’, you don’t have to read too far between the lines to see that the Joint Committee clearly didn’t think very much of him (paras 187-199, JC Report).

Moving on past the rather chilling sentence, “While legislation is not a perfect solution, we believe it is the best available option” the ISC again points out that the Home Office can’t expect communications companies to snoop on its behalf without actually stating in law that they will be required to do so, and on what basis:

Whilst we recognise the UK Communications Service Providers’ concerns, we believe they would be willing to co-operate in deploying Deep Packet Inspection technology to obtain third-party data. We are however sympathetic to their argument that the Home Office should have to demonstrate due diligence before resorting to the use of Deep Packet Inspection to collect communications data from overseas Communications Service Providers, and we recommend that this should be reflected on the face of the Bill.

It may be worth noting that the Communications Service Providers are backing the Deputy Prime Minister’s call for the Home Office to go “back to the drawing board”. Though some, including Jimmy Wales, are more outspoken against the scheme than others they are all going to need a ‘get out of jail free’ card if this scheme is to proceed. The retributive risks of being known snoopers on ‘third party data’ from repressive regimes may not have escaped some. Nor the reputational and other risks of some possible domestic effects.

In discussing the ‘Request Filter’ – the search engine intended to mine what would be, in practice, a distributed database of detailed information about everyone in the UK (para 113, JC Report) – the ISC suggests that in the hands of the Agencies it may mitigate “collateral intrusion”. This is more optimistic by what it omits than the Joint Committee, which deploys the same euphemism but sees this “Government owned and operated data mining device” in the hands of other bodies as providing the temptation to go on “fishing expeditions” (para 126, JC Report).

The ISC highlights some of the complexities of implementation, both bureaucratic and technical, ending its delicately-worded advice with an assessment we know all too well to be true:

The technology seems to exist to provide this. It will be a significant challenge to integrate the numerous data sets from different Communications Service Providers to make the filter work, as well as manage the expectations of the various Departmental and Agency stakeholders. The record of government in managing such complex IT projects is mixed at best.

For a Parliamentary body which has had access to highly confidential material, dealing with actual national security issues to arrive at such similar conclusions regarding the draft Bill as the Joint Committee is remarkable.

Or maybe it isn’t.

The draft Communications Data Bill is a monstrosity, the scheme behind it even worse. Anyone who can see beyond their own nose (or self-interest) should be able to discern that. As I’ve said elsewhere it is over-reaching, poorly drafted, ill-defined, not based on evidence or proper consultation and misleadingly costed. But the answer is not a re-write, as many seem to be suggesting. What is required is a fundamental re-think of surveillance law. And I’m not the only one who thinks so*.

The machinery is broken. We can’t fix it by slapping on another kludge. And we certainly shouldn’t let the outfit that has bodged the job so badly time and again anywhere near it. (Yes, Home Office – I mean you!)

The Joint Committee itself said, “The language of RIPA is out of date and should not be used as the basis of new legislation” (para 167, JC Report). The current legislation is not fit for purpose; the government must sort that out properly before it even considers any more.

*In the interest of full disclosure, I have been doing some work with the Open Rights Group these past months on this issue. Even if I hadn’t, I would agree with Pete’s article 100%.


Posted in communications data | 1 Comment