Terri’s and my talk on the National Pupil Database at the Open Data Institute

Here is the PowerPoint presentation for the lunchtime lecture we gave at ODI on Scribd and here is the audio on SoundCloud – the sound is quite faint, so I did an amplified version which you can download here (27MB MP3 file).

Posted in choice and consent, database state, National Pupil Database, open data | Leave a comment

Response to Geoff Mulgan’s “Will open data be a damp squib?”

Geoff’s piece, Will open data be a damp squib? prompted me to comment. At length. And wander around a bit. So for what it’s worth…

An alternate view: the ‘value’ of open data is a function of its impact in people’s lives.

So transport and geographical data – ‘getting from A to B’ or ‘finding C’ – is unsurprisingly useful, and straightforward to monetise. Which possibly explains why government / Ordnance Survey / Royal Mail are so reluctant to relinquish their monopolies on some datasets.

Open data about the operation of government and/or public services may be useful – even invaluable – in terms of transparency and accountability, but its ‘entrepreneurial value’ is quite low. And institutions will tend to resist revealing the truly shameful, corrupt or embarrassing stuff, preferring – where they cannot avoid publication – to bury it in a blizzard of other data; the classic bureaucrat’s tactic. This, ungenerously, might also go some way towards explaining the ‘oversupply’ issue.

(Also, who says what is ‘open’? We don’t only want to know what the government is willing to tell us. We want to know what we want to know, which is why ‘open data’ should NEVER be allowed to replace, substitute for or weaken Freedom of Information.)

The most valuable data is data about people. People who can buy stuff, e.g. advertising; people who need stuff, e.g. service provision; people who can give you stuff, e.g. votes ≡ power. It is as it ever was; people as exploitable resource.

Everyone wants it. Companies will ‘give’ you loads of cool stuff for it – repackaged relationships (social networks), software, pizza tokens…

And the public services may hoover it up, form after form. Governments may even mandate it – which is what makes the database state especially dangerous. But just because personal data has been gathered in or by the public sector, doesn’t make it ‘public data’ any more than my name, address and date of birth ‘belongs’ to my bank.

Bottom line: personal data ≠ open data. There are laws about that.

And “anonymised” doesn’t get you off the hook, much as many in government and business would quite like it to. The shameful attempts to present “anonymisation” – in practice more often pseudonymisation or de-identification, as genuinely anonymised data tends not to be very useful – as an alternative to proper notification and informed consent are coming from a similar sort of self-serving, self-justifying, shallow-thinking place as the one that reckons ‘big data’ (i.e. pattern-driven prediction) is hard science, when it’s more like something between stats and artifact-discovery.

In reality, the ‘bigger’ data all gets – i.e. the more cross-referenceable datasets there are out there – the less anonymiseable it all is. And there’s maths about that (cf. Differential Privacy).

Taking or using something “just because it’s there” – or, to quote the Second Data Protection Principle, that has been obtained for a specified and lawful purpose – isn’t ‘openness’. It’s theft.

I repeat: personal data ≠ open data. For, in an information society, things done to my data affect me in my life as surely as if you walked up to me and punched me in my face. You might not intend to do those things – but if you suck up or process my data and you or others make decisions based on it, I’m the one who must suffer the consequences. So I get to choose.

Personal data is my data. Not anyone else’s to exploit without my consent. It’s not ‘public’, unless I freely choose it to be – and it’s definitely not ‘open’!

Returning to my original point about functional value; health data – deeply personal and virtually impossible to “anonymise” and keep useful – is amongst the highest value data of all. (The potential for fear marketing alone must be worth billions, probably trillions if you add in sequenced DNA data.) Hence the multiple ongoing attempts right now to suck up, pass around and sell or ‘give away’ – in “anonymised” form, of course – our health data.

I agree with Geoff’s point about vested interests – that where open data has succeeded it has done so because it didn’t threaten vested interests – and also with his observation that there’s not the political will to tackle the “top down systems”, i.e. the bureaucracies, which – cognisant of information as power – institutionally tend to use information technologies to embed and extend their empires.

The rhetoric is agile and citizen-centred, the reality is an all-too-familiar attempt to redefine personal data as ‘public’ or ‘open’; to “overcome the barriers to sharing”. And where government transformation – or “transformational government”, if you can remember back a few short years – isn’t about government changing itself at all. It’s about changing us.

(N.B. You will note that in this, the interests of the corporations and bureaucracies are quite closely aligned. Which makes ‘data envy’ on the part of governments all the more pernicious.)

So, if open data – i.e. information about systems and their operations – works, where’s the disintermediation in the public sector and the bureaucracies that we’ve seen in commercial supply chains? New political and bureaucratic initiatives add in yet more layers of complexity, exposing the citizen to yet more “computer says no” or “your problem doesn’t fit our solution”, paid for by paring back yet more front line staff while the back office and managerial layers metastasise and the systems integrators are laughing all the way to the (failed) bank, not even having paid their taxes…

For an example, look no further than the reengineering of the NHS: the new Commissioning Board introduces a brand new mega-bureaucracy, minimises accountability, replaces hundreds of administrative bodies with hundreds more, leaves an entire Department effectively redundant but still in place. And its first move? Abolish system-wide information governance oversight, re-write the Constitution and go for the data…

Until government proves it can properly reengineer itself, delivering genuinely citizen-oriented services without destroying the all-important human interface, it simply shouldn’t be trusted with any more of our data. Especially if it’s going to redefine what’s ours as ‘public’ or ‘open’ in the hope of a quick buck. Sorry, “stimulating economic activity”.

I understand the urgency. I’m a huge fan of entrepreneurship; I’ve been operating in that mode for the last 20 years or so. But the danger isn’t that open data is a damp squib, it’s that open data is subverted or suborned to drive the further commodification and bureaucratisation of personal data, to limit choice, control and consent*, and to make citizens less free.

*For if liberal democracy is to work we must be autonomous agents, not coerced ‘consumers’ of government or, far worse, the ‘product’ – as in “If you’re not paying for…”

Posted in choice and consent, database state, medical confidentiality, open data | Leave a comment

The ID scheme rides again… *sigh*

A copy of the comment I left on the slides to Cabinet Office / Government Digital Service’s recent ‘SPRINT 13′ conference, Workshop 2 on “Electoral Registration Transformation”:

Please provide a human-readable transcript!

The following is just gobbledegook, e.g. ’5. Electoral Registration after 2014 Each person Choice of digital Electoral Names Citizens registers and non digital Registration Officer added to exercise individually and routes during must verify name, electoral right to voteprovide identifying transition DOB, National register, held information to (barriers to Insurance Number locally.enable verification digital channels with DWP using IER of entitlement to removed). Digital Service. register. Adoption of ID Assurance when market developed’

With regard to Slide 5: I note the (convenient?) omission of the Query engine that will effectively federate the locally-held electoral registers – conveniently cross-matched with the NINO – that makes this *whole scheme* a direct analogue of the Home Office’s ID scheme, and Treasury’s ‘Citizen Information Project’ before that.

To call this a mere ‘electoral registration transformation’ misses the point. (Deliberate myopia or paranoid political PR?) Anyone smart enough to engineer a system like this should know that – or they shouldn’t be building population-scale systems at all. And you people aren’t stupid.

The Coalition may have scrapped the Home Office ID scheme; with this programme, Cabinet Office is bringing it back.

And in the process it is perverting some of the very principles of our democratic ‘contract’. Compelling or coercing people to vote is one thing; coercing people to *register* to vote is about building a register, not about widening participation or preventing fraud. (Fraud which in large part was exacerbated by ‘innovation’ with postal votes.)

(Slideshare’s comment system doesn’t appear to respect line breaks, so I thought I’d put a more legible copy here.)

Posted in database state, GDS, ID cards | Leave a comment

Permanent Secretaries’ objectives, plus links

Published on 20th December 2012, the list of Permanent Secretaries’ objectives unfortunately omits to name the Department for which each person is Permanent Secretary. (Indicative of a Whitehall mindset where such knowledge is assumed, or of the extraordinary level of turnover of ‘Permanent’ Secretaries since 2010?)

The following edited list provides this and other information:

Defining operational and commercial experience for Permanent Secretaries.

Posted in Transparency | Leave a comment

‘Opening’ the National Pupil Database: a parent’s response

The Department for Education proposes to ‘widen access’ to the information held in the National Pupil Database, changing the ‘purposes’ for which it can be shared. Or in other words, it intends to share your kids’ personal data, which it sucks up every term from school systems whether you know or like it or not, with commercial organisations, the media and a whole host of others…

DfE is running a 6-week consultation on this, which ends tomorrow. Others, including the Open Data Institute and Open Rights Group (added 18/12/12) have already published their responses.

Here is mine:

1) Do you agree with the proposal to widen the purposes for which data from the National Pupil Database can be shared? Please explain the reasons for your answer.

I disagree with the proposal for a number of reasons – not least that the purposes as redrafted offer few limits to the types and number of people and organisations with which children’s (and adults’) personal data held in the National Pupil Database could in practice be shared.

I am particularly concerned by the indicative list in section 7.1, which includes “the media” and the catch-all “commercial or non-profit organisations”; a definition that covers any corporate entity, but which clearly does not exclude purely profit-driven enterprises. There appears to be no reason why that list might not also include ‘market researchers’ or ‘direct marketers’ – or for that matter ‘political parties’.

While I trust that the Department’s Data and Statistics Division and Data Management Advisory Panel do their best to protect the personal data of the millions of people for which they are custodians, broadening the purposes in the way proposed would make this extraordinarily important job much harder; maybe even impossible.

Setting aside for a moment the stated aims or intended consequences of these changes, it is always worth examining the unintended but predictable consequences.

Were the Regulations altered to stipulate that data in the National Pupil Database could be shared, some may insist on the letter of the law that they must be shared. In such circumstances, and especially if the motivations were financial gain rather than research, the protections in place could prove to be altogether too weak. A contract or set of Terms and Conditions with threat of audit may have been sufficient to manage the release of data to the academy. To pretend that these and the sanctions offered by the Data Protection Act are sufficient to disincentivise bad behaviour in a commercial context or with the media is to fly in the face of evidence.

The second reason I disagree with the proposal is because of what it fails to say. The consultation document provides a few anecdotal suggestions in section 5.1 but no evidence or detail of how any of the “potential uses” would deliver benefit or to whom. Of course, providing superficially accurate ‘estimates’ would be meaningless, but providing some sort of outline cost/benefit and risk analyses would at least allow people to gauge the Department’s argument and thinking. As it stands the ‘argument’ is little more than a set of assertions and the proposal appears to boil down to, “We’ll suck it and see”.

I was struck by the complete absence from the entire document of the description ‘personal data’. This is a significant and damning omission. The data in the National Pupil Database is variously referred to as “rich data”, “government data”, “pupil data” and just once as “sensitive data”. At individual level, the data in NPD are indisputably personal data. Though interspersed with administrative and other data, this is information about my children, their circumstances and characteristics and my home. That this is not consistently referred to as personal data gives me cause for concern.

Which leads me to my third point; informed consent. I am fully aware that much of the data in the National Pupil Database is harvested via statutory gateway, which in law provides an exemption for processing data without consent. But what the Department chooses to present as “minor amendments” to the Regulations in fact represent a hugely significant change in what may be done with the personal data – some highly sensitive – which have been collected in this way.

To claim: “The Department makes it clear to children and their parents what information is held about pupils and how it is processed, through a statement on its website. Schools also inform parents and pupils of how the data is used through privacy notices” is, quite frankly, a joke.

(I would be very interested to see the annual traffic logs / unique visits for the web page(s) on which the Department’s “statement” resides. As far as I can tell, the page ‘National pupil database: How is the data used?’ is merely a re-ordering of the four bullet-points that appear in section 3.1 of the consultation document. Three of the six pages explaining NPD are concerned with accessing the data.)

I am probably quite unusual in having read the “privacy notices” sent from both my children’s primary and secondary schools, generally amongst a swathe of other papers and forms at the beginning of a school year or induction. These notices may have provided some information, but they certainly did not properly inform me about the National Pupil Database and what is done with the data on it. And I doubt anyone who wasn’t specifically looking for it would notice any change to the boilerplate from one year to the next.

According to the Second Data Protection Principle: “Personal data shall be obtained only for one or more specified and lawful purposes, and shall not be further processed in any manner incompatible with that purpose or those purposes.”

The proposed changes are not just a minor ‘loosening’ of the limits to sharing. The amendment quite clearly changes the specified purpose for which at least some of personal data is collected – e.g. via the School Census – at the same time vastly increasing the type and number of bodies and organisations which may access the data and the ways in which they may process them.

That the reach, scope and scale of the National Pupil Database have been extended time and again over the years is bad enough. To try to pass off these currently proposed changes as “minor” is as cavalier as it is deceptive.

In summary:

1)     The proposed change offers few practical limits to the type or number of organisations which may apply for access; the protections in place are insufficient for the marketplace the Secretary of State appears to wish to create.

2)     The Department provides no evidence, cost/benefit or risk analysis; its argument is mere assertion and it appears to have adopted a “suck it and see” approach.

3)     The amendment as it stands is a major change to the specified purpose for which (some) personal data is being gathered; to use the “lawful” exception yet again is a perversion of principle.

So what do I suggest? Stop! Go back to the beginning. Think again.

I urge you not to go down this route at all. But if you must, for something that would clearly affect every child in state education, many who have gone through it and their families now and for generations to come, the only reasonable way forward is to conduct a full and proper consultation (I say more on this below) and to allow Parliament the opportunity to debate a far more coherent, robust and properly-evidenced proposal, in primary legislation.

 

2) How could you or your organisation potentially use the data?

I or my children might like to see a copy of the information that is held about them, but I assume we already have the right to do this through a Subject Access Request.

I would very much like to have a complete list of every body or agency that has accessed my children’s data. I do not know why the Department does not publish this information on its website – it is one of the most obvious things about NPD data sharing that children and parents are likely to want to know.

 

3) What do you see as the benefits of widening the purposes for which data can be shared?

I see significant risks and no attempt to model them, let alone the cost/benefit. I do not propose to feed the Department’s fantasies.

 

4)  Do you have any other comments you would like to make about the proposals in this consultation document?

No. I hope I have made myself clear.

 

5) Please let us have your views on responding to this consultation (e.g. the number and type of questions, whether it was easy to find, understand, complete etc.).

My other comments relate to the consultation process itself.

As you say below, Cabinet Office’s current Consultation Principles state: “Departments will need to give more thought to how they engage with and consult with those who are affected.”

A six-week online consultation in the busy run-up to Christmas is neither an appropriate nor sufficient way in which to engage with and consult those who would be affected by these proposals. As much as the Department appears to wish to downplay it, the proposed amendment represents a major change in what is intended to be done with the personal information – some highly sensitive – of millions of children, some now adults and their families. It appears any thought given to this consultation process was on the basis of how quickly it could be hustled through.

Based on people I have spoken with, I strongly suspect that many if not most children and parents are still quite unaware of the National Pupil Database. But, again only anecdotally, I believe many parents and children would be very concerned at the notion that their personal information – gathered without their consent, simply by virtue of them attending school – might be shared with, say, commercial companies and the media.

At the very least, details of this consultation should have been sent to every school with instructions to notify parents that a significant change to arrangements regarding their children’s personal data was being proposed. (Neither head at either of my children’s schools had heard anything about this.) And, for any change that would affect so many, a period of at least half a term should be the minimum allowed for people to respond. Preferably not just before a holiday!

As is often the case, the questions and language in this consultation are skewed towards the positive with no mention of risks or potential disbenefits. I hardly expect this to change, but will continue to comment in the hope of improvement.

Posted in National Pupil Database | 1 Comment