The National Pupil Database (NPD) is a database controlled by the Department for Education in England, based on multiple data collections from individuals age 2-21 in state funded education and higher education. Data are matched using pupil names, dates of birth and other personal and school characteristics, including special educational needs, disability, and indicators for free school meals, a child in care, and families in the armed forces. Personal details are linked to pupils' attainment and exam results over a lifetime school attendance.
In October 2018 the database contained over 21 million individual named pupil records. It is deemed by the Department to be “one of the richest education datasets in the world".[1] This is just one of the distributed datasets that the Department for Education controls, and separate from the further Individualised Learner Record (ILR) in the Learning Records Service, for example.
Schools use Management Information Systems (MIS) to collect and analyse pupil level information at local level. Data from these systems are used to complete the termly school census returns provided to Local Authorities (regional) or directly to the Department for Education (national) three times a year. The National Pupil Database has expanded in its scope of the items collected, and from children of a wider age range over time. Data once stored in the National Pupil Database, are never deleted.
The Higher Education Statistics Agency passes students' personal confidential data collected from universities to the Department for Education, where it is linked to individuals' school records in the National Pupil Database, expanding the lifetime record for millions of people that the Department retains indefinitely.
The National Pupil Database covers only pupils in state (or partially state-funded) schools in England. However similar systems operate across the rest of the United Kingdom.
Details of all data sources contained within the linked set of data which form the National Pupil Database, and the coverage of children within each source.[5]
Data Source | Ages | |||
School Census/PLASC | 2-18 | |||
PRU Census [historic to 2013] | 2-18 | |||
Early Years Census | 2-4 | |||
Alternative Provision Census | 2-18 | |||
Early Years Foundation Stage Profile | 2-4 | |||
Reception Baseline Assessment | 4-5 | |||
Year 1 Phonics Test | 5 | |||
Key Stage 1 SATs | 6-7 | |||
Multiplication Times Tables Test | 8-9 | |||
Key Stage 2 SATs | 10-11 | |||
Key Stage 4 Awarding Body data | 14-21 | |||
Key Stage 4 Achievement & Attainment Tables data | 15-16 | |||
Key Stage 5 Awarding Body data | 14-21 | |||
Key Stage 5 Achievement & Attainment Tables data | 16-18 | |||
Individual Learner Records (ILR) | 14-21 | |||
HESA data | 17-21 | |||
Children Looked After | 0-18 | |||
Children In Need | unborn-18 | |||
PLAMS Post-16 Learning Aims | 16-18 | |||
NCCIS National Client Caseload Information | 15-24 | |||
Independent Specialist Providers (ISP) |
The pupil level data are personal confidential data which include sensitive personal data as defined by the Data Protection Act 1998.[6] The National Pupil Database contains:
There are about 400 possible variables to collect on individual pupils. The full national code sets of all the items of data that can be collected on individual children can be downloaded from the Department for Education, listed in the common basic data set (CBDS), including health and SEND (special educational needs and disability).
For uses of Key Stage attainment datasets and School Census dataset see also England: School Census. Raw pupil level personal data are held in the Department for Education National Pupil Database (NPD). The linked datasets contain data which are identifying, and too sensitive or disclosive to be published, although these data are given out to third parties in raw form.
David Cameron announced in 2011, the government would be “opening up access to anonymised data from the National Pupil Database […].” This was an expansion to other third parties, since these data had already been used for many years and extensively by academic public interest researchers.
Since 2012, Secretary of State has had powers to share raw data from National Pupil Database under terms and conditions with named bodies and third parties who for the "purpose of promoting the education or well-being of children in England are conducting research or analysis, producing statistics, or providing information, advice or guidance", and who meet the Approved Persons criteria of the 2009 Prescribed Persons Act, updated in 2012/13.
The data when released, however are not anonymised, but are sensitive and identifying.[8] "According to centrally held records at the time of writing, from August 2012 to 20 December 2017, 919 data shares containing sensitive, personal or confidential data at pupil level have been approved for release from the National Pupil Database. For the purpose of this answer, we have assumed the term sensitive, personal or confidential uses of information to be data shares classified as either Tier 1 or Tier 2 as set out in the National Pupil Database area on GOV.UK. [In addition] There were 95 data shares approved between March 2012 and this classification system being introduced."
In a presentation to the NPD User group in September 2016,[9] the Director of the DfE Data Modernisation group acknowledged the release of sensitive data: "People are accessing sensitive data, but only to then aggregate. The access to sensitive data is a means to an end to produce the higher level findings.”
The data items for release are classed into four tiers by the Department for Education, as described in the NPD User Guide.[10] Following the change of legislation, releases of the data since 2012 from the Department for Education to third parties have not been anonymous, but have been of identifiable and highly sensitive (Tier 1), identifiable and sensitive (Tier 2), aggregated but may be identifying due to small numbers (Tier 3) and identifying non-sensitive items (Tier 4). Raw, closed data are released on a regular basis to third parties, and the majority of releases are of Tier 1 and 2 data.
A list of completed National Pupil Database Third Party Requests and those in the pipeline, are published on a quarterly retrospective basis.
Government uses of the data are based on a model of data sharing, passing raw data from one location to another, which is viewed by some as 'obsolete'. Intra departmental transfers of data include to the Cabinet Office for preparation of Electoral Registration Transformation work in 2013, to match participant data in the National Citizen Service, and for use in the Troubled Families programme, as well as arms length bodies such as NHS Digital for a survey "What About Youth" mailed home to 300,000 15 year olds in 2014. Not all government uses of the data are recorded in the Third Party Release Register, such as internal use. The volume of Police and Home Office use first made public through Freedom of Information requests in 2016, were first officially published by the Department, in the Third Party Release Register in December 2017, under "External Organisation Data Shares".[11] Police requests were only documented going as far back as July 2015. This omits police access to records before this date, as noted in a ministerial correction (HCWS272)[12] made by Nick Gibb, Minister of State for School Standards, on the numbers of pupils data released to the Home Office and police. “Information supplied by the Data Modernisation Division of the DfE has been identified as containing incorrect facts in the response provided to Parliamentary Questions concerning the volume of children's records passed onto the police and the Home Office (PQ48634, PQ48635 and PQ52645) and in figures quoted during a House of Lords Debate on the 31 of October 2016 on the Education (Pupil Information) (England) (Miscellaneous Amendments) Regulations 2016."
Of the documented 887 requests for identifiable data that have been through the DMAP request process in March 2012 – December 2016, only 29 have been for aggregated data, according to analysis by the NGO defenddigitalme. There were 15 rejected applications between March 2012 and September 2016, including a request "by mistake"[13] from the Ministry of Defence to target its messaging for recruitment marketing. Approved uses include identifying and sensitive data released to Fleet Street papers, “to pick interesting cases/groups of students," and about 60% of applications approved (as distinct from volume of data used) for identifying and sensitive, pupil level data, were from think tanks, charities, and commercial companies.[14]
The Telegraph newspaper was granted identifying and sensitive data in 2013, for all pupils in the KS2, KS4 and KS5 cohorts for the years 2008-2012.[15]
Academic uses of school census data make up about 40% of the requests for identifying, pupil level data, processed through and approved by the DMAP process. The raw data are sent to the requestor's own location. There is no charge made for fulfilling requests. "DfE does not charge for data (and has not since the NPD process began), nor does DfE charge for the processing and delivery of extracts to customers."
There is however no transparency of the volume of how many children's data have been given away in approved uses either, because “the Department does not maintain records of the number of children included in historic data extracts.” (PQ109065)[16]
Public interest research use of pupil level data through other routes of access to the data, include projects linking individual data together with other education and employment data from citizens' interactions with other government departments and public services. For example, the LEO dataset is made up of information from the National Pupil Database (NPD), the Individualised Learner Record (ILR), the Higher Education Statistics Agency (HESA), Her Majesty's Revenue and Customs data (HMRC), The National Benefit Database, the Labour Market System and Juvos, the unemployment research database. Further work by DfE compares self-reported salaries from the 2008/09 DLHE survey with earnings data from the LEO dataset coming directly from HMRC tax records.
In June 2018, the UK Parliament gave powers to the Office for Students through the Higher Education and Research Act 2017 (Cooperation and Information Sharing) Regulations 2018 (No.607)[17] to distribute personal data to thirteen third party organisations. In 2019 The Higher Education and Research Act 2017 (Further Implementation etc.) Regulations 2019[18] will expand which data that may be, and will include the entirety of the National Pupil Database and Alternative Provision data. In debate,[19] Shadow Secretary for Higher Education Gordon Marsden MP, asked the government whether, "it the intention of the new regulations that through the new data powers they give OfS to receive data in regulations 28 and 32 they can also enable the distribution by OfS of population-wide personal data?" The data in question, "includes the personal, confidential data of every pupil from state education since 1996, past, present and future and in perpetuity—over 25 million people, and growing every year—distribution to its own third-party prescribed persons, including potentially Pearson Education Ltd, among other commercial parties, for such wide-ranging company purposes, through the powers of last year's regulations, which set out who the OfS could give data to, and for purposes defined only by that company's memorandum and articles of association."
Since legislation changed over time to permit new uses and access to personal data by new third parties, over 15 million people whose data was already in the National Pupil Database and who had already left school pre-2012, have not been informed how their personal data may be used, for what purposes, and by whom, such new Regulations demonstrates.
In July 2015, the Department for Education and Home Office Border Removals Team agreed a Memorandum of Understanding[20] to share pupil data including names, date of birth, gender, home address and school address for up to 1,500 children a month, from the last 5 years of their records, for various purposes of direct interventions.
This policy became public knowledge through the expansion of the school census in October 2016 which added country of birth and nationality to the collection.
In October 2017, the Department for Education confirmed in an interview with Sky News that, information obtained from the National Pupil Database was used to contact families to "regularise their stay or remove them"[21] and confirmed in January 2019 that this policy continues.[22]
An expansion of the Alternative Provision census starting in January 2018, added further sensitive data to the National Pupil Database including pregnancy, physical and mental health, and a code for young offender, as reason for transfer out of mainstream education.[23] The AP Guidance 2017-18 indicates that the age group has been lowered. "Within the AP census, pupils should be aged between 2 (as at 31 December 2017) and 18 (at 31 August 2017) - those pupils born between 01/09/1998 and 31/12/2015."
Campaigners and charities warned that the changes would lead to sensitive details being collected without the knowledge of parents and pupils, in breach of data protection law and raised concerns that "there are not enough safeguards to ensure that sensitive data does not end up being passed on to third parties and damaging the privacy of those it covers."[24]
In October 2020, the Information Commissioner's Office published an executive summary of a compulsory audit it had carried out of the Department for Education in spring 2020. The audit found that data protection was not being prioritised and this had severely impacted the DfE's ability to comply with the UK's data protection laws. A total of 139 recommendations for improvement were found, with over 60% classified as urgent or high priority.[25]
The sharing of identifying pupils’ personal data with third parties was put on hold in May 2018 for three months. The Department for Education halted the distribution of personal information about school children in England, to restart it aligned with a Five Safes model, according to the Office for Statistics Regulation's recommendations. Although this was intended as an improvement towards safer pupil data, in spring 2019 data distribution continued, more than six months after the safer model was introduced.
The new infrastructure was part of a set of recommendations[26] made by the UK Statistics Authority in 2018, which included that the Department carry out a Data Protection Impact Assessment. A summary was published in May 2019.[27] It included recognition of the risk that people, “may not be aware that their personal data may be shared with other organisations.”
In May 2019, the Department for Education released the first summary data protection impact assessment. It revealed that sexual orientation and religion are added to pupil records, for students from Higher Education.
The Information Commissioner's Office confirmed in interim investigation findings in autumn 2019, that, "This investigation has demonstrated that many parents and pupils are either entirely unaware of the school census and the inclusion of that information in the NPD, or are not aware of the nuances within the data collection, such as which data is compulsory and which is optional. This has raised concerns about the adequacy DfE's privacy notices and their accountability for the provision of such information to individuals regarding the processing of personal data for which they are ultimately data controllers.”
Access is granted through an applications process to the Department for Education Education Division and internal Data Management Advisory Panel (DMAP), and is subject to requesters complying with terms and conditions imposed under contractual licence arrangements. The DMAP Terms of Reference was first published in July 2016 by the Department for Education, but became obsolete after a 2018 panel reconfiguration.
The Department for Education application procedures for handling requests for data from the National Pupil Database, from March 2012, enabled interested parties to request extracts of data from the National Pupil Database (NPD) using forms available on the Department for Education website. Data supply agreements, agreement schedules and individual declarations for researchers and third-party organisations who have received DfE approval for applications for data extracts are completed before users are sent the password protected data.
The sensitive and identifying items that require DMAP approval include name, date of birth, postcode, candidate numbers, Pupil Matching Reference (Non Anonymised), detailed types of disability, indicators of adoption from care, reasons for exclusions (theft, violence, alcohol etc).
There is no ethics committee review for the release of identifying or sensitive data directly from the National Pupil Database by the Data Management Advisory Panel or Education Division.
There was no privacy impact assessment of the National Pupil Database for over twenty years, until 2019.[28]
Some of the history behind its collection, use and changes to legislation are outlined in a presentation given at an Open Data Institute ODI Friday lunchtime talk: Getting to grips with the National Pupil Database in 2013. (Soundcloud licensed under a Creative Commons License.)
The release of data permitting pupil level release of individuals’ identifiable data to third parties from the National Pupil Database was updated by 2013 changes to legislation. Section 114 of the Education Act 2005, and section 537A of the Education Act 1996, together with the 2009 Prescribed Persons Act, were amended in 2010 and 2013, to allow the release of individual children's data to third parties. Which data items are involved is based on the 2006 Act around the register data a school must hold, which has subsequently had many amendments.
The Data Protection Act 1998, in particular, Principle 1, sets out a fairness obligation which cannot be set aside merely because of the presence of a legal basis such as a Statutory duty. On 1 October 2015 this latter point was again made explicit for public bodies in the judgment of the Court of Justice of the European Union in the Bara case (C‑201/14), in which it ruled that “[the Directive] must be interpreted as precluding national measures…which allow a public administrative body of a Member State to transfer personal data to another public administrative body and their subsequent processing, without the data subjects having been informed of that transfer or processing,” i.e. individuals must be informed when public bodies share personal data and why.
For sensitive data (Tier 1 and Tier 2 of the National Pupil database include all the data items classified as ‘sensitive’) an additional condition from Schedule 3 of The Data Protection Act 1998 must also be met to justify a legal basis for disclosure. These conditions are a high bar, for example, in the interests of justice.
The Data Protection Act 1998 (s33) gives research exemptions for the purposes of statistical and historic research purposes, most significantly on the principles of indefinite retention and data minimisation, as well as Subject Access rights, for as long as data are processed for the legitimate interests of the Data Controller. To qualify for the research exemption,[29] the research must be able to comply with the following ‘relevant conditions’:
(a) that the data are not processed to support measures or decisions with respect to particular individuals, and
(b) that the data are not processed in such a way that substantial damage or substantial distress is, or is likely to be, caused to any data subject.
Campaigners from the children's privacy NGO defenddigitalme, have questioned whether this legal basis is met for some releases between 2012 and 2017 from the National Pupil Database and whether new uses put the research status of the National Pupil Database at risk.[30]
As observed in 2014 by independent experts, "the central concern is that parents and pupils themselves are not sufficiently aware of the way the data is being shared with third parties."[31] "There appears to have been no concerted effort to bring the consultation or the NPD initiative to the attention of parents or pupils."[32]
In November 2019, the ICO found: "the issues that the DfE experienced with the collection of the nationality data in terms of parent and pupil awareness of the optional nature of the collection of that data has highlighted concerns regarding compliance with articles 12, 13 and 14 of the GDPR. Our view is that the DfE is failing to comply fully with the GDPR in respect of these articles. The investigation has demonstrated that many parents and pupils are either entirely unaware of the school census and the inclusion of that information in the NPD, or are not aware of the nuances within the data collection, such as which data is compulsory and which is optional. This has raised concerns about the adequacy DfE's privacy notices and their accountability for the provision of such information to individuals regarding the processing of personal data for which they are ultimately data controllers."[33]
In November 2022 the ICO issued a reprimand to the DfE following "the prolonged misuse of the personal information of up to 28 million children."[34] John Edwards, UK Information Commissioner, stated that:
"No-one needs persuading that a database of pupils’ learning records being used to help gambling companies is unacceptable. Our investigation found that the processes put in place by the Department for Education were woeful. Data was being misused, and the Department was unaware there was even a problem until a national newspaper informed them.“We all have an absolute right to expect that our central government departments treat the data they hold on us with the utmost respect and security. Even more so when it comes to the information of 28 million children.
“This was a serious breach of the law, and one that would have warranted a £10 million fine in this specific case. I have taken the decision not to issue that fine, as any money paid in fines is returned to government, and so the impact would have been minimal. But that should not detract from how serious the errors we have highlighted were, nor how urgently they needed addressing by the Department for Education.”