Mapping the people and citations in UK policy
Introduction
Where in the UK is academic engagement with policy coming from?
Mark Geddes' excellent paper "Committee Hearings of the UK Parliament: Who gives evidence and does this matter?" (OA copy here) looks at who gives evidence to House of Commons committees and their geographic distribution based on a database of witnesses from the 2013-2014 parliamentary session.
It shows a clear preference for Russell Group universities (accounting for 75% of witnesses) and for universities based close to parliament, with 37% of academic witnesses coming from London and a further 22% from the south of England.
Might this hold true more broadly? In this interactive lab we'll use the citation and people mention data from Overton to check.
Use the tool below to select different sources of UK policy and see where the academics being cited or mentioned by that source are located.
Please note that this kind of data can only ever be a rough indicator - the data recipe below lists some of its limitations and potential biases. You can cite this resource as:
Choose a policy source
Results from 65,282 matching mentions and 151 universities
Most frequently seen regions (NUTS 1)
NUTS 1 | Region | As % |
---|---|---|
UKI | London | 21.3% |
UKJ | South East (England) | 15.0% |
UKM | Scotland | 12.7% |
UKE | Yorkshire and The Humber | 12.0% |
UKD | North West (England) | 7.7% |
UKH | East of England | 7.7% |
UKG | West Midlands (England) | 6.0% |
UKK | South West (England) | 4.9% |
UKF | East Midlands (England) | 4.1% |
UKL | Wales | 3.5% |
UKC | North East (England) | 3.1% |
UKN | Northern Ireland | 1.9% |
Most frequently seen regions (NUTS 2)
NUTS 2 | Region | As % |
---|---|---|
UKI3 | Inner London - West | 21.3% |
UKJ1 | Berkshire, Buckinghamshire and Oxfordshire | 10.2% |
UKM7 | Eastern Scotland | 7.2% |
UKH1 | East Anglia | 6.3% |
UKE3 | South Yorkshire | 5.0% |
UKD3 | Greater Manchester | 4.4% |
UKM8 | West Central Scotland | 3.7% |
UKE2 | North Yorkshire | 3.7% |
UKK1 | Gloucestershire, Wiltshire and Bath/Bristol area | 3.4% |
UKE4 | West Yorkshire | 2.9% |
UKG3 | West Midlands | 2.8% |
UKG1 | Herefordshire, Worcestershire and Warwickshire | 2.5% |
UKF1 | Derbyshire and Nottinghamshire | 2.3% |
UKJ2 | Surrey, East and West Sussex | 2.1% |
UKL2 | East Wales | 2.1% |
UKD7 | Merseyside | 2.1% |
UKJ3 | Hampshire and Isle of Wight | 2.0% |
UKN0 | Northern Ireland | 1.9% |
UKC2 | Northumberland and Tyne and Wear | 1.7% |
UKF2 | Leicestershire, Rutland and Northamptonshire | 1.6% |
UKM5 | North Eastern Scotland | 1.6% |
UKC1 | Tees Valley and Durham | 1.5% |
UKL1 | West Wales | 1.4% |
UKK4 | Devon | 1.3% |
UKD4 | Lancashire | 1.3% |
University groupings
Grouping | As % |
---|---|
Russell Group | 70.0% |
Other | 30.0% |
Gender
Gender determination by first name (please see notes) | As % |
---|---|
Female | 32.7% |
Male | 52.9% |
Could not be determined | 14.4% |
Most frequently seen universities
University | Count | As % |
---|---|---|
University of Oxford | 5,887 | 9.0% |
University College London | 4,258 | 6.5% |
University of Cambridge | 3,275 | 5.0% |
London School of Economics and Political Science | 3,056 | 4.7% |
University of Sheffield | 2,603 | 4.0% |
University of Manchester | 2,458 | 3.8% |
University of York | 2,415 | 3.7% |
University of Edinburgh | 2,381 | 3.6% |
Imperial College London | 2,362 | 3.6% |
University of Birmingham | 1,691 | 2.6% |
University of Leeds | 1,509 | 2.3% |
University of Glasgow | 1,463 | 2.2% |
University of Bristol | 1,421 | 2.2% |
University of Warwick | 1,356 | 2.1% |
Cardiff University | 1,290 | 2.0% |
University of Nottingham | 1,263 | 1.9% |
University of Liverpool | 1,189 | 1.8% |
University of Southampton | 1,089 | 1.7% |
University of Aberdeen | 956 | 1.5% |
University of Stirling | 940 | 1.4% |
Durham University | 898 | 1.4% |
Queen's University Belfast | 892 | 1.4% |
Newcastle University | 867 | 1.3% |
University of Exeter | 803 | 1.2% |
University of East Anglia | 803 | 1.2% |
The data - recipe, limitations & biases
Overton is a large database and citation index of policy documents collected from government, IGOs and think tanks worldwide.
We're a small start-up supported entirely by customers and collaborators. We're not externally funded which gives us the freedom to experiment with a mix of commercial and non-profit data models: please get in touch if you'd like to use this or similar data in your own research, academic or otherwise.
Step | Limitations and biases |
---|---|
We queried the Overton database for relevant documents and retrieved approximately 202k matches from UK government sources, ranging from documents on GOV.UK to committee reports from parliament, Hansard, and clinical guidelines from NICE |
There's no time period constraint
The Geddes study linked at the top of this page looked specifically at 2013-2014. We're looking at all of the policy documents available to us in Overton, which are typically but not always from 2015 onwards It's only public documentsOverton only knows about publicly available documents: it can't see internal Civil Service documents or when interactions haven't been recorded publicly Some local policy sources aren't trackedOverton tracks policy documents at the UK and devolved nations level, and from large city councils - London, Greater Manchester, Edinburgh, Leeds, Liverpool etc. - but not from smaller cities, so some local interactions will be missed |
Citations of academic books and papers in each document were fetched using the Overton API |
The humanities and some social sciences will be underrepresented
Overton works best for citations with DOIs, and many books and older papers especially in HSS don't have these: they may not be picked up and so won't be counted We're looking at citations of scholarly workOverton tracks research from think tanks and NGOs too, but in this dataset we're only looking at work in the scholarly record. That means we're ignoring any engagement academics might have through e.g. publishing a report via a think tank or foundation |
The affiliations of any UK authors of those books and papers were mapped to GRiD (a standard identifier for research producing institutions) |
We don't have good affiliation data for every academic
Affiliation data comes from Microsoft Academic and while coverage is good it is not complete. Some academics won't be counted and there's no obvious pattern to this that we can compensate for Only educational institutions are coveredWe're only including data from UK based institutions in GRiD classed as 'Education'. GRiD treats healthcare facilities separately so experts from e.g. university affiliated teaching hospitals are not counted |
Mentions of UK academics in government sources were fetched using the Overton API |
Academics whose work hasn't been cited but who have been mentioned won't be counted
Overton only knows about academics who have been cited at least once somewhere in the policy literature, so the set is biased towards people whose work has already made it into policy It's not just witnessesOverton can't robustly classify mention types: it can't tell if somebody is mentioned because they have given evidence, are being quoted, because they were commissioned to write a report or they just attended a workshop or discussion |
The affiliations of these academics were also mapped to GRiD |
GRiD matching is imperfect
Some policy documents use informal names or acronyms in affiliations e.g. Edinburgh University instead of University of Edinburgh. GRiD includes many name variants but not all, so some affiliation strings may not be mapped correctly, especially for smaller institutions |
The first names of mentioned and cited academics were used to guess their gender using the genderize.io web service |
Determining gender from first names poses ethical and technical challenges
See Stacy Konkiel's post on the Bibliomagician blog for a fuller overview of why We're counting people mentioned or cited at least once in policy, rather than gendering each appearance: an academic may be mentioned in three different documents but her name would only be counted once First names that were just initials - common in Overton's citation metadata - or fewer than three characters long were automatically marked "could not be determined" and we used 85% as the cutoff probability score for the guessed gender from genderize.io, following the methodology in Elsevier's Gender in the Global Research Landscape report |
GRiD includes NUTS Level 3 (a set of standard identifiers for different geographical regions within Europe) and these were then used to total up counts by region and to draw the map |
GRiD has mapped some universities to incorrect (but geographically adjacient) NUTS3 codes
Thanks to @carlbaker on Twitter for pointing this out. An example is the University of East Anglia, which is placed in South Norfolk by GRiD but should actually be in Norwich and East Norfolk. We found about this after doing the data analysis, so it hasn't yet been fixed. We had to manually tweak some regionsNUTS has been updated recently and a few regions have been added, removed or changed, mostly in Scotland and Northern Ireland. We've tried to map old NUTS codes to the new ones as best we can Our map uses the December 2019 boundary data file from the Office for National Statistics |