As published on The Startup, Medium’s largest active publication, and curated by Medium for their AI, World and Technology homepages.
Issues with the A-Level results algorithm have already been detailed across traditional and social media. But what does this mean for the technology sector? And what can we learn from the response?
On the 13th August, A-Level students received their 2020 results. This year would be like no other. In the midst of a global pandemic, exams were cancelled, leaving a huge question: ‘How should A-Level grades be calculated?’
This is, unsurprisingly, a complex question to answer. Predicted grades have historically exacerbated discrimination in education: studies show that BAME and working-class students are disproportionately impacted by predicted grades from teachers whose views often reflect societal biases. On the other hand, the cancellation of exams put unprecedented weight
on students’ prior work, which raised two problems. Firstly, major changes to the curriculum over the last few years mean that students’ final grades are largely weighted towards their final exams, in contrast with the prior modular
approach. Secondly, it’s hard to get a sense of how students might perform in final exams based on previous assessments; different learning styles mean that students’ exam grades may not align with their assessment grades, and different
teachers or schools will vary in the support they provide in formative assessments.
Studies show that BAME and working-class students are disproportionately impacted by predicted grades from teachers whose views often reflect societal biases.
So, on the surface of things, an algorithm to help calculate A-Level grades may seem like a smart idea. But a key challenge, besides the development of the algorithm itself, is how students understand their grading. Traditionally, students have spent
hours looking at past papers with their teachers, reviewing assessment criteria, and learning to apply their skills in exams accordingly. Students know how they will be marked, even before they step foot in the exam hall. With an algorithm, all
of this changes. For starters, the term algorithm can be used to mean anything from a simple 1+1=2, to a programme that
tells you a tumour is cancerous based on a photo. An algorithm is simply a set of rules or calculations, but the fluidity of the term adds a layer of opacity to decision-making processes. Furthermore, terms like ‘algorithm’ and ‘statistical
analysis’ can feel intimidating to students who are rarely taught what they mean. Even for those with a good grasp of computer science or statistics, neither Ofqual nor the Government were forthcoming with details of how the algorithm works,
nor did they articulate this in simple language — failing a key tenet of the EU’s
Guidelines for Trustworthy AI which emphasises the importance of transparency in ethical data-driven decision-making. Although the A-Level algorithm doesn’t use AI, the EU principles can apply to any system that automates decision-making,
and they provide clear guidance to ensure these systems are lawful, ethical and robust.
The outrage directed at the grading algorithm from traditional and social media shows the public’s awareness of these issues. Pressure has mounted enough to persuade the Government to change its policy. But what does this mean for the technology
sector? And what can we learn from the response?
The scale and impact of the bias in the A-Level results algorithm point towards the increasing presence of algorithmic decision-making in our everyday lives.
First, the bias. Headlines about algorithmic bias are nothing new, and the Apple Card scandal
or the discriminatory Amazon recruitment algorithm
illustrate just a couple of these stories covered by major news outlets. Ben Park, Director of AI at Sopra Steria, comments: “As an advocate for the use of artificial intelligence, I recognise the inherent benefits that the technology can bring mankind. Used properly it will transform many lives for the better. But the ripple effect created by the use of poorly designed algorithms is potentially huge.”
The scale and impact of the bias in the A-Level results algorithm point towards the increasing presence of algorithmic decision-making in our everyday lives, as well as the public’s increasing awareness of the challenges it raises. The A-Level algorithm was unfair in a multitude of ways, as has been detailed by countless articles. To name a few, the algorithm disproportionately impacted students from state schools
, students in larger cohorts
, and students who chose specific A-Level subjects
. To consider this in context of the EU guidelines
, this algorithm is failing 3 key principles:
- Transparency: “AI systems and their decisions should be explained in a manner adapted to the stakeholder concerned”
- “Unfair bias must be avoided”Diversity, non-discrimination and fairness:
- Societal wellbeing: “AI systems should benefit all human beings”
Let’s take an example to illustrate this point. A student takes A-Level History at their local comprehensive, where grades fluctuate year-on-year due to changing pupils with changing abilities. On average, over the last three years, 3 students in the class of 30 receive a U grade. This might represent 2 students in 2017, 1 student in 2018 and 6 students in 2019. This year,
the student is in a high-performing class, where her grade is the lowest: a C. Since the school’s results must reflect their historical attainment, the 3 students in the class with the lowest grades must be given a U. So, the student’s
results can be dropped 3 grades to give her a U. How does this conflict with the principles for trustworthy AI?
- Transparency: The student does not have oversight of how the grading algorithm works until the Ofqual report is published on the 13th August, when the student has to wade through
pages of detail to understand the system.
- Diversity, non-discrimination and fairness: The student, although predicted a C grade, is penalised for the past performance of her peers in a low-performing school. She was a victim of a bias in the algorithm, as were two
of her classmates. Had these same students attended a different school, perhaps in a more affluent area where historical grades had been higher, their grades wouldn’t have been downgraded at all.
- Societal wellbeing: The student’s prospects are severely impacted. Despite having maintained a C grade in her previous assessments, she is no longer able to attend the university she planned to go to as the score provided
by the algorithm does not fulfil the requirements.
As we know, the Government’s change in stance in response to the public outcry means that the student is now able to attend the university she expected to go to. But, it raises an important point: when considering the impact of algorithmic decision-making,
it’s not enough to only consider the accuracy of the system. Perhaps, in this student’s classroom, the average grade was close to what it might have been in a pre-pandemic world. However, the impact of under-prediction is significantly
more severe than the impact of over-prediction — notwithstanding the fact that educational assessments in themselves have often been shown to display bias.
A student who is downgraded from a C to a D may be prevented from going to university, whereas a student who is upgraded from a C to a B is unlikely to face significant challenges in a university course where the requirements are to achieve a
When designing an algorithm, organisations must not only consider ‘how accurate is this algorithm?’, but also ‘what is the impact of inaccuracy in this algorithm?’ and ‘how might I compensate for these impacts within
Secondly, the response to the scandal has highlighted the need for better public engagement in the technology sector. On the 13th August, Twitter was rife with students saying that the results were the fault of the algorithm, and Boris Johnson blamed
the fiasco on a ‘mutant algorithm’. This personification of code, written by humans and based on criteria
developed by politicians, illuminates the need for wider education on the practices that go into developing technology, as well as the importance of engaging with users of automated decision-making systems.
This point was perfectly put by @MelanieStefan in this series of Tweets:
and algorithms don’t “get it wrong”, they do exactly what they are told to do
[…] Is the algorithm massively unfair? Yes. But this is not just something that
happened. It was built that way. It’s the people who designed the algorithm who
got it wrong.
The opacity of the sector — exacerbated by the overuse of jargon, the exclusivity of expensive training programmes, and poor public sector education around technology — is damaging to all. Responses to the A-Level results perfectly illustrate
the barriers that the sector has put up to understanding these technologies, and it’s our sector’s responsibility to take them down. Ben Park reflects: “Solutions based on algorithms have the potential to change the world for
the better, but they can also amplify the flaws of the humans that develop them. Thus, the process for development must be strictly managed with a clear set of ethical principles and a repeatable design framework that enforces them.”
Organisations must consider, ‘who am I creating this for, and how is it benefitting them?’, ‘how do my users understand the decisions I make about them, and what opportunities do they have to challenge them?’, and ‘who
is accountable for the impact of the algorithm we develop?’
The questions raised here may not provide all the answers, but they offer a useful starting point for technology companies to understand their role in the A-Level results scandal, and what they can do to start addressing the issues it’s rooted
in. As technology and algorithmic decision-making become more widespread across both public and private institutions, public pressure to answer these questions will only increase.
At Sopra Steria, we have a team of consultants that are dedicated to helping organisations across all sectors solve some of the most complex challenges around Digital Ethics and AI. We have created a systematic way for defining the ethical application
of technology with organisations, which maximises the opportunities and minimises the risks associated to products and solutions with a focus on the human and environmental impact. We apply cutting edge academic research with leading standards,
tools and methodologies to understand a business’ unique context.
For more information and to meet our lead Digital Ethics Consultants please visit our dedicated web page here