Report 2170

For such a short string of algebraic symbols, there is a lot we can learn from Ofqual’s grading algorithm (though really it is an equation) – and a lot we can learn about what went wrong.

First and most obviously, the size of the algorithm is an issue. With just four distinct terms – Ckj, qkj, pkj and rj – it shows the sparseness of the inputs. This is not a “big data” solution, gathering every possible piece of information about a student in an attempt to gain a full view of their capability. In fact, it is the opposite: using the smallest possible amount of information, so it can be rapidly gathered and easily standardised.

So what are those terms? The first are three various distributions of grades, k, at schools, j. Ckj is simple enough: it is the historical grade distribution at the school over the last three years, 2017-19.

That tells us already that the history of the school is very important to Ofqual. The grades other pupils got in previous years is a huge determinant to the grades this year’s pupils were given in 2020. The regulator argues this is a plausible assumption but for many students it is also an intrinsically unfair one: the grades they are given are decided by the ability of pupils they may have never met.

qkj is where the pupils’ own ability comes in. That is the predicted grade distribution based on the class’s prior attainment at GCSEs. A class with mostly 9s (the top grade) at GCSE will get a lot of predicted A*s; a class with mostly 1s at GCSEs will get a lot of predicted Us.

pkj is the predicted grade distribution of the previous years, based on their GCSEs. You need to know that because, if previous years were predicted to do poorly and did well, then this year might do the same; and, again, vice versa.

The final term, rj, is different: it is not about grades at all, hence the absence of the k. Instead, it is about how many pupils in the class actually have historical data available. If you can perfectly track down every GCSE result, then it is 1; if you cannot track down any, it is 0.

Finally, we can put the terms together. First, the equation is in two halves, one multiplied by that rj term, and one multiplied by one minus rj, meaning the higher rj is, the lower 1-rj will be. What that says is: “If we don’t know about this group’s GCSE grades, ignore the right half of this equation, and just base everything on last year’s A-levels; to the extent that we do know about their GCSE grades, use that information as well.”

The left half, which only gets used if we do not know the GCSE data, is that simple: “Just use the historical A-level results.” And then the right half says: “Use the historical A-level results, but add to them the predictions from this year’s GCSE results, then downgrade them based on how good the last lot of predictions were.” That means a school that regularly gets good A-level results despite having bad GCSEs will get a boost.

Aggregating all those terms together gives us Pkj, the predicted grades for the school.

Even in this short equation, we can see the seeds of a fiasco: prior attainment based exclusively on GCSE results; historical grades stretching back just three years; and a refusal to allow the actual success of the pupil to overrule the situation.

In a better system, perhaps the rest of the process could have ironed out these flaws, but in reality they made them worse.

The decision to give small classes the ability to receive their teachers’ recommended grades is not in the algorithm but led to a boost for elite private schools.

The choice to take the results of the algorithm and further tweak the grade boundaries to prevent overall grade inflation is not in the algorithm but further depressed the larger classes in favour of the smaller.

And the choice to focus, not on determining individual grades, but on determining a distribution for a class which students were then matched to on the basis of their rank in the class, is not an error in the algorithm but a fundamental misunderstanding of what the goal was.

This article was amended on 24 August 2020. The terms Ckj, qkj, pkj and rj are distinct, but not unique as described in an earlier version; a reference to pkj was corrected to Pkj; and the description of 1-rj in relation to rj was clarified.

Report 2170

Associated Incidents

Incident 3748 Report
UK Ofqual's Algorithm Disproportionately Provided Lower Grades Than Teachers' Assessments

Ofqual's A-level algorithm: why did it fail to make the grade?

Report 2170

Associated Incidents

Incident 3748 ReportUK Ofqual's Algorithm Disproportionately Provided Lower Grades Than Teachers' Assessments

Ofqual's A-level algorithm: why did it fail to make the grade?

Incident 3748 Report
UK Ofqual's Algorithm Disproportionately Provided Lower Grades Than Teachers' Assessments