Congratulations to all students and their teachers who have recently received the first results for the reformed, big, fat, 9-1 GCSE in maths.

I am sure that all over the country, maths teachers are busy analysing and reflecting on their results. I have been doing exactly the same, looking at the cohort that sat AQA’s examinations and trying to answer the questions that I know teachers will be asking.

The purpose of this blog is to look at the overall outcomes rather than focus on particular questions, topics and assessment objectives. That will come later. So, let’s start with the headline figures as published. The overall outcomes for all GCSE candidates who sat the AQA GCSE can be seen in table 1, below.

Table 1: Overall outcomes (all candidates) GCSE 8300 June 17

Grade 9 8 7 6 5 4 3 2 1
Cum % 3.4 9.2 18.5 29.7 48.4 67.8 81.8 91.4 97.8
% at grade 3.4 5.8 9.3 11.2 18.7 19.4 14.0 9.6 6.4

Makeup of the student population and the effect of age on overall performance

Because of the changes to the way performance tables work and the existence of a final opportunity for older students to sit the outgoing (4365) specification, I expected that the overall population would be overwhelmingly made up of 16-year-olds, and 91% of the whole cohort were 16-year-old students. However, the remaining 9% have made a difference.

The 13,400 older students who sat the new GCSE tended to choose the Foundation tier and had a very different results profile to the majority. I am sure there is another blog to be written about this group, comparing them with the 41,000+ post-16 learners who sat the ‘old’ GCSE but that is for another time.

For now, I will take them out of the results and look at the results of the majority group – the 16-year-olds – see table 2, below.

Table 2: Overall Outcomes (16yo) GCSE 8300 June 17

Grade 9 8 7 6 5 4 3 2 1
Cum % 3.5 9.7 19.7 31.7 51.0 70.4 83.0 91.8 98.0
% at grade 3.5 6.2 10.0 12.0 19.3 19.4 12.6 8.8 6.2

The differences are small but important, particularly at grades 4 and 5. If you compare table 1 with table 2, you’ll see the 16-year-olds’ results are about 2.6% higher than the overall results at these two grades.

One concern that I had before the summer was around the proportions at these key grades. We all knew that about 70% would get grade 4 or better to match up with the historical C+ figure. We did not know how many of this 70% would achieve grade 5 or better.

I think it is a positive indicator that very similar percentages achieved each grade and that there was no sign of large proportions ‘just’ meeting grade 4 and bunching in that grade.

Did teachers make the right decisions about tiers?

On results day, it was no surprise to see the media focussing on the Higher tier grade 4 boundary, criticising it for being low. It was equally unsurprising to see maths teachers on social media pointing out that a low boundary at grade 4 was inevitable as it is the bottom grade at that tier.

However, the setting of this boundary and others has led to teachers questioning whether they made the right decisions about tier choice, so it is worth looking more closely at the tiered outcomes and how they are obtained.

Table 3: Boundaries and outcomes by tier (16yo) for GCSE 8300: Foundation, June 2017 and GCSE 4365: Foundation, June 2016

8300 (2017)
Grade 5 4 3 2 1
Boundary/240 156 124 91 59 27
Boundary (%) 65.0 51.7 37.9 24.6 11.3
Cum % of tier 14.2 40.2 64.9 83.4 96.3
4365 (2016)
 Grade C D E F G
Boundary (%)  66.3 24.6
Cum % of tier 34.2 53.0 67.2 79.2 89.7

Table 4: Boundaries and outcomes by tier (16yo) for GCSE 8300: Higher, June 2017 and GCSE 4365: Higher, June 2016

8300 (2017)
Grade 9 8 7 6 5 4 3
Boundary/240 189 157 125 98 72 46 33
Boundary (%) 78.8 65.4 52.1 40.8 30 19.2 13.8
Cum % of tier 6.8 18.6 37.7 60.5 84.5 97.8 99.5
4365 (2016)
 Grade  A* A D C D
Boundary (%)  71.4  35.4
Cum % of tier 10.8 29.1 60.7 90.9 99.3

So, let’s start with the grade 4 boundaries. Those of you who have followed the Ofqual blogs on standard setting will know that we have to meet the expected overall percentage of 4 and above, about 70%. You will also know that the boundaries at each tier are balanced to ensure they represent the same level of performance on common questions.

These two very sensible and fair constraints lead to the boundary marks of 124 on Foundation and 46 on Higher. It is worth emphasising that there is no expectation of a particular proportion of grades coming from one tier or the other.

However, as I have discussed with teachers many times over the last year or so, there must be sufficient evidence of performance at the Higher grade 4 boundary to be able to say the students at that mark are worth the grade.

Level of demand on Higher papers and setting common grades

By design, there are just over 90 marks across the three Higher tier papers that we judge to be of medium demand – marks that are broadly targeted at grades 4 and 5 and could be common to both tiers. In contrast, there are 120 marks targeted at grades 7, 8 and 9.

So, one way for a student to just get a grade 4 is to achieve about half the medium demand marks. Put like that, the boundary does not seem unreasonable. Of course, the reality is that students pick up a few marks from later in the papers but their average performance on the common questions matches that of the Foundation students at the boundary and that seems like the only fair approach.

One way for a student to just get a grade 4 is to achieve about half the medium demand marks

It is also worth mentioning that Higher tier grade C boundaries of around 20% were common when we had a three-tier structure and C was the bottom grade of 4 on the Higher tier. I recall that the media made a fuss about it then and the solution was to focus more marks on the lower grades, leading inevitably to a significant rise in boundary marks.

I am not suggesting this is a change we should make for this new GCSE, simply pointing out that the boundaries we get are a result of the way we are asked to structure the papers.

I mentioned my concerns around grade 5 earlier and it is worth looking at how this grade was set at each tier. Once the 4 and 7 have been set, the Higher grade 5 is set arithmetically so the boundaries from 4 to 7 are evenly spaced (see table 4). Then tier balancing is used again to find the Foundation mark that represents the same performance on common questions as the Higher tier boundary.

So, the grade 5 on tier F could have been very different and it is either luck or good assessment design (no prizes for guessing my preference) that the grade 5 is in pretty much the same place as it would have been if we had set it by simple arithmetic.

How this year compared to last year

Tables 3 and 4 show that a comparison of A and C outcomes with 7 and 4 outcomes suggests a much better performance this year.

The reasons for this are, I guess, two-fold. The AQA entry is much bigger this year and it seems that the growth has led to our cohort being stronger overall by about 2% at 7 and 4. The rest of the difference lies in the shift from Higher to Foundation. This has the effect of making both tiers stronger as students expecting grades 4 and 5 move from the ‘bottom end’ of one tier into the ‘top end’ of the other.

If, in future years, this were to reverse, then the profile of each tier would change but overall outcomes would remain the same. My personal view is that it would be a shame if this happened. One laudable aim of the GCSE reforms was for all students to be confident and competent with the mathematics they have learned. This is achieved by great teaching, not by an exam. However, an assessment where students get more right answers than wrong feels like a better servant to good teaching.

One laudable aim of the GCSE reforms was for all students to be confident and competent with the mathematics they have learned.

Does this mean that we should have artificially raised the bar for grade 4 in the Higher tier? No, because our first responsibility is to be fair to all students.

Should schools rethink their choice of tier?

About 1% of students sitting the Foundation tier gained above 190 marks and they may have achieved grade 6 if they had been prepared for and sat Higher. About 0.5% of students sitting the Higher tier averaged single figures on each paper and may have benefited from sitting the Foundation tier. So, those figures suggest most schools got it about right.

I have heard a lot of teachers talking about a baseline of 20 marks per paper in mocks for Higher tier entry. Hindsight suggests that is not a bad rule of thumb, but, as always, you know your students best and the national picture suggests your judgement has not disadvantaged your students.

Next steps

My next blog will look more closely at performance of the three papers and different question styles within that.

If you are interested in really getting ‘under the bonnet’ of GCSE assessment, I will be hosting a webinar on 20 September and running a session at Mathsconf 13, looking at what we aimed to do with these papers and how well we did so do sign up for one of those sessions.

I hope this has been useful. Do let me have your thoughts via Twitter @AQAMaths