Ronseal Assessment Part 1

Assessment that ‘does what it says on the tin’

In November 2014, Alice Philips of the Girls School Association criticised the DFE here for creating “an assessment structure akin to rigour on speed... It seems that if it can be graded and put on a scale, compared, averaged, manipulated and mangled, it must be both good and useful.”  Similarly, Hattie criticises the English education system here“You go overboard to reinvent assessment... Every 2 or 3 years you reinvent the wheel and start again and all the time you miss what really matters... [You’re] obsessed about measuring students.”

11 Questions to ask about your assessment

1. Are you assessing the right school curriculum?

Tim Oates, speaking here on assessment without levels, believes there has been a “creep in function”; consequently, “Assessment dominates curriculum thinking...We need to put curriculum first.  Assessment should follow.”  Similarly, Daisy Christodoulou argues here for a rigorous and detailed curriculum: “if you have a vague curriculum then the result is not ‘teacher freedom.’  The result is that the syllabus and/ or the test become the curriculum, with hugely damaging consequences.”  

Carl Hendrick argues here that, “tests are no longer part of a judgement, they now are the’s not so much that the tail is wagging the dog as the tail now is the dog.”

Every young person is entitled to gain mastery of the knowledge necessary to make informed and independent judgements about the world.  Rather than ‘tests worth teaching to’ we need a ‘curriculum worth testing to’ based on ‘the best that has been thought and said’; what Michael Young describes as ‘Powerful Knowledge.’

This is our opportunity to focus on the ‘Big Ideas’. 

Tim Oates suggests a curriculum of, “Fewer things in greater depth,” and  Stephen Tierney reminds us here of the power of the curriculum to create, what we would want an educated person to be.”


2.  And subject curriculum?

How do we assess propositional and procedural subject knowledge?

Meyer and Land here, divide subject knowledge into:

  • Core concepts: ‘building blocks’ that progress subject understanding 
  • Threshold concepts: that transform understanding, without which the learner cannot progress 

An Open University report here, describes ‘threshold concepts’ as:

  • Transformative: they shift a learner’s perceptions
  • Irreversible: once learned, they are hard to unlearn 
  • Integrative: they expose inter-relatedness 
  • Bounded: they border with other threshold concepts to define a subject 
  • Troublesome: they appear difficult and unintuitive

How do we assess core and threshold concepts?

Meyer and Land explore the idea of ‘troublesome’ knowledge [or language] that may be:

  • Conceptually difficult
  • Counterintuitive, or seemingly inconsistent or paradoxical
  • Alien
  • Incoherent: including routine and meaningless ‘ritual knowledge’ 
  • Inert: it “sits in the mind’s attic, unpacked only when specifically called for by a quiz or a direct prompt but otherwise gathering dust. (Perkins, 1999 quoted in Meyer and Land)”
  • Tacit: implicit and unexamined

They point out that threshold concepts in particular, “often prove problematic or ‘troublesome’ for learners.”  For example, threshold concepts are often integrative: “integration is troublesome because you need to acquire the bits before you can integrate.”

This impacts on learning: “Difficulty in understanding threshold concepts may leave the learner in a state of liminality... in which understanding approximates to a kind of mimicry or lack of authenticity.” 

How do we assess whether students have mastered ‘troublesome’ knowledge?



3. Are you assessing the different stages of learning?

The process of ‘learning’ to ‘learned’ is staged; therefore different types of assessment are necessary:

  1. Assessment of initial understanding: Do the students GET IT.  In Principles of Instruction, Rosenshine found that “the most successful teachers spent more time... checking for understanding.”
  2. Assessment of task fluency: Can they DO IT? (Consistently, quickly, automatically, accurately)  In Principles of Instruction, Rosenshine suggests that, “A success rate of 80 per cent shows that students are learning the material, and ...challenged.” 
  3. Assessment of process fluency: Can they DO IT DIFFERENTLY?  In Why Don’t Students Like School? Daniel Willlingham points out that only shallow understanding has occurred while “knowledge is tied to the analogy or knowledge that has been provided.”  
  4. Assessment of understanding of deep, rather than surface, structure: Can they UNPICK IT?  In Why Don’t Students Like School? Daniel Willlingham points out that, “to see the deep structure, you must understand how all parts of the problem relate to one another.”  
  5. Assessment of permanent learning: Can they RECALL IT?  Brown et al. write in Make it Stick that, “to be useful, learning requires memory, so what we’ve learned is still there when we need it.”  
  6. Assessment of synoptic learning: Can they DO IT ANYWHERE?  Brown et al. write in Make it Stick that, “mass practice give[s] rise to feelings of fluency that are taken [incorrectly] to be signs of mastery.”  Students need to be able to apply learning in circumstances that are:
    i. Varied
    ii. Delayed
    iii. Interleaved

 The benefit of assessing the stages of learning as you go is that, as Dylan Wiliam explains here, you are “building the quality in.”


4. Are you planning meaningful assessment trajectories?

The Education Data Lab team found evidence here that, “the assumptions of many pupil tracking systems and Ofsted inspectors are probably incorrect.”  They found that only 9% of pupils take the ‘expected’ linear pathway from KS1-4, with assumptions of linear progress being especially weak in secondary schools and for low-attaining students.    

Tim Oates points out that learning is “uneven in pace not always upwards.”  Similarly, cognitive psychologist Robert Siegler argues in Emerging Minds that,   “Rather than development being seen as stepping up from Level 1 to Level 2 to Level 3, it is envisioned as a gradual ebbing and flowing of the frequencies of alternative ways of thinking, with new approaches being added and old ones being eliminated as well. To capture this perspective in a visual metaphor, think of a series of overlapping waves, with each wave corresponding to a different rule, strategy, theory, or way of thinking.”

Students with different levels of cognitive ability or prior knowledge may also ‘progress’ differently.

Consequently, the Education Data Lab team suggest that schools monitor whether pupils are making progress within the range of attainment for 60% of pupils with similar prior attainment:  “Pupils making progress in the 20% above or 20% below these ranges could then be more reasonably identified as overperforming or underperforming.”

Michael Fordham criticises progression models here, as “fundamentally flawed” because students change level as they move through new content.  As Harry Fletcher Wood points out here“successfully explaining the causes of the First Wold War [does not] automatically confer... the same ability for the Russian Revolution (and so encourages the prioritisation of flashy turns of phrase above deeper understanding.”  

Shaun Allison explains that at Durrington High School, awarded a DfE Innovation Fund to develop a method of assessing without levels, Rather than focusing on a predetermined end-point and how to get there i.e. an end of key stage target level, we are focusing on [students’]... starting points and then how to move them on – without their progress being ‘capped’ by a target. We do this by scaffolding their learning through four thresholds, towards excellence.  The idea is that all students will aim for excellence.” 


5. Do your assessment descriptors define actual learning?

Claims that ‘assessment without levels’ heralds an end to student ‘self-labelling’ are utopian.  As @missdcoxblog writes here: “whether a child thinks they are a ‘4a’ or ‘F’ or an ‘expert’ or a ‘-1’, it is surely still going to end in labelling.”  Tim Oates points out that: “labelling which encourages children to see themselves as poor learners is highly dysfunctional.” 

As David Didau suggests in How to get assessment wrong, “with the freedom to replace National Curriculum Levels with whatever we want, there’s a wonderful opportunity to assess what students can actually do rather than simply slap vague, ill-defined criteria over students’ work and then pluck out arbitrary numbers as a poor proxy for progress.”  He argues here that, Measurement has become a proxy for learning. If we really value students' learning we should have a clear map of what they need to know, plot multiple pathways through the curriculum, and then support their progress no matter how long or circuitous. Although this is unlikely to produce easily comprehensible data, it will at least be honest. Assigning numerical values to things we barely understand is inherently dishonest and will always be misunderstood, misapplied and mistaken."

Michael Fordham here and Chris Hildrew here, criticise badly constructed performance descriptors.  Are these a result of poor assessment practice or unworkable principle?  Daisy Christodoulou thinks the latter.  She argues here that, “many of the replacements for national curriculum levels rely on precisely the same kind of vague performance descriptions... For many people, descriptors simply are assessment... Unfortunately... descriptors do not give us a common language but the illusion of a common language.  They can’t be relied on to deliver accuracy or precision about how pupils are doing..a problem associated with all forms of prose descriptors of performance... The words ‘emerging’, ‘expected’ and ‘exceeding’ might seem like they offer clear and precise definitions, but in practice, they won’t.”  Tim Oates also refers here to “the slippery nature of standards.  Even a well-crafted statement of what you need to get an A grade can be loaded with subjectivity.”

To remedy this, Daisy Christodoulou suggests here"wherever possible, define the criteria through questions, through groups of questions and through question banks.  If you must have criteria, have the question bank sitting behind each criterion.  Instead of having teachers making a judgement about whether a pupil has met each criterion, have pupils answer questions instead.  This is far more accurate, and also provides clarity for lesson and unit planning."

The following example of question-based judgements is taken from The Wing to Heaven

Michael Tidd suggests here that, “a simple number or grade won’t cut it.”  A meaningful assessment system “needs to record exactly what students can and can’t do,” while Phil Stock writes herethat, "specific statements of the learning to be mastered organised in a logical sequence are generally more useful [although] in some subjects it is hard to reduce certain aspects of achievement down to a manageable amount of specific statements about learning."

In Authentic Assessment and Progress: Keeping it Real, Tom Sherrington suggests, “authentic, natural, common-sense mode[s] of assessment that teachers choose with an outcome that fits the intrinsic characteristics of the discipline... [and] data in the rawest possible state, without trying to morph the outcomes into a code where meaning is lost.”  He gives the following examples of authentic assessment:

  • Tests
  • Evaluation of a product against criteria
  • Absolute benchmarks


 Ronseal Assessment Part 2 still to come...

Posted on June 12, 2015 .