There are numerous forms of assessment. Assessment may range from standardised testing to global workplace-based evaluation, but it may also vary between individual, peer or group assessment. Sometimes we tend to think that a certain assessment mode is intrinsically better than another. A typical example is the belief that open-ended quiz questions are better than multiple-choice ones because of the so-called cueing effect – the fact that the answer can be given by recognising it rather than spontaneously generating it. However, there is considerable literature comparing different assessment formats and assessment modes, and from that literature some important lessons can be drawn.
The first lesson is that with respect to validity: what the assessment really assesses. The format is of negligible importance – the content of the assessment, what the assessment really asks the student to do, is of overriding importance. This is not to say that just any format will do for any purpose. In considering the best mode of assessment it is a good idea to start with the learning outcomes and what the content of the assessment should be, and then decide which assessment mode or format is most appropriately suited to assess that content and learning outcome.
The second lesson is that, if there is no inherently superior form of assessment, an ideal assessment should be constructed in such a way that it best fits the purpose. In designing assessment and deciding on the specific mode it is therefore important to be very clear about what you are trying to assess. Every assessment mode has advantages and disadvantages, or even side-effects. So, in most situations the choice is a trade-off between these advantages and disadvantages. The most obvious aspects to take into consideration when designing assessment are:
- reliability (reproducibility of the results),
- validity (to what extent does the assessment assess what it purports to assess),
- educational impact (the influence the assessment has on student learning behaviour but also on teachers’ teacher behaviour),
- cost effectiveness and efficiency and, finally,
- the acceptability to staff and students.
No single assessment will ever be perfect in all these aspects. Therefore, if every assessment is a compromise, a combination and variety of assessment modes is best in an assessment program. Having three of the same type of assessments in a topic may not be as useful as having a variety of modes. Each assessment mode is likely to be the result of a different compromise and therefore addresses a different aspect of competence. Having that variety of assessment modes will therefore paint a more comprehensive picture of the students’ competence.
Apart from individual assessment, peer and group assessment can have a place in the assessment program. There are some pitfalls though, especially if there are summative aspects in peer group assessment. Summative peer assessment can easily lead to issues with validity. Students may not be willing to ‘fail’ each other or they may be led to include personal dislikes in their judgements. Obviously, these issues will have a big impact on the validity of the assessment. But more importantly, it may negatively impact students’ quality of learning. They may not be willing to ask questions, show gaps in their learning or even share knowledge in collaborative learning. Therefore, for peer assessment to work it should not have a purely summative function and it will not work in a competitive culture. Instead, when students are used to collaborative learning and can provide each other with constructive feedback, peer assessment can certainly be an important addition to the assessment program.
Group assessments are often not easy to design. In line with the ‘fit for purpose’ principle, the advice would be to only use group assessment if communication, collaboration and/or contribution to group process are important purposes of the assessment. Group assessments which are purely focused on the quality of a joint product or a joint assignment, and where each of the students receives the same grade, may not be the best use of this format. Such group assessments encourage free riding and unequal division of labour – meaning that none of the students will have engaged with the whole task – and are likely to overlook the important aspects of group performance, namely collaboration, communication and contribution.
It would be unfair to expect everybody to have complete expertise in assessment design, but it may be helpful to seek expert input. Not only will this help to ensure that the assessment is optimally fit for purpose but it can also help to ensure that it is optimally time efficient. Ensure your assessments are moderated, and if necessary seek advice from a colleague, look at the range of available resources provided by CILT or speak to an academic developer.
Professor Lambert Schuwirth and Associate Professor Ingo Koeper
Assessment Policy Working Group members
Ward WC. A comparison of free-response and multiple-choice forms of verbal aptitude tests. Applied Psychological Measurement 1982;6(1):1-11.
Van der Vleuten CPM, Van Luyk SJ, Beckers HJM. A written test as an alternative to performance testing. Medical Education 1988;22:97-107.