Categories
assessment curriculum education research teaching

SAFRI 2011 (session 2) – day 4

Reliability and validity

Validity

Important for assessment, not only for research

It’s the scores that are valid and reliable, not the instrument

Sometimes the whole is greater than the sum of the parts e.g. when a student gets all the check marks but doesn’t perform competently overall e.g. the examiner can tick each competency being assessed but the student doesn’t establish rapport with the patient. Difficult to address

What does the score mean?

Students are efficient in the use of their time i.e. they will study what is being assessed because the inference is that we’re assessing what is important

Validity can be framed as an “argument / defense” proposition

Our Ethics exam is a problem of validity. Written tests measure knowledge, not behaviour e.g. students can know and report exactly what informed consent is and how to go about getting it, but may not pay it any attention in practice. How do we make the Ethics assessment more valid?

Face” validity doesn’t exist, it’s more accurately termed “content” validity. “Face” validity basically amounts to saying that something looks OK

What are the important things to score? Who determines what is important?

There are some things that standardised patients can’t do well e.g. trauma

Assessment should sample more broadly from a domain. This improves validity and also students don’t feel like they’ve wasted their time studying things that aren’t assessed. The more assessment items we include, the more valid the results

Scores drop if timing of assessment is inappropriate e.g. too much or too little time → lower scores as students either rush or try to “fill” the time something that isn’t appropriate for the assessment

First round scores in OSCEs are often lower then later rounds

Even though the assessment is meant to indicate competence, there’s actually no way to predict if practitioners are actually competent

Students really do want to learn!

Reliability

We want to ensure that a students observed score is a reasonable reflection of their “true ability”

In reliability assessments, how do you reduce the learning that occurs between assessments?

In OSCEs, use as many cases / stations as you can, and have different assessor for each station. This is the most effective rating design

We did a long session on standard setting, which was fascinating especially when it came to having to defend the cut-scores of exams i.e. what criteria do we use to say that 50% (or 60 or 70) is the pass mark? What data do we have to defend that standard?

Didn’t even realise that this was something to be considered, good to know that methods exist to use data to substantiate decisions made with regards to standards that are set (e.g. Angoff Method)

Should students be able to compensate for poor scores in one area, with good scores in another. Should they have to pass every section that we identify as being important? If it’s not important, why is it being assessed?

Norm-referenced critera are not particularly useful to determine competence. Standards should be set according to competence, not according to the performance of others

Standard setting panels shouldn’t give input on the quality of the assessment items

You can use standard setting to lower the pass mark in a difficult assessment, and to raise the pass mark in an easier exam

Alignment of expectations with actual performance

Setting up an OSCE

  • Design
  • Evaluate
  • Logistics

Standardised, compartmentalised (i.e. not holistic), variables removed / controlled, predetermined standards, variety of methods

Competencies broken into components

Is at the “shows how” part of Miller’s pyramd (Miller, 1990, The assessment of clinical skills, Academic Medicine, 65; S63-S67)

Design an OSCE, using the following guidelines:

  • Summative assessment for undergraduate students
  • Communication skill
  • Objective
  • Instructions (student, examiner, standardised patient)
  • Score sheet
  • Equipment list

Criticise the OSCE stations of another group

 

Assessing clinical performance

Looked at using mini-CEX (clinical evaluation exercise)

Useful for formative assessment

Avoid making judgements too soon → your impression may change over time

 

Categories
education students

Assessment in an outcomes based curriculum

I attended a seminar / short course on campus yesterday, presented by Prof. Chrissie Boughey from Rhodes University. She spoke about the role of assessment in curriculum development and the link between teaching and assessing. Here are the notes I took.

Assessment is the most important factor in improving learning because we get back what we test. Therefore assessment is acknowledged as a driver of the quality of learning.

Currently, most assessment tasks encourage the reproduction of content, whereas we should rather be looking for the production of new knowledge (the analyse, evaluate and create parts of Bloom’s top level cognitive processes).

Practical exercise: Pick a course / module / subject you currently teach (Professional Ethics for Physiotherapists), think about how you assess it (Assignment, Test, Self-study, Guided reflection, Written exam) and finally, what you think you’re assessing (Critical thinking / Analysis around ethical dilemmas in healthcare, Application of theory to clinical practice). I went on to identify the following problems with assessment in the current module:

  • I have difficulty assigning a quantitative grade to what is generally a qualitative concept
  • There is little scope in the current assessment structure for a creative approach

This led to a discussion about formal university structures that determine things like, how subjects will be assessed, as well as the regimes of teaching and learning (“we do it this way because this is the way it’s always been done”). Do they remove your autonomy? It made me wonder what our university official assessment policy is.

Construct validity: Are we using assessment to asses something other than what we say we’re assessing? If so, what are we actually assessing?

There was also a question about whether or not we could / should asses only what’s been formally covered in class. How do you / should you asses knowledge that is self-taught? We could for example, measure the process of learning, rather than the product. I made a point that in certain areas of what I teach, I no longer assign a grade to an individual peice of work and rather give a mark for the progress that the student has made, based on feedback and group discussion in that area.

Outcomes based assessment / criterion referenced assessment

  1. Uses the principle of ALIGNMENT (aligning learning outcomes, passing criteria, assessment)
  2. Is assessing what students should be able to do
  3. “Design down” is possible when you have standardised exit level outcomes (we do, prescribed by the HPCSA)
  4. The actual criteria are able to be observed and are not a guess at a mental process, “this is what I need to see in order to know that the student can do it”
  5. Choosing the assessment tasks answers the question “How will I provide opportunities for students to demonstrate what I need to see?” When this is the starting point, it knocks everything else out of alignment
  6. You need space for students / teachers to engage with the course content and to negotiate meaning or understanding of the course requirements, “Where can they demonstrate competence?”

Criteria are negotiable and form the basis of assessment. They should be public, which makes educators accountable.

When designing outcomes, the process should be fluid and dynamic.

Had an interesting conversation about the priviliged place of writing in assessment. What about other expressions of competence? Since speech is the primary form of communication (we learn to speak before we learn to write), we find it easier to convey ideas through conversation, as it includes other cues that we use to construct meaning. Writing is a more difficult form because we lack visual (and other) cues. Drafting is one way that constructing meaning through writing could be made easier. The other point I thought was interesting was that academic writing is communal (drafting, editors, reviewers all provide a feedback mechanism that isn’t as fluid as speech, but is helpful nonetheless), but we often don’t allow students to write communally.

Outcomes based assessment focusses on providing students with multiple opportunities to practice what they need to do, and the provision of feedback on that practice (formative). Eventually, students must demonstrate achievement (summative).

We should only assign marks when we evaluate performace against the course outcomes.

Finally, in thinking about the written exam as a form of assessment, we identified these characteristics:

  • It is isolated and individual
  • There is a time constraint
  • There is pressure to pass or fail

None of these characteristics are present in general physiotherapy practice. We can always ask a colleage / go to the literature for assistance. There is no constraint to have the patient fully rehabilitated by any set time, and there are no pass or fail criteria.

If assessment is a method we use to determine competence to perform a given task, and the way we asses isn’t related to the task physio students will one day perform, are we assessing them appropriately?

Note: the practical outcomes of this session will include the following:

  • Changing the final assessment of the Ethics module from a written exam to a portfolio presentation
  • Rewriting the learning outcomes of the module descriptors at this year’s planning meeting
  • Evaluating the criteria I use to mark my assignments to better reflect the module outcomes