Tag: reliability
-
In Beta podcast: Assessment and learning
http://inbetaphysio.com/2023/06/29/31-assessment-and-learning/ In this conversation, Ben and I had discuss the assessment process, linking it to broader themes of learning, curriculum design, and student experience. We talk about the centralisation of assessment and explore the tensions between institutional control and the autonomy of teachers. We discuss student satisfaction and the influence of risk aversion in educational…
-
Link: Introducing Claude 2.1
https://www.anthropic.com/index/claude-2-1 Our latest model, Claude 2.1, is now available over API in our Console and is powering our claude.ai chat experience. Claude 2.1 delivers advancements in key capabilities for enterprises—including an industry-leading 200K token context window, significant reductions in rates of model hallucination, system prompts and our new beta feature: tool use… We’re doubling the…
-
Article: CORE-GPT: Combining Open Access research and large language models for credible, trustworthy question answering
Pride, D., Cancellieri, M., & Knoth, P. (2023). CORE-GPT: Combining Open Access research and large language models for credible, trustworthy question answering (arXiv:2307.04683). arXiv. http://arxiv.org/abs/2307.04683 In this paper, we present CORE-GPT, a novel question answering platform that combines GPT-based language models and more than 32 million full-text open access scientific articles from CORE. We first…
-
Don’t use assessment design to solve the ChatGPT problem…
Note: These are my first thoughts on this idea, so I may not be articulating this well. Feel free to tell me how wrong I am in the comments 🙂 Don’t use assessment design to solve the ChatGPT problem. Use ChatGPT to solve the assessment design problem. I’m seeing a lot of chatter around the…
-
Assessment and data assemblages: Changing the world simply by ‘being’ in it
We use student achievement in tests and exams to try and make accurate predictions about their future performance but we know that this practice is neither valid nor reliable. The test environment doesn’t look like the real world environment (so it’s not valid), and the metrics we use to measure test outcomes aren’t reliable because…
-
Comment: There’s a new obstacle to getting a job after college: Getting approved by AI
Companies may not be ready to outsource vetting candidates for C-Suite and executive positions to algorithms, but the stakes are lower for entry-level roles and internships. That means some of today’s college students are effectively the guinea pigs for a largely unproven mechanism for evaluating applicants. Metz, R. (2019). There’s a new obstacle to getting…
-
The Future of Artificial Intelligence Depends on Trust
To open up the AI black box and facilitate trust, companies must develop AI systems that perform reliably — that is, make correct decisions — time after time. The machine-learning models on which the systems are based must also be transparent, explainable, and able to achieve repeatable results. Source: Rao, A. & Cameron, E. (2018).…
-
Separating the Art of Medicine from Artificial Intelligence
in AIWriting a radiology report is an extreme form of data compression — you are converting around 2 megabytes of data into a few bytes, in effect performing lossy compression with a huge compressive ratio. Source: Separating the Art of Medicine from Artificial Intelligence For me, there were a few useful takeaways from this article. The first is…
-
Digital literacy survey: Outcome of reliability testing
Earlier this year we started the International Ethics Project, a collaboration between physiotherapy departments from several countries who intend offering an online course in professional ethics to their undergraduate students. You can read more about the project here. In June we started the process of developing a questionnaire that we can use to establish some…
-
SAFRI 2011 (session 2) – day 4
Reliability and validity Validity Important for assessment, not only for research It’s the scores that are valid and reliable, not the instrument Sometimes the whole is greater than the sum of the parts e.g. when a student gets all the check marks but doesn’t perform competently overall e.g. the examiner can tick each competency being…
-
Test-retest reliability analysis
A few thoughts on conducting test-retest reliability analysis on questionnaires, based on my own recent experiences. DO: DO NOT: