Our guide to demonstrating equivalence between delivery methods.
The COVID-19 pandemic has forced many certification bodies to rapidly transition from delivering tests in physical centers to delivering them in test takers’ homes using online proctoring.
The speed at which organizations have been able to transition, thereby allowing important assessment programs to continue, has been impressive. But what should be considered when changing delivery methods to document that the tests remain fair, valid, and legally defensible?
This article provides a guide to equivalence reporting, which compares test data across delivery methods. The resulting evidence can be used to reassure candidates, stakeholders, and accrediting bodies that the mode of delivery is not impacting results.
What is equivalence reporting?
Equivalence reporting means analyzing test data to determine whether all candidates receive an equivalent testing experience. This includes determining that they have the same chance of passing, regardless of factors such as the test form they receive, the format of the exam, or the delivery method used. The term ‘form equivalence’ may be more familiar, as it is standard practice that tests with multiple test forms would undergo analysis to ensure that they are psychometrically equivalent.
Rather than test form variation, this article concerns the perceived potential impact of changing the method of test delivery, and what evidence can be developed to prove that the delivery mode does not impact results.
Historically, this kind of equivalence reporting was carried out when testing programs transitioned from paper-based to computer-based testing (CBT). When CBT was first introduced, there was a lot of anxiety about the change, with a subsequent intense period of research that provided evidence of the delivery methods’ comparability. Similarly, we’re now seeing another shift in delivery methods, due to the impact of COVID-19. For many organizations, this has been a rapid and forced shift, making equivalence reporting even more imperative.
Equivalence reporting gathers evidence from test data collected during both the “old” and the “new” methods of delivery. Analyses are then run to demonstrate the lack of any differences and ensure that the same results judgements can be made about candidates regardless of how the test has been delivered.
Unaccredited organizations, or those that do not have a psychometrician on board, may be unaware that they should be carrying out this kind of reporting on delivery methods even when the test content remains unchanged. Accrediting and regulatory bodies have requirements regarding fairness, validity, and consistent exam administration, all of which can be affected by varying delivery methods. Equivalence reporting can be provided by the testing organization to prove that these requirements are still being met.
Why is it important?
Carrying out this kind of data analysis is always best practice when changing anything in your assessment program, but even more so as many organizations undergo a forced change.
Right now, the speed at which online proctoring is being implemented in order to keep assessment programs viable means that some stakeholders may be left feeling unsure, having not been given the usual opportunity to undergo trials and become familiar with the technology over time.
Comparing data from both delivery methods helps make it known that the change from one to another is acceptable. For example, should a candidate suggest that their failure in an exam was due to the method of delivery, the report findings would provide evidence that this was not the case.
‘Best practice in ensuring fairness and validity, which is required by international standards, would be to look at the equivalence between test forms and test delivery formats. In any case where you needed to legally defend the decisions made by your test results, you would need to be able to prove, using statistical evidence, that every candidate was presented with the same level of difficulty and equivalent testing conditions.’ – Amanda Dainis, CEO and Lead Psychometrician, Dainis & Co
What type of data is analyzed?
To produce comprehensive equivalence reporting documentation, data such as reliability coefficients, passing rates, and descriptive statistics are taken into account, analyzing both test-level and item-level data.
Term | Broad Description |
---|---|
Descriptive Statistics | The most basic and common descriptive statistics that can be reported for test forms are the mean and standard deviation. There are many other statistics that are useful (median, confidence intervals, SEM), but these are the minimum expected to be reported across forms, modes, and time. |
Passing Rates | The passing rate is simply the proprotion of candidates that pass the test form. This should be consistent across forms, test administration mode, and time. |
Reliability Estimates | The reliabilty of a test form refers to the internal consistency of the scores. That is, how much variance is there across the questions? It’s a technical estimation, and there are numerous accepted methods. The most common one is Cronbach’s Coefficient Alpha. |
The greater the volume of data, the better, but that volume may be dependent on time constraints. A psychometrician can help you identify the minimum test volumes required to have enough data to produce stable results.
A notable difference in the passing rates of center delivered vs. remotely proctored exams would cause the biggest concern, but a psychometrician could drill further down into the data by looking at different demographic groups to try and determine possible causes of any discrepancies.
The data can be presented in a way that’s easily understandable and digestible to your audience, and is as much about providing reassurance about a new delivery method as it is about being able to defend any appeals around results. By carrying out this kind of reporting, credentialing bodies guard themselves against any perceived negative impact of changing the method of delivery.
As more and more tests are taken remotely, gathering qualitative candidate feedback after taking a test will also be key in helping organizations to get their processes – and their communication with test takers – right.
Surpass allows you to easily add survey questions to the end of the test for this purpose, which can help to underpin the data and improve the overall user experience.
Do you expect major differences between center and remote delivery?
In the majority of cases, it is unlikely that you will see any changes between delivering identical test content in a test center versus a remote location, providing the transition has been planned well. However, it is still important to carry out a comparability analysis so any issues can be monitored and rectified, and so that any candidate appeals can be met with clear evidence.
There are of course other factors to consider when switching to remote delivery, which if not handled correctly could have a negative impact on the test taker.
When administering tests remotely, more responsibility falls to the test taker in terms of setting up their device, and making sure they understand how to access the test. Low technical skill or comfort with technology, or simply fear of change could manifest itself as a delivery divide. Conversely, candidates may experience less anxiety being in the comfort of their own homes, and not having to worry about getting to a test center, which they may have never visited before, on time.
We know that communication is also key when it comes to remote delivery, so ensuring processes are clear and that the candidate receives adequate support should they run into difficulty can all help to ensure that candidates sitting for tests remotely receive the same experience as with center delivered tests.
Read our case study of early adopters of online proctoring, The Association of Corporate Treasurers
How can psychometricians help?
Looking at statistics across all delivery methods on a regular basis is best practice, and should be ongoing operationally. But when undergoing any change in content or delivery method, data analysis should be proactively undertaken.
Whether you have an in-house psychometrician or utilize the services of a specialist firm such as Surpass Community partners Dainis & Company, they can help you interpret the data and present it in a clear and concise format to stakeholders.
A psychometrician will be able to advise on any potential areas for concern that need to be watched, and can help define a plan and timeline for ensuring the operational validity of your assessment program through regular data audits and reporting.
Conclusion
Whilst a rapid switch to remote delivery has been unexpected, undergoing this forced changed has removed the barrier caused by fear and uncertainty around remote proctoring. In many cases we expect it to be a permanent feature for assessment programs, allowing candidates a choice between center-based and remote delivery. Whether permanent or temporary, this is a good time to start thinking about equivalence reporting if it’s not already part of your operational processes.
If you’d like to find out more about the Surpass Online Proctoring Service, or would like the support of a psychometrician to help carry out equivalence reporting, contact the team today.