Step 1: Test purposeDuring this stage, we will consider two key questions with you:
- What do we want to measure?
When creating a test, we need to be mindful of what we want to measure. For example, do we want to measure reading skills or a person's attitude towards work? When defining competencies, core objectives and frequently used methods act as a guide. At this point, it helps to consider what you want the reports to say.
- Why do we want to measure?
Ideally, tests would be created as a means of systematically tracking candidates' progress. Results can be used to compare individual or groups of candidates with one another. They can also be used to compare with people elsewhere, or against a set of objectives. In doing so, you gain information about the results of training provided at individual, group and company levels. You can also define focus points, which can help to actively improve teaching or training.
A test must satisfy a number of requirements as follows:
The objectives of the test help to determine the type of items, their level of difficulty, and any variation in difficulty between items.
- Your wishes as the client
- The characteristics required of the candidates
- Fixed requirements, such as reliability and validity.
Here are some important points to bear in mind when designing tests:
If we want our measurements to be reliable, we need a minimum number of test items. Tests can be divided into various sections or tasks. This enables us to make allowances for the candidate's concentration span, whilst still ensuring that we collect sufficient information.
We use blueprints to provide guidance on how many items should be presented in each component or sub-component of the test.
When determining the test level, we consider the various target groups.
- Item type
The choices open to us here could include: open-ended or multiple-choice questions, paper or computer-based tests, with or without an answer sheet, with or without pictures, individual or group administration, question type.
Step 3: AssemblyThis pre-test data is used for the final item selection.
We use all of this information when assembling a test. Individual answers to any item tell us little about actual competence. After all, luck can sometimes play a major role in testing. What matters is how candidates respond to a series of items. It is only by looking at this as a whole that you, the client, can obtain reliable information.
- Item difficulty index
To determine the difficulty index of an item, we look at the percentage of candidates who answered the item correctly. We omit items that are too easy or too difficult. Such items are sometimes saved to be used in a different test.
- Alternative responses
In the case of multiple-choice items, we look at whether the majority of candidates managed to select the right answer. If not, the item will not feature in the final test.
Which candidates have provided right or wrong answers to a particular item? Do the most competent candidates select the right answer to relatively simple questions? More competent candidates should have a greater chance of correctly answering items.
- Response field
Items that function well statistically, but which are deemed 'poor items' by experts, are not included in the final test.
A test is assembled from the selected items. This should be done carefully, in order to ensure:
- A variety of content
Utilising different elements from a particular field, in pre-agreed proportions.
- A variety of levels
Using an assortment of easy, average and difficult items means that a distinction can be made between more and less competent candidates.
- Suitability of levels
The average candidate should be able to answer the vast majority of the items correctly.