Center for Assessment & Improvement of Learning
About the CAT
Assess Important Skills
The CAT instrument is a unique tool designed to assess and promote the improvement of critical thinking and real-world problem solving skills. The instrument is the product of extensive development, testing, and refinement with a broad range of institutions, faculty, and students across the country. The National Science Foundation has provided support for many of these activities.
The CAT instrument is designed to assess a broad range of skills that faculty across the country feel are important components of critical thinking and real world problem solving. The test was designed to be interesting and engaging for students. All of the questions are derived from real world situations. Most of the questions require short answer essay responses, and a detailed scoring guide helps ensure good scoring reliability.
Engage Faculty in Improvement Efforts
The CAT instrument is scored by the institution's own faculty using the detailed scoring guide. Training is provided to prepare institutions for this activity. During the scoring process faculty are able to see their students' weaknesses and understand areas that need improvement. Faculty are encouraged to use the CAT instrument as a model for developing authentic assessments and learning activities in their own discipline that improve students' critical thinking and real-world problem solving skills. These features help close the loop in assessment and quality improvement.
- Technical Information
The CAT instrument was designed to measure those components of critical thinking and problem solving that faculty across disciplines think are most important. The graph below shows the percent of faculty that think each question is a valid measure of critical thinking. These evaluations include a wide variety of disciplines from six institutions involved in a recent NSF project to evaluate and refine the instrument.
Criterion validity for a test of this type is difficult to establish, since there are no clearly accepted measures that could be used as a standard for comparison. Since the CAT Instrument is designed to assess a broad range of skills associated with critical thinking, we looked for reasonable but moderate correlations with other (more narrow) measures of critical thinking and academic performance.
The relationship between student responses on the National Survey of Student Engagement (NSSE) and performance on the CAT instrument has also been examined. Five items on the NSSE were significant predictors of performance on the CAT instrument (multiple R = .49, p < .01). The negative relationship between CAT performance and the extent to which students felt that their college courses emphasized rote retention is particularly important and supports both the criterion validity and the construct validity of the CAT instrument.
The CAT instrument can be used in a pre-test/post-test design to evaluate the effects of single course or to evaluate the effects of many college experiences (value-added). Test-retest reliability of CAT version 4.0 was > 0.80.
Since this instrument involves mostly short-answer essay questions, the reliability of scoring is of great importance. Each question is scored by a minimum of two scorers and disagreements are resolved by a third scorer. Refinements in the test and the scoring guide have yielded scoring reliability = 0.92 between the first and second scorer.
Most of the questions on the CAT instrument are designed to assess more than one component of critical thinking. The internal consistency of questions is reasonably good, α = 0.70.
The cultural fairness of the test has been evaluated in two ways. A multiple regression analysis of CAT performance revealed that once the effects of entering SAT score and GPA and whether English was the primary language were taken into account, neither gender, race, nor ethnic background were significant predictors of overall CAT performance. A cultural differential item functioning (DIF) analysis was also performed to examine question bias. The review of DIF results did not reveal any items with prevalent cultural bias.
Performance on the CAT instrument reveals neither floor effects nor ceiling effects for any of the participants tested so far. Test takers have included all levels of 4-year undergraduates and community college students. The sensitivity of the test is also sufficient to reveal differences between freshman and seniors and to reveal the effects of a single course that emphasizes critical thinking.
Skills Assessed by CAT Instrument
- Separate factual information from inferences.
- Interpret numerical relationships in graphs.
- Understand the limitations of correlational data.
- Evaluate evidence and identify inappropriate conclusions.
- Identify alternative interpretations for data or observations.
- Identify new information that might support or contradict a hypothesis.
- Explain how new information can change a problem.
Learning and Problem Solving
- Separate relevant from irrelevant information.
- Integrate information to solve problems.
- Learn and apply new information.
- Use mathematical skills to solve real-world problems.
- Communicate ideas effectively.
The CAT Instrument was developed and refined with extensive faculty input from a broad range of disciplines across a diverse group of institutions. The instrument has also been refined with input from learning sciences experts, external evaluators, and extensive statistical analyses.
- CAT Video Resources