Test construction strategies

From Wikipedia, the free encyclopedia

Test construction strategies are the various ways that items in a psychological measure are created and decided upon. They are most often associated with personality tests but can also be applied to other psychological constructs such as mood or psychopathology. There are three commonly used general strategies: inductive, deductive, and empirical.[1] Scales created today will often incorporate elements of all three methods.

Inductive[edit]

Also known as itemetric or internal consistency methods, the inductive method begins by constructing a wide variety of items with little or no relation to an established theory or previous measure. The group of items is then answered by a large number of participants and analyzed using various statistical methods, such as exploratory factor analysis or principal component analysis. These methods allow researchers to analyze natural relationships among the questions and then label components of the scale based on how the questions group together. The Five Factor Model of personality was developed using this method.[2]

Advantages of this method include the opportunity to discover previously unidentified or unexpected relationships between items or constructs. It also may allow for the development of subtle items that prevent test takers from knowing what is being measured and may represent the actual structure of a construct better than a pre-developed theory.[3] Criticisms include a vulnerability to finding item relationships that do not apply to a broader population, difficulty identifying what may be measured in each component because of confusing item relationships, or constructs that were not fully addressed by the originally created questions.[4]

Deductive[edit]

Also known as rational, intuitive, or deductive method, it begins by developing a theory for the construct of interest. This may include the use of a previously established theory. After this, items are created that are believed to measure each facet of the construct of interest. After item creation, initial items are selected or eliminated based upon which will result in the strongest internal validity for each scale.

Advantages of this method include clearly defined and face valid questions for each measure. Measures are also more likely to apply across populations. Additionally, it requires less statistical methodology for initial development, and will often outperform other methods while requiring fewer items.[5] However, the construct of interest must be well understood to create a thorough measure, and it may be difficult to prevent or determine if individuals are faking on the measure.

Empirical[edit]

Also known as external or criterion group method, empirical test construction attempts to create a measure that differentiates between different established groups. For example, this may include depressed and non-depressed individuals, or individuals high or low in levels of aggression. The goal of item creation is to find items that will be answered differently by the groups of interest. Items are traditionally constructed without expectation for how they will be answered by each group.[6] The Minnesota Multiphasic Personality Inventory was initially developed using this method.[7]

This method primarily differs from the inductive method in the way items are selected. While inductive methods select items based upon factor loadings, empirical items are selected based upon validity coefficients and their ability to accurately predict group membership. However, the empirical method shares many of the strengths and weaknesses of atheoretical item creation with inductive methods, while also having an initial item pool more likely to relate to the topic of interest.[8]

References[edit]

  1. ^ Burisch, Matthias (March 1984). "Approaches to personality inventory construction: A comparison of merits". American Psychologist. 39 (3): 214–227. doi:10.1037/0003-066X.39.3.214.
  2. ^ McCrae, Robert; Oliver John (1992). "An Introduction to the Five-Factor Model and Its Applications". Journal of Personality. 60 (2): 175–215. CiteSeerX 10.1.1.470.4858. doi:10.1111/j.1467-6494.1992.tb00970.x. PMID 1635039. S2CID 10596836.
  3. ^ Smith, Greggory; Sarah Fischer; Suzannah Fister (December 2003). "Incremental Validity Principles in Test Construction". Psychological Assessment. 15 (4): 467–477. doi:10.1037/1040-3590.15.4.467. PMID 14692843.
  4. ^ Ryan Joseph; Shane Lopez; Scott Sumerall (2001). William Dorfman, Michel Hersen (ed.). Understanding Psychological Assessment: Perspective on Individual Differences (1 ed.). Springer. pp. 1–15.
  5. ^ Burisch, Matthias (1978). "Construction Strategies for Multiscale Personality Inventories" (PDF). Applied Psychological Measurement. 2 (1): 97–101. doi:10.1177/014662167800200110. S2CID 143727093.
  6. ^ Meehl, Paul (1945). "The dynamics of structured personality tests". Journal of Clinical Psychology. 1 (3): 296–303. doi:10.1002/(SICI)1097-4679(200003)56:3<367::AID-JCLP12>3.0.CO;2-U. PMID 10726672.
  7. ^ Hathaway, S. R.; McKinley, J. C. (1940). "A multiphasic personality schedule(Minnesota): I. Construction of the schedule". Journal of Psychology. 10 (2): 249–254. doi:10.1080/00223980.1940.9917000.
  8. ^ Burisch, Matthias (1986). "Methods of Personality Inventory Development - A Comparative Analysis". In Alois Angleitner, Jerry Wiggins (ed.). Personality Assessment via Questionnaires. Berlin: Springer. pp. 109–120. ISBN 978-3642707537.