Test Construction, Item Analysis, Reliability, & Validity Exercise

Background Information

 

You will find the data in a file named PSY621.XLS. If you right click, and choose the "Save Target As" option, you can save it to your local machine.

 

You can perform the analyses on the mainframe, or on a PC if you like.  If you perform them on the mainframe, you will need to export and upload the file first. 

These data are from a group of nearly 500 subjects who filled out a 122 item questionnaire which is available here.  This questionnaire is actually four questionnaires:

1. Perceptual Aberration (PA): taps body image aberrations and distortions.

2. Magical Ideation (MI): taps "magical thinking" and other beliefs that would not be endorsed by many people in this culture.

3. Impulsive Nonconformity (NC): taps impulsivity with special disregard for social norms.

4. Infrequency (IF): a seven item scale designed to identify careless responding and malingerers.

 

If the item was endorsed "true", the code is 1; if "false", the code is 0.  Missing data (omitted items) are indicated by spaces.  You will need to decide how to handle missing data.

 

Most items are keyed "True"; i.e., endorsing true gives the person one point towards the total score, endorsing false does not.  Some items, however, are keyed "False"; endorsing false gives the person one point towards the total score, endorsing true does not.  The following items are keyed false and therefore must be recoded:  4, 6, 14, 26, 27, 28, 35, 38, 40, 46, 48, 49, 52, 53, 54, 56, 58, 59, 76, 84, 101, 103, 104,  106, 112.

 

The following items are on each scale (also indicated on the attached questionnaire):

 

Perceptual Aberration: 2, 3, 5, 7, 9, 13, 16, 18, 22, 23, 26, 29, 31, 34, 37, 39, 41, 44, 47, 48, 52, 55, 59, 60, 66, 70, 71, 77, 94, 96, 98, 102, 108, 109, 118.

 

Magical Ideation: 06, 08, 10, 11, 25, 27, 36, 46, 50, 51, 54, 56, 61, 65, 67, 73, 75, 76, 78, 79, 81, 83, 88, 92, 104, 116, 117, 119, 120, 122.

 

Impulsive Nonconformity: 01, 04, 12, 14, 15, 17, 19, 20, 24, 28, 30, 32, 35, 38, 40, 42, 43, 45, 49, 53, 57, 58, 63, 68, 69, 72, 74, 80, 84, 85, 86, 87, 89, 90, 91, 93, 95, 99, 100, 101, 103, 105, 106, 107, 110, 111, 112, 113, 114, 115, 121.

 

Infrequency: 21, 33, 62, 64, 82, 97.



Assignment

First you should decide what subjects to include and exclude, and justify your decision.  Then on the dataset of included subjects, address the following items.  For each item below, please include all relevant printouts and, if performing calculations by hand, please include the formula you choose to use and show your calculations.

 

  1. Compute the total scores and obtain (and print out) the intercorrelation matrix.  This is a single-method multitrait matrix.  What does this tell you about the homogeneity or heterogeneity the constructs tapped by this 122-item scale?

 

  1. What are the standard errors of these correlations (assuming that r is an unbiased estimate of D)?

 

  1. Compute and print out the items statistics (p and rit).  Identify poor items on the printout, and indicate in general what constitutes a bad item.  Is rit a pearson, a biseral, or a point-biserial correlation and why?

 

  1. Compute and print out the internal consistency (rtt) of each of the four scales by whatever method you deem appropriate (justify your deeming).

 

  1. Fred scored 22 on the Perceptual Aberration Scale; Clyde scored 13.  Is the difference between their scores statistically significant at the .05 level?

 

  1. What do you estimate the reliability of the difference (rDD) between the Perceptual Aberration and Magical Ideation Scales to be?

 

  1. By hand, recompute the correlation matrix, correcting for attenuation by the unreliability of each scale.  Now, what would you say about the homogeneity or heterogeneity the constructs tapped by this 122-item scale?

 

  1. By hand, compute the estimated reliability of each scale if you doubled the length of the scale by adding items comparable to these existing items in terms of their psychometric properties and content area.

 

  1. Exclude the items you identified as poor on each of the scales (excluding the infrequency scale).  Then recompute the internal consistency estimates (rtt) of each of these three scales by whatever method you already deemed appropriate.

 

  1. Now assume that you added new items that are of comparable quality to these retained items.  Assume you add as many items to each scale as you excluded in # 9 above.  By hand, what would you estimate the internal consistency estimates (rtt) of each of these three new-and-improved scales to be?