This book contains six papers, originally presented at the Johns Hopkins University National Symposium on Educational Research, held in Washington, D. C, in October 1978. The authors, researchers who have contributed to the field of criterion-referenced measurement, were “to synthesize the research . . . which has been amassed over the past decade and translate the results into forms that practitioners and researchers can use” (p. ix). In addition, the book as a whole attempts “to clarify terminology and jargon related to criterion-referenced tests . . . and to bring into sharp focus the major issues in criterion-referenced measurement that have been resolved and those that need resolution”
The book closes with the topic of reliability, conceptualized both as decision consistency (for mastery testing) and as an application of generalizability theory (for domain-referenced testing). Michael Subkoviak (29 pages of text, plus tables) discusses the use of the raw agreement index and the corrected-for-chance kappa index, and compares four different approaches to their estimation. Detailed examples are given of the use of the tables (developed by Huynh Huynh) in estimating decision consistency for tests with 5 to 10 items. Subkoviak should be commended for his lucid style of writing, and the various numerical illustrations should prove useful to practitioners concerned with decision consistency.