**welcome to my website teachingonline.net**

Teaching Online Contents

- 1 Interpretation of Data of Criterion Referenced Tests
- 1.1 Chance Statistics for Intermediate Home Science Paper-II
- 1.2 Relationship between the Number of items Attempted and
- 1.3 Scores Obtained (Paper-II)
- 1.4 Item Analysis
- 1.5 Item Analysis Results: Estimating Item Difficulty and Discrimination
- 1.6 Split-Half Reliability
- 1.7 Standard Error of Measurement
- 1.8 Standard Error of Substitution
- 1.9 Binomial Model
- 1.10 Bayesian Model
- 1.11 Interpretation of Data of Criterion Referenced Tests (Part-1):- Click Here
- 1.12 Interpretation of Data of Criterion Referenced Tests (Part-2):- Click Here
- 1.13 Interpretation of Data of Criterion Referenced Tests (Part-4):- Click Here

**Interpretation of Data of Criterion Referenced Tests**

Chance Statistics for Intermediate Home Science Paper-II

**Chance statistics for intermediate home science paper-II have been shown in Table-13. It may be seen from the Table that the mean chance score for home science paper-II was 24. The standard deviation of the chance score was 4.22. The chance score at the 5% level of confidence was 30.96 and at 1% level of confidence Was 33.** The number of girls who scored at various chance levels is given against home science paper-II. For example, the number of girls who scored in paper-II below mean chance score was 70, who scored at the mean chance score was 19, at 5% chance level was 22, between mean chance score and 5% confidence level was 115 at 1% chance level was 25, between 5% confidence level and 1% confidence level was 34 and at above 1% chance level was 215. It reveals from the Table that paper-II was also not easy for girls. This paper was designed to measures the mastery of the course prescribed.

Relationship between the Number of items Attempted and

Scores Obtained (Paper-II)

**A visual representation of chance scores of home science paper-II is given in figure-8. The entries in column (class interval 90-95) indicatythat of all the 500 girls attempted all the 95 items.** Similarly looking at the rows it can be seen that out of 500 girls who scored between ranges of 5-95 majority of girls obtained the score range of 30-34 and 2 girls obtained the score between the score range of 5-9. The other rows and columns can be interpreted in the same manner. The lines drawn in the figure-8 can be interpreted in the same way. Around 57% cases got score, which are below one percent level. The scores of 43% candidates cannot be attributed to chance.

Item Analysis

It seemed necessary to do item analysis of both the paper of home science. The concept of item analysis and its need for the development of a test was already described in chapter 4. It is a procedure to estimate how good an item is. This differentiates a good item from a poor item. Since a test is made individual items, its quality will depend upon the quality of the items that constitute the test. Item analysis is therefore considered as a very important step in test development. Two indices are generally calculated for each item in item analysis: difficulty index and validity index (Discrimination).

Item Analysis Results: Estimating Item Difficulty and Discrimination

**Table-14 and 15 gives a summary of item analysis result for home science paper-I and paper-II. The method used for the estimate of difficulty and discrimination indices are based on the . performance of 27% upper and 27% lower groups. Item wise detailed results are given in Appendices A and B. In the Table-14 and 15the figures inrows indicate the items which fall within specific discrimination fall within specific difficulty ranges. To illustrate, items (of paper-I) 12,27 and 31 fall within the discrimination range of 40 and above, and in the difficulty range of .20- 39. Similarly items 51, 64, 69 falls within the discrimination range of .00-.19 other figures of paper-I can be read in the same way. To illustrate paper-II, Table-13 shows items 34,75 and 85 falls within the discrimination range of 40 and above and in the difficulty range of .20- 39. Similarly item 64 falls within the discrimination range of .00-.19 and difficulty range of .20-.39. Other figures of paper-II can be interpreted in the same manner.**

Items which are too hard or too easy they may of course be used for some special purposes. Too hard items can be used for discrimination among poor students. Too easy item can be used for discrimination among poor students. Too easy items can also be used as starters so that students get confidence and are not thrown off in the very beginning of the test.

The summary of item analysis results of home science paper show that all the items were not acceptable. The Table suggests that items falling in the discrimination range marked as poor and the difficulty range of .00-.19 and 89-.100 are poor tell)s. Therefore total of 83 items were accepted out of 115 items. Remaining 32 items were rejected in home science paper-I.

**Summary of item analysis results of home science paper-II show that all the items were not acceptable. The Table suggests that items falling in the discrimination range marked poor are poor items. Therefore total 19 items were rejected out of 114 items Total of 95 items was accepted in home science paper-II. The poor discrimination of items my be due to the fact that the items were too easy or too difficult for the group, such items can be used with advantage in some other tests for other purposes.**

Split-Half Reliability

The Table-16 shows that the correlation between the two halves of home science paper-I was found to be .63 and for home science paper-II was found to be .56.This is the Reliability for each half test. For estimating the reliability of the total test, the S.B. Formula was used. This gives the estimate for paper-I and paper-II .775 and .717 respectively.

A limitation of the coefficient of reliability is that it depends on the range of ability of the tested. The same test can give a high value of reliability for a more variable group, and a low reliability for a less variable group.

Standard Error of Measurement

**Another statistic that indicates the accuracy of scores obtained on a test and which is somewhat independent of the group variability is the Standard Error of Measurement (SEM). The concept of standard error of measurement and formula for calculating it has been given in chapter 6. The standard error of measurement of home science paper-I and paper – II have been shown in Table-17. The standard error of measurement indicates the error involved in a particular observed score. To illustrate, there are only five chances out of one hundred that the observed score might have departed from the true score to the tune of 1.96. S.E.M. and there only one chance out of one hundred that the observed score might have departed from the true score to the tune of 2.58 S.E.M. The figures in the table-17 can be interpreted in the same way.**

Standard Error of Substitution

Sometimes one might be interested in knowing not the magnitude of departure of the observed score from the true score, but he may like to know the probable magnitude Difference the scores of a student if he is examined again using a parallel test. This information is given by the Standard Error of Substitution (SES). The formula used for estimating it has been given in chapter 6. The values of SES for the home science paper -I and paper – II have been given in Table – 18.

**The SES can be interpreted in the same way as the SEM. To illustrate, the magnitude of the difference expected in the scores of a student examined on a parallel test on a different occasion would be 1.96 SES, 95 times out of 100, 2.58 SES, 1times out of 100. Two approaches have been suggested for the estimate of the accuracy of a student score examined by a CRT. These approaches assume that the items selected for inclusion in the test are random samples from the domains of items they are supposed to represent. These approaches indicate the percentage of items likely to be answered by a student if the entire number of items he had to answer; these two approaches are as follows:**

## Binomial Model

This approach does not take into account the performance of others on the test, only the performance of the individual is taken into account. An inference about the domain score of each individual is made in the same way as a statician estimates the probability of getting a mix of any two coins, gives the proportion of each in a random sample.

To provide ready reference for estimating students domain scores, Millman (1971) has provided a Table-19 showing the percentage of students likely to be misclassified as masters or non masters.

The Table-20 can be interpreted in the following manner. **In the home science paper-I number of items are 83 and in home science paper-II number of items are 95 and the passing score (i.e. cut off score for classification between masters and non masters) of paper- I is 58 and of paper- II is 66.5. The chances of 7.4% students in paper-I, whose true ability in the domain score is 40% being misclassified as master are nil, of 5.6%. Students with true score of 50% are 2, of .2% students with true score of 60% are 18%, of 2.8% students with true score of 65% are 36%. Similarly, the chances of 0.6% students being misclassified as non master with true ability of 75% score are 20% and nil for a student of a true score of 90% or above. In other words, in these two papers of home science those students who were classified as non masters will not have a true score of 90% or more.** There is a probability that some students with true score of 50% to 65% could have been classified as masters, and some with true scores of 75%, 90% could have been classified as non masters.

The percentages of urban students likely to be misclassified as master or non master have been given in Table-21. It can be interpreted in the same way. The chances of 6.3% urban students in paper- I whose true ability in the domain score is 40% being misclassified as master are nil and in paper- II chances of 5.3% – students whose true ability in the domain score is 40% being misclassified as master are in 1. Other figures can be interpreted in the same manner. Similarly, the chances of urban students being misclassified are non master with true ability of 75% score in paper are 20% and in paper-II are also 20% and nil for students of a true score of 90% or above.

The percentages of rural students likely to be misclassified as masters or non-masters have been shown in Table-22. The chances of 9% rural students in domain score is 40% being misclassified as mater are nil and in paper-II whose true ability in the domain score is 40% being misclassified as master are also nil and in paper – II whose true ability in the domain score is 40% being misclassified as master are also nil. Other figure can be read in the same way. The chances of students misclassified as non master with true ability of 90% score are nil.

Bayesian Model

The binomial model considers the scores of each individual in complete disregard of the scores of others on the test. The Bayesian model on the other hand considers the performance of other examinees and the belief of the tester as the prior in formation. **Bayesian model gives more accrual estimate of the student’s true score than the binomial model. For providing some ready information for the users of a CRT. Novick and Fewis (1974) as quoted in Dubey.S. (1991) have provided Table-23. In the Table, A stands for the criterion level or the level of functioning that is sufficient to evidence mastery. B stands for the expected mean level of functioning (for the group of examinees being tested which is estimated prior to observing actual test score). It may be assumed that in a test where the cut off score for mastery is 70% and the students are taught with a specific View to ensure mastery,** all the students will have a true level of functioning as 70%. Loss ratio is the ratio of the loss of advancing an undeserving student to master category to the loss of retaining an undeserving student as non-masters.

Table- 23 shows that in 8 item domain-referenced test and a passing score of 6(75%) is recommended when the mastery criterion is 70%, the groups expected mean level of functioning is 70% and the loss ratio is 1.5. That is, it is 1.5 times as serious error to retain the student as non-master as to advance him as master. That is, she should be advanced. The figures for other loss ratio can be interpreted in the same way. In the present case the students who got 70% or above scores can be safely advanced to mastery category.

## Be the first to comment on "Interpretation of Data of Criterion Referenced Tests (Part-3)"