Identification of Terminal Competencies in Home Science and their Evaluation by Criterion Referenced Tests: Part-3

Identification of Terminal Competencies in Home Science

and their Evaluation by Criterion Referenced Tests: Part-3

Concept of Evaluation

Evaluation is a continuous process and forms an integral part of the total system of education because it is intimately related to educational objectives. Evaluation exercises a great influence both on the pupil's habits of study and teachers methods of teaching. Therefore, it helps in measuring educational achievement as well as improving it. Evaluation is a broad term which includes both measurement and appraisal; it takes into account the behavioral changes in students. Evaluation attempts to measure a comprehensive range of objectives of the modem school curriculum rather than subject matter achievement only. Modem evaluation includes integrating and interpreting various indices of behavior into an educational situation.

Meaning of Terms


    Domain means a segment of knowledge and behavioural domain is one which can be detected by an observer by virtue of the pupils being able to do something new as result ofleaming.

Domain-Referenced Tests

    A random or stratified random sample of items from a domain is called domain-referenced test.

Domain Specification

     The domains are usually expressed in content specification. The Popham (1978) and Hambleton (1982) suggested that a domain specification might be divided into four parts:

  1. Description

    A short statement of the content or behavior covered by the competency.

  1. Sample Direction and Test Item

    An example of the test directions and a model test item to measure the competency.

Ill. Content Limits

A detailed description of both the content and behaviors measured by the competency, as well as the structure and content of the item pool sometimes clarity is enhanced by also specifying areas which are not included in the content domain description.

Iv.Response Limits:

    A description is the kind of incorrect answer choices which must be prepared. The structure and the content of the incorrect answers should be stated in as much detail as possible.

Acceptability Index of a Multiple Choice Questions

    Acceptability index of a multiple choice question is the probability of getting an item correct by chance by a candidate who knows barely enough to pass the class for example, if in a multiple choice question having four options, there is one option which, in the opinion of the experts, a candidate, who has barely enough knowledge, should be able to reject straight away, the chance of passing the item is 1/4-1 = .33. The acceptability index for another item, where such a candidate should be able to reject straight away two options is 1/4 – 2 =.50 .According to Guilbert (1976) the minimum pass level of a test should be equal to the total acceptability indices of all the items.

Amplified Objectives

     An amplified objective is an expanded statement of an educational goal which provides boundary specifications regarding testing situations, response alternatives and criteria of correctness. An exemplary amplified objective must possess a thorough description of what situations or stimuli can constitute an item, including the potential content from which items can be generated. This can be accomplished either by constructing content generation rules or by listing entire topics, principles, years or other areas about which item can deal.

Bayesian Model

     Bayesian approach considers the collateral information available on other students and on beliefs of the investigator as prior information, which is combined with the domain referenced tests results to provide refined estimates of domain scores. This model is considered better than binomial model as more precise estimates is available using this model.

Binomial Model

    Binomial model used in the competency test score interpretation for estimating the examinee’s ability to “pass” or “fall” an item. Since its use presupposes a two category distribution, it is called binomial model. Ill CRT it provides information on how precision of estimation varies as a function of test length.


Behavioral objectives and performance goals are similar terms but sometimes goals are treated as different from objectives. Goals are statements of board general outcomes of instruction. Following are the difference between goal and the an objective:

  1. Goals are broad encompassing terms and objectives are specific.
  2. Goals are when listed usually few in number and objectives are numerous.
  3. Goals are not expressed in behavioral terms but objectives are always expressed in behavioral terms.
  4. Both are expressed sometimes from teacher’s point of view and sometimes from student’s point of view.

Terms used in context of objectives

     Terms as terminal competencies, instructional objectives, terminal behavior\,behavior objectives, and performance objectives are used in the same context.

Item Analysis

     The analysis of student response to objective test items is a powerful tool for test improvement. Item analysis is a process whereby behavioral statements are analyzed in order to derive classes of items which elicits the various aspects of the behavior class. Item analysis indicates which items may be too easy or too difficult and which may fail for other reasons to discriminate clearly between the better and the poorer examinees.  It sometimes suggests why an item has not functioned effectively. Item analysis is an essential technique of judging the quality of a criterion referenced tests. Item analysis is treated under three heads:

  1. Item Selection

  2. Difficulty Index

  3. Validity Index

    I. Item Selection

    The choice of an item depends upon the judgment of competent persons as to its suitability for the purpose of the test. Items are carefully selected from all sources of information judged to be suitable. Linguistics-based scheme means item transformation. This is a scheme for item generation. It uses operational definitions for deriving items. Operations should be capable of being systematically applied to an instructional programmed in such a way that all the items of the type derivable by those operations will be produced.

II.Difficulty Index

   The proportion of passing an item is an index of item difficulty. It expresses difficulty as the percent of responses which are correct. The higher the numerical values of this index of difficulty the easier the item. Difficulty index ranges from one (when everyone passed the item) to zero (when none passed the item). Ebel (1966) states that “the estimation of item difficulty from the responses of only the upper and lower group, disregarding the middle group, involves some bias”. The calculation of the proportion is made sometimes by considering the scores of entire groups and sometimes of the 27% top and bottom groups.

    The difficulty index of an item may vary from group to group depending upon the ability level of the group. Difficulty index of an item shows the average difficulty level of the item for the entire group. It does not say anything about the difficulty of an item for an individual.

III. Validity Index (Discrimination Index)

    Garrett (1981) states that “The validity index of an item (its discriminative power) is determined by the extent to which the given item discriminates among examinees who differ sharply in the function (or functions) measured by the test as a whole”. This shows to what extent the item discriminates good students from the poor ones. A number of methods have been devised for use in determining the discriminative power of an item. But biserial correlation is usually regarded as the standard procedure in item analysis. Biserial r gives the correlation of an item with total scores on the test, or with scores in some independent criterion . The adequacy of other methods is judged by the degree to which they are able to yield results which approximate those obtained by biserial correlation.

     One method of determining validity indices, much favored by test makers, setup extreme groups in computing the validity of an item. This method is one of the best among several methods. First, the numbers who answer the item correctly in selected upper and lower subgroups are found. Next the discriminative power of the item, its consistency with whole score on the test is gauged by the correlation (r bis) of the item and the whole test.

Loss Ratio

    Loss ratio indicates the seriousness involved in erroneously relating a student in the class, and in erroneously advancing a student to the higher class. For example a loss ratio of 2 in the case of a student means that it is just twice as serious to retain him in the class as to advance him to the higher class.


     Ebel (1966) states that “Measurement is a process of assigning numbers to the individual members or a set of object or person, for the purpose of indicating differences among them in the degree to which they possess the  characteristics being measured. If any characteristics of persons or things can be defined clearly enough so observed difference between them with respect to this characteristics can be consistently verified, the characteristic is measurable. A more refined type of measurement involves comparison of some characteristics of a thing with a pre-established standard scale for measuring those characteristics”.


    The distracter is any of the incorrect answer options in multiple choice test items. A good distracter is chosen by many of the poorer students but few of the good students.

Minimum Pass Level

     This is the cut off point for determining whether a student should be classified as a “pass” or a “fail”. “According to Guilbert (1976) the calculation of the minimum pass level for a test is not valid, unless the number of multiple choice question is not more than 30”.

Minimum Learning

    This is defined as learning based on the essential competencies expected of most of the children.

Selection ratio

    Selection ratio is percentage of people who will be selected from among those who are tested. The smaller this percentage is greater is the efficiency of the test. This explains why even relatively crude tests will often furnish very good results when the selection ratio is no greater than 5% or 10%.

Terminal Behavior Objectives

    Terminal behavior objectives are also known as expected outcomes of learning. Terminal behavior objectives are very specific statements which define those student’s abilities which should have been developed by them by the end of the unit or course. These abilities or performance or student actions are often called student behaviors.

Standard Error of Measurement (SEM)

    The standard error of measurement provides an indication of the absolute accuracy of the test scores. If for example the standard error of measurement for a set of scores is 5, it can be inferred that there are 68.26% chances that a student’s observed score is ±5 his true score and 95% chances that the observed score is ±10 his true score.

Standard Error of Substitution (SES)

     Sometimes we may not be interested in knowing the magnitude of the differences between the observed and true scores of students. Instead, we may like to know the magnitude of the differences between the marks of students. If they were tested again using a parallel test. In such a situation error may be defined as the differences between the two scores of the students on two parallel tests. The standard deviation of these errors is called the standard error of substitution. (SES)

Terminal Competencies

    Terminal competencies are the objectives or competencies expected to be achieved by most of the students at the end of an instructional programme.

Need for the Study

    There has been sufficient work in India regarding norm­ referenced tests. But there is hardly any work on criterion referenced tests or terminal competency. The survey of research in education (India), the latest amongst all the published ones, has to make the following observation: “This is the first time only few studies are being reported. In these studies the mastery learning strategy is compared with the personalized system of instruction and conventional method of teaching science and social science respectively”.(Buch M.B. et al).

    Other work on CRT that has come to light is a CRT developed by the NCERT in the subject of science for class VII level.

    Since very few studies seem to have been done in this area in the country, the need for the present study is obvious. This is required on following grounds:

  1. Terminal competency is new field to study. Government of India has developed terminal competencies only at primary level in language, math’s and science. Therefore it is important to take the study in home science at intermediate level.
  2. So for there has hardly been any attempt to identify masters and non-masters (of their abilities and skills) in home science. In the absence of such an attempt some non­ masters are promoted to higher class with the result they do not fit in the teaching learning process.
  3. The suitability of the syllabus in home science prescribed for class XII as prescribed by the U.P.Board of high school and intermediate education has not been studied to an appreciable extent. This study will try to fill the gap.
  Identification of competencies and evaluation of these competencies will enable the teachers to locate students' week points and develop remedial teaching programmers which will improve student's performance.
  5. The accountability of teacher can be ensured by judging the performance of her students against a pre fixed objectives. If all the students have achieved the competencies the teacher is excellent. The identification and evaluation of terminal competencies can be used to find out the accountability of teachers.

Statement of the Problem

    “Identification of terminal competencies in home science at class XII level, and evaluation of achievement of these, by students of Gorakhpur District (U.P.) in relation to several background factors”.


