Which of the following is a common measure of internal consistency?

Internal consistency refers to the general agreement between multiple items (often likert scale items) that make-up a composite score of a survey measurement of a given construct. This agreement is generally measured by the correlation between items.

For example, a survey measure of depression may include many questions that each measure various aspects of depression, such as:

Assuming the items are worded appropriately and asked of an appropriate sample, we would expect that each of these items would correlate with each of the other items, since they are all indicators depression (see correlation matrix below).

Which of the following is a common measure of internal consistency?

Internal Consistency Correlation Matrix and Cronbach's Alpha Example High.png

To the extent that this is true, internal consistency would be high, giving us confidence that our measure of depression is reliable (see alpha above, explanation of Cronbach's Alpha to come).

However, if an item is poorly worded or does not belong in there at all, the internal consistency of the scale could be threatened. For example, if we replaced the question about Lethargy in our measure of depression with the new question below, our internal consistency is likely to be threatened.

  • Loss of interest in activities (X1)
  • Negative Mood (X2) 
  • Weight Loss/Weight Gain (X3)
  • Sleep Problems (X4)
  • Number letters in your last name (Y1)

Internal consistency is likely to be threatened because "Number of letters in your last name" is unlikely to be highly correlated with any of the other four items (see low correlation coefficients circled in image below), because it is not really an indicator of depression. Thus, replacing the "Lethargy" question with the "Number letters in your last name" question will lower internal consistency of our Depression scale and ultimately, lower the reliability of our measurement (see below, explanation of Cronbach's Alpha below).

Which of the following is a common measure of internal consistency?

Internal Consistency Correlation Matrix and Cronbach's Alpha Example Low.png

Internal consistency is typically measured using Cronbach's Alpha (α). Cronbach's Alpha ranges from 0 to 1, with higher values indicating greater internal consistency (and ultimately reliability). Common guidelines for evaluating Cronbach's Alpha are:

  • .00 to .69 = Poor
  • .70 to .79 = Fair 
  • .80 to .89 = Good 
  • .90 to .99 = Excellent/Strong

…if you get a value of 1.0 then you have "complete agreement" (i.e. redundancy) in your items, so you likely need to eliminate some. Items that are in perfect agreement with each other do not each uniquely contribute to the measurement in the construct they are intended to measure, so they should not both be included in the scale. Occasionally, you may also see a negative Cronbach's Alpha  value, but this is usually indicative of a coding error, having too few people in your sample (relative to the number of items in your scale), or REALLY poor internal consistency.

If Cronbach's Alpha (i.e. internal consistency) is poor for your scale, there are a couple ways to improve it:

  1. Eliminate items that are poorly correlated with other items in your scale (i.e. "Number letters in your last name" item in previous example)
  2. Add highly reliable items to your scale (i.e. that correlate with existing items in your scale, but are not redundant with items already in your scale)

As always, I hope this is helpful and please let me know if you have questions in the comments! What stats terms do you find confusing?

Previous

Using the lavaan package (in R) for latent variable modeling (SEM)

TutorialsJeremy J. TaylorDecember 18, 2013sem, R, lavaan, cfa

Next

R Is Not So Hard! A Tutorial, Part 4 (repost)

Stats Make Me Cry Blog EntriesJeremy J. TaylorApril 29, 2013R, non-linear, quadratic, regression, tutorial

Discover 21 more articles on this topic

For example, an English test is divided into vocabulary, spelling, punctuation and grammar. The internal consistency reliability test provides a measure that each of these particular aptitudes is measured correctly and reliably.

One way of testing this is by using a test-retest method, where the same test is administered some after the initial test and the results compared.

However, this creates some problems and so many researchers prefer to measure internal consistency by including two versions of the same instrument within the same test. Our example of the English test might include two very similar questions about comma use, two about spelling and so on.

The basic principle is that the student should give the same answer to both - if they do not know how to use commas, they will get both questions wrong. A few nifty statistical manipulations will give the internal consistency reliability and allow the researcher to evaluate the reliability of the test.

There are three main techniques for measuring the internal consistency reliability, depending upon the degree, complexity and scope of the test.

They all check that the results and constructs measured by a test are correct, and the exact type used is dictated by subject, size of the data set and resources.

Which of the following is a common measure of internal consistency?

Split-Halves Test

The split halves test for internal consistency reliability is the easiest type, and involves dividing a test into two halves.

For example, a questionnaire to measure extroversion could be divided into odd and even questions. The results from both halves are statistically analysed, and if there is weak correlation between the two, then there is a reliability problem with the test.

The split halves test gives a measurement of in between zero and one, with one meaning a perfect correlation.

The division of the question into two sets must be random. Split halves testing was a popular way to measure reliability, because of its simplicity and speed.

However, in an age where computers can take over the laborious number crunching, scientists tend to use much more powerful tests.

Which of the following is a common measure of internal consistency?

Kuder-Richardson Test

The Kuder-Richardson test for internal consistency reliability is a more advanced, and slightly more complex, version of the split halves test.

In this version, the test works out the average correlation for all the possible split half combinations in a test. The Kuder-Richardson test also generates a correlation of between zero and one, with a more accurate result than the split halves test. The weakness of this approach, as with split-halves, is that the answer for each question must be a simple right or wrong answer, zero or one.

For multi-scale responses, sophisticated techniques are needed to measure internal consistency reliability.

Cronbach's Alpha Test

The Cronbach's Alpha test not only averages the correlation between every possible combination of split halves, but it allows multi-level responses.

For example, a series of questions might ask the subjects to rate their response between one and five. Cronbach's Alpha gives a score of between zero and one, with 0.7 generally accepted as a sign of acceptable reliability.

The test also takes into account both the size of the sample and the number of potential responses. A 40-question test with possible ratings of 1 - 5 is seen as having more accuracy than a ten-question test with three possible levels of response.

Of course, even with Cronbach's clever methodology, which makes calculation much simpler than crunching through every possible permutation, this is still a test best left to computers and statistics spreadsheet programmes.

Summary

Internal consistency reliability is a measure of how well a test addresses different constructs and delivers reliable scores. The test-retest method involves administering the same test, after a period of time, and comparing the results.

By contrast, measuring the internal consistency reliability involves measuring two different versions of the same item within the same test.

What is the most common test for internal consistency?

The three most commonly used statistical tests for measuring internal consistency are the Spearman–Brown, the Kuder–Richardson 20, and Cronbach's alpha formulas. Cronbach's alpha is the most frequently used because it calculates all possible split half values of the test.

Which of the following are measures of the internal consistency of a measure?

Answer and Explanation: The correct answer is a. reliability. Internal consistency involves the correlation of the different components of a measure.

What does internal consistency reliability measure?

Internal consistency reliability is a measure of how well a test addresses different constructs and delivers reliable scores. The test-retest method involves administering the same test, after a period of time, and comparing the results.

What are the 3 types of internal reliability?

Reliability refers to the consistency of a measure. Psychologists consider three types of consistency: over time (test-retest reliability), across items (internal consistency), and across different researchers (inter-rater reliability).