What type of correlation which there is no relationship between two variable?

Correlation describes the relationship between variables. It can be described as either strong or weak, and as either positive or negative.

Note: 1= Correlation does not imply causation.

Positive Linear Correlation

There is a positive linear correlation when the variable on the $x$-axis increases as the variable on the $y$-axis increases. This is shown by an upwards sloping straight regression line.

What type of correlation which there is no relationship between two variable?

|225px|text-top|Positive Correlation

Negative Linear Correlation

There is a negative linear correlation when one variable increases as the other variable decreases. This is shown by a downwards sloping straight regression line.

What type of correlation which there is no relationship between two variable?

|225px|text-top|Negative Correlation

Non-linear Correlation (known as curvilinear correlation)

There is a non-linear correlation when there is a relationship between variables but the relationship is not linear (straight).

What type of correlation which there is no relationship between two variable?

|225px|text-top|Non-linear Correlation

No Correlation

There is no correlation when there is no pattern that can be detected between the variables.

What type of correlation which there is no relationship between two variable?

|225px|text-top|No Correlation

Worked Example

Worked Example

The local ice-cream shop have kept track of how much ice-cream they sell and the maximum temperature on that day. The data that they obtained during the last 15 days is as follows:

Temperature (°c)

Ice-cream Sales (£)

$12.5$

$211$

$15.8$

$230$

$22.1$

$359$

$18.9$

$284$

$17.7$

$254$

$19.3$

$287$

$15.3$

$248$

$19.2$

$303$

$13.4$

$235$

$14.1$

$209$

$16.7$

$267$

$18.6$

$295$

$11.9$

$199$

$18.4$

$274$

$18.9$

$279$

Determine the type of correlation between the number of ice-cream sales and the maximum temperature of the day.

Solution

Firstly draw a scatter diagram with the given data.

What type of correlation which there is no relationship between two variable?

|300x350px|texttop|Ice-cream Sales vs Maximum Temperature.

This shows that there is strong positive linear correlation between ice-cream sales and maximum temperature. However, it is not always as easy to tell just by looking at the scatter graph, instead we quantify it using a numeric value known as the correlation coefficient.

Examples: one variable might be the number of hunters in a region and the other variable could be the deer population. Perhaps as the number of hunters increases, the deer population decreases. This is an example of a negative correlation: as one variable increases, the other decreases. A positive correlation is where the two variables react in the same way, increasing or decreasing together. Temperature in Celsius and Fahrenheit have a positive correlation.

How can you tell if there is a correlation? By observing the graphs, a person can tell if there is a correlation by how closely the data resemble a line. If the points are scattered about then there may be no correlation. If the points would closely fit a quadratic or exponential equation, etc., then they have a nonlinear correlation. In this course we will restrict ourselves to linear correlations and hence linear regression. Since the data are almost linear, the data can be enclosed by an ellipse. The major axis (length) of the ellipse relative to the minor axis (width) of the ellipse, are an indication of the degree of correlation.

How can you tell by inspection the type of correlation?
If the graph of the variables represent a line with positive slope, then there is a positive correlation (x increases as y increases). If the slope of the line is negative, then there is a negative correlation (as x increases y decreases).

An important aspects of correlation is how strong it is. The strength of a correlation is measured by the correlation coefficient r. Another name for r is the Pearson product moment correlation coefficient in honor of Karl Pearson who developed it about 1900. There are at least three different formulae in common used to calculate this number and these different formulae somewhat represent different approaches to the problem. However, the same value for r is obtained by any one of the different procedures. First we give the raw score formula. n has the usual meaning of how many ordered pairs are in our sample. It is also important to recognize the difference between the sum of the squares and the squares of the sums!

r =                        n
What type of correlation which there is no relationship between two variable?
xy - (
What type of correlation which there is no relationship between two variable?
x)(
What type of correlation which there is no relationship between two variable?
y)                        
       sqrt[n(
What type of correlation which there is no relationship between two variable?
x2) - (
What type of correlation which there is no relationship between two variable?
x)2] · sqrt[n(
What type of correlation which there is no relationship between two variable?
y2) - (
What type of correlation which there is no relationship between two variable?
y)2]

Next we present the deviation score formula. This formula is closer to the developmental history since it gives the average cross-product of the standard scores of the two variables, but in a computationally easier format.

We need to make some notes regarding notation since the x and y variables in the formula above have been transformed from the original variables by subtracting their means.

Lastly we present the covariance formula, which is yet another approach. Covariances are commonly given between two variables and this is one reason why. (It should be noted that the size of the covariance is dependent on the units of measurement used for each variable. However, the correlation coefficient is not.)

r is often denoted as rxy to emphasize the two variables under consideration. For samples, the correlation coefficient is represented by r while the correlation coefficient for populations is denoted by the Greek letter rho (which can look like a p). Be aware that the Spearman rho correlation coefficient also uses the Greek letter rho, but generally applies to samples and the data are rankings (ordinal data).

The closer r is to +1, the stronger the positive correlation is. The closer r is to -1, the stronger the negative correlation is. If |r| = 1 exactly, the two variables are perfectly correlated! Temperature in Celsius and Fahrenheit are perfectly correlated.

Formal hypothesis testing can be applied to r to determine how significant a result is. That is the subject of Hinkle chapter 17 and this lesson 12. The Student t distribution with n-2 degrees of freedom is used.

Remember, correlation does not imply causation.

A value of zero for r does not mean that there is no correlation, there could be a nonlinear correlation. Confounding variables might also be involved. Suppose you discover that miners have a higher than average rate of lung cancer. You might be tempted to immediate conclude that their occupation is the cause, whereas perhaps the region has an abundance of radioactive radon gas leaking from the subterranian regions and all people in that area are affected. Or, perhaps, they are heavy smokers....

r2 is frequently used and is called the coefficient of determination. It is the fraction of the variation in the values of y that is explained by least-squares regression of y on x. This will be discussed further in lesson 6 after least squares is introduced.

Correlation coefficients whose magnitude are between 0.9 and 1.0 indicate variables which can be considered very highly correlated. Correlation coefficients whose magnitude are between 0.7 and 0.9 indicate variables which can be considered highly correlated. Correlation coefficients whose magnitude are between 0.5 and 0.7 indicate variables which can be considered moderately correlated. Correlation coefficients whose magnitude are between 0.3 and 0.5 indicate variables which have a low correlation. Correlation coefficients whose magnitude are less than 0.3 have little if any (linear) correlation. We can readily see that 0.9 < |r| < 1.0 corresponds with 0.81 < r2 < 1.00; 0.7 < |r| < 0.9 corresponds with 0.49 < r2 < 0.81; 0.5 < |r| < 0.7 corresponds with 0.25 < r2 < 0.49; 0.3 < |r| < 0.5 corresponds with 0.09 < r2 < 0.25; and 0.0 < |r| < 0.3 corresponds with 0.0 < r2 < 0.09.

It is often the case that the data we wish to measure the correlation for is not of the interval or ratio level of measurement. The Spearman rho correlation coefficient was developed to handle this situation. This is an unfortunate exception to the general rule that Greek letters are population parameters! There are others.

The formula for calculating the Spearman rho correlation coefficient is as follows.

rho (p) = 1 -   6
What type of correlation which there is no relationship between two variable?
d2  
                   n(n2-1)

n is the number of paired ranks and d is the difference between the paired ranks. If there are no tied scores, the Spearman rho correlation coefficient will be even closer to the Pearson product moment correlation coefficent. Also note that this formula can be easily understood when your realize that the sum of the squares from 1 to n can be expressed as n(n + 1)(2n + 1)/6. From this you can realize the least sum of d2 is zero and the greatest sum of d2 is twice the sum of the squares of the odd integers up to n/2 and this then scales such a sum between -1 and +1.

Example: Suppose we have test scores of 110, 107, 100, 96, 89, 78, 67, 66, and 49. These correspond with ranks 1 through 9. If there were duplicates, then we would have to find the mean ranking for the duplicates and substitute that value for our ranks. The corresponding first page score totals were: 29, 32, 27, 29, 25, 25, 21, 26, 22. Thus these ranks are as follows: 2.5, 1, 4, 2.5, 6.5, 6.5, 9, 5, 8. (Note that if we reversed the order, assigning the ranks from low to high instead of high to low, the resulting Spearman rho correlation coefficient would reverse sign.)

We have constructed a table below from the information above. We have added additional columns of d and d2 for ease in calculating the Spearman rho. Using the Spearman rho formula we get 1-6(24)/(9(80)) = 0.80.

Total (x)page 1 (y)x ranky rankdd2xyx2y21102912.5-1.52.2531901210084110732211134241144910241002734-11270010000729962942.51.52.2527849216841892556.5-1.52.2522257921625782566.5-0.50.2519506084625672179-241407448944166268539171643566764922981110782401484-----------------------------762236:sums:02420474680166286We have added additional columns of xy, x2, and y2 to make it easier to calculate the Pearson product moment correlation coefficient. Using the raw score formula for the Pearson product moment correlation coefficient we get (9×20474-762×236)/sqrt((9×68016-7622)(9×6286-2362) = 0.843. r2 = 0.71 which means 71% of the variation in y is explained by the variation in x. It is also true and perhaps more useful to know that the same correlation coefficient is obtained when x and y are exchanged. However, a different equation will result. Perhaps it makes more sense to use the results of the first page to predict the final test score rather than the other way around! We have looked now at how to calculate r, what various values mean, but it is also important to understand what factors affect it. First, remember, it is only meaningful to calculate the correlation coefficient if the data are paired observations measured on an interval or ratio scale. Next, since we are only concerned here with linear correlation, the Pearson product moment correlation coefficient will underestimate the relationship if there is a curvilinear relationship. It is a good idea to generate a scatterplot before calculating any correlation coefficients and then proceed only if the correlation is reasonably strong.

As the homogeneity of a group increases, the variance decreases and the magnitude of the correlation coefficient tends toward zero. It is thus imperative on the researcher to ensure enough heterogeneity (variation) so that a relationship can manifest itself. In general, the correlation coefficient is not affected by the size of the group.