Variance

Table of Contents

What is a Variance?

Variance is a statistical measure that quantifies the spread or dispersion of data points around their mean (average). It is a measure of variability that provides insights into the extent to which individual data points differ from the central tendency of the data set.

Interpretation

  • A high variance indicates that the data points are spread widely from the mean, suggesting greater variability or dispersion within the data set.
  • A low variance indicates that the data points are clustered closely around the mean, suggesting less variability or dispersion within the data set.

Properties

  • Variance is always a non-negative value (i.e., it cannot be negative).
  • The variance is sensitive to outliers in the data set. Outliers, which are data points that significantly deviate from the rest of the data, can inflate the variance if they are squared in the calculation.
  • Squaring the variances formula’s differences ensures that negative deviations from the mean do not cancel out positive deviations, providing a measure of overall variability.

Use in Statistics

Variance is a fundamental concept in descriptive statistics. It is often used alongside measures of central tendency (such as the mean, median, and mode) to describe the characteristics of a data set.

It is also a key component in calculating the standard deviation, another measure of dispersion widely used in statistics.

Sample Variance vs. Population Variance

When calculating variance from a sample of data, the formula involves dividing by n-1 instead of n to account for the degrees of freedom. This adjustment is made to provide an unbiased estimate of the population variance.

Example of Variance

Data

Suppose we have a dataset representing the daily sales of a store for the past week:

  • Monday: $1000
  • Tuesday: $1200
  • Wednesday: $900
  • Thursday: $1100
  • Friday: $1300
  • Saturday: $1500
  • Sunday: $800

Find The Mean

To calculate the variance, we first find the mean (average) of these daily sales:

Mean = (1000 + 1200 + 900 + 1100 + 1300 + 1500 + 800) / 7 = $1150

Calculate The Squared Differences

Next, we calculate the squared differences between each daily sales value and the mean:

  • (1000 – 1150)^2 = 150^2 = 22500
  • (1200 – 1150)^2 = 50^2 = 2500
  • (900 – 1150)^2 = (-250)^2 = 62500
  • (1100 – 1150)^2 = (-50)^2 = 2500
  • (1300 – 1150)^2 = 150^2 = 22500
  • (1500 – 1150)^2 = 350^2 = 122500
  • (800 – 1150)^2 = (-350)^2 = 122500

Average of The Differences

Then, we find the average of these squared differences, which gives us the variance:

Variance = (22500 + 2500 + 62500 + 2500 + 22500 + 122500 + 122500) / 7 ≈ $49285.71

The variance measures the spread or dispersion of the data points around the mean. In this example, a higher variance indicates greater variability in the daily sales figures, while a lower variance would suggest more consistency or stability in the sales pattern.

Related Links

Chance Error

Control Group

Parameter

T-Test