Important Statistics Formulas

 

Parameters

Statistics

Unless otherwise noted, these formulas assume simple random sampling.

Correlation

·                     Pearson product-moment correlation = r = Σ (xy) / sqrt [ ( Σ x2 ) * ( Σ y2 ) ]

·                     Linear correlation (sample data) = r = [ 1 / (n - 1) ] * Σ { [ (xi - x) / sx ] * [ (yi - y) / sy ] }

·                     Linear correlation (population data) = ρ = [ 1 / N ] * Σ { [ (Xi - μX) / σx ] * [ (Yi - μY) / σy ] }

Simple Linear Regression

 

 

 

Counting

Probability

Random Variables

In the following formulas, X and Y are random variables, and a and b are constants.

Sampling Distributions

Standard Error

炷     Standard error of proportion = SEp = sp = sqrt[ p * (1 - p)/n ] = sqrt( pq / n )

炷     Standard error of difference for proportions = SEp = sp = sqrt{ p * ( 1 - p ) * [ (1/n1) + (1/n2) ] }

炷     Standard error of the mean = SEx = sx = s/sqrt(n)

炷     Standard error of difference of sample means = SEd = sd = sqrt[ (s12 / n1) + (s22 / n2) ]

炷     Standard error of difference of paired sample means = SEd = sd = { sqrt [ (Σ(di - d)2 / (n - 1) ] } / sqrt(n)

炷     Pooled sample standard error = spooled = sqrt [ (n1 - 1) * s12 + (n2 - 1) * s22 ] / (n1 + n2 - 2) ]

炷     Standard error of difference of sample proportions = sd = sqrt{ [p1(1 - p1) / n1] + [p2(1 - p2) / n2] }

Discrete Probability Distributions

·     Binomial formula: P(X = x) = b(x; n, P) = nCx * Px * (1 - P)n - x = nCx * Px * Qn - x

·     Mean of binomial distribution = μx = n * P

·     Variance of binomial distribution = σx2 = n * P * ( 1 - P )

·     Negative Binomial formula: P(X = x) = b*(x; r, P) = x-1Cr-1 * Pr * (1 - P)x - r

·     Mean of negative binomial distribution = μx = rQ / P

·     Variance of negative binomial distribution = σx2 = r * Q / P2

·     Geometric formula: P(X = x) = g(x; P) = P * Qx - 1

·     Mean of geometric distribution = μx = Q / P

·     Variance of geometric distribution = σx2 = Q / P2

·     Hypergeometric formula: P(X = x) = h(x; N, n, k) = [ kCx ] [ N-kCn-x ] / [ NCn ]

·     Mean of hypergeometric distribution = μx = n * k / N

·     Variance of hypergeometric distribution = σx2 = n * k * ( N - k ) * ( N - n ) / [ N2 * ( N - 1 ) ]

·     Poisson formula: P(x; μ) = (e) (μx) / x!

·     Mean of Poisson distribution = μx = μ

·     Variance of Poisson distribution = σx2 = μ

·     Multinomial formula: P = [ n! / ( n1! * n2! * ... nk! ) ] * ( p1n1 * p2n2 * . . . * pknk )

Linear Transformations

 For the following formulas, assume that Y is a linear transformation of the random variable X, defined by the equation: Y = aX + b.

7                     Mean of a linear transformation = E(Y) = Y = aX + b.

7                     Variance of a linear transformation = Var(Y) = a2 * Var(X).

7                     Standardized score = z = (x - μx) / σx.

7                     t-score = t = (x - μx) / [ s/sqrt(n) ].

Estimation

·                     Confidence interval: Sample statistic + Critical value * Standard error of statistic

·                     Margin of error = (Critical value) * (Standard deviation of statistic)

·                      HYPERLINK "http://stattrek.com/AP-Statistics-4/Margin-of-Error.aspx?Tutorial=Stat" Margin of error = (Critical value) * (Standard error of statistic)

 

Hypothesis Testing

·              Standardized test statistic = (Statistic - Parameter) / (Standard deviation of statistic)

·              One-sample z-test for proportions: z-score = z = (p - P0) / sqrt( p * q / n )

·              Two-sample z-test for proportions: z-score = z = z = [ (p1 - p2) - d ] / SE

·              One-sample t-test for means: t-score = t = (x - μ) / SE

·              Two-sample t-test for means: t-score = t = [ (x1 - x2) - d ] / SE

·              Matched-sample t-test for means: t-score = t = [ (x1 - x2) - D ] / SE = (d - D) / SE

·               HYPERLINK "http://stattrek.com/AP-Statistics-4/Goodness-of-Fit.aspx?Tutorial=Stat" Chi-square test statistic = Χ2 = Σ[ (Observed - Expected)2 / Expected ]

Degrees of Freedom

The correct formula for degrees of freedom (DF) depends on the situation (the nature of the test statistic, the number of samples, underlying assumptions, etc.).

       Sample Size

Below, the first two formulas find the smallest sample sizes required to achieve a fixed margin of error, using simple random sampling. The third formula assigns sample to strata, based on a proportionate design. The fourth formula, Neyman allocation, uses stratified sampling to minimize variance, given a fixed sample size. And the last formula, optimum allocation, uses stratified sampling to minimize variance, given a fixed budget.