Important Statistics Formulas

Parameters

Statistics

Unless otherwise noted, these formulas assume simple random sampling.

Correlation

· Pearson product-moment correlation = r = Σ (xy) / sqrt [ ( Σ x² ) * ( Σ y² ) ]

· Linear correlation (sample data) = r = [ 1 / (n - 1) ] * Σ { [ (x_i - x) / s_x ] * [ (y_i - y) / s_y ] }

· Linear correlation (population data) = ρ = [ 1 / N ] * Σ { [ (X_i - μ_X) / σ_x ] * [ (Y_i - μ_Y) / σ_y ] }

Simple Linear Regression

Counting

Probability

Random Variables

In the following formulas, X and Y are random variables, and a and b are constants.

Sampling Distributions

Standard Error

炷 Standard error of proportion = SE_p = s_p = sqrt[ p * (1 - p)/n ] = sqrt( pq / n )

炷 Standard error of difference for proportions = SE_p = s_p = sqrt{ p * ( 1 - p ) * [ (1/n₁) + (1/n₂) ] }

炷 Standard error of the mean = SE_x = s_x = s/sqrt(n)

炷 Standard error of difference of sample means = SE_d = s_d = sqrt[ (s₁² / n₁) + (s₂² / n₂) ]

炷 Standard error of difference of paired sample means = SE_d = s_d = { sqrt [ (Σ(d_i - d)² / (n - 1) ] } / sqrt(n)

炷 Pooled sample standard error = s_pooled = sqrt [ (n₁ - 1) * s₁² + (n₂ - 1) * s₂² ] / (n₁ + n₂ - 2) ]

炷 Standard error of difference of sample proportions = s_d = sqrt{ [p₁(1 - p₁) / n₁] + [p₂(1 - p₂) / n₂] }

Discrete Probability Distributions

· Binomial formula: P(X = x) = b(x; n, P) = _nC_x * P^x * (1 - P)^{n - x} = _nC_x * P^x * Q^{n - x}

· Mean of binomial distribution = μ_x = n * P

· Variance of binomial distribution = σ_x² = n * P * ( 1 - P )

· Negative Binomial formula: P(X = x) = b*(x; r, P) = _x-1C_r-1 * P^r * (1 - P)^{x - r}

· Mean of negative binomial distribution = μ_x = rQ / P

· Variance of negative binomial distribution = σ_x² = r * Q / P²

· Geometric formula: P(X = x) = g(x; P) = P * Q^{x - 1}

· Mean of geometric distribution = μ_x = Q / P

· Variance of geometric distribution = σ_x² = Q / P²

· Hypergeometric formula: P(X = x) = h(x; N, n, k) = [ _kC_x ] [ _N-kC_n-x ] / [ _NC_n ]

· Mean of hypergeometric distribution = μ_x = n * k / N

· Variance of hypergeometric distribution = σ_x² = n * k * ( N - k ) * ( N - n ) / [ N² * ( N - 1 ) ]

· Poisson formula: P(x; μ) = (e^-μ) (μ^x) / x!

· Mean of Poisson distribution = μ_x = μ

· Variance of Poisson distribution = σ_x² = μ

· Multinomial formula: P = [ n! / ( n₁! * n₂! * ... n_k! ) ] * ( p₁ⁿ₁ * p₂ⁿ₂ * . . . * p_kⁿ_k )

Linear Transformations

For the following formulas, assume that Y is a linear transformation of the random variable X, defined by the equation: Y = aX + b.

7 Mean of a linear transformation = E(Y) = Y = aX + b.

7 Variance of a linear transformation = Var(Y) = a² * Var(X).

7 Standardized score = z = (x - μ_x) / σ_x.

7 t-score = t = (x - μ_x) / [ s/sqrt(n) ].

Estimation

· Confidence interval: Sample statistic + Critical value * Standard error of statistic

· Margin of error = (Critical value) * (Standard deviation of statistic)

· HYPERLINK "http://stattrek.com/AP-Statistics-4/Margin-of-Error.aspx?Tutorial=Stat" Margin of error = (Critical value) * (Standard error of statistic)

Hypothesis Testing

· Standardized test statistic = (Statistic - Parameter) / (Standard deviation of statistic)

· One-sample z-test for proportions: z-score = z = (p - P0) / sqrt( p * q / n )

· Two-sample z-test for proportions: z-score = z = z = [ (p₁ - p₂) - d ] / SE

· One-sample t-test for means: t-score = t = (x - μ) / SE

· Two-sample t-test for means: t-score = t = [ (x₁ - x₂) - d ] / SE

· Matched-sample t-test for means: t-score = t = [ (x₁ - x₂) - D ] / SE = (d - D) / SE

· HYPERLINK "http://stattrek.com/AP-Statistics-4/Goodness-of-Fit.aspx?Tutorial=Stat" Chi-square test statistic = Χ² = Σ[ (Observed - Expected)² / Expected ]

Degrees of Freedom

The correct formula for degrees of freedom (DF) depends on the situation (the nature of the test statistic, the number of samples, underlying assumptions, etc.).

One-sample t-test: DF = n - 1
HYPERLINK "http://stattrek.com/AP-Statistics-4/Difference-Means.aspx?Tutorial=Stat" Two-sample t-test: DF = (s1²/n₁ + s₂²/n₂)² / { [ (s₁² / n₁)² / (n₁ - 1) ] + [ (s₂² / n₂)² / (n₂ - 1) ] }
HYPERLINK "http://stattrek.com/AP-Statistics-4/Difference-Means.aspx?Tutorial=Stat" Two-sample t-test, pooled standard error: DF = n₁ + n₂ - 2
Simple linear regression, test slope: DF = n - 2
HYPERLINK "http://stattrek.com/AP-Statistics-4/Goodness-of-Fit.aspx?Tutorial=Stat" Chi-square goodness of fit test: DF = k - 1
Chi-square test for homogeneity: DF = (r - 1) * (c - 1)
Chi-square test for independence: DF = (r - 1) * (c - 1)

Sample Size

Below, the first two formulas find the smallest sample sizes required to achieve a fixed margin of error, using simple random sampling. The third formula assigns sample to strata, based on a proportionate design. The fourth formula, Neyman allocation, uses stratified sampling to minimize variance, given a fixed sample size. And the last formula, optimum allocation, uses stratified sampling to minimize variance, given a fixed budget.

Proportionate stratified sampling: n_h = ( N_h / N ) * n

Neyman allocation (stratified sampling): n_h = n * ( N_h * σ_h ) / [ Σ ( N_i * σ_i ) ]

Optimum allocation (stratified sampling):
n_h = n * [ ( N_h * σ_h ) / sqrt( c_h ) ] / [ Σ ( Ni * σ_i ) / sqrt( c_i ) ]

+ نوشته شده در یکشنبه سی ام خرداد ۱۳۸۹ ساعت 10:22 توسط Amir Daneshgar |