概率论_4.1_4.2_4.3
- 4.1 Probability Density Functions
- 4.2 Cumulative Distribution Functions and Expected Values(累积分布函数与期望值)
- 4.3 The Normal Distribution(正态分布)
- The Standard Normal Distribution(标准正态分布)
- Percentiles of the Standard Normal Distribution
- z α z_{\alpha} zα Notation for z Critical Values
- Nonstandard Normal Distributions
- Percentiles of an Arbitrary Normal Distribution
- The Normal Distribution and Discrete Populations
- Approximating the Binomial Distribution
4.1 Probability Density Functions
A discrete random variable (rv) is one whose possible values either constitute a finite set or else can be listed in an infinite sequence (a list in which there is a first element, a second element, etc.). A random variable whose set of possible values is an entire interval of numbers is not discrete.
Probability Distributions for Continuous Variables
For f(x) to be a legitimate pdf, it must satisfy the following two conditions:
- f ( x ) ≥ 0 f(x) \ge 0 f(x)≥0 for all x
- ∫ − ∞ ∞ f ( x ) d x = a r e a u n d e r t h e e n t i r e g r a p h o f f ( x ) = 1 \int_{- \infin}^{\infin} f(x)dx = area \, under \, the \, entire \, graph \, of \, f(x)=1 ∫−∞∞f(x)dx=areaundertheentiregraphoff(x)=1
When X is a discrete random variable, each possible value is assigned positive probability. This is not true of a continuous random variable (that is, the second condition of the definition is satisfied) because the area under a density curve that lies above any single value is zero(当X是一个离散随机变量时,每个可能的值都被赋正概率。对于连续型随机变量(即满足定义的第二个条件),这是不成立的,因为在任意单个值之上的密度曲线下的面积是零):
P ( X = c ) = ∫ c c f ( x ) d x = lim ϵ → ∞ ∫ c − ϵ c + ϵ f ( x ) d x = 0 P(X=c)=\int_{c}^{c}f(x)dx=\lim_{\epsilon \to \infty} \int_{c -\epsilon}^{c+\epsilon}f(x)dx=0 P(X=c)=∫ccf(x)dx=ϵ→∞lim∫c−ϵc+ϵf(x)dx=0
The fact that P(X=c)=0 when X is continuous has an important practical consequence: The probability that X lies in some interval between a and b does not depend on whether the lower limit a or the upper limit b is included in the probability calculation(X位于a和b之间的某个区间的概率并不取决于a的下限或b的上限是否包含在概率计算中):
P ( a ≤ X ≤ b ) = P ( a < X < b ) = P ( a < X ≤ b ) = P ( a ≤ X < b ) P(a \leq X \leq b)=P(a < X < b)=P(a < X \leq b)=P(a \leq X < b) P(a≤X≤b)=P(a<X<b)=P(a<X≤b)=P(a≤X<b)
4.2 Cumulative Distribution Functions and Expected Values(累积分布函数与期望值)
The Cumulative Distribution Function
Using F(x) to Compute Probabilities
The figure below illustrates the second part of this proposition; the desired probability is the shaded area under the density curve between a and b, and it equals the difference between the two shaded cumulative areas. This is different from what is appropriate for a discrete integer valued random variable (e.g., binomial or Poisson): P(a ≤ \leq ≤ X ≤ \leq ≤ b) = F(b) - F(a - 1) when a and b are integers.
Obtaining f(x) from F(x)
Percentiles of a Continuous Distribution
A continuous distribution whose pdf is symmetric—the graph of the pdf to the left of some point is a mirror image of the graph to the right of that point—has median μ ~ \tilde{\mu} μ~ equal to the point of symmetry, since half the area under the curve lies to either side of this point.
Expected Values
4.3 The Normal Distribution(正态分布)
The statement that X is normally distributed with parameters μ \mu μ and σ 2 \sigma^2 σ2 is often abbreviated X~N( μ \mu μ, σ 2 \sigma^2 σ2).
The Standard Normal Distribution(标准正态分布)
Percentiles of the Standard Normal Distribution
For any p between 0 and 1, Appendix Table A.3 can be used to obtain the (100p)th percentile of the standard normal distribution.
z α z_{\alpha} zα Notation for z Critical Values
In statistical inference, we will need the values on the horizontal z axis that capture certain small tail areas under the standard normal curve.
The z α ′ s z_{\alpha}'s zα′s are usually referred to as z critical values(z临界值). Table 4.1 lists the most useful z percentiles and values.
Nonstandard Normal Distributions
When X ∼ N ( μ , σ 2 ) X \sim N(\mu,\sigma^2) X∼N(μ,σ2), probabilities involving X are computed by “standardizing”. The standardized variable(标准化变量) is ( X − μ ) / σ (X - \mu)/\sigma (X−μ)/σ. Subtracting μ \mu μ shifts the mean from μ \mu μ to zero, and then dividing by σ \sigma σ scales the variable so that the standard deviation is 1 rather than σ \sigma σ.
Percentiles of an Arbitrary Normal Distribution
The (100p)th percentile of a normal distribution with mean μ \mu μ and standard deviation σ \sigma σ is easily related to the (100p)th percentile of the standard normal distribution.
The Normal Distribution and Discrete Populations
The normal distribution is often used as an approximation to the distribution of values in a discrete population(正态分布常被用作离散总体中数值分布的近似值). In such situations, extra care should be taken to ensure that probabilities are computed in an accurate manner.
The correction for discreteness of the underlying distribution(对底层分布离散性的校正) is often called a continuity correction(连续性校正). It is useful in the following application of the normal distribution to the computation of binomial probabilities.