Lecture Undergraduate econometrics - Chapter 2: Some basic probability concepts

40 37 0
Lecture Undergraduate econometrics - Chapter 2: Some basic probability concepts

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

Thông tin tài liệu

In this chapter, students will be able to understand: Experiments, outcomes and random variables; the probability distribution of a random variable; expected values involving a single random variable; using joint probability density functions; the expected value of a function of several random variables: covariance and correlation; the normal distribution.

Chapter Some Basic Probability Concepts 2.1 Experiments, Outcomes and Random Variables • A random variable is a variable whose value is unknown until it is observed The value of a random variable results from an experiment; it is not perfectly predictable • A discrete random variable can take only a finite number of values, which can be counted by using the positive integers • Discrete variables are also commonly used in economics to record qualitative, or nonnumerical, characteristics In this role they are sometimes called dummy variables • A continuous random variable can take any real value (not just whole numbers) in an interval on the real number line Slide 2.1 Undergraduate Econometrics, 2nd Edition –Chapter 2.2 The Probability Distribution of a Random Variable • The values of random variables are not known until an experiment is carried out, and all possible values are not equally likely We can make probability statements about certain values occurring by specifying a probability distribution for the random variable • If event A is an outcome of an experiment, then the probability of A, which we write as P(A), is the relative frequency with which event A occurs in many repeated trials of the experiment For any event, ≤ P(A) ≤ 1, and the total probability of all possible event is one 2.2.1 Probability Distributions of Discrete Random Variables • When the values of a discrete random variable are listed with their chances of occurring, the resulting table of outcomes is called a probability function or a probability density function Slide 2.2 Undergraduate Econometrics, 2nd Edition –Chapter • The probability density function spreads the total of “unit” of probability over the set of possible values that a random variable can take • Consider a discrete random variable, X = the number of heads obtained in a single flip of a coin The values that X can take are x = 0,1 If the coin is “fair” then the probability of a head occurring is 0.5 The probability density function, say f(x), for the random variable X is Coin Side x f(x) tail 0.5 head 0.5 • “The probability that X takes the value is 0.5” means that the two values and have an equal chance of occurring and, if we flipped a fair coin a very large number of times, the value x = would occur 50 percent of the time We can denote this as P[X Slide 2.3 Undergraduate Econometrics, 2nd Edition –Chapter = 1] = f(1) = 0.5, where P[X = 1] is the probability of the event that the random variable X = • For a discrete random variable X the value of the probability density function f(x) is the probability that the random variable X takes the value x, f(x) = P(X=x) • Therefore, ≤ f(x) ≤ and, if X takes n values x1, , xn, then f ( x1 ) + f ( x2 ) + L + f ( xn ) = 2.2.2 The Probability Density Function of A Continuous Random Variable • For the continuous random variable Y the probability density function f(y) can be represented by an equation, which can be described graphically by a curve For continuous random variables the area under the probability density function corresponds to probability • For example, the probability density function of a continuous random variable Y might be represented as in Figure 2.1 The total area under a probability density function is 1, Slide 2.4 Undergraduate Econometrics, 2nd Edition –Chapter and the probability that Y takes a value in the interval [a, b], or P[a ≤ Y ≤ b], is the area under the probability density function between the values y = a and y = b This is shown in Figure 2.1 by the shaded area • Since a continuous random variable takes an uncountable infinite number of values, the probability of any one occurring is zero That is, P[Y = a] = P[a ≤ Y ≤ a] = • In calculus, the integral of a function defines the area under it, and therefore P[a ≤ Y ≤ b] = ∫ b y =a f ( y)dy • For any random variable x, the probability that x is less than or equal to a is denoted F(a) F(x) is the cumulative distribution function (cdf) • For a discrete random variable, F ( x) = f ( x) = Prob( X ≤ x) ∑ X ≤x Slide 2.5 Undergraduate Econometrics, 2nd Edition –Chapter In view of the definition of f(x), f ( xi ) = F ( xi ) − F ( xi−1) • For a continuous random variable, F ( x) = ∫ x −∞ f (t )dt and f ( x) = dF ( x) dx • In both the continuous and discrete cases, F(x) must satisfy the following properties: Slide 2.6 Undergraduate Econometrics, 2nd Edition –Chapter ≤ F ( x) ≤ If x ≥ y , then F ( x) ≥ F ( y) F (+∞) = F (−∞) = Prob(a < x ≤ b) = F (b) − F (a) Slide 2.7 Undergraduate Econometrics, 2nd Edition –Chapter 2.3 Expected Values Involving a Single Random Variable • When working with random variables, it is convenient to summarize their probability characteristics using the concept of mathematical expectation These expectations will make use of summation notation 2.3.1 The Rules of Summation If X takes n values x1, , xn then their sum is n ∑x i =1 i = x1 + x2 + L + xn If a is a constant, then n ∑ a = na i =1 Slide 2.8 Undergraduate Econometrics, 2nd Edition –Chapter If a is a constant then it can be pulled out in front of a summation n ∑ ax i =1 i n =a ∑ xi i =1 If X and Y are two variables, then n n n ∑ ( x + y ) =∑ x + ∑ y i =1 i i i =1 i i i =1 If a and b are constants, then n n ∑ (a + bx ) =na + b∑ x i i =1 i =1 i If X and Y are two variables, then n n n ∑ (ax + by ) = a∑ x + b∑ y i =1 i i i =1 i i =1 i Slide 2.9 Undergraduate Econometrics, 2nd Edition –Chapter The arithmetic mean (average) of n values of X is n ∑x x= i =1 i n = x1 + x2 + L + xn n Also, n ∑(x − x ) = i i =1 We often use an abbreviated form of the summation notation For example, if f(x) is a function of the values of X, n ∑ f (x ) = f (x ) + f (x ) +L + f (x ) i =1 i n = ∑ f ( xi ) ("Sum over all values of the index i") i = ∑ f ( x) ("Sum over all possible values of X ") x Slide 2.10 Undergraduate Econometrics, 2nd Edition –Chapter for each and every pair of values x and y The converse is also true • If X1, …, Xn are statistically independent the joint probability density function can be factored and written as f(x1,x2,…,xn) = f1(x1)f2(x2)…fn(xn) (2.4.4) • If X and Y are independent random variables, then the conditional probability density function of X, given that Y = y is f ( x | y) = f ( x, y ) f ( x ) f ( y ) = = f ( x) f ( y) f ( y) (2.4.5) for each and every pair of values x and y The converse is also true Slide 2.26 Undergraduate Econometrics, 2nd Edition –Chapter 2.5 The Expected Value of a Function of Several Random Variables: Covariance and Correlation In economics we are usually interested in exploring relationships between economic variables The covariance literally indicates the amount of covariance exhibited by the two random variables • If X and Y are random variables, then their covariance is cov( X , Y ) = E[( X − E[ X ])(Y − E[Y ])] (2.5.1) • If X and Y are discrete random variables, f(x,y) is their joint probability density function, and g(X,Y) is a function of them, then Slide 2.27 Undergraduate Econometrics, 2nd Edition –Chapter E[ g ( X , Y )] = ∑∑ g ( x, y ) f ( x, y ) x (2.5.2) y • If X and Y are discrete random variables and f(x,y) is their joint probability density function, then cov( X , Y ) = E[( X − E[ X ])(Y − E[Y ])] (2.5.3) = ∑∑ [ x − E ( X )][ y − E (Y )] f ( x, y ) x y • If X and Y are continuous random variables, then the definition of covariance is similar, with integrals replacing the summation signs as follows: cov( X ,Y ) = ∫ x ∫y f ( x, y)dxdy Slide 2.28 Undergraduate Econometrics, 2nd Edition –Chapter • The sign of the covariance between two random variables indicates whether their association is positive (direct) or negative (inverse) The covariance between X and Y is expected, or average, value of the random product [X – E(X)][Y – E(Y)] If two random variables have positive covariance then they tend to be positively (or directly) related See Figure 2.4 The values of two random variables with negative covariance tend to be negatively (or inversely) related See Figure 2.5 Zero covariance implies that there is neither positive nor negative association between pairs of values See Figure 2.6 • The magnitude of covariance is difficult to interpret because it depends on the units of measurement of the random variables The meaning of covariation is revealed more clearly if we divide the covariance between X and Y by their respective standard deviations The resulting ratio is defined as the correlation between the random variables X and Y If X and Y are random variables then their correlation is Slide 2.29 Undergraduate Econometrics, 2nd Edition –Chapter ρ= σ cov( X , Y ) = xy var( X ) var(Y ) σ x σ y (2.5.4) • If X and Y are independent random variables then the covariance and correlation between them are zero The converse of this relationship is not true • Independent random variables X and Y have zero covariance, indicating that there is no linear association between them However, just because the covariance or correlation between two random variables is zero does not mean that they are necessarily independent Zero covariance means that there is no linear association between the random variables Even if X and Y have zero covariance, they might have a nonlinear association, like X2 + Y2 = • If a, b, c, and d are constants and X and Y are random variables, then cov(aX + bY , cX + dY ) = ac var( X ) + bd var(Y ) + (ad + bc)cov( X ,Y ) Slide 2.30 Undergraduate Econometrics, 2nd Edition –Chapter Proof: cov(aX + bY , cX + dY ) = E[((aX + bY ) − E[aX + bY ])((cX + dY ) − E[cX + dY ])] = E[(aX + bY − aE[ X ] − bE[Y ])(cX + dY − cE[ X ] − dE[Y ])] = E[(a( X − E[ X ]) + b(Y − E[Y ]))(c( X − E[ X ]) + d (Y − E[Y ]))] = E[ac( X − E[ X ])2 + bd (Y − E[Y ])2 + (ad + bc)( X − E[ X ])(Y − E[Y ])] = acE[( X − E[ X ])2 ] + bdE[(Y − E[Y ])2 ] + (ad + bc) E[( X − E[ X ])(Y − E[Y ])] = ac var( X ) + bd var(Y ) + (ad + bc)cov( X ,Y ) Slide 2.31 Undergraduate Econometrics, 2nd Edition –Chapter 2.5.1 The Mean of a Weighted Sum of Random Variables • Let the function g(X,Y) = aX + bY where a and b are constants This is called a weighted sum Now use Equation (2.5.2) to find the expectation E[ aX + bY ] = aE[ X ] + bE[Y ] (2.5.5) This rule says that the expected value of a weighted sum of two random variables is the weighted sum of their expected values This rule works for any number of random variables whether they are discrete or continuous • If X and Y are random variables, then E [ X + Y ] = E [ X ] + E [Y ] (2.5.6) Slide 2.32 Undergraduate Econometrics, 2nd Edition –Chapter In general, the expected value of any sum is the sum of the expected values 2.5.2 The Variance of a Weighted Sum of Random Variables • If X, Y, and Z are random variables and a, b, and c are constants, then var[aX + bY + cZ] = a2var[X] + b2var[Y] + c2var[Z] + 2abcov[X,Y] + 2accov[X,Z] + 2bccov[Y,Z] (2.5.7) Proof: Slide 2.33 Undergraduate Econometrics, 2nd Edition –Chapter var[aX + bY + cZ ] = E[((aX + bY + cZ ) − E[aX + bY + cZ ])2 ] = E[(a( X − E[ X ]) + b(Y − E[Y ]) + c(Z − E[Z ]))2 ] = E[a2 ( X − E[ X ])2 + b2 (Y − E[Y ])2 + c2 (Z − E[Z ])2 + 2ab( X − E[ X ])(Y − E[Y ]) + 2ac( X − E[ X ])(Z − E[Z ]) + 2bc(Y − E[Y ])(Z − E[Z ])] = a2 E[( X − E[ X ])2 ] + b2 E[(Y − E[Y ])2 ] + c2 E[(Z − E[Z ])2 ] + 2abE[( X − E[ X ])(Y − E[Y ])] + 2acE[( X − E[ X ])(Z − E[Z ])] + 2bcE[(Y − E[Y ])(Z − E[Z ])] = a2 var( X ) + b2 var(Y ) + c2 var(Z ) + 2ab cov( X ,Y ) + 2ac cov( X , Z ) + 2bc cov(Y , Z ) • If X, Y, and Z are independent, or uncorrelated, random variables, then the covariance terms are zero and: var[aX + bY + cZ] = a2var[X] + b2var[Y] + c2var[Z] (2.5.8) Slide 2.34 Undergraduate Econometrics, 2nd Edition –Chapter • If X, Y, and Z are independent, or uncorrelated, random variables, and if a = b = c = 1, then var[X + Y + Z] = var[X] + var[Y] + var[Z] (2.5.9) • When the “variance of a sum is the sum of the variances,” the random variables involved must be independent, or uncorrelated Slide 2.35 Undergraduate Econometrics, 2nd Edition –Chapter 2.6 The Normal Distribution • If X is a normally distributed random variable with mean β and variance σ2, symbolized as X ~ N(β,σ2), then its probability density function is expected mathematically as:  − ( x − β)  f ( x) = exp  ,  2 2πσ  2σ  −∞< x

Ngày đăng: 02/03/2020, 14:03

Từ khóa liên quan

Tài liệu cùng người dùng

Tài liệu liên quan