Spatial econometrics using microdata (GIS and territorial intelligence)

252 114 0
Spatial econometrics using microdata (GIS and territorial intelligence)

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

Thông tin tài liệu

W468-Dube.qxp_Layout 29/08/2014 10:05 Page GIS AND TERRITORIAL INTELLIGENCE Jean Dubé is Professor in regional development at Laval University, Canada Diègo Legros is a lecturer in economics and management at the University of Burgundy, France www.iste.co.uk Z(7ib8e8-CBEGIC( Spatial Econometrics Using Microdata This book can be used as a reference for those studying towards a bachelor’s or master’s degree in regional science or economic geography, looking to work with geolocalized (micro) data, but without possessing advanced statistical theoretical basics The authors also address the application of the spatial analysis methods in the context where spatial data are pooled over time (spatio-temporal data), focusing on the recent developments in the field Jean Dubé Diègo Legros This book puts special emphasis on spatial data compilation and the structuring of connections between the observations Descriptive analysis methods of spatial data are presented in order to identify and measure the global and local spatial autocorrelation The authors then move on to incorporate this spatial component into spatial autoregressive models These models allow us to control the problem of spatial autocorrelation among residuals of the linear statistical model, thereby contravening one of the basic hypotheses of the ordinary least squares approach Spatial Econometrics Using Microdata Jean Dubé and Diègo Legros Spatial Econometrics Using Microdata To the memory of Gilles Dubé For Mélanie, Karine, Philippe, Vincent and Mathieu Series Editor Anne Ruas Spatial Econometrics Using Microdata Jean Dubé Diègo Legros First published 2014 in Great Britain and the United States by ISTE Ltd and John Wiley & Sons, Inc Apart from any fair dealing for the purposes of research or private study, or criticism or review, as permitted under the Copyright, Designs and Patents Act 1988, this publication may only be reproduced, stored or transmitted, in any form or by any means, with the prior permission in writing of the publishers, or in the case of reprographic reproduction in accordance with the terms and licenses issued by the CLA Enquiries concerning reproduction outside these terms should be sent to the publishers at the undermentioned address: ISTE Ltd 27-37 St George’s Road London SW19 4EU UK John Wiley & Sons, Inc 111 River Street Hoboken, NJ 07030 USA www.iste.co.uk www.wiley.com © ISTE Ltd 2014 The rights of Jean Dubé and Diègo Legros to be identified as the authors of this work have been asserted by them in accordance with the Copyright, Designs and Patents Act 1988 Library of Congress Control Number: 2014945534 British Library Cataloguing-in-Publication Data A CIP record for this book is available from the British Library ISBN 978-1-84821-468-2 Contents ACKNOWLEDGMENTS ix P REFACE xi C HAPTER E CONOMETRICS AND S PATIAL D IMENSIONS 1.1 Introduction 1.2 The types of data 1.2.1 Cross-sectional data 1.2.2 Time series 1.2.3 Spatio-temporal data 1.3 Spatial econometrics 1.3.1 A picture is worth a thousand words 1.3.2 The structure of the databases of spatial microdata 1.4 History of spatial econometrics 1.5 Conclusion 11 13 15 16 21 C HAPTER S TRUCTURING S PATIAL R ELATIONS 29 2.1 Introduction 2.2 The spatial representation of data 2.3 The distance matrix 2.4 Spatial weights matrices 2.4.1 Connectivity relations 2.4.2 Relations of inverse distance 29 30 34 37 40 42 vi Spatial Econometrics Using Microdata 2.4.3 Relations based on the inverse (or negative) exponential 2.4.4 Relations based on Gaussian transformation 2.4.5 The other spatial relation 2.4.6 One choice in particular? 2.4.7 To start 2.5 Standardization of the spatial weights matrix 2.6 Some examples 2.7 Advantages/disadvantages of micro-data 2.8 Conclusion 45 47 47 48 49 50 51 55 56 C HAPTER S PATIAL AUTOCORRELATION 59 3.1 Introduction 3.2 Statistics of global spatial autocorrelation 3.2.1 Moran’s I statistic 3.2.2 Another way of testing significance 3.2.3 Advantages of Moran’s I statistic in modeling 3.2.4 Moran’s I for determining the optimal form of W 3.3 Local spatial autocorrelation 3.3.1 The LISA indices 3.4 Some numerical examples of the detection tests 3.5 Conclusion 59 65 68 72 74 75 77 79 86 89 C HAPTER S PATIAL E CONOMETRIC M ODELS 93 4.1 Introduction 4.2 Linear regression models 4.2.1 The different multiple linear regression model types 4.3 Link between spatial and temporal models 4.3.1 Temporal autoregressive models 4.3.2 Spatial autoregressive models 4.4 Spatial autocorrelation sources 4.4.1 Spatial externalities 4.4.2 Spillover effect 4.4.3 Omission of variables or spatial heterogeneity 93 95 99 102 103 110 115 117 119 123 Contents 4.4.4 Mixed effects 4.5 Statistical tests 4.5.1 LM tests in spatial econometrics 4.6 Conclusion vii 127 129 134 140 C HAPTER S PATIO - TEMPORAL M ODELING 145 5.1 Introduction 5.2 The impact of the two dimensions on the structure of the links: structuring of spatio-temporal links 5.3 Spatial representation of spatio-temporal data 5.4 Graphic representation of the spatial data generating processes pooled over time 5.5 Impacts on the shape of the weights matrix 5.6 The structuring of temporal links: a temporal weights matrix 5.7 Creation of spatio-temporal weights matrices 5.8 Applications of autocorrelation tests and of autoregressive models 5.9 Some spatio-temporal applications 5.10 Conclusion 145 148 150 154 159 162 167 170 172 173 C ONCLUSION 177 G LOSSARY 185 A PPENDIX 189 B IBLIOGRAPHY 215 I NDEX 227 Appendix 193 Out of commodity, Nijkamp and Paelinck [NIJ 75] propose the calculation of the “spatial influence coefficient” that is represented by the γ index (equation [A.2]), which is not linked to the definition seen previously in [3.38]: γ =1−c [A.2] This index presents similar behavior to that of a regular correlation coefficient, with one exception In the absence of spatial correlation, the value of the index is zero (γ = 0), while in the present of positive spatial autocorrelation, the value of the index is positive (γ > 0) and reach a maximum value of one (γ = 1) In the presence of negative spatial autocorrelation, the value of the index is negative without necessary having a lower limit This is the main difference between the γ index and the classical correlation coefficient, or even with Moran’s I index The expectation and the variation of Geary’s c statistic are traditionally calculated using two distinct hypotheses [CLI 81] The first one assumes that the values yi taken by the random variable y on the different observations come from N independent draws in a normal population The second hypothesis, presumes that the values yi are realization of a random variable y whose distribution is unknown In this case, we must consider all of the N ! possible permutations of the values of the random variable y on the territory considered, as each of the permutations is just as likely to appear Under the hypothesis of normality of the random variable y, the expectation and the variance of Geary’s c statistic are respectively provided by the equations [A.3] and [A.4]: E(c) = V ar(c) = [A.3] (2S1 + S2 )(N − 1) − 4S02 2(N + 1)S02 [A.4] 194 Spatial Econometrics Using Microdata Under the hypothesis of random distribution of the variable y, the variance of Geary’s c statistic is given by a different expression (equation [A.5]): V ar(c) = S02 N − − (N − 1)2 b2 A−B + N (N − 2)(N − 3)S02 N (N − 2)(N − 3)S02 [A.5] where A = (N − 1)S1 N − 3N + − (N − 1)b2 , B = − 14 (N − 1)S2 N + 3N − − (N − N + 2)b2 , N is the total number of observations considered and where the expression S0 , S1 and S2 are respectively defined by the identities [3.12], [3.13] and [3.14] and where the terms wi· , w·i and b2 are respectively defined by the quantities [3.15], [3.16] and [3.26] A.2.2 Comparison: Geary’s c and Moran’s I Starting with the previous theoretical presentations on the calculations of Geary’s c and Moran’s I statistics, it is possible to highlight some commonalities between them We summarize, in Table A.1, the values that can be taken by statistics of autocorrelation, Geary’s c and Moran’s I, covered previously Geary (c) Moran (I) Positive spatial autocorrelation concentrated, grouped < c < Negative spatial autocorrelation dispersed, contrasted c>1 I>0 I0 β=β By applying the rules of the derivation of matrices, we get the following expression: ∂S(β) = −2X y + 2X Xβ = ∂β Appendix 199 By re-writing the expression, this results in: X Xβ = X y This system of equations is called normal equations system and has a single solution of the matrix X X is invertible, i.e if it is of full rank, which is equal to (K + 1) The matrix expression of the estimator of the OLS of β as part of multiple linear regression is therefore: β = (X X)−1 X y The expression of the second order derivative, [A.10] ∂ S(β) ∂β∂β β=β = 2X X is a positive-defined matrix The expression of the estimator β is indeed the optimal solution A.3.3 Estimation by maximum likelihood (ML) The use of the principle of maximum likelihood (ML) as method of estimation of the spatial models was first outlined by the work by Cliff and Ord [CLI 73] and Ord [ORD 75] who applied it to the spatial autoregressive model (SAR) and to the spatially autocorrelated error model (SEM) When it is applied to spatial models, the method of maximum likelihood requires that certain conditions be met, ensuring the convergence, the efficiency and the asymptotic normality of the estimators4 These conditions of regularity were defined by Heijmans and Magnus [HEI 86a, HEI 86b, HEI 86c] and Magnus [MAG 78] and rely mainly on the existence of the function of log-likelihood, continuity and the differentiability of the elements of score and the associated Hessian matrix For the most common spatial models, these conditions relate to restrictions on the spatial weights and to the space of the parameters associated with the spatial autoregressive coefficients [ANS 88] 200 Spatial Econometrics Using Microdata A.3.3.1 Estimation of the SAR by maximum likelihood The structural form of the spatial autoregressive model (SAR) is written: y = ρWy + Xβ + [A.11] The SAR model can also be presented in its reduced form: y = (I − ρW)−1 Xβ + (I − ρW)−1 [A.12] The disturbances distributed: are assumed to be independent and normally ∼ N (0, σ I) [A.13] By expressing in function of the other quantities, this results in: = (I − ρW)y − Xβ [A.14] Let us now calculate the Jacobian of , this is: J= ∂ = |I − ρW| ∂y [A.15] The likelihood function of the SAR model is written: N N ln(2π) − ln(σ ) 2 + ln |I − ρW| − 2σ ln L(β, ρ, σ |y, X) = − [A.16] Appendix 201 By replacing the vector of the disturbances by its expression of the equation [A.14], the log-likelihood function becomes: N N ln(2π) − ln(σ ) + ln |I − ρW| 2 − (y − ρWy − Xβ) (y − ρWy − Xβ) 2σ [A.17] ln L(β, ρ, σ |y, X) = − If some conditions of regularity are satisfied ([HEI 86a, HEI 86b, HEI 86c] and [MAG 78]), the asymptotic properties of maximum likelihood estimators are too For this, we need that the Jacobian, |I − ρW| is positive so the parameter ρ has to be in the interval [−1, 1] To get the estimators of the maximum likelihood of parameters β, σ and ρ, the log-likelihood function is maximized with respect to these parameters The first order condition implies that the partial derivatives of the log-likelihood with respect to each of the parameters of interest β, σ and ρ are equal to zero: ∂ ln L(β, ρ, σ |y, X) X (y − ρWy − Xβ) = =0 ∂β σ2 [A.18] ∂ ln L(β, ρ, σ |y, X) N =− + ∂σ 2σ 2σ [A.19] ∂ ln L(β, ρ, σ |y, X) (Wy) = − trace W(I − ρW)−1 ∂ρ σ2 [A.20] The condition [A.18] allows us to express β as a function of y, W and of ρ The estimator of the maximum likelihood of parameter β, written βM L is: βM L = (X X)−1 X (I − ρW)y [A.21] 202 Spatial Econometrics Using Microdata In the same manner, we get the estimator of the maximum likelihood , to be: of σ , σM L σM L = ML ML [A.22] N with M L = (y − ρWy − XβM L ) the residuals obtained by replacing the parameter β by its estimator of the maximum likelihood βM L We remark that these estimators are those obtained by the method of the least squares applied to the filtered model y = Xβ + with y = (I − ρW)y By writing β0 = (X X)−1 X y, the estimator coming from the regression of y on X with = y − Xβ0 , β1 the estimator of the ordinary least squares coming from the regression of Wy on X with = y − Xβ1 , the estimators of the maximum likelihood of β and of σ respectively written βM L and σM L are given by: βM L = β0 − ρβ1 [A.23] The vector of the residuals obtained by the method of maximum likelihood, written M L can be decomposed into two parts: ML = y − ρWy − XβM L ML = −ρ with = y − Xβ0 and = Wy − Xβ1 Thus, according to the equation (A.22), the variance of the disturbances, σ , can be written: σM L = ( − ρ 1) ( N − ρ 1) [A.24] By substituting β and σ by their estimators given in equations [A.23] and [A.24] in the likelihood function [A.16], we obtain the concentrated likelihood function with respect to parameters β and σ Appendix 203 which is a function of the parameter ρ only: ln Lc (ρ) = − N N ln(2π) − ln 2 ( − ρ 1) ( N + ln |I − ρW| − ρ 1) [A.25] In the end, the estimation process can be synthesized and decomposed into four steps: – regression of y on X to obtain an estimation of β0 = (X X)−1 X y to calculate the residuals = y − Xβ0 ; – regression of Wy on X to obtain an estimation of β1 = (X X)−1 X Wy to calculate the residuals = Wy − XβL ; – maximization5 of the concentrated log-likelihood function with respect parameters β and σ LC (ρ) given and to obtain an estimator of the maximum likelihood, written σM L , for the parameter σ: – estimation by maximum likelihood of the parameters β and σ : βM L = β0 − ρβL σM L = N − ρM L [A.26] L − ρM L L [A.27] A.3.3.2 First order conditions for the estimation of the SEM by maximum likelihood The spatial error model (SEM) is given by the following equations system: y = Xβ + η [A.28] η = λWη + [A.29] Because the concentrated likelihood is nonlinear in the parameter ρ we have to use numerical optimization procedures to obtain an estimation of the parameter ρ 204 Spatial Econometrics Using Microdata The disturbances distributed: are assumed to be independent and normally ∼ N 0, σ I [A.30] The reduced form of the SEM model is: y = Xβ + I − λW −1 [A.31] From equation [A.31], we obtain the expression of : = I − λW (y − Xβ) [A.32] Let us now calculate the Jacobian of , this is: J= ∂ = |I − λW| ∂y [A.33] The log-likelihood function is written: N N ln(2π) − ln σ + ln |I − λW| 2 − [A.34] 2σ ln L β, λ, σ |y, X = − By replacing the vector of the disturbances by its expression of equation [A.32], the log-likelihood function is written: N N ln(2π) − ln σ + ln |I − λW| 2 − I − λW (y − Xβ) 2σ ln L β, λ, σ |y, X = − × I − λW (y − Xβ) [A.35] The introduction in the likelihood function of the term ln |I − λW| implies that the maximum likelihood estimator for the parameter β is Appendix 205 not equal to the estimator of the ordinary least squares These two estimators are equal when λ tends towards zero Like in the SAR model, asymptotic properties are verified if some regularity conditions are met such as the positivity of the Jacobian: J = |I − λW| > [A.36] For a row-standardized weights matrix, W, this requires that the autoregressive parameter be in the interval [−1, 1] The first-order condition to obtain the maximum likelihood estimator of parameters β, λ and σ is that the first-order derivative with respect to each parameter be zero, canceling each of the partial derivatives of the log-likelihood: ∂L β, λ, σ =0 ∂β [A.37] ∂L β, λ, σ =0 ∂σ [A.38] ∂L β, λ, σ =0 ∂λ [A.39] Since the parameter β only appears in the term = I − λW (y − Xβ) I − λW (y − Xβ), the partial derivative of the log-likelihood function with respect to the parameter ∂L β,λ,σ β, is equal to the derivative of the quantity with respect ∂β to the parameter β The quantity can be written as follows: = y I − λW I − λW y − 2y I − λW +β X I − λW I − λW Xβ I − λW Xβ 206 Spatial Econometrics Using Microdata The first order condition for the parameter β is: ∂L β, λ, σ = X I − λW ∂β −X I − λW I − λW y I − λW Xβ = [A.40] The first order condition [A.37] implies: X I − λW I − λW Xβ = X I − λW I − λW y[A.41] By resolving the equation [A.41], we obtain the estimator of the maximum likelihood: βM L = X I − λW I − λW X −1 X I−λW I−λW y[A.42] which is equivalent to the estimator of the generalized least squares (βM L = βM CG ) We can see this as an estimator of the least squares coming from a regression of y on X where: y = I − λW y [A.43] X = I − λW X [A.44] and: The estimator of the maximum likelihood βM L can be obtained by carrying out a regression by ordinary least squares after having transformed the variables according to equations [A.43] and [A.44] The partial derivative of the log-likelihood function with respect to the parameter σ is written: ∂L β, λ, σ N =− + ∂σ 2σ 2σ [A.45] Appendix 207 This derivative has to be equal to zero to determine the the maximum likelihood estimator for the parameter σ : − N + 2σ 2σ =0 [A.46] By simplifying equation [A.46], we have: σ2 = N By using the expression of of equation [A.32], the estimator of the maximum likelihood of σ can be calculated by using the estimator of the maximum likelihood, βM L of β of equation [A.42]: σM L = y − XβM L (I − λW) (I − λW) y − XβM L [A.47] where the vector of the residuals is defined by: = y − XβM L σM L = [A.48] (I − λW) (I − λW) N [A.49] Let us note that the two maximum likelihood estimators, βM L and depend on the parameter λ A maximum likelihood estimator for the parameter λ can be obtained by maximizing, with respect to λ, the concentrated log-likelihood function, which is obtained by inserting the maximum likelihood estimators βM L and σM L in the likelihood function: σM L ln Lc (λ) = − N N − ln 2 + ln |I − λW| (I − λW) (I − λW) N [A.50] The concentrated log-likelihood function, is a nonlinear function of λ for which there is no analytical solution Thus an estimator of the ... Spatial Econometrics Using Microdata To the memory of Gilles Dubé For Mélanie, Karine, Philippe, Vincent and Mathieu Series Editor Anne Ruas Spatial Econometrics Using Microdata Jean... of data that the researcher is using Econometrics and Spatial Dimensions 11 1.3 Spatial econometrics Why spatial econometrics? The simplest answer is that the spatial dimension of the mobilized... statistical and quantitative xii Spatial Econometrics Using Microdata analyses However, for many people, the acquisition of the knowledge necessary for a proper reading and understanding of the

Ngày đăng: 03/01/2020, 15:45

Từ khóa liên quan

Mục lục

  • Cover Page

  • Half-Title Page

  • Title Page

  • Copyright Page

  • Contents

  • Acknowledgements

  • Preface

    • P.1. Introduction

    • P.2. Who is this work aimed at?

    • P.3. Structure of the book

    • 1: Econometrics and Spatial Dimensions

      • 1.1. Introduction

      • 1.2. The types of data

        • 1.2.1. Cross-sectional data

        • 1.2.2. Time series

        • 1.2.3. Spatio-temporal data

        • 1.3. Spatial econometrics

          • 1.3.1. A picture is worth a thousand words

          • 1.3.2. The structure of the databases of spatial microdata

          • 1.4. History of spatial econometrics

          • 1.5. Conclusion

          • 2: Structuring Spatial Relations

            • 2.1. Introduction

            • 2.2. The spatial representation of data

            • 2.3. The distance matrix

Tài liệu cùng người dùng

Tài liệu liên quan