Introduction to python for economotric statistics and data analysis

405 292 1
Introduction to python for economotric statistics and data analysis

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

Thông tin tài liệu

Introduction to Python for Econometrics, Statistics and Data Analysis Kevin Sheppard University of Oxford Tuesday 5th August, 2014 - ©2012, 2013, 2014 Kevin Sheppard Changes since the Second Edition Version 2.2.1 (August 2014) • Fixed typos reported by a reader – thanks to Ilya Sorvachev Version 2.2 (July 2014) • Code verified against Anaconda 2.0.1 • Added diagnostic tools and a simple method to use external code in the Cython section • Updated the Numba section to reflect recent changes • Fixed some typos in the chapter on Performance and Optimization • Added examples of joblib and IPython’s cluster to the chapter on running code in parallel Version 2.1 (February 2014) • New chapter introducing object oriented programming as a method to provide structure and organization to related code • Added seaborn to the recommended package list, and have included it be default in the graphics chapter • Based on experience teaching Python to economics students, the recommended installation has been simplified by removing the suggestion to use virtual environment The discussion of virtual environments as been moved to the appendix • Rewrote parts of the pandas chapter • Code verified against Anaconda 1.9.1 Version 2.02 (November 2013) • Changed the Anaconda install to use both create and install, which shows how to install additional packages • Fixed some missing packages in the direct install • Changed the configuration of IPython to reflect best practices • Added subsection covering IPython profiles i Version 2.01 (October 2013) • Updated Anaconda to 1.8 and added some additional packages to the installation for Spyder • Small section about Spyder as a good starting IDE ii Notes to the 2nd Edition This edition includes the following changes from the first edition (March 2012): • The preferred installation method is now Continuum Analytics’ Anaconda Anaconda is a complete scientific stack and is available for all major platforms • New chapter on pandas pandas provides a simple but powerful tool to manage data and perform basic analysis It also greatly simplifies importing and exporting data • New chapter on advanced selection of elements from an array • Numba provides just-in-time compilation for numeric Python code which often produces large performance gains when pure NumPy solutions are not available (e.g looping code) • Dictionary, set and tuple comprehensions • Numerous typos • All code has been verified working against Anaconda 1.7.0 iii iv Contents Introduction 1.1 Background 1.2 Conventions 1.3 Important Components of the Python Scientific Stack 1.4 Setup 1.5 Using Python 1.6 Exercises 1.A Frequently Encountered Problems 17 1.B register_python.py 18 1.C Advanced Setup 19 17 Python 2.7 vs (and the rest) 27 2.1 Python 2.7 vs 27 2.2 Intel Math Kernel Library and AMD Core Math Library 27 2.3 Other Variants 28 2.A Relevant Differences between Python 2.7 and 29 Built-in Data Types 31 3.1 Variable Names 31 3.2 Core Native Data Types 32 3.3 Python and Memory Management 42 3.4 Exercises 44 Arrays and Matrices 47 4.1 Array 47 4.2 Matrix 49 4.3 1-dimensional Arrays 50 4.4 2-dimensional Arrays 51 4.5 Multidimensional Arrays 51 4.6 Concatenation 51 4.7 Accessing Elements of an Array 52 4.8 Slicing and Memory Management 57 v 4.9 import and Modules 59 4.10 Calling Functions 4.11 Exercises 59 61 Basic Math 63 5.1 Operators 63 5.2 Broadcasting 64 5.3 Array and Matrix Addition (+) and Subtraction (-) 65 5.4 Array Multiplication (*) 66 5.5 Matrix Multiplication (*) 66 5.6 Array and Matrix Division (/) 66 5.7 Array Exponentiation (**) 66 5.8 Matrix Exponentiation (**) 67 5.9 Parentheses 67 5.10 Transpose 67 5.11 Operator Precedence 67 5.12 Exercises Basic Functions and Numerical Indexing 71 6.1 Generating Arrays and Matrices 71 6.2 Rounding 6.3 Mathematics 75 6.4 Complex Values 77 6.5 Set Functions 77 6.6 Sorting and Extreme Values 78 6.7 Nan Functions 80 6.8 Functions and Methods/Properties 6.9 Exercises 82 Special Arrays 83 7.1 68 Exercises 74 81 84 Array and Matrix Functions 85 85 8.1 Views 8.2 Shape Information and Transformation 8.3 Linear Algebra Functions 8.4 Exercises 86 93 96 Importing and Exporting Data 99 99 9.1 Importing Data using pandas 9.2 Importing Data without pandas 9.3 Saving or Exporting Data using pandas 106 100 vi 9.4 Saving or Exporting Data without pandas 106 9.5 Exercises 107 10 Inf, NaN and Numeric Limits 109 10.1 inf and NaN 109 10.2 Floating point precision 10.3 Exercises 109 110 11 Logical Operators and Find 113 11.1 >, >=, > %timeit x=randn(100,100);dot(x.T,x) 1000 loops, best of 3: 758 us per loop %who %who lists all variables in memory %who_ls %who_ls returns a sorted list containing the names of all variables in memory %whos %whos provides a detailed list of all variables in memory %xdel %xdel variable deletes the variable from memory 381 382 Bibliography Bollerslev, T & Wooldridge, J M (1992), ‘Quasi-maximum likelihood estimation and inference in dynamic models with time-varying covariances’, Econometric Reviews 11(2), 143–172 Cochrane, J H (2001), Asset Pricing, Princeton University Press, Princeton, N J Flannery, B., Press, W., Teukolsky, S & c, W (1992), Numerical recipes in C, Press Syndicate of the University of Cambridge, New York Jagannathan, R., Skoulakis, G & Wang, Z (2010), The analysis of the cross section of security returns, in Y Aït-Sahalia & L P Hansen, eds, ‘Handbook of financial econometrics’, Vol 2, Elsevier B.V., pp 73–134 383 Index (, 67 argmin, 79 ), 67 +, 255 +, 63 -, 63 /, 63 =, 113 %time, 273 %timeit, 273 *, 63 *, 256 **, 63 max, 79 maximum, 80 min, 79 minimum, 80 Inputting, 50–52 Manipulation, 86–92 broadcast, 89 concatenate, 90 delete, 91 diag, 92 dsplit, 90 flat, 88 flatten, 88 fliplr, 91 flipud, 91 hsplit, 90 hstack, 90 ndim, 87 ravel, 88 reshape, 86 shape, 86 size, 87 squeeze, 91 tile, 87 tril, 92 triu, 92 vsplit, 90 vstack, 90 Mathematics, 65–67, 75–76 absolute, 76 abs, 76 cumprod, 75 cumsum, 75 diff, 75 exp, 76 log, 76 abs, 76 absolute, 76 all, 115 and, 114 any, 115 arange, 71 argmax, 79 argmin, 79 argsort, 79 around, 74 array, 47 Arrays, 47–49 Broadcasting, 64–65 Complex Values, 77 conj, 77 conjugate, 77 imag, 77 real, 77 Extreme Values, 79–80 argmax, 79 384 log10, 76 c_, 72 prod, 75 ceil, 74 sign, 76 center, 258 sqrt, 76 chisquare, 227 square, 76 cholesky, 94 sum, 75 close, 106 NaN Functions, 80–81 nanargmax, 81 nanargmin, 81 nanmax, 81 nanmin, 81 nansum, 80 Set Functions, 77–78 in1d, 77 intersect1d, 78 setdiff1d, 78 setxor1d, 78 union1d, 78 unique, 77 Slicing, 52–58 Sorting, 78–79 argsort, 79 sort, 78, 79 Special empty, 83 eye, 84 identity, 84 ones, 83 zeros, 83 Views asarray, 86 asmatrix, 85 view, 85 as, 59 asarray, 86 asmatrix, 85 beta, 227 binomial, 227 break, 137, 138 concatenate, 90 cond, 93 conj, 77 conjugate, 77 continue, 137, 139 corrcoef, 232 count, 258 cov, 233 ctypes, 298–300 cumprod, 75 cumsum, 75 Cython, 288–297, 300–302 date, 143 Dates and Times, 143–146 date, 143 datetime, 143 datetime64, 144 Mathematics, 143 time, 143 timedelta, 143 timedelta64, 144 datetime, 143 datetime64, 144 def, 207 del, 38 delete, 91 det, 94 diag, 92 Dictionary comprehensions, 141 diff, 75 docstring, 211 dsplit, 90 dtype, 48 brent, 253 eig, 94 broadcast, 89 eigh, 95 broadcast_arrays, 89 elif, 133 Broadcasting, 276 else, 133 385 empty, 83 empty_like, 84 enumerate, 137 equal, 113 except, 139 exp, 76 exponential, 227 Exporting Data CSV, 107 Delimited, 107 MATLAB, 106 savez, 106 savez_compressed, 106 eye, 84 f, 227 file, 105 find, 257 finfo, 109 flat, 88 flatten, 88 fliplr, 91 flipud, 91 float, 106 float, 265 floor, 74 Flow Control elif, 133 else, 133 except, 139 if, 133 try, 139 fmin, 247 Functions, 81 Custom, 207–220 Default Values, 209 docstring, 211 Keyword Arguments, 209 Variable Inputs, 210 Variable Scope, 214 Custom Modules, 216 def, 207 PYTHONPATH, 219 gamma, 227 Generating Arrays, 71–74 arange, 71 c_, 72 ix_, 73 linspace, 71 logspace, 71 meshgrid, 71 mgrid, 73 ogrid, 74 r_, 72 get_state, 229, 230 golden, 253 greater, 113 greater_equal, 113 histogram, 233 histogram2d, 233 hsplit, 90 hstack, 90 identity, 84 if, 133 fmin_1_bfgs_b, 251 imag, 77 fmin_bfgs, 244 import, 59 fmin_cg, 246 Importing Data, 99–106 CSV, 99 Excel, 100, 102, 103 loadtxt, 101 MATLAB, 104 pandas, 99 STATA, 100 in1d, 77 index, 258 fmin_cobyla, 251 fmin_ncg, 246 fmin_powell, 248 fmin_slsqp, 248 fmin_tnc, 251 fminbound, 253 for, 134 from, 59 386 inf, 109 ==, 113 int, 106 >, 113 int, 265 >=, 113 intersect1d, 78 all, 115 inv, 95 and, 114 ix_, 73 any, 115 equal, 113 join, 255, 257 greater, 113 greater_equal, 113 kendalltau, 239 less, 113 kron, 95 less_equal, 113 ks_2samp, 240 logical_and, 114 kstest, 240 logical_not, 114 kurtosis, 238 logical_or, 114 laplace, 227 logical_xor, 114 leastsq, 253 not, 114 less, 113 not_equal, 113 less_equal, 113 Linear Algebra cholesky, 94 cond, 93 det, 94 eig, 94 eigh, 95 eigvals, 94 inv, 95 kron, 95 lstsq, 94 or, 114 logical_and, 114 logical_not, 114 logical_or, 114 logical_xor, 114 lognormal, 227 logspace, 71 matrix_power, 93 Looping, 134–139 break, 137, 138 continue, 137, 139 for, 134 while, 137 matrix_rank, 95 Looping slogdet, 93 Whitespace, 133 solve, 93 lower, 258 svd, 93 lstrip, 257 trace, 95 lstsq, 94 linregress, 239 linspace, 71 List comprehensions, 139 ljust, 258 loadtxt, 101 log, 76 log10, 76 Logical

Ngày đăng: 01/06/2018, 15:07

Từ khóa liên quan

Mục lục

  • Introduction

    • Background

    • Conventions

    • Important Components of the Python Scientific Stack

    • Setup

    • Using Python

    • Exercises

    • Frequently Encountered Problems

    • register_python.py

    • Advanced Setup

    • Python 2.7 vs. 3 (and the rest)

      • Python 2.7 vs. 3

      • Intel Math Kernel Library and AMD Core Math Library

      • Other Variants

      • Relevant Differences between Python 2.7 and 3

      • Built-in Data Types

        • Variable Names

        • Core Native Data Types

        • Python and Memory Management

        • Exercises

        • Arrays and Matrices

          • Array

          • Matrix

          • 1-dimensional Arrays

Tài liệu cùng người dùng

Tài liệu liên quan