# Economic，statistics ,R,stata

This questions asks you to estimate the causal effect of tourism on local household incomes in

Mexico. To answer this question, we will use Stata/R and the dataset “[login to view URL]” that you can

download from BCourses. This dataset contains 1153 Mexican municipalities that reported some

amount of local tourism activity (measured by local hotel sales) in the year 2000. Write up the

answers to a)-e) below in the same document you used for Questions 1 and 2 above. In addition, also

attach the do file that you used to answer the following questions.

a) Open the dataset in Stata/R. Visualize and export a table that lists the number of observations,

the mean, the standard deviation, the minimum value and the maximum value for each of the

variables in the dataset (edit and include the table in your written up answer). Briefly describe

what we learn from the table about the sample of Mexican municipalities.

b) Use the data to obtain an OLS point estimate of the effect of local tourism activity (measured

by the logarithm of local hotel sales) on the logarithm of local average monthly household

incomes. Export your result in a regression table (that you can edit and include in your written

up answer), and comment on the interpretation and statistical significance of your result.

c) List three plausible arguments why the point estimate in b) could be biased upwards or

downwards relative to the true causal effect of local tourism activity on monthly household

incomes.

d) Now your GSI suggests that the kilometer distance between the center of the municipality

and the nearest segment of the US-Mexico border could be a valid instrumental variable for

your measure of local tourism activity. List the assumptions that need to hold true for this to

be correct.

e) Verify if the assumption of instrument relevance is satisfied, and export the results into the

same regression table that you used before. Comment on the interpretation and statistical

f) Now use Stata/R to estimate the 2nd stage IV point estimate as suggested by the GSI, and

export your result in the same regression table you used before. Comment on the

the difference between the OLS and IV point estimates as you expected or rather not?

g) Now one of your friends suggests that the distance to the US border is likely correlated with

other local characteristics that affect local incomes, such as the logarithm of the average

temperature, the logarithm of the average precipitation, the average years of education and

the proportion of indigenous population. Propose a way to verify whether these concerns are

relevant and export your regression results in the same table as before. Comment on the

results and what they imply about the validity of the instrumental variable strategy

