4 Problem Sets

4.1 Introduction

You are a team assigned to conduct research on one of the following topics. Each group will receive a specific topic randomly from the list below. Your tasks are divided into two Problem Sets:

  1. Problem Set 1:
    • Data Management

    • Visualization

    • Descriptive Statistics

    • Conceptual Endogeneity

  2. Problem Set 2
    • Formulate Testable Hypotheses

    • OLS Estimation

    • Diagnostics

    • Sensitivity/Robustness Checks

Important: All datasets must come from PSA OpenSTAT or World Bank Open Data.

If you want to use other datasets, you must inform the lecturer by Week 2.

No Kaggle or any cleaned datasets allowed.

4.2 Assigned Topics List (General and Specific)

  1. Trade and Economic Outcomes

    a. Export volume and regional economic growth

    b. Imports, domestic production, and household income

    c. Trade openness and employment in key sectors

  2. Agriculture and Productivity

    a. Crop yield differences by farm size

    b. Fertilizer input and productivity

    c. Regional specialization in agriculture

  3. Money, Banking and Household Finance

    a. Household access to banking and income

    b. Regional credit availability and small business activity

    c. Income, savings, and household financial stability

  4. Stocks and Capital Markets

    a. Stock market index movements and GDP

    b. Stock volatility and investment or savings

    c. Economic shocks and market performance

  5. Education and Human Capital

    a. Educational attainment and earnings

    b. Regional schooling differences and income disparities

    c. Education spending and enrollment/completion rates

  6. Health and Economic Outcomes

    a. Health expenditure and labor productivity

    b. Regional health access and income

    c. Health outcomes and employment

4.3 Problem Set 1 - Data Management to Descriptive Statistics and Conceptual Endogeneity

Deadline: Week 6

Note: HW means handwritten.

  1. Data Management and Cleaning
    • Acquire dataset(s) for your assigned topic

    • Clean data (handle missing values, recode variables, reshape, etc.)

    • Report observations before and after cleaning (HW)

  2. Data Visualization
    • Create at least 3 visualizations

    • For each visualization:

      • Explain why you chose it (HW)

      • Discuss what it shows in economic terms (HW)

      • Interpret patterns, trends, or anomalies (HW)

  3. Descriptive Statistics
    • Compute summary statistics (mean, median, SD)

    • Compare across groups, regions, or categories

    • Discuss findings in economic terms (HW)

  4. Conceptual Endogeneity / Confounding Variables
    • Identify at least one variable that might confound relationships (HW)

    • Discuss how it could bias interpretation (HW)

  5. Economic Discussion
    • Summarize your findings clearly (HW)

    • Link patterns to economic reasoning or policy (HW)

4.4 Problem Set 2 - Testable Hypotheses, OLS, Diagnostics, Robustness

Deadline: Week 12

Note: HW means handwritten.

  1. Formulate Testable Hypotheses
    • Clearly define dependent and independent variables
  2. OLS Estimation
    • Run bivariate OLS

    • Run multivariate OLS with controls

    • Report coefficients, standard errors, and R2 (Note: This can be printed and pasted)

    • Discuss (HW):

      • Economic interpretation of coefficients

      • Can the OLS be interpreted causally? Why or why not?

      • Potential sources of bias

  3. Sensitivity/ Robustness Checks
    • Test whether results hold with different sets of control variables

    • Perform subsample analysis (i.e., gender, region, time period)

    • Discuss which results are robust, which change and why (HW)

    • Please emphasize your discussion with economic interpretation (HW)

  4. Diagnostics or Formal Tests
    • Multicollinearity (VIF)

    • Heteroskedasticity (BP or White)

    • Functional form (RESET)

    • Discuss the implications of the results (HW)

  5. Economic Discussion (HW)
    • Compare bivariate vs multivariate results and the sensitivity checks

    • Summarize findings with Problem Set 1, policy, limitations

4.5 Reminders:

  1. Hard copy submission is the only submission accepted.

  2. All with (HW) means these are handwritten. Please write legibly.

  3. Typed R code printed and included.

  4. Plots are printed and pasted together with the handwritten discussion.

  5. Only include essential outputs meaning, no need to print the dataset contents, whatsoever.

  6. First pages would be the answers to all questions together then, next pages would be the step-by-step process with code chunks in R.

    In printing the Quarto Markdown file for submission, put in the code chunks the following line on top where you see the {r}. It should be like this: {r, eval=FALSE}. This is so that the results will not appear in the HTML. You can then print the Quarto Markdown clearly.

    Only do this when you are printing the R codes.