1 Overview

Bertrand and Mullainathan (AER 2004)

求人広告に履歴書を送るときに，黒人に典型的な名前と白人に典型的な名前をランダムに割り当て，返信率を比較する．

1.1 Data

LICENSE.txt ファイルに記載されている通り，データセットは Creative Commons Attribution 4.0 International Public License で公開されている． Moodle には Bertrand2004.dta というファイル名でアップロードしている．

注：分析コードは公開されていない．

setwd("c://ws_stat")
library(tidyverse)
bm <- haven::read_dta("Bertrand2004.dta")
nrow(bm)  # sample size

## [1] 4870

attr(bm$call, "label")  # definition of "call"

## [1] "1=applicant was called back"

主な変数

race: w = White-sounding name, b = Black-sounding name
call: 1 = applicant was called back
h: 1 = high quality resume
adid: employment ad identifier

人種 race (b: black, w: white) とcall back call (1: yes, 0: no) のクロス表．

table(bm$race, bm$call)

##    
##        0    1
##   b 2278  157
##   w 2200  235

2 Is There a Racial Gap in Callback? (Section III.A.)

比率 (callback rate) が白人（に典型的な名前が割り当てられた履歴書．以下同様）と黒人でどのように異なるかを計算する．

whole_table <- bm %>% select(race, call) %>% table  # whole sample 
whole_table

##     call
## race    0    1
##    b 2278  157
##    w 2200  235

c(157, 235) / c(157+2278, 235+2200)

## [1] 0.06447639 0.09650924

whole_table[, 2] / rowSums(whole_table)  # same as above

##          b          w 
## 0.06447639 0.09650924

つまり，白人で9.65%，黒人で6.45%，その差は3.2%ポイント．

2.1 Proportion test

この比率の差は統計的に有意なのか？

検討の対象は比率なので，t-検定ではなく母比率の差の検定を行う．

関数は prop.test． 1つ目の引数に call back の数，2つ目の引数に resume を送った数を（それぞれベクトルで）わたす．

prop.test(whole_table[, 2], rowSums(whole_table))

## 
##  2-sample test for equality of proportions with continuity correction
## 
## data:  whole_table[, 2] out of rowSums(whole_table)
## X-squared = 16.449, df = 1, p-value = 4.998e-05
## alternative hypothesis: two.sided
## 95 percent confidence interval:
##  -0.04769866 -0.01636705
## sample estimates:
##     prop 1     prop 2 
## 0.06447639 0.09650924

同様にして，女性のサブサンプル，男性のサブサンプルについて検定．

female_table <- bm %>% filter(sex == "f") %>% select(race, call) %>% table  # female sample 
prop.test(female_table[, 2], rowSums(female_table))

## 
##  2-sample test for equality of proportions with continuity correction
## 
## data:  female_table[, 2] out of rowSums(female_table)
## X-squared = 12.76, df = 1, p-value = 0.0003541
## alternative hypothesis: two.sided
## 95 percent confidence interval:
##  -0.05079182 -0.01450196
## sample estimates:
##     prop 1     prop 2 
## 0.06627784 0.09892473

male_table <- bm %>% filter(sex == "m") %>% select(race, call) %>% table  # male sample 
prop.test(male_table[, 2], rowSums(male_table))

## 
##  2-sample test for equality of proportions with continuity correction
## 
## data:  male_table[, 2] out of rowSums(male_table)
## X-squared = 3.3655, df = 1, p-value = 0.06658
## alternative hypothesis: two.sided
## 95 percent confidence interval:
##  -0.062586830  0.001771118
## sample estimates:
##     prop 1     prop 2 
## 0.05828780 0.08869565

2.2 Distribution of callbacks at the employment-ad level (Table 2)

Employment ad レベルで集計して，白人へのcallback数と黒人へのcallback数の組み合わせをカウントする．

length(unique(bm$adid))  # employment ad identifier

## [1] 1323

bm %>% 
  group_by(adid) %>% 
  summarise(callback_w = sum(call & race == "w"),
            callback_b = sum(call & race == "b")) %>%
  mutate(w0b0 = ifelse(callback_w == 0 & callback_b == 0, 1, 0),
         w1b1 = ifelse(callback_w == 1 & callback_b == 1, 1, 0),
         w2b2 = ifelse(callback_w == 2 & callback_b == 2, 1, 0),
         w1b0 = ifelse(callback_w == 1 & callback_b == 0, 1, 0),
         w2b0 = ifelse(callback_w == 2 & callback_b == 0, 1, 0),
         w2b1 = ifelse(callback_w == 2 & callback_b == 1, 1, 0),
         w0b1 = ifelse(callback_w == 0 & callback_b == 1, 1, 0),
         w0b2 = ifelse(callback_w == 0 & callback_b == 2, 1, 0),
         w1b2 = ifelse(callback_w == 1 & callback_b == 2, 1, 0)) %>%
  select(starts_with("w")) %>%  # select variables that starts with "w"
  apply(2, sum)

## w0b0 w1b1 w2b2 w1b0 w2b0 w2b1 w0b1 w0b2 w1b2 
## 1103   46   17   74   19   18   33    6    7

(74 + 19 + 18) / length(unique(bm$adid))  # % of White favored ad

## [1] 0.08390023

(33 + 6 + 7) / length(unique(bm$adid))  # $ of African-American favored ad

## [1] 0.03476946

White favored な ad は8.4%，African-American favored は3.5%．

2.3 Binary regression model (original)

人種，性別，その他応募者の経験などを説明変数としてプロビットモデルを推定する．

formula_binary <- call ~ race + yearsexp + honors + volunteer + military + empholes + workinschool + email + computerskills + specialskills + sex + city
probit <- glm(formula = formula_binary, family = binomial("probit"), data = bm)
summary(probit)

## 
## Call:
## glm(formula = formula_binary, family = binomial("probit"), data = bm)
## 
## Coefficients:
##                 Estimate Std. Error z value Pr(>|z|)    
## (Intercept)    -1.742625   0.109376 -15.932  < 2e-16 ***
## racew           0.214511   0.054077   3.967 7.29e-05 ***
## yearsexp        0.014021   0.005309   2.641 0.008268 ** 
## honors          0.268572   0.104762   2.564 0.010358 *  
## volunteer      -0.058182   0.075185  -0.774 0.439017    
## military       -0.045029   0.106294  -0.424 0.671839    
## empholes        0.182748   0.064908   2.816 0.004870 ** 
## workinschool    0.073023   0.066009   1.106 0.268612    
## email           0.135979   0.077447   1.756 0.079128 .  
## computerskills -0.137753   0.073428  -1.876 0.060650 .  
## specialskills   0.366493   0.056598   6.475 9.46e-11 ***
## sexm           -0.020341   0.069872  -0.291 0.770962    
## cityc          -0.201019   0.057963  -3.468 0.000524 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 2726.9  on 4869  degrees of freedom
## Residual deviance: 2597.9  on 4857  degrees of freedom
## AIC: 2623.9
## 
## Number of Fisher Scoring iterations: 5

mfx パッケージを用いて限界効果を計算．

library(mfx)
probitmfx(formula = formula_binary, data = bm)

## Call:
## probitmfx(formula = formula_binary, data = bm)
## 
## Marginal Effects:
##                      dF/dx   Std. Err.       z     P>|z|    
## racew           0.02951999  0.00742523  3.9756 7.019e-05 ***
## yearsexp        0.00192532  0.00072858  2.6425 0.0082285 ** 
## honors          0.04373952  0.01987213  2.2010 0.0277326 *  
## volunteer      -0.00793072  0.01017241 -0.7796 0.4356084    
## military       -0.00602122  0.01383369 -0.4353 0.6633753    
## empholes        0.02548277  0.00918091  2.7756 0.0055095 ** 
## workinschool    0.00996636  0.00895698  1.1127 0.2658407    
## email           0.01876549  0.01073667  1.7478 0.0804996 .  
## computerskills -0.02017238  0.01141930 -1.7665 0.0773094 .  
## specialskills   0.05533983  0.00925282  5.9809 2.220e-09 ***
## sexm           -0.00277094  0.00944338 -0.2934 0.7691960    
## cityc          -0.02810494  0.00823587 -3.4125 0.0006437 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## dF/dx is for discrete change for the following variables:
## 
##  [1] "racew"          "honors"         "volunteer"      "military"      
##  [5] "empholes"       "workinschool"   "email"          "computerskills"
##  [9] "specialskills"  "sexm"           "cityc"

白人 race = w は黒人と比較して3%ポイント程度 callback rate が高い．

Employment holes (empholes) の係数は正に推定されている．別の効果を拾っているのかもしれない．

ここで，人種はランダムに割り当てられていることを思い出せば，人種のみを説明変数として推定しても限界効果は概ね同じになると考えられる（除外変数バイアスが生じないため）．

probitmfx(formula = call ~ race, data = bm)

## Call:
## probitmfx(formula = call ~ race, data = bm)
## 
## Marginal Effects:
##           dF/dx Std. Err.      z     P>|z|    
## racew 0.0320329 0.0077834 4.1156 3.863e-05 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## dF/dx is for discrete change for the following variables:
## 
## [1] "racew"

3 Do African-Americans Receive Different Returns to Resume Quality? (Section III.B.)

履歴書が high quality かどうかによって，人種と callback rate の関係がどのように異なるか．

3.1 Subjective measure of quality

bm %>% 
  group_by(h, race) %>% 
  summarise(call_back_rate = mean(call))

## # A tibble: 4 × 3
## # Groups:   h [2]
##       h race  call_back_rate
##   <dbl> <chr>          <dbl>
## 1     0 b             0.0619
## 2     0 w             0.0850
## 3     1 b             0.0670
## 4     1 w             0.108

母比率の差の検定．

white_table <- bm %>%  # White name 
  filter(race == "w") %>%
  dplyr::select(h, call) %>% 
  table()
prop.test(white_table[, 2], rowSums(white_table))

## 
##  2-sample test for equality of proportions with continuity correction
## 
## data:  white_table[, 2] out of rowSums(white_table)
## X-squared = 3.4179, df = 1, p-value = 0.06449
## alternative hypothesis: two.sided
## 95 percent confidence interval:
##  -0.047197581  0.001301945
## sample estimates:
##    prop 1    prop 2 
## 0.0849835 0.1079313

black_table <- bm %>%  # Black name 
  filter(race == "b") %>%
  dplyr::select(h, call) %>% 
  table()
prop.test(black_table[, 2], rowSums(black_table))

## 
##  2-sample test for equality of proportions with continuity correction
## 
## data:  black_table[, 2] out of rowSums(black_table)
## X-squared = 0.19059, df = 1, p-value = 0.6624
## alternative hypothesis: two.sided
## 95 percent confidence interval:
##  -0.02549421  0.01516011
## sample estimates:
##     prop 1     prop 2 
## 0.06188119 0.06704824

注：論文の Table 4 とは p 値が異なる． Yates’ continuity correction を行わない設定にしても（引数 correct = FALSE）再現できない．

3.2 Predicted measure of quality

履歴書の predicted measure of quality を用いると，黒人の履歴書が callback を受ける確率は履歴書の quality によって有意に異なる（Table 4, Panel B）．

授業内で説明する予定はないが，興味がある履修者は以下を参照．

サンプルの分割によって結果が大きく異なる．プロビットモデルの特定化が間違っているのか，あるいは論文でも不安定な推定をしているのかは不明．

set.seed(1)
bm2 <- bm %>%  # サンプルを「パラメタ推定用」と「分析用」に分割
  mutate(runif_for_sampling = runif(nrow(bm), 0, 1),
         for_train = ifelse(runif_for_sampling <= 1/3, 1, 0),
         college_degree = ifelse(education == 4, 1, 0))
table(bm2$college_degree)

## 
##    0    1 
## 1366 3504

bm_train <- bm2 %>%  # パラメタ推定用（元のサンプルの 1/3）
  filter(for_train == 1)
bm_test <- bm2 %>%  # 分析用（元のサンプルの 2/3）
  filter(for_train == 0)
# nrow(bm_train); nrow(bm_test)
# probit model of callback dummy 
probit_call <- glm(formula = call ~ college_degree + yearsexp + honors + volunteer + military + empholes + workinschool + email + computerskills + specialskills + sex + city + manuf + transcom + bankreal + trade + busservice + expreq + comreq + educreq + compreq + orgreq, family = binomial("probit"), data = bm_train)
# summary(probit_call)
# predicted callback rate (= predicted quality of resume) 
bm_test$predicted_call <- predict(object = probit_call, newdata = bm_test, type = "response")
summary(bm_test$predicted_call)

##     Min.  1st Qu.   Median     Mean  3rd Qu.     Max. 
## 0.001049 0.028394 0.052470 0.068933 0.095213 0.440886

bm_test %>% 
  mutate(h_predicted = ifelse(predicted_call >= median(predicted_call), 1, 0)) %>%
  group_by(h_predicted, race) %>% 
  summarise(call_back_rate = mean(call))

## # A tibble: 4 × 3
## # Groups:   h_predicted [2]
##   h_predicted race  call_back_rate
##         <dbl> <chr>          <dbl>
## 1           0 b             0.0429
## 2           0 w             0.0746
## 3           1 b             0.0823
## 4           1 w             0.140

4 Whether and how an applicant’s residential address, all else equal, affects the callback (Section III.C.)

履歴書は名前だけではなく住所もランダムに割り当てられている．

住所によって callback がどの程度異なるかを知ることで，黒人の応募者がより裕福な地域に居住することによる影響を知ることができる（Table 6）．

残念ながら論文の結果を全く再現できないためコードは表示のみ．

bm3 <- bm %>% 
  mutate(black = ifelse(race == "b", 1, 0))
probit_zip_call <- glm(formula = call ~ black*fracwhite + city, family = binomial("probit"), data = bm3)
summary(probit_zip_call)

Data Analysis Using Statistical Packages: Bertrand and Mullainathan (AER 2004)

Sho Kuroda / 黒田翔

2024年3月 (Last update: 2024年3月)