Socio-economic impact of COVID-19 on refugees - Round 4, 2021
Demographic and Health Survey [hh/dhs]
This dataset contains information from the first three waves of the COVID-19 RRPS Household Survey which is part of a five-wave bi-monthly panel survey that targets Kenyan nationals. The same households are interviewed every two months, between May 2020 and May 2021.
The participants of this phone interview were identified using mixed methods. Stratified random sampling were adopted for Persons of Concern (POC) to UNHCR based in Kakuma, Kalobeyei, Dadaab and Urban areas. While a census were used for all PoCs who were 18+ years amongst the Shona community; this cohort forms 48.6% of the enumerated population of the Shona people. The survey was conducted at two levels; household and individual.
Kind of Data
Sample survey data [ssd]
Unit of Analysis
Individual and Household
v2.1: Edited, anonymous dataset for public distribution. Fourth wave.
Households: Demographics, Employment, Food security, Income loss, Transfers, Subjective welfare, Health and COVID Knowledge
Livelihood & Social cohesion
All persons of concern for UNHCR
Producers and sponsors
Kenya National Bureau of Statistics
University of California
Individuals (18 years and above) with active phone numbers were randomly selected from UNHCR database for each of the four camp sites - Kakuma, Kalobeyei, Dadaab and Urban. For Shona, we took the sample from the Socioeconomic Assessment survey. Due to the smaller sample size of the Shona population (782), we use everybody in the sample. Those selected individuals from each site were sent an SMS, stating that they have been randomly selected to participate in a socio-economic impact of COVID-19 survey.
Already computed; see the database.
Weighting: Cross-Sectional weights
For the KNBS and RDD samples, to make the sample nationally representative of the current population of households with mobile phone access, we create weights in two steps.
Step 1: Construct raw weights combining the two national samples: The current population consists of
(I) households that existed in 2015/16, and did not change phone numbers,
(II) households that existed in 2015/16, but changed phone number,
(III) households that did not exist in 2015/16.
Abstracting from differential attrition, the weights from the 2015/16 KIHBS CAPI pilot make the KIHBS sample representative of type (I) households. For RDD households, we ask whether they existed in 2015/16, when they had acquired their phone number, and where they lived in 2015/16, allowing us to classify them into type (I), (II) and (III) households and assign them to KIHBS strata. We adjust weights of each RDD household to be inversely proportional to the number of mobile phone numbers used by the household, and scale them relative to the average number of mobile phone numbers used in the KIHBS within each stratum. RDD therefore gives us a representative sample of type (II) and (III) households. We then combine RDD and KIHBS type (I) households by ex-post adding RDD households into the 2015/16 sampling frame and adjusting weights accordingly. Last, we combine our representative samples of type (I), type (II) and type (III), using the share of each type within each stratum from RDD (inversely weighted by number of mobile phone numbers). Variable: weight_raw
Step 2: Scale the weights to population proportions in each county and urban/rural stratum: We use post stratification to adjust for differential attrition and response rates across counties and rural/urban strata. We scale the raw weights from step 1 to reflect the population size in each county and rural/urban stratum as recorded in the 2019 Kenya Population and Housing Census conducted by the KNBS (2019 Kenya Population and Housing Census, Volume II: Distribution of Population by Administrative Units, December 2019, Kenya National Bureau of Statistics, https://www.knbs.or.ke/?wpdmpro=2019-kenya-population-and-housing-census-volume-ii-distribution-of-population-by-administrative-units). Variable: weight
To construct panel weights, we follow the approach outlined in Himelein (2014): “Weight Calculations for Panel Surveys with Subsampling and Split-off Tracking”. In each household we follow one target respondent. Wherever households split, only the current household of the target respondent was interviewed. The weights for the wave 1 and 2 balanced panel are constructed by applying the following steps to the full sample of Kenyan nationals:
0. Wave 1 cross-sectional weights after post-stratification adjustment are used as a base. W_1 = W_wave1
1. Attrition adjustment through propensity score-based method: The predicted probability that a sample household was successfully re-interviewed in the second survey wave is estimated through a propensity score estimation. The propensity score (PS) is modeled with a linear logistic model at the level of the household. The dependent variable is a dummy indicating whether a household that has completed the survey in wave 1 has also done so in wave. The following covariates were used in the linear logistic model: Urban/rural dummy, County dummies, Household head gender, Household head age, Household size, Dependency ratio, Dummy: Is anyone in the household working, Asset ownership: Radio, Asset ownership: Mattress, Asset ownership: Charcoal Jiko, Asset ownership: Fridge, Wall material: 3 dummies, Floor materials: 3 dummies, Connection to electricity grid, Number of mobile phones numbers household uses, Number of phone numbers recorded for follow-up, Sample dummy for estimation with national samples
2. Rank households by PS and split into 10 equal groups
3. Calculate attrition adjustment factor: ac (attrition correction) = the reciprocal of the mean empirical response rate for the propensity score decile
4. Adjust base weights for attrition: W_2 = W_1 * ac
5. Trim top 1 percent of the weights distribution (), by replacing the weights among the top 1 percent of the distribution with the highest value of a weight below the cutoff. W_3 = trim(W_2)
6. Apply post-stratification in the same way as for cross-sectional weights (step 2) Variable: weight_panel_w1_2
The balanced panel weights including waves 3 and 4 were constructed using the same procedure. Variables: weight_panel_w1_2_3 and weight_panel_w1_2_3_4
Dates of Data Collection
Data Collection Mode
Computer Assisted Telephone Interview [cati]
Data Collection Notes
PRE-LOADED INFORMATION: Basic household information was pre-loaded in the CATI assignments for each enumerator. The information, for example the household's location, household head name, phone numbers et cetera, was used to help enumerators call and identify the target households. The list of individuals from the KIHBS CAPI pilot and their basic characteristics were uploaded which helped maintain the panel of individuals and ensured the status of each individual in the wave 2 survey
RESPONDENTS: The COVID-19 RRPS had ONE RESPONDENT per household, The target respondent was defined as the primary male or female from 2015/16 KIHBS CAPI Pilot. They were randomly chosen where both existed to maintain gender balance. If the target respondent was not available for a call, the field team spoke to any adult currently living in the household of the target respondent. If the target respondent was deceased, the field team spoke to any adults that lived with the target respondent in 2015/16. Finally, if the household from 2015/16 split up, we targeted anyone in the household of the target respondent but did not survey a household member that no longer lives with the target respondent. For the sample based on Random Digit Dialing, the target respondent was the owner the phone number that was randomly selected.
Vyxer Research Management and Information Technology Consultancy Limited
Vyxer Research Management and Information Technology Consultancy Limited
The questionnaire included 12 sections
Section 1: Introduction
Section 2: Household background
Section 3: Travel patterns and interactions
Section 4: Employment
Section 5: Food security
Section 6: Income Loss
Section 7: Transfers
Section 8: Subjective welfare (50% of sample)
Section 9: Health
Section 10: COVID Knowledge
Section 11: Household and Social Relations (50% of sample)
Section 12: Conclusion
Variable names were kept constant across survey waves. For questions that remained exactly the same across survey waves, data points for all waves can be found under one variable name. For questions where the phrasing changed (even in a minimal way) across waves, variable names were also changed to reflect the change in phrasing. To address potential inconsistencies in the employment data, some data points had to be dropped for waves 2, 3, and 4.
Despite the random allocation of households to enumerators, high variability was observed in reported employment across enumerators. To reduce inconsistencies, data on employment collected by some enumerators were set to missing. For each enumerator, the mean proportion of households without any employment was calculated. For waves 2 and 3, the 95 percent confidence interval of this mean proportion was established across all enumerators. Enumerators who displayed a proportion of households with no employment above the upper bound of the confidence interval were dropped. For wave 4, those enumerators with a mean proportion of households without any employment 1 standard deviation above the mean proportion across all enumerators were dropped. This resulted in dropping the data on employment for 596 households in wave 2, 1,109 households in wave 3, and 380 households in wave 4. To account for the dropped observations in the survey weights, the variable ‘weight_labor’ should be used for weighing when analyzing data on employment. It is constructed in the same way as the cross-sectional ‘weight’ variable, but only considering observations for which the data on employment is kept.
More detailed data on children was collected in waves 3 and 4, compared to waves 1 and 2. In waves 1 and 2, data on children, e.g. on their learning activities, was collected for all children in a household with one question. Therefore, variables related to children are part of the ‘hh’ data for waves 1 and 2. From wave 3 onwards, questions on children in the household were asked for specific children. Some questions covered all children, while others were only administered to one randomly selected child in the household. This approach allows to disaggregate data at the level of the child household members, and the data can be found in the ‘child’ data set. The household level weights can be used for analysis of the children’s data.
UNHCR (2021). Socio-economic impact of COVID-19 on refugees in Kenya - Panel - Anonymized for Licensed Use. Dataset downloaded from https://microdata.unhcr.org on [date].