Socio-economic impact of COVID-19 on refugees - Panel Study
Demographic and Health Survey [hh/dhs]
This dataset contains information from the first four waves of the COVID-19 RRPS Household Survey which is part of a five-wave bi-monthly panel survey that targets Kenyan nationals and refugees. The same households are interviewed every two months, between May 2020 and May 2021.
This dataset contains information from four waves of the COVID-19 RRPS, which is part of a bi-monthly panel survey that targets Kenyan nationals and refugees and started in May 2020. The same households are interviewed every two months, with interviews conducted using Computer Assisted Telephone Interviewing (CATI) techniques. Sampled households that were not reached in earlier waves were also contacted along with households that were interviewed before. The “wave” variable represents in which wave the households were interviewed in.
All waves of this survey include information on household background, service access, employment, food security, income loss, transfers, health, and COVID-19 knowledge.
The data set contains three files. The first is the hh file, which contains household level information. The ‘hhid’, uniquely identifies all household. The second is the adult level file, which contains data at the level of adult household members. Each adult in a household is uniquely identified by the ‘adult_ID’. The third file is child level file, which contain information for every child in the household. Each child in a household is uniquely identified by the ‘child_id’.
The duration of data collection for each wave was:
Wave 1: May 14 to July 7, 2020
Wave 2: July 16 to September 18, 2020
Wave 3: September 18 to November 28, 2020
Wave 4: January 15 to March 25, 2021
The participants of this phone interview were identified using mixed methods. Stratified random sampling were adopted for Persons of Concern (POC) to UNHCR based in Kakuma, Kalobeyei, Dadaab and Urban areas. While a census were used for all PoCs who were 18+ years amongst the Shona community; this cohort forms 48.6% of the enumerated population of the Shona people. The survey was conducted at two levels; household and individual.
Kind of Data
Sample survey data [ssd]
Unit of Analysis
Individual and Household
v2.1: Edited, anonymous dataset for public distribution. Fourth wave.
The Kenya COVID-19 RRPS survey covers the following topics: Household Roster, Travel Patterns & Interactions, Employment, Food security, Income Loss, Transfers, Subjective welfare (50% of sample), Health, COVID-19 Knowledge, Household and Social Relations (50% of sample).
Livelihood & Social cohesion
All persons of concern for UNHCR
Producers and sponsors
Kenya National Bureau of Statistics
University of California
Individuals (18 years and above) with active phone numbers were randomly selected from UNHCR database for each of the four camp sites - Kakuma, Kalobeyei, Dadaab and Urban. For Shona, we took the sample from the Socioeconomic Assessment survey. Due to the smaller sample size of the Shona population (782), we use everybody in the sample. Those selected individuals from each site were sent an SMS, stating that they have been randomly selected to participate in a socio-economic impact of COVID-19 survey.
To construct panel weights, we follow the approach outlined in Himelein (2014): “Weight Calculations for Panel Surveys with Subsampling and Split-off Tracking”. In each household we follow one target respondent. Wherever households split, only the current household of the target respondent was interviewed. The weights for the wave 1 and 2 balanced panel are constructed by applying the following steps to the full sample of Kenyan nationals:
0. Wave 1 cross-sectional weights after post-stratification adjustment are used as a base. W_1 = W_wave1
1. Attrition adjustment through propensity score-based method: The predicted probability that a sample household was successfully re-interviewed in the second survey wave is estimated through a propensity score estimation. The propensity score (PS) is modeled with a linear logistic model at the level of the household. The dependent variable is a dummy indicating whether a household that has completed the survey in wave 1 has also done so in wave. The following covariates were used in the linear logistic model: Urban/rural dummy, County dummies, Household head gender, Household head age, Household size, Dependency ratio, Dummy: Is anyone in the household working, Asset ownership: Radio, Asset ownership: Mattress, Asset ownership: Charcoal Jiko, Asset ownership: Fridge, Wall material: 3 dummies, Floor materials: 3 dummies, Connection to electricity grid, Number of mobile phones numbers household uses, Number of phone numbers recorded for follow-up, Sample dummy for estimation with national samples
2. Rank households by PS and split into 10 equal groups
3. Calculate attrition adjustment factor: ac (attrition correction) = the reciprocal of the mean empirical response rate for the propensity score decile
4. Adjust base weights for attrition: W_2 = W_1 * ac
5. Trim top 1 percent of the weights distribution (), by replacing the weights among the top 1 percent of the distribution with the highest value of a weight below the cutoff. W_3 = trim(W_2)
6. Apply post-stratification in the same way as for cross-sectional weights (step 2) Variable: weight_panel_w1_2
The balanced panel weights including waves 3 and 4 were constructed using the same procedure. Variables: weight_panel_w1_2_3 and weight_panel_w1_2_3_4
Dates of Data Collection
Data Collection Mode
Computer Assisted Telephone Interview [cati]
Data Collection Notes
PRE-LOADED INFORMATION: Basic household information was pre-loaded in the CATI assignments for each enumerator. The information, for example the household's location, household head name, phone numbers et cetera, was used to help enumerators call and identify the target households. The list of individuals from the KIHBS CAPI pilot and their basic characteristics were uploaded which helped maintain the panel of individuals and ensured the status of each individual in the wave 2 survey
RESPONDENTS: The COVID-19 RRPS had ONE RESPONDENT per household, The target respondent was defined as the primary male or female from 2015/16 KIHBS CAPI Pilot. They were randomly chosen where both existed to maintain gender balance. If the target respondent was not available for a call, the field team spoke to any adult currently living in the household of the target respondent. If the target respondent was deceased, the field team spoke to any adults that lived with the target respondent in 2015/16. Finally, if the household from 2015/16 split up, we targeted anyone in the household of the target respondent but did not survey a household member that no longer lives with the target respondent. For the sample based on Random Digit Dialing, the target respondent was the owner the phone number that was randomly selected.
The questionnaire included 12 sections
Section 1: Introduction
Section 2: Household background
Section 3: Travel patterns and interactions
Section 4: Employment
Section 5: Food security
Section 6: Income Loss
Section 7: Transfers
Section 8: Subjective welfare (50% of sample)
Section 9: Health
Section 10: COVID Knowledge
Section 11: Household and Social Relations (50% of sample)
Section 12: Conclusion
Variable names were kept constant across survey waves. For questions that remained exactly the same across survey waves, data points for all waves can be found under one variable name. For questions where the phrasing changed (even in a minimal way) across waves, variable names were also changed to reflect the change in phrasing. To address potential inconsistencies in the employment data, some data points had to be dropped for waves 2, 3, and 4.
Despite the random allocation of households to enumerators, high variability was observed in reported employment across enumerators. To reduce inconsistencies, data on employment collected by some enumerators were set to missing. For each enumerator, the mean proportion of households without any employment was calculated. For waves 2 and 3, the 95 percent confidence interval of this mean proportion was established across all enumerators. Enumerators who displayed a proportion of households with no employment above the upper bound of the confidence interval were dropped. For wave 4, those enumerators with a mean proportion of households without any employment 1 standard deviation above the mean proportion across all enumerators were dropped. This resulted in dropping the data on employment for 596 households in wave 2, 1,109 households in wave 3, and 380 households in wave 4. To account for the dropped observations in the survey weights, the variable ‘weight_labor’ should be used for weighing when analyzing data on employment. It is constructed in the same way as the cross-sectional ‘weight’ variable, but only considering observations for which the data on employment is kept.
More detailed data on children was collected in waves 3 and 4, compared to waves 1 and 2. In waves 1 and 2, data on children, e.g. on their learning activities, was collected for all children in a household with one question. Therefore, variables related to children are part of the ‘hh’ data for waves 1 and 2. From wave 3 onwards, questions on children in the household were asked for specific children. Some questions covered all children, while others were only administered to one randomly selected child in the household. This approach allows to disaggregate data at the level of the child household members, and the data can be found in the ‘child’ data set. The household level weights can be used for analysis of the children’s data.
UNHCR (2021). Socio-economic impact of COVID-19 on refugees in Kenya - Panel - Anonymized for Licensed Use. Dataset downloaded from https://microdata.unhcr.org on [date].
DDI Document ID
Date of Metadata Production
DDI Document version
1.0, November 2020
2.0, December 2020
3.0, February 2021
4.0, May 2021