Skip to content

Data Analysis

EPAR conducts data analyses when we have access to publicly-available datasets with information relevant to our research questions. This page includes links to data sources EPAR has frequently used, a series of data analysis tips and considerations drawing from our experience with data analysis projects, and links to resources for research and statistical analysis.

Browse EPAR Data Analyses on our research page, including our recent and ongoing work on curating a series of Agricultural Development indicators using data from the World Bank’s Living Standards Measurement Study – Integrated Surveys on Agriculture (LSMS-ISA) from Ethiopia, Nigeria, and Tanzania. View publicly available code repositories from past EPAR projects on EPAR’s GitHub page. This code was the basis for our online data tool. The code for the data tool is also available on GitHub.

Common EPAR Data Sources:

Living Standards Measurement Study – Integrated Surveys on Agriculture (LSMS-ISA): The LSMS-ISA is a comprehensive household survey supported by the World Bank and administered in eight countries in Africa (Burkina Faso, Ethiopia, Malawi, Mali, Niger, Nigeria, Tanzania, and Uganda) in partnership with national government statistical offices. The LSMS-ISA datasets are nationally-representative panel surveys, and include a rich set of agricultural data as well as a household module with a variety of questions of socio-economic status, household welfare, and non-farm income activities.
Food and Agriculture Organization of the United Nations (FAO): The FAO hosts FAOSTAT, a data platform housing global agricultural statistics.
Agricultural Science and Technology Indicators (ASTI): The International Food Policy Research Institute (IFPRI) collects and hosts data on government, higher education, nonprofit, and (where possible) private sector agricultural R&D investment in low- and middle-income countries.
Financial Inclusion Insights (FII): Collected and hosted by Intermedia, the FII surveys capture data on consumer financial behaviors, including trends in mobile money and other digital financial services. The FII includes nationally-representative cross-sections for multiple years for Bangladesh, India, Indonesia, Kenya, Nigeria, Pakistan, Tanzania and Uganda, and is currently expanding its scope to include Benin, Ghana, Rwanda, and Senegal.
Data Analysis Tips and Considerations:

We have compiled a brief with a set of helpful tips and considerations for conducting data analyses, drawing from our research experience. These tips include notes on getting to know your dataset, practices for data cleaning, and preparing and organizing your code and the documentation for your analysis.

Resources for Research and Statistical Analysis:

University of Washington Center for Studies in Demography and Ecology (CSDE): CSDE supports provides training and resources for data analyses, including occasional courses and workshops on Stata, R, and GIS.
University of Washington Center for Statistics and the Social Sciences (CSSS): CSSS provides free statistical consulting to current UW faculty, staff, and students working on social science problems, at any stage in the research process. They also offer courses in various aspects of statistical analysis at the undergraduate and graduate levels.
Institute for Digital Research and Education (IDRE), UCLA: IDRE provides useful guides on learning and using Stata, the primary statistical software used by EPAR.
UNC Carolina Population Center Stata Tutorial: Includes function-oriented guides for using Stata, focusing on the data-management tasks most needed by data analysts working with sample survey data. It works up from basic tasks, such as how to drop variables, to the tasks needed for complex file organization, such as how to reshape and merge data files. The examples in the guides can be followed by downloading the provided sample data files that are available
The World Bank provides presentation slides from workshop introducing users to analysis in Stata and to using panel analysis in Stata, using household survey data.
StataCorp provides a variety of community-contributed and official resources for leaning Stata.
A variety of websites provide free tutorials for using R: Cyclismo, R-tutor, R-bloggers, TryR, etc.
Some paid websites offer a set number of free tutorials for different software packages, including interactive assignments and coding examples: DataCamp, Code School, etc.
The GitHub Guide provides a helpful overview of how to use GitHub for version control. A StataCorp slideshow presents a guide to using GitHub for version control with Stata code.
Abdul Latif Jameel Poverty Action Lab (J-PAL): J-PAL and Innovations for Poverty Action (IPA) curate a research resources page that includes best practices for data and code management and other resources covering research design, measurement and data collection, working with data, transparency, randomization, and software and tools.
LinkedIn Facebook