Frequently Asked Questions

Due to the high volume of queries we receive, we can only answer questions directly related to the Enterprise Surveys website and online portal. Please include the name of your organization and the country in which you reside.

These are the most frequently asked questions (click to see answers):

ABOUT THE ENTERPRISE SURVEYS

How many countries and firms have been surveyed?

Currently, over 158,000 firms in 144 countries have been surveyed following the Enterprise Surveys Global Methodology. For details about the countries surveyed, sample characteristics, and year of data collection, see the Sample Description document. In addition, older surveys which were not conducted using a standard methodology are hosted on the Enterprise Surveys website. Emerging economies are the primary focus and a few developed economies have been surveyed for comparative purposes. Survey data from 2006 onwards is available in the Enterprise Surveys Data Portal. Data from older surveys (with the exception of BEEPS data) may be available from the Archive which is located inside the Data Portal.

How often are the surveys performed? Are you currently collecting data?

Survey data from 2006 onwards is available in the Enterprise Surveys Data Portal. Data from older surveys (with the exception of BEEPS data) may be available from the Archive which is located inside the Data Portal.

Where can I find the most recent survey documentation?

Questionnaires, sampling notes, and other survey documentation are available on the Methodology page. Information about the implementation of specific surveys can be found in the Enterprise Surveys Data Portal along with the raw datasets. Each survey is accompanied by an implementation report detailing some aspects of fieldwork.

Does the Enterprise Survey unit maintain a bibliography of research papers that make use of the surveys?

Yes, the list is compiled on an ongoing basis but may not be comprehensive. If you have research that is not in the list, please send an email to enterprisesurveys@worldbank.org and it will be added to the list.

How often is the website updated?

The website is updated approximately every month with new survey data. Research products such as Enterprise Notes and Working Papers are posted periodically.

In what year did the Enterprise Surveys begin?

The World Bank's Enterprise Surveys began in 2002 and were conducted by different units within the World Bank (previous names include "PICS" or "Investment Climate Surveys"). Since 2005-06, most data collection efforts have been centralized within the Enterprise Analysis Unit. Centralization of the survey implementation has resulted in a unified set of core survey questions and a consistent application of survey methodology across countries.

What is the best way to become familiar with the data and the website?

The Questionnaire Codebook is a good place to start. The Excel tables contain most of the core questions that are currently asked of firms. Another way to get a sense of the overall survey themes and types of questions asked is to explore the different topic pages on the website (upper right-hand corner of the homepage). Users with knowledge of Stata or other statistical software packages are encouraged to become registered users of the Data Portal.

What types of establishments do the Enterprise Surveys cover?

The manufacturing and services sectors are the primary business sectors of interest. This corresponds to firms classified with ISIC codes 15-37, 45, 50-52, 55, 60-64, and 72 (ISIC Rev.3.1). Formal (registered) companies with 5 or more employees are targeted for interview. Services firms include construction, retail, wholesale, hotels, restaurants, transport, storage, communications, and IT. Firms with 100% government/state ownership are not eligible for interview. The Methodology page contains more detailed information.

Where can I find the Investment Climate Assessments (ICAs)?

The Enterprise Analysis Unit is not involved in preparing ICAs even though the data from our surveys are used in the reports. To access the reports, we suggest browsing the publications on the World Bank website as well as contacting the regional units (e.g. Africa or South Asia).

How does Enterprise Surveys differ from Doing Business?

Enterprise Surveys and Doing Business are complementary but different approaches to benchmarking the quality of the business environment across countries.

What is BEEPS?

Enterprise Surveys implemented in Eastern Europe and Central Asian countries are also known as Business Environment and Enterprise Performance Surveys (BEEPS) and are jointly conducted by the World Bank and the European Bank for Reconstruction and Development.

COUNTRY COVERAGE

What years of data are available for different countries?

The list of countries surveyed (and the years of data collection) can be found in this Excel document.

Where do I locate the raw datasets from older surveys (surveys fielded prior to 2005)?

Unfortunately, the Enterprise Analysis team can not provide access to, or support for, many of the surveys conducted prior to 2005. We are unable to support many of the older datasets in their raw form. There is however a standardized dataset spanning 2002-2005 on the Data Portal that contains a core set of matched variables. Please inquire at enterprisesurveys@worldbank.org to see if a specific survey is available.

What is panel data and for which countries is it available?

Panel data is survey data for the same firm over multiple years. For example, a business in Brazil was interviewed in an Enterprise Survey conducted in 2003 and it was also interviewed in the Enterprise Survey conducted in 2009. Panel datasets can be downloaded in the Enterprise Surveys Data Portal. Panel datasets consist of the multiple waves of survey data where the firm-level records are "stacked on top of each other" across the multiple years. Panel datasets exist for most ECA countries, some African countries, and a few South and East Asian countries. The Sample Description Excel document lists which countries have panel datasets.

Do you have data on any industrialized nations such as UK and US?

The Enterprise Analysis Unit's research focuses on how the business environment affects firm performance in emerging economies. Data from a limited number of High Income countries are available (e.g., Germany, Ireland, Hungary and Spain).

How many total firms exist or are eligible for an Enterprise Survey in each country (a.k.a. Universe figures/estimates)?

Universe figures (or estimates) are available in the Implemenation Reports in the Enterprise Surveys Data Portal along with each raw dataset. Usually the Universe figures are obtained from a country's national statistics agency (directly or from the agency's website). When Universe figures are unavailable from the government, the Enterprise Surveys team usually estimates the number of eligible firms for survey participation (especially when undertaking block enumeration to build a sample frame).

Is the survey done in China in 2003 available on your website?

We are unable to support many of the older datasets in their raw form. There is however a standardized dataset spanning 2002-2005 on the Data Portal that contains China 2003 and other countries with a core set of matched variables. If you wish to use the original raw dataset we suggest you contact the Task Manager for the China 2003 survey directly.

Why aren’t countries surveyed more frequently or more countries surveyed as in Doing Business?

The Doing Business data are derived from laws/regulations and the experiences of experts in each country for a narrow set of questions whereas the Enterprise Survey data are based on answers to several questions by hundreds of firms in each country. The extensive resources required to conduct an Enterprise Surveys and the danger of survey fatigue on the part of survey respondents makes it prohibitive to conduct surveys more often than every 3 to 4 years. Resource constraints often prohibit the implementation of Enterprise Surveys in developed economies.

How extensive is the geographical coverage within a country?

In each survey project, the sample design aims to include the main cities/regions of economic activity. The actual number of cities depends on the size of the economy. Major cities include the city itself as well as the surrounding areas.

DATA AND VARIABLES

How can I access the questionnaires used for specific country surveys? Can I access the raw data?

Registered users can download the data and corresponding survey questionnaire for each country in the Enterprise Surveys Data Portal. Both registration and downloading data are free.

Which aggregation level do you use for the indicators on your website?

Indicators are created using weighted (weight=w_median) data. For each country for a particular survey question, the indicator is created using the weighted average (or the weighted percentage of firms that responded 'Yes') across firms. Some website indicators are based on the combination of two survey questions. The website also allows users to view indicators by strata variables (firm size, geographic location within a country) and also by a few ex-post variables (exporter status, foreign ownership, gender of top manager). Some indicators are based on questions only asked of manufacturing firms. The Methodology page has a listing of all website indicators. Note that when indicators are presented on the website for a broad geographic region or income group (e.g. Africa region or "Upper Middle Income"), a simple average is computed using the relevant, available country-level indicators.

What types of questions are asked in the surveys?

The Methodology page has the most recent versions of the global questionnaires. The Enterprise Survey covers a wide range of business environment topics including general business characteristics, infrastructure and services, sales and supplies, access to finance, degree of competition, land, crime, business-government relations, investment climate constraints, labor, and productivity. There are manufacturing-specific questions as well as a few retail-specific questions. In collaboration with economists in the regional departments of the World Bank, every Enterprise Survey is customized to include country-specific questions (or region-specific questions). The questions are mostly objective questions aimed at measuring the quality of the business environment and the experience of firms. Less than 10% of the questions are subjective, that is asking the respondent for his/her opinion. The question answers are mostly the following types: yes/no, a percentage or monetary amount, days required to obtain a service, number of times a particular event has occurred, or a 5-point Likert scale.

What is the difference between the country data and the standardized data?

Country data includes all questions that were asked in a survey but may lack comparability across countries and years. Standardized data is country data that has been matched to a standard set of questions. This format allows cross-country comparisons and analysis but sacrifices those country-specific survey questions which cannot be matched. The standardization process requires that certain compromises are made in order to match some of the variables. Thus, we encourage our users to pay close attention to the actual wording of the survey questions and to use the raw country datasets for their analysis.

What monetary values and units are used in the surveys?

For most countries, monetary values are reported in local currency units. When downloading raw data from the portal, data users are encouraged to download the accompanying survey questionnaires and documentation which provide the exact wording of each survey question. No adjustments are made to inflate/deflate the reported monetary values for costs and sales figures.

I am interested in occupational safety and health-related questions. Why are they not included in your surveys?

Unfortunately, some questions are not included in our questionnaire because it is already quite lengthy in size. The current set of questions takes about an hour to conduct, and adding new questions may increase both item and unit non-response. If you have suggestions for improving existing questions or adding new questions to the survey instrument, please send us an email with your suggestions and we will consider them for potential inclusion in future surveys.

In the most recent indicator list, the indicator "% of Women in Senior Positions", is not included. But it was in the old indicator list. Has this indicator been replaced?

The indicator '% of Women in Senior Positions" has been replaced by the indicator "% of Female Permanent Full-Time Non-Production Workers". We have also added another new indicator: "% of Firms with Female Top Manager" (question B.7a).

What is the correct way to use the weights in the full data?

The weights in the more recent Enterprise Surveys data are probability weights. Using these weights allows inference on the population of non-agricultural private firms (that meet the Enterprise Surveys eligibility criteria) in a country. In Stata, a survey design should be declared before performing any analysis. Specifically, this command should be used: svyset idstd [pweight=wt], strata(strata) singleunit(scaled). The survey commands using ‘svy’ should be used in calculating any statistics to be interpreted for the population of non-agricultural private firms. For statistics related to specific types of firms, analysts should use the subpopulation option in Stata.

Do you provide a correspondence table between the question numbers in some countries' questionnaires (e.g. A0, A1, B1, B2…) and the variables in Stata data files (e.g. idstd, a0, a1, a2…)?

We do not provide such correspondence tables but they can be made by matching the questionnaires available on the Portal by hand. There are however standardized datasets spanning 2002-2005 and 2005-Present on the Data Portal that contain a core set of matched variables.

How is a firm's business activity classified? Are standard industry/sector codes used?

Most of the new surveys conducted after 2006 contain ISIC Rev. 3.1 industry code, which is a 4 digit code used to describe a firm's business activity (question D.1a2). Question D.1a1 contains the text description of the business' main product. Please see the raw data available on the Enterprise Surveys Data Portal for details.

Where can I find the labels for region variables?

When the sampling methodology calls for stratification by geographic location (within a country), the region labels can be found in the dataset itself or the questionnaire instrument will list the geographic locations along with the coding scheme. The implementation report will describe how many interviews were conducted in the various geographic locations.

How is "principal ownership" defined in the Enterprise Surveys?

The term principal owner is only used in the older surveys when asking about female participation in ownership. Newer surveys use a slightly different wording to avoid ambiguity: "Amongst the owners of the firm, are there any females?"

I am having trouble opening the data even though I have Stata. Why might this be?

You may be using an older version of Stata (we are currently using Stata/SE v.13). Also sufficient memory for Stata is required to open some of the large datasets. Note that Stata can not open a zipped file without it first being extracted to another location. If you still have problems, try re-downloading the data in case it was corrupted during the download.

What is the format of the data and can it be converted into other formats?

The data are in Stata 13 format and may be converted into other formats such as SPSS, SAS or Access using a translation program (e.g., Stat/Transfer or DBMS). However, some of the data attributes (e.g., labels) might be lost in some of the non-native formats.

What is the unit of measurement for the indicator "Average total time of power outages per month"? What is its relationship with "Average duration of power outages (hours)"?

The indicator "average total time of power outages per month" is the average total hours of being without power during a month. It is calculated by multiplying (at the firm level) the number of outages by the duration of the average outage. Then, an average is computed at the country level using this new measure. The "average duration of power outages (hours)" is the average duration of one power outage.

Why are some variables not available in the standardized data sets?

Due to the natural evolution of questionnaire design and survey methodology over time, some variables may not be available in certain countries. Datasets that contain more than one country (across different regions) will likely lack the variables from country-specific questions.

Why are the summary statistics results I calculated different from what you presented on your website?

The indicators shown on our website are computed using sampling weights. The sampling weights are available in most datasets. Some datasets have more than one weight calculated based on different assumptions about the data. Whenever this is the case, we use the Median weights. In order to use sample weights you need to use the survey commands in Stata as demonstrated below:
svyset idstd [pweight=wmedian], strata(strata) singleunit(scaled)
svy: tab k8 if a1==79
Note that the weight variable may be called 'wt', 'weight', 'w_median', etc., depending on the country and dataset. Another way to get to the same result is shown below. Note that many functions do not support p-weights but you can often 'cheat' by using the i-weights specification instead:
tab k8 if a1==79 [iw=wmedian]
Note that for cross-country comparability, we remove "outliers", values that are +/- 3 SDs for some of the continuous, unbounded variables. This may result in website summary statistics being different from what you compute on your own. For more information see the Methodology page.