un

Map of Ukraine Wheat Exports in 2021

Import and Export Data for Countries: Grain from Ukraine

I’ve been receiving more questions about geospatial data sources as the semester draws to a close. I’ll describe some sources that I haven’t used extensively before in the next couple of posts, beginning with data on bilateral trade: imports and exports between countries. We’ll look at the IMF’s Direction of Trade Statistics (DOTS) and the UN’s COMTRADE database. Both sources provide web-based portals, APIs, and bulk downloading. I’ll focus on the portals.

IMF Direction of Trade Statistics

IMF DOTS provides monthly, quarterly, and annual import and export data, represented as total dollar values for all goods exchanged. The annual data goes back to 1947, while the monthly / quarterly data goes back to 1960. All countries that are part of the IMF are included, plus a few others. Data for exports are published on a Free and On Board (FOB) price basis, while imports are published on a Cost, Insurance, Freight (CIF) price basis. Here are definitions for each term, quoted directly from the OECD’s Statistical Glossary:

The f.o.b. price (free on board price) of exports and imports of goods is the market value of the goods at the point of uniform valuation, (the customs frontier of the economy from which they are exported). It is equal to the c.i.f. price less the costs of transportation and insurance charges, between the customs frontier of the exporting (importing) country and that of the importing (exporting) country.

The c.i.f. price (i.e. cost, insurance and freight price) is the price of a good delivered at the frontier of the importing country, including any insurance and freight charges incurred to that point, or the price of a service delivered to a resident, before the payment of any import duties or other taxes on imports or trade and transport margins within the country.

OECD Glossary of Statistical Terms

There are a few different ways to browse and search for data. Start with the Data Tables tab at the top, and Exports and Imports by Areas and Countries. The default table displays monthly exports by region and country for the entire world (you could switch to imports by selecting the Imports CIF tab beside the Export sFOB tab). Hitting the Calendar dropdown allows you to change the date range and frequency. Hitting the Country dropdown lets you select a specific region or country. In the example below, I’ve changed the calendar from months to years, and the country to Ukraine. By doing so, the table now depicts the total US dollar value of exports and imports between Ukraine and all other countries. The Export button at the top allows you to save the report in a number of formats, Excel being the most data friendly option.

IMF DOTS Basic Report – Total Value of Exports from Ukraine, Last Five Years

While this is the quickest option, it comes with some downsides; the biggest one is that there are no unique identifiers for the countries, which is important if you wanted to join this table to a GIS vector file for mapping, or another country-level table in a database.

A better approach is to return to the home page and use the Query tab, which allows you to get a unique identifier and filter out countries and regions that are not of interest.

DOTS Query Tab
  1. Under Columns, select the time frame and interval. For example, check Years for Frequency at the top, and change the dropdowns at the bottom from Months to Years. From -5 to 0 would give you the last five years in ascending order.
  2. Rows allows you to filter out countries or regions that you don’t want to see in the results. You can also change the attribute that is displayed. Once the menu is open, right click in an empty area and choose Attribute. Here you can choose a variant country name, or an ISO country code. ISO codes are commonly used for uniquely identifying countries.
  3. Indicator lets you choose Exports (FOB), Imports (CIF or FOB), or Trade Balance, all in US dollars.
  4. Counterpart country is the country or region that you want to show trade for, such as Ukraine in our previous example.
  5. The tabs along the top allow you to produce graphs instead of a table (View – Table), to pivot the table (Adjust), and calculate summaries like sums or averages (Advanced).
  6. Export to produce an Excel file. By choosing the ISO codes you’ll lose the country names, but you can join the result to another country data table or shapefile and grab the names from there.
Modify Time
Modify Rows – Country – Change Attribute
DOTS Modified Table to Export: Total Value of Exports from Ukraine Last Five Years

UN COMTRADE

If you want data on the exchange of specific goods and services, quantities in addition to dollar values, and exchanges beyond simple imports and exports, then the UN’s COMTRADE database will be your source. You need to register to download data, but you can generate previews without having to log in. There is an extensive wiki that describes how to use the different database tools, and summaries of technical terms that you need to know for extracting and interpreting the data. You’ll need some understanding of the different systems for classifying commodities and goods. Your options (the links that follow lead to documentation and code lists) are: the Harmonized Classification System (HC), the Standard Industrial Trade Classification (SITC), and the Broad Economic Categories (BEC). What’s the difference? Here are some summaries, quoted directly from a UN report on the BEC:

The HS classification is maintained by the World Customs Organization. Its main purpose is to classify goods crossing the border for import tariffs or for application of some non-tariff measures for safety or health reasons. The HS classification is revised on a five-year cycle (p. 18)

The original SITC was designed in the 1950s as a tool for collection and dissemination of international merchandise trade statistics that would help in establishing internationally comparable trade statistics. By its introduction in 1988, the HS took over as collection and dissemination tool, and SITC was thereon used mostly as an analytical tool. (p. 19)

The Classification by Broad Economic Categories (BEC) is an international product classification. Its main purpose is to provide a set of broad product categories for the analysis of trade statistics. Since its adoption in 1971, statistical offices around the world have used BEC to report trade statistics in a concise and meaningful way (p. iii). The broad economic categories of BEC include all subheadings of the HS classification. Therefore, the total trade in terms of HS equals the total trade of the goods side of BEC. (p. 18)

Classification of Broad Economic Categories Rev 5 (2018)

In short, go with the BEC if you’re interested in high-level groupings, or the HS if you need detailed subdivisions. The SITC would be useful if you need to go further back in time, or if it facilitates looking at certain subdivisions or groupings that the other systems don’t capture.

From COMTRADE’s homepage, I suggest leaving the defaults in place and doing a basic, preliminary search for all global exports for the most recent year, so you can see basic output on the next screen. Then you can apply filters for a narrower search.

For example, let’s look at annual exports of wheat from Ukraine to other countries. Under the HS filter, remove the TOTAL code. Start typing wheat, and you’ll see various product categories: 6-digit codes are the most specific, while 4-digit codes are broader groups that encapsulate the 6-digit categories. We’ll choose wheat and meslin 1001. We’ll select Ukraine as the Reporter (the country that supplied the statistics and represents the origin point), and for the 1st partner we’ll choose All to get a list of all countries that Ukraine exported wheat to. The 2nd partner country we’ll leave as World (alternatively, you would add specific countries here if you wanted to know if there were intermediary nations between the origin and destination).

UN COMTARDE Refine Search with Filters

Hit Preview to see the results. You can click on a heading to sort by dollar value, weight, or country name. Like IMF DOTS, UN COMTRADE measures dollar amounts of exports as FOB and imports as CIF. At this point, you would need to log in to download the data as a CSV (creating an account is free). You would also need to be logged in if you generated an extract that has more than 500 records, otherwise the results will be truncated. You could always copy and paste data for shorter extracts directly from the screen to a spreadsheet, but you wouldn’t get any of the extra metadata fields that come with download, like ISO Country Codes and the classification codes for goods and merchandise.

COMTRADE Filtered Results – Exports of Wheat and Meslin from Ukraine 2021
Data Exported from COMTRADE to CSV with Identifiers

Mapping

For data from either source, if you wanted to map it you’d need to have a data table where there is one row for each country with columns of attributes, and with one column that has the ISO country code to serve as a unique identifier. Save the data table in an Excel file or as a table in a database. Download a country shapefile from Natural Earth. Add the shapefile and data table to a project and join them using the ISO code. Natural Earth shapefiles have several different ISO code columns that represent nations, sovereigns, and parent – child relationships; be sure you select the right one. Data table records that represent regions or groupings of countries (i.e. the EU, ASEAN, sum of smaller countries per continent not enumerated, etc.) will fall out of the dataset, as they won’t have a matching feature in the country shapefile. The map at the top of this post was created in QGIS, using COMTRADE and Natural Earth.

UN ICSC Retail Price Index Map

UN Retail Price Index Time Series

We recently launched our fledgling geodata portal on GitHub for the open datasets we’ll create in our new lab. In the spring we carved out a space on the 11th floor of the Sciences Library at Brown which we’ve christened GeoData@SciLi, a GIS and data consultation and work space. We’ll be doing renovations on both the webspace and workspace over the summer.

Our inaugural dataset was created by Ethan McIntosh, a senior (now graduate) who began working with me this spring. The dataset is the United Nations International Civil Service Commission’s (UN ICSC) Retail Price Indices with Details (RPID). The index measures the cost of living based on several categories of goods and services in duty stations around the world. It’s used to adjust the salaries of the UN’s international staff relative to UN headquarters in New York City (index value of 100 = cost of living in New York). The data is updated six times a year, published in an Excel spreadsheet that contains a macro that allows you to look up the value of each duty station via a dropdown menu. The UN ICSC makes the data public by request; you register and are granted access to download the data in PDF and Excel format in files that are packaged in one month / year at a time.

We were working with a PhD student in economics who wanted to construct a time-series of this data. Ethan wrote a Python script to aggregate all of the files from 2004 to present into a single CSV; the actual values for each country / duty station were stored in hidden cells that the macro pulled from, and he was able to pull them from these cells. He parsed the data into logical divisions, and added the standard 3-letter ISO 3166 country code to each duty station so that each record now has a unique place identifier. His script generates three outputs: a basic CSV of the data in separate month / year files, a “long” (aka flat) time series file where each record represents a specific duty station and retail index category or weight for a given month and year, and a “wide” time series file where the category / weight has been pivoted to a column, so each record represents all values for a duty station for a given month / year. He’s written the program to process and incorporate additional files as they’re published.

While the primary intention was to study this data as a time series in a statistical analysis, it can also be used for geospatial analysis and mapping. Using the wide file, I created the map in the header of this post, which depicts the total retail index for February 2022 for each country, where the value represents the duty station within the country (usually the capital city). I grabbed some boundaries from Natural Earth and joined the data to it using the ISO code. I classified the data using natural breaks, but manually adjusted the top level category to include all countries with a value greater than or equal to the base value of 100.

There were only five duty stations that were more expensive than New York, with values between 102 and 124: Tokyo, Ashkhabad (Turkmenistan), Singapore, Beirut, and Hong Kong. Beijing and Geneva were equivalent in price at 100. The least expensive stations with values between 52 and 69 were: Caracas (Venezuela), Tripoli, Damascus, Ankara (Turkey), Bucharest (Romania), Mbabane (Eswatini – formerly Swaziland), and Sofia (Bulgaria). There appears to be regional clustering of like values, although I didn’t run any tests. The station in the US that’s measured relative to NYC is Washington DC (index value of 89).

The final datasets and code used to generate them are available on GitHub, and we’ll update it at least once, if not a couple times, a year. We are not providing the original month / year macro spreadsheets; if you want those you should register with the UN ICSC and access them there. If you’re using our data files, you should still register with them, as they would like to be aware of how their data is being used.

We will post additional projects, datasets, and code in individual repos as we create them, linked to from our main page. I’m working on creating a basic metadata profile for our lab, so we’ll provide structured metadata for each of our datasets in the near future.