About Us Contact Us
Cluster Mapping Project  
Overview
In advanced nations such as the United States, many of the most important influences on competitiveness are found at the regional level. Regional economies are specialized, with each region exhibiting competitiveness in a different mix of industry clusters. Clusters are a geographically proximate group of interconnected companies and associated institutions in a particular field, including product producers, service providers, suppliers, universities, and trade associations.

Measuring the performance and competitive strength of regional economies has been difficult because clusters have not been systematically identified and mapped across all U.S. regions. To address this challenge, Professor Porter and his team have launched the Cluster Mapping Project to define clusters statistically and create objective, detailed profile of regional economies across the United States. Economies are analyzed at various geographic levels, including states, economic areas, and metropolitan areas. For example, the data presented here is divided into three broad categories: overall regional economic performance, composition of the regional economy, and cluster competitiveness.

The data can be used to identify the most important clusters in the region's economy, compare the region's cluster position versus other regions, understand the drivers of the region's relative wages, employment growth, and formation of new establishments, and assess the region's patenting performance.


Methodology
The purpose of the Cluster Mapping Project is to assemble a detailed picture of the location and performance of industries in the United States, with a special focus on the linkages or externalities across industries that give rise to clusters.

The raw data for the project are County Business Pattern data (excluding agriculture and government) on employment, establishments, and wages by four-digit Standard Industrial Classification (SIC) Code by U.S. County. In addition, U.S. patent by location of inventor are allocated to industries and clusters using a concordance of technology classifications with SIC codes. There are also confidentiality limitations, which mean that the actual data are not disclosed for every county and economic area in every industry. Various techniques are used to compensate for missing data.

Economies are analyzed at various geographic levels, including states, economic areas, metropolitan areas, and counties.

All the industries in the economy are separated into "traded" and "local" based on the degree of industry dispersion across geographic areas. Local industries are those present in most if not all geographic areas, are evenly distributed, and hence primarily sell locally. Traded industries are those that are concentrated in a subset of geographic areas and sell to other regions and nations.

Among traded industries, clusters are identified using the correlation of industry employment across geographic areas. The principle is that industries normally located together are those that are linked by some external economies. These industries, then, constitute a cluster.

Clusters are defined initially using state-level data (n=50). The robustness of cluster composition is verified using Economic Area as the geographical unit.

Clusters are constructed using two approaches, which are then reconciled. First, select a prominent "core" industry in a field or part of the economy. Calculate the locational correlations of all other industries with the core. Those industries with statistically significant correlations with the core define the extent of the cluster. Second, calculate locational correlations between all pairs of industries in a general field and potentially related fields. Those set of industries with statistically significant and substantial intercorrelations among each other define the cluster.

In both cases some industries may have spurious correlations to a cluster because of the co-location of several strong clusters in the same geographical area. Spurious correlation is eliminated using Input-Output tables, industry definitions, and industry knowledge.

Note that a given industry can be part of more than one cluster. This sometime reflects overly broad industry definitions. However, it is also the case that there are multiple forms of externalities, and some industries are suppliers or customers of many other industries. Thus, overlapping clusters are expected and their overlaps are important economically.
Back to Project Home
For inquires about this project, email iscdata@hbs.edu.  
Help HBS Home HU Home