USDA Forest ServiceSkip navigational links  
 Northeastern Forest Inventory & Analysis
 Go to: NE FIA Home Page
 Go to:
 Go to:
Go to:
 Go to:
 Go to:
Go to:
 Go to: Publications & Products
 Go to:
Go to: FIA Site Map
 Go to: NE Station
 Go to:
 Go to:
 Go to:View a Map
 

Forest Inventory & Analysis Program
11 Campus Blvd.
Suite 200
Newtown Square, PA 19073-3294

(610)557-4075
(610)557-4250 FAX
(610)557-4132 TTY/TDD

 United States Department of Agriculture Forest Service. USDA logo which links to the department's national site. Forest Service logo which links to the agency's national site.
 

GIS/ Spatial Statistics

Species Distribution Maps - Methodology

Andrew Lister and Rachel Riemann
1/17/02

How were the maps made?

Rationale

The Forest Inventory and Analysis (FIA) Units of the USDA Forest Service seek to improve the understanding of the nation's forests with respect to their quantity, distribution, use and health.  There has recently been a great deal of interest in the production of high quality, spatially resolute maps of forest inventory information, such as species distributions (Riemann Hershey et al. 1997, Moeur and Riemann Hershey 1999, Iverson et al. 1999), pockets of high-value commercial trees (King 2000), and forest distribution (Zhu 1994).

Basics of Geostatistics

In order to create these maps, we implemented a technique called Sequential Gaussian Conditional Simulation (SGCS).  SGCS is a method from the field of geostatistics, a branch of statistics that is used to characterize spatial distributions and to produce estimates of variables at unsampled locations.  The idea behind geostatistics is quite simple:  samples taken closer together are more similar on average than samples taken farther apart.  For example, if you have a grid of points over an area, and measure some variable at each point:

Area Distribuiton

you might find that on average, points that are right next to each other have values that are more similar than points that are 2 or 3 units apart.  A lot of environmental factors like soils, climate, tree growth, species distribution, topography, etc. show this characteristic.

 

Using this principle, we can make a mathematical model (variogram) of how dissimilarity changes with distance: 

Variogram

As the separation distance increases, the dissimilarity, on the vertical axis, increases and then levels off.  This mathematical model is then used in a procedure known as kriging (named after a statistician named Krige) to help estimate values for variables at unknown locations.  SGCS uses kriging in a simulation framework to produce distributions of possible estimates at a given location.

 

One advantage of kriging is that the mathematical model (variogram), which is really a regression equation, can be used to help estimate the value of an unknown point separated by a known distance from other known points.  Weights are placed on known points surrounding a point to be estimated based on the value of the variogram associated with the separation distance, and the estimate is then produced by finding a weighted average of the surrounding points’ values.  In other words, since we know how close a point to be estimated is to a known point, and we know from the model how similar it is to that known point, we know how much influence the known point’s value should have on that estimate’s value. 

 

Kriging and its variants (e.g., cokriging, kriging with local means, indicator kriging, and SGCS) have been used for years in the mining industry, and only recently in the forestry community.  To produce the species importance maps on these pages, we implemented SGCS in the following manner:

 

Procedures

  1. We retrieved basal area data from our FIA plot database.
  2. We calculated importance for each species for each plot as the percentage of a given plot’s total basal area that is made up of a given species’ basal area (i.e., relative importance, or relative dominance).  We only used forested plots in the analysis, and we included saplings.
  3. Next, we performed some data preprocessing, and created the mathematical model describing the relationship between similarity and separation distance (i.e., the variogram).
  4. The next step was to implement the simulation:  we created multiple (100) kriged maps using a combination of original data values and previously simulated data values.  By doing this, at each location on the map (each pixel of the final map image), we created a distribution made up of 100 estimates.
  5. We next analyzed this distribution of 100 estimates with the goal of finding one that best describes the FIA data’s county-level estimates.  For example, from this distribution of estimates, we might have chosen the mean to report as the final value.  If various assumptions of the technique are met, this mean is the best estimate given everything we know about the data (see the histograms below).
  6. Finally, we calculated what is called the interquartile range (IQR) of the distribution of estimates (see below).  This statistic is calculated by finding the range of estimates between the 25th and the 75th percentiles of a data distribution.  For example, if the estimate below which 25% of the estimates fall (the 25th percentile) is .31, and the value below which 75% of the estimates fall (75th percentile) is .86, then the IQR is .86 - .31, or .55 (55%).  Large IQR’s indicate wide distributions (less confidence in the reported estimate) and small IQR’s indicate narrow distributions (more confidence in the reported estimate). 

IQR Range

 

One useful aspect of the simulation procedure is that the error estimate for a given point is not only based on the density of surrounding samples, but also on how similar those surrounding samples are.  The resulting “error maps”, which are associated with the estimates, are useful when evaluating the final map product.  Furthermore, we don’t necessarily need to choose the mean value for our final estimate; we can tailor it to the goals of our study by choosing the median, or any other percentile.

 

These maps are useful as graphical representations of the spatial variation in species importance, as inputs into geospatial models, or as tools that can help guide other studies.  We would like to emphasize that these are estimates, and that there are varying levels of uncertainty associated with the estimates.  The IQR maps displayed on the pages can give you an idea of which estimates are more uncertain than others.  We recommend that before using these maps for any purpose, you contact us for advice and for recommendations on appropriate uses of the information.  We can make no guarantees as to the appropriateness of the map for any purpose!

 

References

For more information on the Sequential Gaussian Conditional Simulation procedure, see our geostatistics workshop webpage, Ed Isaaks’s page, Deutsch and Journel (1998), Isaaks and Srivastava (1989), Goovaerts (1997), and others of the below references.

 

Goovaerts, P. 1997. Geostatistics For Natural Resources Evaluation. Oxford University Press, New York. 483p.

 

Isaaks, EH and RM Srivastava. 1989. An Introduction to Applied Geostatistics. Oxford University Press, New York. 561p.

 

Iverson LR, Prasad AM, Hale BJ, and EK Sutherland. 1999. An atlas of current and potential future distributions of common trees of the eastern United States. General Technical Report NE-265. Newtown Square, PA: USDA Forest Service, Northeastern Research Station. 41p.

 

King, SL. 2000. Sequential Gaussian simulation vs. simulated annealing for locating pockets of high-value commercial trees in Pennsylvania. Annals of Operations Research 95:  177-203.

 

Lister, A, R Riemann, and M Hoppus. 2000. Use of regression and geostatistical techniques to predict tree species distributions at regional scales. 4th International Conference on Integrating GIS and Environmental Modeling  (GIS/EM4): Problems, Prospects and Research Needs.  Banff, Alberta, Canada, September 2-8, 2000.

 

Lister, AJ, Riemann, R and M Hoppus.  2000. A nonparametric geostatistical approach for estimating species importance.  The 2nd Annual Forest Inventory and Analysis (FIA) Symposium.  Salt Lake City, Utah, October 17-18, 2000.

 

Moeur, M, and R Riemann Hershey. 1999. Preserving spatial and attribute correlation in the interpolation of forest inventory data. In: Lowell K, Jaton A, editors. Spatial accuracy assessment: Land information uncertainty in natural resources. Chelsea, MI: Ann Arbor Press. p 419-429.

 

Riemann Hershey R, Ramirez MA, and DA Drake. 1997. Using geostatistical techniques to map the distribution of tree species from ground inventory data. In: Gregoire, T. et al., editors. Modeling longitudinal and spatially correlated data: methods, applications, and future directions. Lecture notes in statistics 122. New York: S. Verlag. p 187-198.

 

Rossi, RE, Mulla, DJ, Journel, AG and EH Franz. 1992. Geostatistical tools for modeling and interpreting ecological spatial dependence. Ecological Monographs 62(2):277-314.

 

Zhu, Z.  1994.   Forest Density Mapping in the Lower 48 States:  A Regression Procedure.  USDA Research Paper SO-280, Southeastern Forest Experiment Station, New Orleans, LA.