Following mosaicking, either the leaves-off version or leaves-on version is
selected to be the "base" for the land cover mapping process. The 4 TM
bands of the "base" mosaic are clustered to produce a single 100-class image
using an unsupervised clustering algorithm. Each of the spectrally distinct
clusters/classes is then assigned to one or more Anderson level 1 and 2 land
cover classes using National High Altitude Photography program (NHAP)
and National Aerial Photography program (NAPP) aerial photographs as a
reference. Almost invariably, individual spectral clusters/classes are
confused between two or more land cover classes.
Separation of the confused spectral clusters/classes into appropriate NLCD
class is accomplished using ancillary data layers. Standard ancillary data
layers include: the "non-base" mosaic TM bands and 100-class cluster image;
derived TM normalized vegetation index (NDVI), various TM band ratios,
TM date bands; 3-arc second Digital Terrain Elevation Data (DTED) and
derived slope, aspect and shaded relief; population and housing density data;
USGS land use and land cover (LUDA); and National Wetlands Inventory
(NWI) data if available. Other ancillary data sources may include soils data,
unique state or regional land cover data sets, or data from other federal
programs such as the National Gap Analysis Program (GAP) of the USGS
Biological Resources Division (BRD). For a given confused spectral
cluster/class, digital values of the various ancillary data layers are compared
to determine: (1) which data layers are the most effective for splitting the
confused cluster/class into the appropriate NLCD class, and (2) the
appropriate layer thresholds for making the split(s). Models are then
developed using one to several ancillary data layers to split the confused
cluster/class into the NLCD class. For example, a population density
threshold is used to separate high-intensity residential areas from
commercial/industrial/transportation. Or a cluster/class might be confused
between row crop and grasslands. To split this particular cluster/class, a TM
NDVI threshold might be identified and used with an elevation threshold in a
class-splitting model to make the appropriate NLCD class assignments. A
purely spectral example is using the temporally opposite TM layers to
discriminate confused cluster/classes such as hay pasture vs. row crops and
deciduous forests vs. evergreen forests; simple thresholds that contrast the
seasonal differences in vegetation between leaves-on vs. leaves-off.
Not all cluster/class confusion can be successfully modeled out. Certain
classes such as urban/recreational grasses or quarries/strip mines/gravel pits
that are not spectrally unique require manual editing. These class features are
typically visually identified and then reclassified using on-screen digitizing
and re-coding. Other classes such as wetlands require the use of specific data
sets such as NWI to provide the most accurate classification. Areas lacking
NWI data are typically subset out and modeling is used to estimate wetlands
in these localized areas. The final NLCD product results from the
classification (interpretation and labeling) of the 100-class "base" cluster
mosaic using both automated and manual processes, incorporating both
spectral and conditional data layers. For a more detailed explanation please
see Vogelmann et al. 1998 and Vogelmann et al. 1998.
Accuracy Assessment:
An accuracy assessment is done on all NLCD on a Federal Region basis
following a revision cycle that incorporates feedback from MRLC
Consortium partners and affiliated users. The accuracy assessments are
conducted by private sector vendors under contract to the USEPA. A
protocol has been established by the USGS and USEPA that incorporates a
two-stage, geographically stratified cluster sampling plan (Zhu et al., 1999)
utilizing National Aerial Photography Program (NAPP) photographs as the
sampling frame and the basic sampling unit. In this design a NAPP
photograph is defined as a 1st stage or primary sampling unit (PSU), and a
sampled pixel within each PSU is treated as a 2nd stage or secondary
sampling unit (SSU).
PSU's are selected from a sampling grid based on NAPP flight-lines and
photo centers, each grid cell measures 15' X 15' (minutes of
latitude/longitude) and consists of 32 NHAP photographs. A geographically
stratified random sampling is performed with 1 NAPP photo being randomly
selected from each cell (geographic strata), if a sampled photo falls outside of
the regional boundary it is not used. Second stage sampling is accomplished
by selecting SSU's (pixels) within each PSU (NAPP photo) to provide the
actual locations for the reference land cover classification.
The SSU's are manually interpreted and misclassification errors are estimated
and described using a traditional error matrix as well as a number of other
important measures including the overall proportion of pixels correctly
classified, user's and producer's accuracy, and omission and commission
error probabilities.
Discussion:
While we believe that the approach taken has yielded a very good general
land cover classification product for a large region, it is important to
indicate to the user where there might be some potential problems. The
biggest concerns are listed below:
1) Some of the TM data sets are not temporally ideal. Leaves-off data sets
are heavily relied upon for discriminating between hay/pasture and row crop,
and also for discriminating between forest classes. The success of
discriminating between these classes using leaves-off data sets hinges on the
time of data acquisition. When hay/pasture areas are non-green, they are not
easily distinguishable from other agricultural areas using remotely sensed
data. However, there is a temporal window during which hay and pasture
areas green-up before most other vegetation (excluding evergreens, which
have different spectral properties); during this window these areas are easily
distinguishable from other crop areas. The discrimination between
hay/pasture and deciduous forest is likewise optimized by selecting data in a
temporal window where deciduous vegetation has yet to leaf out. It is
difficult to acquire a single-date of imagery (leaves-on or leaves-off) that
adequately differentiates between both deciduous/hay and pasture and
hay-pasture/row crop.
2) The data sets used cover a range of years (see data sources), and changes
that have taken place across the landscape over the time period may not have
been captured. While this is not viewed as a major problem for most classes,
it is possible that some land cover features change more rapidly than might be
expected (e.g. hay one year, row crop the next).
3) Wetlands classes are extremely difficult to extract from Landsat TM
spectral information alone. The use of ancillary information such as
National Wetlands Inventory (NWI) data is highly desirable. We relied on
GAP, LUDA, or proximity to streams and rivers as well as spectral data to
delineate wetlands in areas without NWI data.
4) Separation of natural grass and shrub is problematic. Areas observed on
the ground to be shrub or grass are not always distinguishable spectrally.
Likewise, there was often disagreement between LUDA and GAP on these
classes.
State-Specific Caveats and Concerns:
We believe that the approach taken has yielded a very good general
land cover classification product for a very large region. Each state readme
file contains a listing of specific concerns for each state.
Acknowledgments
This work was performed by the Raytheon STX Corporation under U.S.
Geological Survey Contract 1434-CR-97-CN-40274.
References
More detailed information on the methodologies and techniques employed in
this work can be found in the following:
Kelly, P.M., and White, J.M., 1993. Preprocessing remotely sensed data for
efficient analysis and classification, Applications of Artificial Intelligence
1993: Knowledge-Based Systems in Aerospace and Industry, Proceeding of
SPIE, 1993, 24-30.
Cowardin, L.M., V. Carter, F.C. Golet, and E.T. LaRoe, 1979. Classification
of Wetlands and Deepwater Habitats of the United States, Fish and Wildlife
Service, U.S. Department of the Interior, Washington, D.C.
Vogelmann, J.E., Sohl, T., and Howard, S.M., 1998. "Regional
Characterization of Land Cover Using Multiple Sources of Data."
Photogrammetric Engineering & Remote Sensing, Vol. 64, No. 1, pp. 45-47.
Vogelmann, J.E., Sohl, T., Campbell, P.V., and Shaw, D.M., 1998. "Regional
Land Cover Characterization Using Landsat Thematic Mapper Data and
Ancillary Data Sources." Environmental Monitoring and Assessment, Vol.
51, pp. 415-428.
Zhu, Z., Yang, L., Stehman, S., and Czaplewski, R., 1999. "Designing an
Accuracy Assessment for USGS Regional Land Cover Mapping Program."
(In review) Photogrammetric Engineering & Remote Sensing.