Lesson 3: Working with Landsat Data Lesson Description The Landsat Program is the longest-running and most extensive collection of satellite imagery for Earth. These datasets are global in scale, continuously updated, and freely available online. This lesson will provide you with the information necessary to begin working with Landsat imagery in ArcGIS and describe the history and differences between Landsat missions. Objectives: The student will: 1) Understand Landsat filename structures 2) Work with data from multiple Landsat sensors 3) Gap fill Landsat 7 imagery using raster math Keywords: Landsat, Conditional Statement, Mekelle, Ethiopia Resources Required: ArcMap Data Used: Landsat 2 Image (LM21810511979063XXX01): Landsat 2 image of area surrounding Mekelle, Ethiopia Landsat 5 Image (LT51690512011177MLK01): Landsat 5 image of area surrounding Mekelle, Ethiopia Landsat 7 Image (LE71690512016023NPA00): Landsat 7 image of area surrounding Mekelle, Ethiopia Landsat 8 Image (LC81690512016063LGN00): Landsat 8 image of area surrounding Mekelle, Ethiopia 1
Background: Landsat became the first civilian Earth observation satellite when NASA launched Landsat 1 in 1972. Since then, there have been 7 subsequent Landsat missions that provide continuous coverage of the Earth s land surface to facilitate continued research in agriculture, geology, forestry, planning, and numerous other areas. The 40+ year global archive of Landsat data is freely available to anyone and provided by the United State Geology Survey (USGS) through a number of online data portals. Before you work with these data, it is important to understand the differences between Landsat sensors and how these differences affect data availability and quality. There have been 7 total successful missions (Landsat 6 failed to achieve orbit) that each have differences in sensor characteristics and the date ranges they cover. When you begin to download data, one of the first things you should consider is the date of the image(s) needed for your project. Use Figure 1 as a reference when determining which sensor data is available for specific dates. Figure 1. Timeline of Landsat Missions. Acquired from: http://landsat.usgs.gov/about_mission_history.php 2
As the Landsat Program evolved, so did the sensors equipped onboard each satellite. Quality has progressively increased with each new Landsat sensor, as well as the spectral range that each covers. As such, not all data products you receive will contain the same information. It is important that you know which bands are available from the sensor you are working with and which bands correspond to specific spectral wavelengths (e.g., red, green, blue, NIR, etc.) for that sensor. Table 1 shows which bands are available on each sensor and the corresponding band number for each, as well as the pixel size for each of those bands by sensor. Keep this information in mind, as these band differences will be demonstrated later in this lab. Table 1. Summary of band numbers (e.g., B1, B2) and spatial resolution by Landsat satellites and sensors. The core group of bands used in most applications is shown in gray. Adapted from Young et al., In Review. LS 1-5 MSS LS 4-5 TM LS 7 ETM+ LS 8 OLI/TIRS Pixel Size (m) Coastal aerosol - - - B1 30 Blue - B1 B1 B2 30 Green B1 B2 B2 B3 30* Red B2 B3 B3 B4 30* NIR 1 B3 60 NIR B4 B4 B4 B5 30* SWIR 1 - B5 B5 B6 30 SWIR 2 - B7 B7 B7 30 Thermal - B6 B6 B10 30** Thermal 2 - - - B11 30** Pan-Chromatic - - B8 B8 15 Cirrus - - - B9 30 *MSS data are 60 m pixel size ** Collected at 100m, then resampled to 30m Although we are working specifically with Landsat data in this lesson, some of the foundational steps you learn here will transfer to working with other remotely sensed data from any given platform. The purpose of this lesson is not only to familiarize you with Landsat data, but also increase your comfort and competency working with any remote sensing data in ArcGIS. 3
Lesson: Step 1. Working with Landsat Imagery In this section, we will explore the basics of working with Landsat data, including interpreting filenames and exploring data from separate Landsat sensors. Landsat satellites collect data continuously as they orbit the Earth. These data are segmented and delivered as individual scenes based on the Worldwide Reference System (WRS). After selecting and downloading a scene, you will likely receive the data as a TAR GZ file that needs to be unzipped. This will extract a.tar file that also needs to be unzipped to access the raw data. Extracting these files may require you to download and install a program such as 7-Zip: www.7-zip.org/. The data presented in this lesson has already been unzipped for you. 1.1 Copy the data folder into your local directory. In your data folder, examine the names of the folders and files contained within (using Windows explorer, not ArcMap). While these names may appear like random identifiers, each piece of the filename contains a piece of information about the data. Figure 2 breaks down the separate components of Landsat filenames. Figure 2. Landsat filename structure. The following codes are used to represent the sensor (second digit) in the file name: C = OLI & TIRS O = OLI only T = IRS only E = ETM+ T = TM M = MSS This specific filename tells us the image was collected by the OLI & TIRS sensors onboard the Landsat 8 satellite for the area contained by path/row 169/51 on Julian Day 63 (March 3 rd, 2016). The Ground Station Identifier (GSI) shows how and where the data were received from one of the ground stations. These data were captured by the USGS Landsat Ground Network (LGN) facilities either in Sioux Falls, South Dakota, USA, or Alice Springs, Australia and we have the first archive version of the data. This particular file is band 1 stored in the GeoTiff file format. Before beginning the next section, Answer Question 1 on the final page. Julian Days (the continuous count of days since the beginning of the year) can be converted into month/day using the following link: http://landweb.nascom.nasa.gov/browse/calendar.html 4
1.2 Examine the other files in the directory. There are data from Landsat 2, 5, 7 and 8. We will begin by exploring the data from Landsat 8. 1.3 Open the folder containing the Landsat 8 imagery (Landsat8_LC81690512016063LGN00). This folder contains several files, a raster file for each band (and the associated files for each raster) and a single text file containing the metadata (.MTL). Open the metadata file, LC81690512016063LGN00_MTL.txt, using a program such as WordPad to maintain formatting, Right click the file > Open with > WordPad. This file contains the metadata for the imagery, with information such as the satellite name, sensor, processing level, projection, minimum and maximum values for each band, and so on. Explore this information for a while then Answer Question 2. 1.4 Now we can open and visualize the imagery. Open ArcMap, in the Catalog window, connect to the folder storing your lesson data and navigate to the Landsat 8 folder (Landsat8_LC81690512016063LGN00). Again, notice that each individual band is stored as a separate raster file. Fortunately, we do not need to add each individual band, ArcMap can read the MTL file and composite (or stack) these images for us. Drag the MTL file (LC81690512016063LGN00_MTL.txt) from the Catalog Window into the Map Viewer window. Your screen should look like Figure 3. Figure 3. Landsat 8 image of area surrounding Mekelle, Ethiopia. 5
Explore the imagery - what specific features do you see? Can you find the city of Mekelle? Notice this imagery has a coarser resolution than the Sentinel-2 imagery from the previous lab (30m vs. 10m). The benefits of using the Landsat MTL file to import the data are twofold. First, you do not need to manually composite the individual bands as we did in the previous lesson. Second, the bands will be labeled automatically In the Table of Contents (notice the bands were brought in automatically as natural color and each band is labeled as Red, Green, or Blue rather than arbitrary band numbers). 1.5 Add band 9 from your working directory, LC81690512016063LGN00_B9.TIF, to the map viewer and explore this data. This is a new band that was not available on the previous Landsat sensors. Answer Question 3. 1.6 We can now compare these Landsat 8 data with some older sensors. Add the MTL file for the Landsat 5 imagery, LT51690512011177MLK01_MTL.txt. Note that the Landsat 5 imagery has a slightly smaller extent and also has fewer bands. Landsat 8 has higher spectral resolution than Landsat 5, meaning it measures a broader range of electromagnetic radiation (and thus has more bands, Table 1). 1.7 Landsat 8 also has a higher radiometric resolution, meaning it can detect finer changes in reflected or emitted energy. Right click the Landsat 5 image and open the Properties, in the Source tab, you will see the Pixel Depth is 8-bit. Now open the Properties of the Landsat 8 imagery and notice the Pixel Depth is 16-bit. What this means is Landsat 5 can distinguish only 256 unique values (2 8, 0-255), while Landsat 8 can distinguish 65,536 unique values (2 16 ). This is a major difference that is illustrated in Figure 4, which shows the fineness of these bit-depths on a small portion of the greyscale gradient. Figure 4. Gradient difference between 8-bit and 16-bit data (on a small portion of the greyscale gradient) 1.8 Now add the Landsat 2 data using the MTL file. Again, notice the different extent, this satellite used a separate Worldwide Reference System (WRS) than the other data from this lesson. Also noticed that we are looking at a false color image, where red is set to NIR, green to red, and blue to green. This is because the MSS sensor has only 4 bands and does not have a blue band. Further, the radiometric resolution of MSS is only 6-bit (2 6, 64 unique values). Consider how these differences in sensors affect the available data for each sensor and how it can be a challenge when working with multiple sensors on a single project (i.e. time-series analysis). Answer Question 4 on the final page. 6
Step 2. Conditional Statements and Gap Filling 2.1 You may now remove all the previous files to clear your Table of Contents. Add the final image, which is from the Landsat 7 satellite, LE71690512016023NPA00_MTL.txt. Pan around and zoom in to explore the imagery. The stripes you see are a result of a failure of the Scan Line Corrector (SLC) onboard the Landsat 7 satellite. The SLC failed in May 2003 and all imagery after that date is affected by this issue. The data in the center vertical portion of the imagery is unaffected, but approximately 22% of each image is not collected. While this renders a large portion of Landsat 7 data nearly useless, Landsat 5 fortunately remained functional until the launch of Landsat 8, so there was no temporal gap in imagery due to this SLC failure (Landsat 5 lasted for 29 years, greatly exceeding its 3 year design life). 2.2 We can still make use of this data by filling in the gaps. There are multiple methods to accomplish this task, but we will focus on a simple method that is available in ArcGIS using a separate, unaffected image. Figure 5. Landsat 7 imagery showing the effect of the scan line corrector (SLC) failure. Imagery shows Mekelle, Ethiopia. 2.3 Zoom in on any striped area. Use the Identify tool to determine the pixel value of a few different pixels in the gap area. 2.4 You should find that all the pixels in these gaps have a value of 0. We can use this information to replace the missing data using a conditional statement. In this process, we will replace all pixels that have a value of 0 with actual spectral data from a separate image. 2.5 Add the file, LE71690512003067SGS00_B3.TIF (located in the main Data folder); this is band 3 (red) of a Landsat 7 image collected in 2003, prior to the failure of the SLC. You will notice this image does not have the striping effect. Now, add the corresponding band 3 from the striped image, LE71690512016023NPA00_B3.TIF. 2.6 Find the Con (Spatial Analyst) tool using the Search window or using ArcToolbox and navigate to: Spatial Analyst Tools > Conditional > Con 7
2.7 In the Con tool window, select LE71690512016023NPA00_B3.TIF as the Input Conditional Raster, the input raster will always be the band with the missing data. The Expression is: VALUE = 0 Select the replacement data, LE71690512003067SGS00_B3.TIF, as the Input True Raster and the input false raster as LE71690512016023NPA00_B3.TIF (same as the input raster). 2.8 Let s assess what the tool is doing here. By setting LE71690512016023NPA00_B3.TIF as the input raster, we are stating each cell within this raster will be evaluated by the expression: does the cell have a value of 0? If this condition is true, the 0 value will be replaced by the cell value of LE71690512003067SGS00_B3.TIF (our true condition, which has no scan line errors). If the condition is false (value does not equal 0), the value from LE71690512016023NPA00_B3.TIF (our false condition, same as the input raster) will replace the value from the input raster (which will leave these locations unchanged because it is the same dataset). Here is a simplified explanation: Condition: Does the cell value in LE71690512016023NPA00_B3.TIF= 0 (e.g. is this cell missing data)? True: Replace the cell value with the value from LE71690512003067SGS00_B3.TIF False: Replace the cell value with the value from LE71690512016023NPA00_B3.TIF (keep the data the same) Name your Output Raster LE7_Gapfilled_B3.TIF (Figure 5). Figure 6. Parameters to use in the Con tool 2.9 Visually examine the output layer, you will likely still notice a minor striping effect because the data is coming from two separate sources at two separate dates in time (2016 and 2003). Ideally, when performing this step, you would first want to minimize the atmospheric differences between the two scenes by performing a relative correction or by using surface reflectance images, which will be covered in a later lesson. 2.10 Be very cautious using a gap-filled product such as this when performing scientific analysis. Obviously, having a portion of the image from a date prior to May 2003 will have significant impacts on temporal 8
analyses or analyses where the date of the imagery is critical (e.g. imagery captured during a flood, post fire, etc.). Also note we have only corrected band 3 here, and this step would need to be repeated for each band to gap-fill the entire image. Answer Question 5. NOTE: There are alternative options in the freely available QGIS software to use spline interpolation to fill in the gap areas (r.fillnulls tool), which does not require a pre-2003 image, instead, it estimates what the values should be in the gap areas based on values near the missing data, though these values may be unreliable. ENVI software also has a similar methods for gap filling. Any use of trade, products, or firm names is for descriptive purposes only and does not imply endorsement by Colorado State University or any other collaborating individuals or agency. This tutorial was created for educational purposes and the data presented in these lessons may be incomplete or inaccurate. 9
Exercise Questions 1. Using Figure 1, use the file/folder names to determine which sensor collected the data for each of the files. Landsat 2: Landsat 5: Landsat 7: Landsat 8: 2. What are the map projection, datum, and UTM zone of this file? What is the azimuth and elevation of the sun when this imagery was captured? (Hint: use Control+F to search for specific words in the text document). Map Projection, Datum, and UTM Zone: Sun Azimuth and Elevation: 3. What type of information is stored in the Landsat 8 band 9? What is the descriptive name of this band (use Table 1)? 4. Examine the Source tab in the Properties of the Landsat 2, 5, and 8 files. What is the uncompressed file size of each file? Why do you think these file sizes are so different? Landsat 2: Landsat 5: Landsat 8: 5. What are some potential issues of using gap-filled imagery from Landsat 7 for scientific analysis? Use the imagery from this lesson as an example (2016 and 2003 data). 10