Homework Assignment (20 points): MORPHOMETRICS (Bivariate and Multivariate Analyses)

Similar documents
Excel Lab 2: Plots of Data Sets

CHM 152 Lab 1: Plotting with Excel updated: May 2011

Appendix C: Graphing. How do I plot data and uncertainties? Another technique that makes data analysis easier is to record all your data in a table.

Excel Tool: Plots of Data Sets

Appendix 3 - Using A Spreadsheet for Data Analysis

MATHEMATICAL FUNCTIONS AND GRAPHS

Remote Sensing 4113 Lab 08: Filtering and Principal Components Mar. 28, 2018

Physics 253 Fundamental Physics Mechanic, September 9, Lab #2 Plotting with Excel: The Air Slide

NX 7.5. Table of Contents. Lesson 3 More Features

6. Multivariate EDA. ACE 492 SA - Spatial Analysis Fall 2003

CHM 109 Excel Refresher Exercise adapted from Dr. C. Bender s exercise

Experiment P01: Understanding Motion I Distance and Time (Motion Sensor)

A graph is an effective way to show a trend in data or relating two variables in an experiment.

Stratigraphy Modeling Boreholes and Cross. Become familiar with boreholes and borehole cross sections in GMS

Enhancement of Multispectral Images and Vegetation Indices

Excel Manual X Axis Scales 2010 Graph Two X-

Experiment P02: Understanding Motion II Velocity and Time (Motion Sensor)

Laboratory 1: Motion in One Dimension

Experiment G: Introduction to Graphical Representation of Data & the Use of Excel

CS/NEUR125 Brains, Minds, and Machines. Due: Wednesday, February 8

IT, Sligo. Equations Tutorial

Lab 4 Projectile Motion

A To draw a line graph showing the connection between the time and cost

Physics 1021 Experiment 3. Sound and Resonance

Drawing and Assembling

Multivariate Regression Algorithm for ID Pit Sizing

NCSS Statistical Software

Comparing Across Categories Part of a Series of Tutorials on using Google Sheets to work with data for making charts in Venngage

Repeated Measures Twoway Analysis of Variance

Using Figures - The Basics

This lab is to be completed using University computer labs in your own time.

GE 113 REMOTE SENSING

Instruction Manual. Mark Deimund, Zuyi (Jacky) Huang, Juergen Hahn

with MultiMedia CD Randy H. Shih Jack Zecher SDC PUBLICATIONS Schroff Development Corporation

Describing Data Visually. Describing Data Visually. Describing Data Visually 9/28/12. Applied Statistics in Business & Economics, 4 th edition

AutoCAD LT 2009 Tutorial

Sensors and Scatterplots Activity Excel Worksheet

BIO 365L Neurobiology Laboratory. Training Exercise 1: Introduction to the Computer Software: DataPro

AutoCAD LT 2012 Tutorial. Randy H. Shih Oregon Institute of Technology SDC PUBLICATIONS. Schroff Development Corporation

Contents Systems of Linear Equations and Determinants

Existing and Design Profiles

The study of human populations involves working not PART 2. Cemetery Investigation: An Exercise in Simple Statistics POPULATIONS

Principles and Applications of Microfluidic Devices AutoCAD Design Lab - COMSOL import ready

Physics 2310 Lab #5: Thin Lenses and Concave Mirrors Dr. Michael Pierce (Univ. of Wyoming)

Page 21 GRAPHING OBJECTIVES:

Laboratory 2: Graphing

Problem 1 Multiple sets of data on a single graph [Gottfried, pg. 92], Downloading, Importing Data

Table of Contents. Lesson 1 Getting Started

PASS Sample Size Software

EXERCISE 1: CREATE LINE SPARKLINES

Physics 131 Lab 1: ONE-DIMENSIONAL MOTION

Part 1. Using LabVIEW to Measure Current

Data Analysis in MATLAB Lab 1: The speed limit of the nervous system (comparative conduction velocity)

Drawing Bode Plots (The Last Bode Plot You Will Ever Make) Charles Nippert

What Limits the Reproductive Success of Migratory Birds? Warbler Data Analysis (50 pts.)

AutoCAD Tutorial First Level. 2D Fundamentals. Randy H. Shih SDC. Better Textbooks. Lower Prices.

Create A Mug. Skills Learned. Settings Sketching 3-D Features. Revolve Offset Plane Sweep Fillet Decal* Offset Arc

Introduction to Circular Pattern Flower Pot

This week we will work with your Landsat images and classify them using supervised classification.

Example Application C H A P T E R 4. Contents

Experiment 8: An AC Circuit

How to Make a Run Chart in Excel

Experiment 1 Introduction to MATLAB and Simulink

SDC. AutoCAD LT 2007 Tutorial. Randy H. Shih. Schroff Development Corporation Oregon Institute of Technology

Lab 4 Projectile Motion

Stratigraphy Modeling Boreholes and Cross Sections

Data Analysis Part 1: Excel, Log-log, & Semi-log plots

Graphing with Excel. Data Table

Hydraulics and Floodplain Modeling Managing HEC-RAS Cross Sections

Assignment 12 CAD Mechanical Part 2

GIS Module GMS 7.0 TUTORIALS. 1 Introduction. 1.1 Contents

Xcircuit and Spice. February 26, 2007

ISOMETRIC PROJECTION. Contents. Isometric Scale. Construction of Isometric Scale. Methods to draw isometric projections/isometric views

AC phase. Resources and methods for learning about these subjects (list a few here, in preparation for your research):

Part 2: Earpiece. Insert Protrusion (Internal Sketch) Hole Patterns Getting Started with Pro/ENGINEER Wildfire. Round extrusion.

muse Capstone Course: Wireless Sensor Networks

v. 8.0 GMS 8.0 Tutorial GIS Module Shapefile import, display, and conversion Prerequisite Tutorials None Time minutes

Simulating Rectangles

Regression: Tree Rings and Measuring Things

Subdivision Cross Sections and Quantities

Laboratory Assignment: EM Numerical Modeling of a Monopole

CREO.1 MODELING A BELT WHEEL

schemas diagrammatic impressions

Laboratory 1: Uncertainty Analysis

Module 1E: Parallel-Line Flat Pattern Development of Sheet- Metal Folded Model Wrapping the 3D Space of An Oblique Circular Cylinder

Spreadsheets 3: Charts and Graphs

EE/GP140-The Earth From Space- Winter 2008 Handout #16 Lab Exercise #3

GEO/EVS 425/525 Unit 2 Composing a Map in Final Form

design the future Reference Manual PO Box Tigard, OR

Volume of Revolution Investigation

Voltage Current and Resistance II

Principles and Practice

2 Oscilloscope Familiarization

Experiment 2: Electronic Enhancement of S/N and Boxcar Filtering

Objectives. Materials

Selecting the Right Model Studio PC Version

Math 259 Winter Recitation Handout 6: Limits in Two Dimensions

Module 1H: Creating an Ellipse-Based Cylindrical Sheet-metal Lateral Piece

AutoCAD 2D I. Module 6. Drawing Lines Using Cartesian Coordinates. IAT Curriculum Unit PREPARED BY. February 2011

Why Should We Care? More importantly, it is easy to lie or deceive people with bad plots

Transcription:

Fossils and Evolution Due: Tuesday, Jan. 31 Spring 2012 Homework Assignment (20 points): MORPHOMETRICS (Bivariate and Multivariate Analyses) Introduction Morphometrics is the use of measurements to assess morphologic variation within a population or to distinguish among multiple populations. The simplest kind of morphometric application is univariate analysis in which only one attribute is measured on specimens within one or more collections. Once the measurements are in hand it is possible to plot a frequency distribution, calculate the mean, and calculate the standard deviation about the mean (Fig. 1). Univariate analysis can be very useful, but a major deficiency is its reliance on just a single variable. Figure 1. Univariate analysis. Frequency distrbutions for a single variable (width) measured in two populations of brachiopods. In this example the populations are probably distinct because their means are quite different. A statistical test must be applied in order to determine whether or not the difference in means is significant. Bivariate analysis is generally more useful than univariate analysis insofar as it simultaneously evaluates two attributes, one relative to the other. Bivariate analyses commonly are depicted by a scatter plot (X Y plot) in which the two variables being examined are represented by the two axes of the plot (Fig. 2). A best fit line can be selected to pass through an array of points. The slope of the best fit line represents the average relationship between the two variables being analyzed. Bivariate analysis is very useful, but like univariate analysis, its value is limited because it does not consider the full range of attributes that can be measured in groups of specimens. 1

Figure 2. Bivariate analysis. Length vs. width scatter plot for two subspecies of brachiopods. These two subspecies probably are validly separated because the points for each seem to cluster together in swarms. Also, the best fit lines for the two subspecies have slightly different slopes and vertical positions, suggesting different average relationships between length and width in each. Multivariate analyses are the most useful in morphometric studies because they simultaneously evaluate relationships among multiple variables. In a way, a multivariate analysis can be thought of as many iterations of bivariate analyses in which every variable is evaluated relative to every other variable. Most multivariate analyses begin with a data matrix in which a number of measurements are recorded for many specimens. The following example depicts a data matrix in which three measurements were obtained from each of five specimens. These data could be plotted on a scatter diagram as five points in three-dimensional space, with each specimen represented by a point and the three axes corresponding to length, width, and height of specimens. DATA MATRIX Length Width Height Specimen 1 1 4 7 Specimen 2 1.5 4.5 8 Specimen 3 1 3.5 6.5 Specimen 4 1.25 3.5 7 Specimen 5 1.65 4.5 9 But how can we visualize a data matrix that contains ten measurements for each of 15 specimens? These data represent 15 points in 10-dimensional space a valid mathematical concept but not one that is easy to grasp or illustrate! [Try to imagine a hyperspace defined by ten mutually perpendicular axes.] One goal of multivariate analysis is to reduce the effective number of variables (or dimensions) in the data while losing as little information as possible. For example, it might be possible to identify two or three synthetic axes in the ten-dimensional space that account for 90% or more of the total variation. An example of the output from multivariate analysis is given in Figure 3. In this example a technique called Principal Components Analysis was used to reduce a fivedimensional data matrix into three principal components (axes) that capture nearly 100% of the original variation among 10 specimens. The first component axis by itself actually 2

accounts for 94% of the total variation and the second component axis accounts for 5%. In this case, the plot of component 1 vs. component 2 contains 99% of the variance. This two-dimensional image is easy to interpret: specimens 8, 9 and 10 are very different from all others, whereas specimens 1-7 seem to form a natural morphologic cluster. Figure 3. Multivariate analysis. Principal Components Analysis of five variables measured on each of 10 specimens. In this example, 99% of the total variation among individuals is contained in the first and second component axes. In other words, it was possible to mathematically identify two perpendicular axes in a five-diminsional space such that 99% of the variation among points can be portrayed on those two axes. The third component axis accounts for only 0.39% of total variation. In this lab we will use both bivariate and multivariate analyses to make inferences about groups of fossil specimens. You will need to have some familiarity with PAST software and how to enter and manipulate data. Assignment Part 1 (bivariate analysis) 1. Table 1 contains length and width measurements taken on brachiopod specimens from two collections, with 30 specimens per collection. Use bivariate analysis to determine whether the collections come from a single population, or from two distinct populations. 2. Launch the PAST software (PAST can be found within the Geology applications folder on computers in the student computer lab.). In PAST, create and save two data matrices one for each collection. Note that rows are already numbered. You will need to rename the first two columns length and width. Do this by selecting a column and then choosing rename column under the Edit menu. At the end of this step you should have two small PAST files, one for collection A and one for collection B. 3

TABLE 1 Brachiopod measurements collection A collection B length width length width 1 99 69 79 42 2 87 56 81 40 3 104 59 83 59 4 101 65 74 56 5 88 61 70 44 6 91 66 89 41 7 93 56 81 59 8 85 68 82 53 9 86 55 75 60 10 99 57 73 51 11 104 70 71 43 12 91 64 85 48 13 94 61 90 45 14 89 63 77 55 15 105 55 90 51 16 104 59 83 43 17 86 70 81 53 18 95 57 79 58 19 96 66 80 60 20 102 68 87 40 21 98 56 83 45 22 88 61 76 49 23 105 68 74 47 24 100 65 89 42 25 90 56 75 52 26 88 55 88 53 27 91 58 82 46 28 92 60 71 49 29 99 63 77 55 30 100 66 89 57 3. For collection A create a scatter plot of length vs. width. Do this by selecting the two columns and then choosing XYgraph under the Plot menu. If you have done everything correctly you should now see a scatter plot with 30 points representing the 30 specimens in your collection. Click on the points button under the plot style area. It may be necessary to adjust the x-start / x-end and y-start / y-end values in order to see all of the points on the graph. Print copies of both the data matrix and the scatter plot to turn in to the instructor. Repeat for collection B. 4

4. Now merge the two files so that you end up with a larger file containing all of your measurements. Do this by opening the file for collection A. Select the cell in the first column and 31 st row, then choose insert file under the File menu. When prompted for a file name, give the file name and location for collection B. This should do the trick. If not, simply copy and paste the data from the collection B file into the collection A file. Save the new, larger file under a new file name. 5. Color the data. Do this by selecting the first 30 rows (collection A data) and then choosing row color/symbol under the Edit menu. This will bring up a box with several options: choose one (e.g., blue squares, red crosses, etc.). Now color the remaining rows with a different color. 6. Select both columns and then choose XYgraph under the Plot menu to again generate a scatter plot. If you ve done everything correctly, the points from the two collections should be plotted together now, but represented on the plot by different colors and symbols. [Again, you may need to adjust the x-start / x-end and y-start / y-end values in order to see all of the points on the graph.] Print the combined data matrix and scatter plot to turn in to the instructor. 7. Examine the scatter plot and answer the following question. Do the two collections seem to represent individuals from two distinct populations? Explain your answer. 8. Although the plots are useful, we need to employ a statistical test in order to answer question #7. We will do this by comparing the lengths in collection A versus the lengths in collection B, and the widths in collection A versus the widths in collection B. Re-open the file for collection A. Select the width column and delete it by choosing remove under the Edit menu. Now select the top row of the second column and choose insert file under the File menu. When prompted for a file name, give the file name and location for collection B. Select the width column (collection B) and delete it. Now, rename the two remaining columns lengtha and lengthb. Select both columns and choose univariate under the Statistics menu. A pop-up box showing a variety of statistics will appear. What are the mean values for: lengtha and lengthb? 5

Now select both columns and choose F and T tests (two samples) under the Statistics menu. Examine the pop-up box. What is the t-value for this comparison? What is the associated p-value? Repeat these steps so that you get univariate statistics and t- and p-values for the widths. widtha and widthb? What is the t-value for this comparison? What is the associated p-value? A very small p-value (i.e., anthing less than 0.05) indicates that the difference between the sample means is statistically significant. Now that you ve seen the p-value for lengths and widths, do you still agree with your answer to question #7? Assignment Part 2 (bivariate analysis) Fusulinids are extinct protists that are very useful in geologic age dating. Fusulinids constructed their shells by adding successive chambers in a spiral growth pattern. The shells of many species are complex and relatively large, at least by protistan standards (Fig. 4). Despite more than 150 years of scientific study, the function of the fusulinid shell is still unclear: i.e., it is not known if the purpose of the shell was for protection, for structural support, or for some other function. Figure 4. Sketch of a fusulinid shell, showing spiral arrangement of chambers. Long axis is the length, short axis is the diameter. 6

Clues to the purpose of the shell might be revealed by an analysis of shell growth. In this exercise we will plot the half-length versus the radius on a volution-by-volution basis for four fusulinid specimens of the species Beedeina acme from southern Oklahoma (Fig. 6). This plot will reveal whether fusulinid growth was isometric or allometric. 9. Save the file fusulinid growth on your computer or on a flash drive. Go to http://faculty.cns.uni.edu/~groves/, then right-click on the filename and choose Save link target as. 10. Launch the PAST software and open the fusulinid growth file. Notice that there are two columns and 20 rows. The first column contains half-length measurements and the second column contains radius vector measurements. The first five rows (i.e., the red ones) contain data for the first through fifth volutions of the first specimen, the blue rows contain data for the second specimen, and so on. 11. Select both columns by clicking on the gray button in the upper-left corner of the matrix. Now choose XY Graph under the Plot menu. Print a copy of the graph to turn in to the instructor. 12. Examine the graph. The red points represent the growth curve for specimen one, the blue points represent the growth curve for specimen two, and so on. What type of growth characterizes the shells of Beedeina acme? 13. Given your answer to question 12, do you think fusulinid shells functioned for structural support? (explain your answer) Assignment Part 3 (multivariate analysis) Background. The Frensley Limestone is a rock formation of Pennsylvanian age in southern Oklahoma. It is roughly 100 feet thick. Fusulinid specimens were collected from the lower and upper parts of the Frensley Limestone. The older of the two collections is thought to be 308.9 million years old and the younger collection is thought to be 308.8 million years old: i.e., the two collections differ in age by about 100,000 years (Fig. 6). Each fossil specimen was prepared, photographed and measured. A total of 29 measurements was made on each shell (Fig. 5) Figure 5. Idealized axial section of a fusulinid showing morphologic variates analyzed in this study: a, half length (second volution); b, radius vector (second volution); c, tunnel width (second volution); d, chomata height (second volution); e, proloculus diameter 7

8 Figure 6. Idealized stratigraphic section of Pennsylvanian rocks in southern Oklahoma showing fusulinid occurrences and their approximate geologic ages. Do fusulinids from the lower Frensley limestone and from the upper Frensley limestone belong to the same species?

It is of interest to know if individuals from both the lower and upper collections might be representatives of a single species, or whether they are sufficiently different to be assigned to two species. In multivariate analysis we simultaneously examine all 29 measurements on each specimen for comparing the lower and upper collections. 14. Save the files lower Frensley and upper Frensley on your computer or on a flash drive. Go to http://faculty.cns.uni.edu/~groves/, then right-click on the filename and choose Save link target as. 15. Launch the PAST software. Open the file lower Frensley and notice that there are 29 columns (morphologic variates) and 28 rows (specimens). 16. Color the rows by selecting them and then choosing row color/symbol under the Edit menu. This will bring up a box with several options: choose red. 17. Now click in the empty cell at the bottom of the first column. Add the upper Frensley file by selecting insert file under the File menu. Now your matrix should contain 29 columns and 48 rows. The last 20 rows (gray) contain measurements taken on specimens from the upper Frensley. 18. Select the entire merged dataset by clicking on the gray button in the upper-left corner of the matrix. Now choose MANOVA/CVA under the Multivar menu. [MANOVA is an acronym for Multivariate ANalysis Of Variance. ] Print a copy of the MANOVA popup window to turn in to the instructor. 19. Click on the button CVA scatter plot in the MANOVA pop-up window. [CVA is an acronym for Canonical Variates Analysis. ] This should produce an X-Y graph in which Axis 1 and Axis 2 correspond to Eigenvalue 1 and Eigenvalue 2, respectively, in the pop-up window. Print a copy of the CVA scatter plot to turn in to the instructor. What percent of the variance in the original 29-dimensional dataset is contained in Eigenvalue 1 (i.e., Axis 1)? What percent of the variance in the original 29-dimensional dataset is contained in Eigenvalue 2 (i.e., Axis 2)? 20. Examine the distribution of points in the CVA scatter plot. In your opinion is there sufficient separation between the red crosses and the black dots to indicate that specimens from the lower Frensley and upper Frensley represent two distinct species? Explain your answer. 9

21. Now turn again to the MANOVA pop-up window. Notice the values associated with p (same) under the Wilk s lambda and Pillai trace columns. If these values are less than 0.05, then there is a statistically significant difference between the two collections of fusulinids. Wilk s lambda p (same) = Pillai trace p (same) = Does this result agree with your conclusion based on visual examination of the CVA scatter plot (explain)? 10