Welcome, friends!

Please go forth and scroll down to begin your journey though our group module assignment for AN 597. Our topic is “3D visualization and analysis”.
1. First, Julie will cover how to plot 3d surfaces, both non-interactively and interactively.
2. Then Maria will cover how to collect and plot 3d landmark data.
3. Finally, Zach will go over how to analyze landmark data.

A note before we start: Macs need XQuartz installed to be able to view images which use the rgl package. See the modules from Week 1 for instructions and click here to download.

Also, it’s important to run through this in R studio. Many of these functions require feedback in the console or in a popup window, so they will not show in the knitted html. Additionally, the size of the mesh files and other 3D data means that the graphical output of many of these functions cannot be displayed correctly in the html output.

Part I : Plotting in 3D

A brief introduction to non-interactive 3D plots

persp() function

In the base R graphics system, the function persp() draws perspective plots of a surface over the x–y plane. Set eval=TRUE in the chunk below to run demo(persp). The demo will give you an idea of what this function can do.

demo(persp)

A demo is an .R file that lives in demo/. Demos are like examples but tend to be longer. We can use the following line of code to check which packages have a native demo().

demo(package=.packages(all.available = TRUE))

{plot3D} package

The {plot3D} package builds on on persp() to provide functions for both 2D and 3D plotting. Load the {plot3D} package and set eval=TRUE in the chunk below to go through some pretty impressive plot examples:

install.packages("plot3D")
library("plot3D")
example(surf3D) #examples of 3D surfaces
example(persp3D) 

Let’s do an example on our own. Use surf3D() to create a cut-away view of a Torus.

# 3D Plot of Half of a Torus
par(mar = c(2, 2, 2, 2))
par(mfrow = c(1, 1))
R <- 3
r <- 2
x <- seq(0, 2*pi,length.out=50)
y <- seq(0, pi,length.out=50)
M <- mesh(x, y)
 
alpha <- M$x
beta <- M$y

surf3D(x = (R + r*cos(alpha)) * cos(beta),
       y = (R + r*cos(alpha)) * sin(beta),
       z = r * sin(alpha),
       colkey=FALSE,
       bty="b2",
       main="Half of a Torus")

Pretty cool, right? :D We can use these functions to create any 3D surface we can imagine.

{scatterplot3d} package

It turns out there are SO MANY ways to create a 3D scatterplot, and (to me at least) all of them seem really useful and elegant. As you’ll see, there are pros and cons to each method - which you should use depends on what function you’re plotting for. Here, I’ll first go over some non-interactive plotting functions, before getting into some interactive ones.

The {scatterplot3d} package is the “go-to” package for simple, non-interactive 3D scatter plots. Load this package and set eval=TRUE in the chunk below to go through some examples of spirals, surfaces and 3D scatterplots:

install.packages("scatterplot3d") # Install
library("scatterplot3d") # load
example(scatterplot3d) 

This link is really helpful if you want more information and detail about using {scatterplot3d} to create 3D plots. I won’t go over this in detail because in my opinion, there are much more useful ways to create 3D plots that we’ll go over in a bit more detail.

{lattice} package

The {lattice} package has its own distinctive look. Once you see one lattice plot it should be pretty easy to distinguish plots made with this package from base graphics plots. Load the package and set eval=TRUE in the chunk below to see a 3D graph of a volcano, and 3D surface and scatter plots of iris data.

library(lattice)
example(cloud)

Interactive 3D plotting using the {rgl} package

I’ve gone over {sctterplot3d} and {lattice} fairly quickly and with few examples. This is because I mainly want to focus on how to create interactive 3D visualizations. The first way to do this that I’ll show you is with {rgl}.

{rgl} is a 3D graphics package that produces real-time interactive 3D plots. It allows you to interactively rotate, zoom, and select regions of your graphic.

Note that an RGL plot can be manually rotated by holding down on the mouse or touchpad. It can be also zoomed using the scroll wheel on a mouse or pressing ctrl + using the touchpad on a PC or two fingers (up or down) on a mac. These functions may not work in html, but should if you run in R/Rstudio.

Install packages

Let’s first set eval=TRUE in the chunk below to install the packages that you’ll need for this section:

install.packages("rgl") # install
install.packages("car") 
library("rgl") #load
library("car")
## Warning: package 'car' was built under R version 3.4.3

Prepare the data

Let’s take a look at the iris dataset:

data(iris)
head(iris)
##   Sepal.Length Sepal.Width Petal.Length Petal.Width Species
## 1          5.1         3.5          1.4         0.2  setosa
## 2          4.9         3.0          1.4         0.2  setosa
## 3          4.7         3.2          1.3         0.2  setosa
## 4          4.6         3.1          1.5         0.2  setosa
## 5          5.0         3.6          1.4         0.2  setosa
## 6          5.4         3.9          1.7         0.4  setosa

Recall that the iris data set gives the measurements of the variables sepal length and width, petal length and width, respectively, for 50 flowers from each of 3 species of iris. The species are Iris setosa, I. versicolor, and I. virginica.

We’ll use this dataset for the following examples. First we’ll assign measurements of the variables sepal length, petal length, and sepal width.

x <- sep.l <- iris$Sepal.Length
y <- pet.l <- iris$Petal.Length
z <- sep.w <- iris$Sepal.Width

Start and close the RGL device

To make a 3d plot with {rgl}, you should first start the RGL device in R. You can use the following function to manage the RGL device.

  • rgl.open(): Opens a new device
  • rgl.close(): Closes the current device
  • rgl.clear(): Clears the current device
  • rgl.cur(): Returns the active device ID
  • rgl.quit(): Shutdowns the RGL device system

You don’t necessarily need to use open a new RGL device for each plot.

Here we’ll create a scatter plot using {rgl}, in 2 different ways:

scatter3d()

First, the function scatter3d() uses the {rgl} package to draw a 3D scatter plot with various regression planes.

open3d()
## glX 
##   1
scatter3d(x = sep.l, y = pet.l, z = sep.w, 
          xlab = "Sepal Length (cm)", ylab = "Petal Length (cm)",
          zlab = "Sepal Width (cm)")
rglwidget() #view the rgl device within your html doc

x, y, z are respectively the coordinates of points to be plotted. The arguments y and z can be optional depending on the structure of x.

Note that you can drag and drop to spin this plot around and view it from different angles.

We can change the colors of the point and remove the regression surface:

open3d()
## glX 
##   2
scatter3d(x = sep.l, y = pet.l, z = sep.w,
        point.col = "blue", surface=FALSE, 
        xlab = "Sepal Length (cm)", ylab = "Petal Length (cm)",
        zlab = "Sepal Width (cm)")
rglwidget() #view the rgl device within your html doc

We can also use the format:

scatter3d(formula, data)

Where formula is a model formula of form y ~ x + z and data is the data frame within which to evaluate the formula. If you want to plot the points by groups, you can use y ~ x + z | g where g is a factor dividing the data into groups

This is an example where we plot the points by groups:

open3d()
## glX 
##   3
scatter3d(x = sep.l, y = pet.l, z = sep.w, groups = iris$Species, 
          xlab = "Sepal Length (cm)", ylab = "Petal Length (cm)",
          zlab = "Sepal Width (cm)")
rglwidget() #view the rgl device within your html doc

We can remove the grids:

open3d()
## glX 
##   4
scatter3d(x = sep.l, y = pet.l, z = sep.w, groups = iris$Species,
          xlab = "Sepal Length (cm)", ylab = "Petal Length (cm)",
          zlab = "Sepal Width (cm)",
          grid = FALSE)
rglwidget() #view the rgl device within your html doc

The display of the surface(s) can be changed using the argument fit. Possible values for fit are “linear”, “quadratic”, “smooth” and “additive”:

open3d()
## glX 
##   5
scatter3d(x = sep.l, y = pet.l, z = sep.w, groups = iris$Species,
          xlab = "Sepal Length (cm)", ylab = "Petal Length (cm)",
          zlab = "Sepal Width (cm)", 
          grid = FALSE, fit = "smooth")
rglwidget() #view the rgl device within your html doc

We can remove surfaces as before (argument surface = FALSE), and add concentration ellipsoids.

open3d()
## glX 
##   6
scatter3d(x = sep.l, y = pet.l, z = sep.w, groups = iris$Species,
          xlab = "Sepal Length (cm)", ylab = "Petal Length (cm)",
          zlab = "Sepal Width (cm)",
          surface=FALSE, ellipsoid = TRUE)
rglwidget() #view the rgl device within your html doc
#rgl.close()

Export images

The function rgl.snapshot() will save the screenshot as a png.

#rgl.snapshot(filename="plot.png")

The function rgl.postscript() will save the screenshot to a file in ps, eps, tex, pdf, svg, or pgf form.

For example,

#rgl.postscript("plot.pdf", fmt="pdf")

rgl.points()

The function rgl.points() is used to draw a basic 3D scatter plot:

open3d()
## glX 
##   7
rgl.bg(color = "white") # Setup the background color
rgl.points(x, y, z, color ="blue", size =5) # Scatter plot
rgl.spheres(x, y, z, r = 0.2, color = "grey") # change the shape of points to spheres with center (x, y, z) and radius r. 
rgl.bbox(color = "#333377") # Add bounding box decoration
rglwidget() #view the rgl device within your html doc
#rgl.close()

Some extra formatting tools for rgl.bbox are: 1. xlen, ylen, zlen: values specifying the number of tickmarks on x, y and z axes, respectively 2. marklen: value specifying the length of the tickmarks 3. : other {rgl} material properties (see ?rgl.material) including: 4. color: a vector of colors. The first color is used for the background color of the bounding box. The second color is used for the tick mark labels. 5. emission, specular, shininess: properties for lighting calculation 6. alpha: value specifying the color transparency. The value should be between 0.0 (fully transparent) and 1.0 (opaque)

Label points interactively with the mouse

The function identify3d(), within the {car} package allows you to label points interactively with the mouse.

open3d()
## glX 
##   9
rgl.bg(color = "white") # Setup the background color
rgl.points(x, y, z, color ="blue", size =5) # Scatter plot
#identify3d(x = sep.l, y = pet.l, z = sep.w, labels=row.names(iris), buttons=c("left", "right"))
rglwidget() #view the rgl device within your html doc

NOTE: you must uncomment the line of code above to have identify3d() work. The interactive portion does not function within an html doc, but it does if you run the script through R/Rstudio.

Using “buttons”, I’ve set it so that you use left click to pick points and right click to exit. You can set it differently if you wish. You can also choose different labels - currently, it is set to displace the row name of the point you pick.

We can add axis lines and labels

For the function rgl.lines(), the arguments x, y, and z are numeric vectors of length 2 (i.e, : x = c(x1,x2), y = c(y1, y2), z = c(z1, z2) ).

The values x1, y1 and y3 are the 3D coordinates of the line starting point. The values x2, y2 and y3 corresponds to the 3D coordinates of the line ending point.

open3d()
## glX 
##  10
# Make a scatter plot
rgl.spheres(x, y, z, r = 0.2, color = "yellow") 
# Add x, y, and z Axes
rgl.lines(c(min(x), max(x)), c(0, 0), c(0, 0), color = "black")
rgl.lines(c(0, 0), c(min(y),max(y)), c(0, 0), color = "red")
rgl.lines(c(0, 0), c(0, 0), c(min(z),max(z)), color = "green")
rglwidget() #view the rgl device within your html doc

As you can see, the axes are drawn but the problem is that they don’t cross at the point c(0, 0, 0).

There are two solutions to handle this situation:

  1. Scale the data to make things easy. Transform the x, y and z variables so that their min = 0 and their max = 1
  2. Use c(-max, +max) as the ranges of the axes

First, we can scale the data:

x1 <- (x - min(x))/(max(x) - min(x))
y1 <- (y - min(y))/(max(y) - min(y))
z1 <- (z - min(z))/(max(z) - min(z))
open3d()
## glX 
##  11
# Make a scatter plot
rgl.spheres(x1, y1, z1, r = 0.02, color = "yellow") 
# Add x, y, and z Axes
rgl.lines(c(0, 1), c(0, 0), c(0, 0), color = "black")
rgl.lines(c(0, 0), c(0,1), c(0, 0), color = "red")
rgl.lines(c(0, 0), c(0, 0), c(0,1), color = "green")
rglwidget() #view the rgl device within your html doc

OR, we can use c(-max,max)

This helper function will help us calculate the axis limits:

lim <- function(x){c(-max(abs(x)), max(abs(x))) * 1.1}
open3d()
## glX 
##  12
# Make a scatter plot
rgl.spheres(x, y, z, r = 0.2, color = "yellow") 
# Add x, y, and z Axes
rgl.lines(lim(x), c(0, 0), c(0, 0), color = "black")
rgl.lines(c(0, 0), lim(y), c(0, 0), color = "red")
rgl.lines(c(0, 0), c(0, 0), lim(z), color = "green")
rglwidget() #view the rgl device within your html doc

CHALLENGE 1

Create a custom function called “rgl_add_axes()” to add x, y, and z axes.

This function should take in x, y, and z; have axis.col be “grey” as default; include xlab, ylab, and zlab; have default option show.plane as “TRUE” (to add the axis planes); have show.bbox have default as “FALSE” (to add the bounding box decoration); and have bbox.col determine the bounding box colors (having the first color as the background color and the second color as the color of tick marks).

Hints: The function rgl.texts(x, y, z, text) is used to add texts to an RGL plot. Also, rgl.quads(x, y, z) is used to add planes. x, y and z are numeric vectors of length four specifying the coordinates of the four nodes of the quad.

Answer to this challenge can be found here.

Use {rgl} to create a surface animation of the globe

Here we’ll use the persp3d() function within {rgl} to create a surface animation of the globe.

First provide formulas for the latitutde and longitude.

lat <- matrix(seq(90, -90, len = 50)*pi/180, 50, 50, byrow = TRUE)
long <- matrix(seq(-180, 180, len = 50)*pi/180, 50, 50)

Then, define some useful variables, including x, y, and z.

r <- 6378.1 # radius of Earth in km
x <- r*cos(lat)*cos(long)
y <- r*cos(lat)*sin(long)
z <- r*sin(lat)

Open a window for our creation (optional) and use the function persp3d() to draw a globe.

open3d()
## glX 
##  13
persp3d(x, y, z, col = "white", 
       texture = system.file("textures/worldsmall.png", package = "rgl"), 
       specular = "black", axes = FALSE, box = FALSE, xlab = "", ylab = "", zlab = "",
       normal_x = x, normal_y = y, normal_z = z)
rglwidget() #view the rgl device within your html doc
if (!rgl.useNULL())
  play3d(spin3d(axis = c(0, 0, 1), rpm = 16), duration = 2.5)
rglwidget() #view the rgl device within your html doc

Animate our globe.

Note: This doesn’t animate in the html document, but it should run in the RGL device if you run it in R/Rstudio.

Interactive 3D plotting using the {plotly} package

{plotly} is an R package that helps you create interactive web-based graphs and 3D surfaces.

You can set eval=TRUE in the chunk below to install the package from CRAN.

install.packages("plotly")

Or you can install the latest from Github.

devtools::install_github("ropensci/plotly")

You must have R v.3.2.0 for these installs to work.

library(plotly)
## Loading required package: ggplot2
## 
## Attaching package: 'plotly'
## The following object is masked from 'package:ggplot2':
## 
##     last_plot
## The following object is masked from 'package:stats':
## 
##     filter
## The following object is masked from 'package:graphics':
## 
##     layout
library(rgl)

Generalized format for basic plotting in R

plot_ly( x , y ,type,mode,color,size )

Where:
  • size = values for same length as x, y and z that represents the size of datapoints or lines in plot.
  • x = values for x-axis.
  • y = values for y-axis.
  • type = to specify the plot that you want to create like “histogram”, “surface” , “box”, etc.
  • mode = format in which you want data to be represented in the plot. Possible values are “markers”, “lines, “points”.
  • color = values of same length as x, y and z that represents the color of datapoints or lines in plot.

This line of code will add the layout fields, like plot title axis title/ labels, axis title/ label fonts, etc.

layout(plot ,title , xaxis = list(title ,titlefont ), yaxis = list(title ,titlefont ))

Where:
  • plot = the plotly object to be displayed
  • title = string containing the title of the plot
  • xaxis : title = title/ label for x-axis
  • xaxis : titlefont = font for title/ label of x-axis
  • yaxis : title = title/ label for y-axis
  • yaxis : titlefont = font for title/ label of y-axis

Let’s run through an example with the iris dataset we’ve been using:

Basic visualizations

{plotly} graphs are interactive. Here are some basic functions that {plotly} enables you to do:
  • hovering your mouse over the plot to view associated attributes
  • selecting a particular region on the plot (click-and-drag on the chart) using your mouse to zoom
  • resetting the axis (double-click to autoscale)
  • rotating the 3D images
  • click on legend entries to toggle traces
  • shift-and-drag to pan

An interactive histogram

#attaching the variables
attach(iris)
open3d()
## glX 
##  14
#plotting a histogram with Sepal.Length variable and storing it in hist
hist<-plot_ly(x=Sepal.Length,type='histogram')

#defining labels and title using layout()
layout(hist,title = "Iris Dataset - Sepal.Length",
xaxis = list(title = "Sepal.Length"),
yaxis = list(title = "Count"))

An interactive box plot

#plotting a Boxplot with Sepal.Length variable and storing it in box_plot
box_plot<-plot_ly(y=Sepal.Length,type='box',color=Species)

#defining labels and title using layout()
layout(box_plot,title = "Iris Dataset - Sepal.Length Boxplot",
yaxis = list(title = "Sepal.Length"))

An interactive scatter plot

#plotting a Scatter Plot with Sepal.Length and Sepal.Width variables and storing it in scatter_plot1
scatter_plot1<-plot_ly(x=Sepal.Length,y=Sepal.Width,type='scatter',mode='markers')

#defining labels and titile using layout()
layout(scatter_plot1,title = "Iris Dataset - Sepal.Length vs Sepal.Width",
xaxis = list(title = "Sepal.Length"),
yaxis = list(title = "Sepal.Width"))

Let’s go a step further and add another dimension (Species) using color.

#plotting a Scatter Plot with Sepal.Length and Sepal.Width variables with color representing the Species and storing it in scatter_plot12
scatter_plot2<-plot_ly(x=Sepal.Length,y=Sepal.Width,type='scatter', mode='markers', color = Species)

#defining labels and titile using layout()
layout(scatter_plot2,title = "Iris Dataset - Sepal.Length vs Sepal.Width", 
       xaxis = list(title = "Sepal.Length"),
       yaxis = list(title = "Sepal.Width"))

Although data frames can be thought of as the central object in this package, {plotly} visualizations don’t actually require a data frame. This makes chart types that accept a z argument especially easy to use if you have a numeric matrix.

We can even add another dimension (Petal Length) to the plot by using the size of each data point in the plot.

#plotting a Scatter Plot with Sepal.Length and Sepal.Width variables with color represneting the Species and size representing the Petal.Length. Then, storing it in scatter_plot3
scatter_plot3<-plot_ly(x=Sepal.Length,y=Sepal.Width,type='scatter',mode='markers',color = Species,size=Petal.Length)

#defining labels and titles using layout()
layout(scatter_plot3,title = "Iris Dataset - Sepal.Length vs Sepal.Width",
       xaxis = list(title = "Sepal.Length"),
       yaxis = list(title = "Sepal.Width"))

An interactive 3D scatterplot

scatter_plot4 <- plot_ly(x=Sepal.Length,y=Sepal.Width,z=Petal.Length,type="scatter3d",mode='markers',size=Petal.Width,color=Species)

layout(scatter_plot4,title = "Iris Dataset in 3D")

An interactive time series plot

To plot a time series, let’s load in a different dataset on international airline passengers.

data(AirPassengers)
str(AirPassengers)
##  Time-Series [1:144] from 1949 to 1961: 112 118 132 129 121 135 148 148 136 119 ...

As you can see, this dataset is a time-series from 1949 to 1961.

time_series<-plot_ly(x=time(AirPassengers),y=AirPassengers,type="scatter",mode="lines")

# defining labels and title using layout()
layout(time_series,title = "AirPassengers Dataset - Time Series Plot",
xaxis = list(title = "Time"),
yaxis = list(title = "Passengers"))

An interactive heat map

To plot an interactive heat map, let’s load in a different (volcano) dataset.

#Loading the data
data(volcano)

#Checking dimensions
dim(volcano)
## [1] 87 61
p <- plot_ly(z = volcano, type = "surface")
p

Aside: As it turns out, there are more than 50 ways to plot this volcano set from R, using the package {plot3d}. Learn more at this link.

Using {plotly} with {ggplot2}

{ggplot2} is arguably one of the best data visualization libraries. {plotly} has a very useful {ggplot2} converter that quickly and easily turns {ggplot2} plots into interactive, web-based plots. Set eval=TRUE in the chunk below if you need to install these packages.

#load necessary libraries
install.packages('ggplot2')
install.packages('ggmap')
library('ggplot2')
library('ggmap')
## 
## Attaching package: 'ggmap'
## The following object is masked from 'package:plotly':
## 
##     wind
#List of Countries for ICC T20 WC 2017
ICC_WC_T20 <- c("Australia",
"India",
"South Africa",
"New Zealand",
"Sri Lanka",
"England",
"Bangladesh",
"Pakistan",
"West Indies",
"Ireland",
"Zimbabwe",
"Afghanistan")

#extract geo location of these countries
countries <- geocode(ICC_WC_T20)
## Information from URL : http://maps.googleapis.com/maps/api/geocode/json?address=Australia&sensor=false
## Information from URL : http://maps.googleapis.com/maps/api/geocode/json?address=India&sensor=false
## Information from URL : http://maps.googleapis.com/maps/api/geocode/json?address=South%20Africa&sensor=false
## Information from URL : http://maps.googleapis.com/maps/api/geocode/json?address=New%20Zealand&sensor=false
## Information from URL : http://maps.googleapis.com/maps/api/geocode/json?address=Sri%20Lanka&sensor=false
## Information from URL : http://maps.googleapis.com/maps/api/geocode/json?address=England&sensor=false
## Information from URL : http://maps.googleapis.com/maps/api/geocode/json?address=Bangladesh&sensor=false
## Information from URL : http://maps.googleapis.com/maps/api/geocode/json?address=Pakistan&sensor=false
## Information from URL : http://maps.googleapis.com/maps/api/geocode/json?address=West%20Indies&sensor=false
## Information from URL : http://maps.googleapis.com/maps/api/geocode/json?address=Ireland&sensor=false
## .
## Information from URL : http://maps.googleapis.com/maps/api/geocode/json?address=Zimbabwe&sensor=false
## .
## Information from URL : http://maps.googleapis.com/maps/api/geocode/json?address=Afghanistan&sensor=false
#map longitude and latitude in separate variables
nation.x <- countries$lon
nation.y <- countries$lat

#using ggplot to plot the world map
mapWorld <- borders("world", colour="grey", fill="lightblue")

#add data points to the world map
q<-ggplot() + mapWorld + geom_point(aes(x=nation.x, y=nation.y) ,color="red", size=3)

#Using ggplotly() of ployly  to add interactivity to ggplot objects.
ggplotly(q) 
## We recommend that you use the dev version of ggplot2 with `ggplotly()`
## Install it with: `devtools::install_github('hadley/ggplot2')`

Note: {plotly} is not limited to R. There are a lot of other languages/tools (including Python, MATLAB, Perl, Julia, Arduino) that support it as well.

Moreover, one of the main advantages of {plotly} that I didn’t go over today is that {plotly} plots can easily be shared (and edited by people with no technical background for creating interactive plots) online by uploading the data and using the {plotly} GUI.

If you’d like more resources, this link is an amazing resource.

Hopefully this first part of our module has given you a good grasp on how to create 3d visualizations in R!

Part II: Geometric morphometric analysis in R with 3D data

Geometric morphometrics is a technique used to quantify and analyze patterns in shapes across 2D and 3D specimens. For example, using morphometrics, you could quantify differences in the shape of a specific feature, say, the wing of an insect, within and across multiple species. In this tutorial, you will learn how to import and create landmark data in R for morphometric analysis using the package {geomorph}. The {geomorph} package allows us to make comparisons between specimens across the shapes of fixed points, curves and surfaces.

Install {geomorph}

When installing {geomorph}, making sure dependencies = TRUE so that it also installs packages on which {geomorph} depends to function.

install.packages("geomorph", dependencies = TRUE)

Packages used in this section

library(geomorph)
## Loading required package: rgl
## Loading required package: ape
library(Morpho)
library(abind)
library(curl)
library(knitr)
library(rgl)#Installed with geomorph

Importing 3D data

The {geomorph} package allows you to input 3D data in ASCII .ply format which is a common output from 3D scanners. Here you will load three 3D scans of archaeological arrow heads from an online repository of 3D models sketchfab

Note: this may take a couple of minutes

library(curl)
library(geomorph)
f1 <- curl("https://raw.githubusercontent.com/mazattack/Group_Module/master/Data/A1.ply")
A1<-read.ply(f1) #using Geomorph
A2<-read.ply(curl("https://raw.githubusercontent.com/mazattack/Group_Module/master/Data/A2.ply"))
A3<-read.ply(curl("https://raw.githubusercontent.com/mazattack/Group_Module/master/Data/A3.ply"))

We can inspect this data most simply using plot3d() in the {rgl} package. This package is loaded by default when you load the geomorph package.

plot3d(A1) #Run in R to view in 3D

To import 3D data in binary .ply format, you can use the {Morpho} package.

library(Morpho)
name<-file2mesh(curl("https://raw.githubusercontent.com/mazattack/Group_Module/master/Data/A2.ply"))#using Morpho

Selecting landmarks

For geomorphometric analysis, we must first select landmarks on each of the specimens which will serve as standardized points, across which measurements and comparisons can be made.

Important Requirements for Landmarks

Remember the following requirements for landmarks:
  1. Each image must have the same number of landmarks
  2. The landmarks on each image must be in the same order
  3. Landmarks are ordinarily placed on homologous points, points that can be replicated from object to object based on common morphology, common function, or common geometry
  4. You may have to flip some images so that are not reversed left to right (e.g., if most of your images show the right side, flip left side images so that they mimic right side)

Collecting Landmarks on 2D data or with other software.

Landmarks can be collected with other software, such as ImageJ with Point Picker installed. See these helpful resources: ImageJ – open-source software for image processing Point Picker plugin for ImageJ This user guide on digitizing in ImageJ is really helpful.

We can digitize 2D landmarks on .jpg files using the digitize2d() function in {geomorph}.

This user guide is really helpful for this. Also see 15.1 in this user guide.

The user provides a list of image names, the number of landmarks to be digitized, and the name of an output TPS file. An option is included to allow the user to digitize a scale on each image to convert the landmark coordinates from pixels into meaningful units. Landmarks to be digitized can include both fixed landmarks and semi-landmarks, the latter of which are to be designated as “sliders” for subsequent analysis (see the function define.sliders()).

Selecting fixed landmarks on 3D data

For 3D data, digit.fixed() from the geomorph package is used for selecting fixed landmarks that do not change from specimen to specimen, and digit.curves() and digitsurface() are tools for creating semi-landmarks along curves and surfaces which slide between the fixed landmarks. These sliding landmarks can also be defined from existing landmark data using define.sliders. In this tutorial, we will first define the fixed landmarks for three arrowheads, and create a template of surface semi-landmarks from A1 to calculate the surface landmarks of the other two arrowheads.

Use digit.fixed() to selected fixed landmarks on the first specimen, this will save a .nts landmark file to the directory. You will use the console to confirm each point using y/n.

On Windows machines, double right click to select the landmark.

On a Mac, you must go the Preferences>Input> of XQuartz and tick “Emulate three button mouse”. After that you can press the pad button to rotate the 3D mesh, press the pad button while holding down the COMMAND key to select a vertex to be used as a landmark, and press the pad button while holding down OPTION to adjust mesh perspective. Finally, you can use the mouse trackpad with two fingers to zoom in and out.

This can be kind of troublesome. The easiest way is to orient the arrowhead as shown below, and double right click along the edge where the landmarks are highlighted. You should click in the same order for each set of fixed landmarks so keep track of where you start.

Note: for this tutorial, start at the tip of the arrow and select in a clockwise order.

A1_fix<-digit.fixed(A1, 10)#mesh, number of fixed landmarks to create. If you need to measure the shape of a curve, add in landmarks along the curve to be converted to sliders later on. 
A1_fix #run in R to see output
Select landmarks for A1:start at the tip and work clockwise

Select landmarks for A1:start at the tip and work clockwise

Inspecting landmarks

We can inspect these landmarks using plotspec() to plot the landmarks you have just created on the 3D mesh.

plotspec(A1, A1_fix, centered=TRUE)#these may not overlap unless you specify centered = TRUE

You can now find the A1.nts file in your working directory, this file will be overwritten each time you run digit.fixed on the same specimen, so remember to make a copy if you want to test different landmark points.

Let’s load that landmark file and compare the first sheet of this 3D matrix to the A1_fix created above.

A1_nts<-readmulti.nts(c("A1.nts"))#file must be located in working directory
A1_nts[,,1]
##            [,1]       [,2]        [,3]
##  [1,]  5.392135  26.781212  -1.1900166
##  [2,]  1.153085   7.769062   9.5253604
##  [3,] -2.850464 -13.565064  11.5271464
##  [4,] -2.614961 -14.181503   8.5833474
##  [5,] -3.910227 -19.388313   5.3301014
##  [6,] -4.027974 -20.878967  -0.2479996
##  [7,] -2.261702 -19.859291  -5.7864906
##  [8,] -3.321465 -14.324989  -7.7265306
##  [9,] -1.555205 -10.674698 -11.9124436
## [10,]  1.388584   9.343040 -10.2372466

In order to use the landmark data directly in functions like plotspec(), we must reference the first sheet of data, even if only one specimen is listed in the data set.

plotspec(A1, A1_nts[,,1], centered=TRUE, ptsize=1)
#rglwidget() does not work here

Build template

Now lets build a template using these fixed points, to create semi-landmarks over the surface of the model. This template will be used to create surface sliders for the remaining specimens. “buildtemplate” will overwrite your .nts files with your new landmark data, as well as create a “surfslide.csv” file in the working directory, which codes which landmarks are the surface sliders, and a template.txt which provides the xyz coordinates for the sliders.

A1_temp<-buildtemplate(A1, A1_fix, 50, ptsize=1) #mesh, file of landmark data, number of surface sliders to create, point size
A1_temp
##          xpts          ypts        zpts
## 1   5.3921347  26.781211643  -1.1900166
## 2   1.1530847   7.769061643   9.5253604
## 3  -2.8504643 -13.565064357  11.5271464
## 4  -2.6149613 -14.181503357   8.5833474
## 5  -3.9102273 -19.388313357   5.3301014
## 6  -4.0279743 -20.878967357  -0.2479996
## 7  -2.2617023 -19.859291357  -5.7864906
## 8  -3.3214653 -14.324989357  -7.7265306
## 9  -1.5552053 -10.674698357 -11.9124436
## 10  1.3885837   9.343039643 -10.2372466
## 11  4.4537703   9.606553407  -3.5323943
## 12 -5.7040361 -13.280484691   3.0139545
## 13  4.9117920  24.192585541  -1.3111477
## 14 -1.0770582   9.354254036   1.9663433
## 15  2.1468437  12.637223463   5.3339321
## 16  0.1801809 -14.802753728   3.8311534
## 17 -2.2388204 -10.583474645  10.9776562
## 18 -3.3865358 -12.915156089  -7.5430314
## 19  2.1533774  -5.235518546   0.5011057
## 20  3.8320357  20.653482734   1.4543776
## 21  5.2171483  14.889430347  -0.4481780
## 22 -1.3778368  -5.526367984  10.6638930
## 23  4.1556575   9.119766226   1.8584059
## 24  3.2984165   4.171812919   3.9053466
## 25  2.8538193  14.778629180  -6.0781719
## 26 -2.3382207 -18.203131311  -4.8897784
## 27  4.0006202  19.647610838  -3.9223034
## 28 -0.4685724  11.004656218  -3.2185769
## 29  0.1938673   3.520530381   8.7693949
## 30  1.4469485  -7.471893454  -5.0309487
## 31 -0.4623886  -4.870566549  -9.7538491
## 32 -4.2672636  -6.825422131  -5.3414744
## 33 -3.7150773  -1.753741975   5.5388882
## 34 -0.8749306 -19.895291623   0.4636897
## 35 -2.4445590 -13.683080601   7.7204318
## 36  2.9827543  16.909800656   4.3444887
## 37  3.8261181   4.249547199  -3.3863265
## 38  3.0316839   0.224395689   0.5006475
## 39 -0.7820710  -0.846329422  10.2930243
## 40 -1.9987165   5.201005474  -3.5914819
## 41  1.7625068  -1.980814976   5.7757258
## 42  0.9152349  -8.375741948   6.1030879
## 43 -5.3712740 -18.911947822  -0.2043002
## 44  1.0529804   4.753642072  -8.8105696
## 45 -5.4460987 -12.654918341  -2.5646604
## 46 -2.5472954  -0.733400432  -5.6700502
## 47 -4.2821068  -8.131734302   6.2020940
## 48 -4.2696862  -0.831113650  -0.5408380
## 49  1.2854595 -10.291046292   0.3257109
## 50 -1.3241917  -9.568200892 -10.1774788
## 51  0.3309512 -15.258855608  -1.6138203
## 52  1.0915558   8.092562257   7.2612813
## 53  0.3520840 -12.770266288  -5.8755705
## 54  2.7760018  -2.022272309  -4.9170187
## 55 -3.5848083 -18.256893014   4.9426703
## 56 -2.5477780   3.600635031   2.9823336
## 57  0.6389734  15.878246496  -0.3616698
## 58  1.6090949   9.673399468  -7.9491016
## 59 -5.4001820  -7.018015960   0.6862323
## 60  0.6828123  -0.001177774  -9.4416204

Now let view our template. How well does it match our 3D specimen?

plotspec(A1, A1_temp, ptsize=5, centered=T)#run in R studio for output

If you need to edit the template you can use editTemplate() to remove n points from the template.

A1_temp_ed<-editTemplate(template=A1_temp, fixed=8, n=1)#template, # of fixed points, number of points to be removed

Create semi-landmarks

Finally, we can use that template to create a series of landmarks for the remainder of our sample. In digitsurface() we can either use existing landmark data for that specimen to align the semi-landmarks across the surface of the specimen, or create the landmarks directly within the function, the same way we did with digit.fixed(). Let’s try out both ways here. Remember, the fixed landmarks must be in the same location and order in each specimen. This function will also read in the template data in the working directory to assign the landmarks.

digitsurface(spec=A2, fixed=10, ptsize=1, centered=TRUE)

We can also run the digitsurface using existing landmark data. Let’s use an .nts file I created for A3. First read in the data. Since we are working with .nts files, we can use readmulti.nts() to load the data.
Download the landmark file A3.nts from https://github.com/mazattack/Group_Module/tree/master/Data and place it in your working directory. Reminder, to see your working directory, use getwd().

A3_nts<- readmulti.nts(c("A3.nts"))
digitsurface(A3, A3_nts[,,1], 1, TRUE)

Finally, because two of our initial fixed landmarks fall along a curve with no distinguishing features, we should define them as sliders too. We must select the landmark at the start of the curve, the slider and the landmark at the end of the curve. In the following function, select landmarks 3, 2, 1, and 1, 10, 9. NOTE: that usually, you would want to place many more landmarks along a curve, and they should be defined as overlapping sliding landmarks, i.e. 1, 2, 3 and 2, 3, 4, and 3, 4, 5. Select sliders: start, slider, finish, i.e. 3, 2, 1

sliders<-define.sliders(landmarks=A1_temp, nsliders=2, surfsliders = TRUE, write.file = TRUE)
sliders
##      before slide after
## [1,]      3     2     1
## [2,]      1    10     9

Now that we have imported all our 3D models, loaded and created the fixed and semi-landmarks, we need to transform them into a format for geomorphometric analysis. The easiest way to do this is load the raw landmark data into a 3D array.

Importing landmark data

Whether you used R or ImageJ to collect digitized landmark coordinates, you can use R to import those landmark data. Landmark data brought into R can be in a variety of formats. Our functions can deal with the most common: .tps files, .nts files, and Morphologicka files. All of these functions return a 3D array, which is the preferred data format for landmark data.
  • If you have a single .nts file with the landmarks of multiple specimens use readlands.nts() to import.
  • If you have multiple .nts files with the landmarks of a single specimen use readmulti.nts().
  • If you have a single .tps file, used readlands.tps().

First you must create a filelist containing an array of the file names you will import.

filelist <- list.files(pattern = ".nts")#This will search the working directory (default) for the files containing .nts
filelist

Here we will use readmulti.nts() to load the three .nts files we created containing the fixed and semi-fixed landmarks. The number of landmarks in each file must be identical.

arrow_nts<-readmulti.nts(filelist)
str(arrow_nts)
arrow_nts #note how the data is formatted
##  num [1:60, 1:3, 1:3] 5.39 1.15 -2.85 -2.61 -3.91 ...
##  - attr(*, "dimnames")=List of 3
##   ..$ : NULL
##   ..$ : NULL
##   ..$ : chr [1:3] "A1_surf" "A2_surf" "A3_surf"
## , , A1_surf
## 
##             [,1]        [,2]        [,3]
##  [1,]  5.3921347  26.7812116  -1.1900166
##  [2,]  1.1530847   7.7690616   9.5253604
##  [3,] -2.8504643 -13.5650644  11.5271464
##  [4,] -2.6149613 -14.1815034   8.5833474
##  [5,] -3.9102273 -19.3883134   5.3301014
##  [6,] -4.0279743 -20.8789674  -0.2479996
##  [7,] -2.2617023 -19.8592914  -5.7864906
##  [8,] -3.3214653 -14.3249894  -7.7265306
##  [9,] -1.5552053 -10.6746984 -11.9124436
## [10,]  1.3885837   9.3430396 -10.2372466
## [11,]  0.7118199 -13.2396022   1.6294202
## [12,] -1.0328911   9.5948088   1.6780208
## [13,] -5.1877265 -19.5985404   0.5350432
## [14,]  0.8291342  -9.0031815   5.9500120
## [15,] -1.9399928 -16.1946884  -5.7986734
## [16,] -4.8963164  -7.5420406   3.7620697
## [17,]  1.9366701  -6.3936997   0.5550643
## [18,] -3.4919882   1.8915362  -0.3846688
## [19,]  2.5150922  14.0044187   5.0793966
## [20,]  0.5504871   3.7593598   8.7322783
## [21,]  3.8170475   4.0732281  -3.3009828
## [22,]  3.3011380  19.2511788   2.5967110
## [23,]  0.3624545  14.7685718  -0.9295669
## [24,]  1.6174756  -6.4096829  -5.4025452
## [25,] -5.3771478 -11.1631597  -3.0231836
## [26,] -2.1170878   4.1193530   4.1697655
## [27,]  3.8909254   7.4495170   2.5583395
## [28,] -1.5356012  -5.1510396  10.4109568
## [29,] -4.9354894  -4.4238612  -1.0903666
## [30,]  1.8091652  11.7754130  -6.9340777
## [31,] -1.1409270  -9.6672995 -10.1136096
## [32,] -2.6219998  -0.4075720  -5.2936410
## [33,] -0.5932962  -5.4802950  -9.9747791
## [34,]  3.0478971  -1.5326611  -3.1909354
## [35,] -2.2312124  -9.2456254  10.8845701
## [36,]  2.9067544   1.5944725   2.7767004
## [37,] -3.9380343  -6.7594475  -6.1480453
## [38,] -5.9641783 -14.0445813   0.9818600
## [39,]  4.4639794   9.3913719  -3.2347430
## [40,]  1.7030942  -2.7344765   5.8654706
## [41,] -4.6352918 -11.8146639   6.1589170
## [42,]  0.8748450 -11.6851487  -3.7305609
## [43,]  3.2235279  15.9775483  -5.5111843
## [44,]  4.7994132  23.9178024  -0.8297178
## [45,] -3.7903236  -1.5745766   5.1272946
## [46,]  1.3981439   7.4592797  -8.5234218
## [47,]  5.0716760  14.0603582   0.1296163
## [48,] -3.9875676 -18.7835488  -3.5974459
## [49,] -4.4467593 -18.0265011   4.8017696
## [50,]  4.1421860  19.9761345  -3.1879060
## [51,] -2.6070201 -12.5263912  -8.0957327
## [52,] -0.7722718 -19.2063836   2.1620516
## [53,]  0.5348547  -1.3090519  -9.3399567
## [54,] -2.0177849 -12.6362456   9.8693319
## [55,] -0.4264653 -18.8973102  -2.4626720
## [56,]  1.2593890   8.6192913   7.0333507
## [57,]  0.8678125   3.0688689  -8.7673185
## [58,] -1.4604878   7.2536226  -3.7499967
## [59,] -1.3670775 -15.5502337   6.2059237
## [60,] -0.7768476  -0.5905056  10.1934558
## 
## , , A2_surf
## 
##             [,1]       [,2]        [,3]
##  [1,]  -1.725997  30.667931 -3.36861336
##  [2,]   9.632420   6.455460 -2.57827136
##  [3,]  11.865998 -18.440567  1.90037964
##  [4,]   9.075469 -18.504761  1.90037764
##  [5,]   7.889933 -20.565071  2.55900864
##  [6,]   0.381615 -21.022293  5.85212564
##  [7,]  -7.182828 -22.128883  4.53488364
##  [8,]  -9.232496 -19.362633  4.00797564
##  [9,] -12.628223 -19.099205  3.74453264
## [10,] -10.824524   8.036152  0.45140364
## [11,]   0.974368 -16.794021 -0.03081236
## [12,]   3.411294   9.089951  1.65983064
## [13,]   1.830592 -20.350586  5.80859964
## [14,]   5.057856 -13.435028 -1.16540036
## [15,]  -8.118024 -20.548172  4.00798264
## [16,]   5.387168 -10.800522  4.56284564
## [17,]  -0.540460 -10.668793 -0.68527736
## [18,]  -0.408733  -0.789421  4.04120164
## [19,]   5.518891  14.424816 -1.94662236
## [20,]   9.075465   1.449906 -2.20544736
## [21,]  -4.985583   1.186455 -1.91964636
## [22,]   3.236870  21.208656 -1.65619236
## [23,]   0.118167  15.544480  1.11101164
## [24,]  -6.994983 -11.854324  0.22430864
## [25,]  -3.438413 -15.147445  6.46213164
## [26,]   5.914067   2.240257  1.51787864
## [27,]   1.567141   5.665116 -3.91421636
## [28,]  10.656167  -8.870422  0.18795064
## [29,]  -0.408735  -8.029709  5.32523264
## [30,]  -8.180509  11.065834 -1.73974136
## [31,] -11.210180 -17.123314  2.67680064
## [32,]  -5.414281  -4.214264  3.55185364
## [33,] -11.144323 -12.117767  1.68291964
## [34,]  -4.755665  -5.729111 -1.07965436
## [35,]  11.446521 -13.566742  0.54381464
## [36,]   1.764731  -1.645637 -2.65684236
## [37,]  -6.731534 -12.512947  5.94223664
## [38,]   2.555079 -16.728157  5.80938064
## [39,]  -4.985662   8.167877 -2.70999836
## [40,]   4.991997  -6.453598 -2.49311236
## [41,]   7.626492 -15.937797  4.02377564
## [42,]  -5.546009 -16.859878  0.52546364
## [43,]  -6.665676  16.664150 -2.12045836
## [44,]  -1.989436  27.268009 -4.35311536
## [45,]   6.572680  -4.477730  2.20513564
## [46,]  -9.497763   5.467522 -0.63276236
## [47,]  -1.330816  14.358956 -4.17046436
## [48,]  -3.833593 -21.512405  5.45695564
## [49,]   7.186564 -19.626098  3.34935964
## [50,]  -4.230649  21.472115 -4.02724736
## [51,]  -9.497760 -18.572304  4.21737164
## [52,]   2.620928 -21.501098  2.95418764
## [53,] -10.815013  -6.519447  0.69329464
## [54,]  10.260989 -17.518493  0.89612864
## [55,]  -3.306684 -22.056267  3.87624764
## [56,]   7.626490   7.509251 -3.56848236
## [57,]  -9.892934  -0.525963  1.52550364
## [58,]  -3.965307   5.665116  2.85232064
## [59,]   6.967864 -19.049667  0.84657764
## [60,]   9.907103  -3.292190 -0.60239536
## 
## , , A3_surf
## 
##             [,1]        [,2]       [,3]
##  [1,] -0.9583439  15.7026567 -1.0388217
##  [2,]  7.0204201   4.8344237 -1.4431017
##  [3,]  8.7001631  -3.7411523 -2.9985197
##  [4,]  5.4806591  -6.5038323 -2.3232227
##  [5,]  7.3586771 -12.3829143 -2.7185627
##  [6,]  0.7913691 -12.5535073  1.0608503
##  [7,] -6.2775209 -12.0377073  5.1202263
##  [8,] -5.7176069  -5.8039423  2.8572013
##  [9,] -9.6811559  -2.3044843  3.4404783
## [10,] -6.9178849   6.3741887  1.2708253
## [11,] -3.1979969  -8.6734873  3.9805363
## [12,]  4.0808751   7.6339847 -1.8228887
## [13,]  3.2409921 -11.1231183 -1.8059177
## [14,]  0.4414321  -4.9640603  2.2825613
## [15,] -6.1375439  -8.3449813  4.7002913
## [16,]  6.8804431  -0.8481323 -3.4184607
## [17,] -4.5977809  -3.9842173  3.2946043
## [18,]  4.5105211   4.5544557 -2.7885507
## [19,]  4.2085151   8.4738517 -1.5987297
## [20,]  6.6704731   3.2946517 -0.6760707
## [21,] -7.1873809   1.4749427  2.9306173
## [22,]  2.5411211  11.6933487 -1.6301987
## [23,]  2.5411141  10.2935687 -2.1779077
## [24,] -8.5871629  -3.1443583  3.6914673
## [25,]  1.4912791  -3.0552623 -3.6984057
## [26,]  6.0405711   4.4144867 -2.2581937
## [27,] -2.0081829   3.2946447  2.2433613
## [28,]  7.8602981  -0.2048063 -2.9174917
## [29,]  4.3608301   1.8948797 -3.4540187
## [30,] -4.6677599   7.9139297  1.0803683
## [31,] -8.7971309  -2.4444683  3.5984503
## [32,] -1.3782729   3.2246597 -3.0837667
## [33,] -8.3540419  -0.4147743  2.9505643
## [34,] -7.6959419  -1.8845473  3.3005093
## [35,]  8.5601841  -2.0245233 -2.7605607
## [36,] -2.2181509   0.4250917  2.6848813
## [37,] -1.9381909   0.1451387 -2.6757997
## [38,]  5.5450251  -4.9640603 -2.9985057
## [39,] -6.4175069   4.2045107  2.5259403
## [40,]  0.5114291  -1.6045943  2.4636633
## [41,]  7.5103451  -3.5642883 -3.6428947
## [42,] -7.2573589  -8.1835813  4.7116093
## [43,] -4.2478229   9.5936707  0.6551483
## [44,] -0.5384259  14.2129637 -0.3594837
## [45,]  6.7787361   2.3148167 -2.2986347
## [46,] -6.2075389   5.6042987  2.0207203
## [47,] -3.2679869   6.8640877  1.6778663
## [48,] -1.3782809 -10.3532283  2.5591473
## [49,]  7.1604041 -10.4232283 -3.2466637
## [50,] -2.7780649  11.9733017  0.6771413
## [51,] -6.3475079  -4.5441233  2.9275613
## [52,] -1.6582449 -12.6208833  2.3206643
## [53,] -8.3072089   1.1249897  3.0097303
## [54,]  6.2505361  -4.8009523 -1.8786887
## [55,] -5.9975699 -12.1729613  5.0482033
## [56,]  5.6206391   5.7442827 -1.8978207
## [57,] -7.0474029   3.2946517  2.6492023
## [58,]  1.0713391   7.0740707 -2.9897337
## [59,]  3.0571711  -9.5133693  0.5709333
## [60,]  7.7203211   1.5449267 -2.0778877

Importing .nts/.tps in bulk

To import multiple .nts files with multiple specimens or multiple .tps files, you can use the function below. Simply replace the two .nts sections with .tps.

require(abind)
filelist <- list.files(pattern = ".nts") #this will list files in the wd, use "path=" to specify files in another location
mydata <- NULL
for(i in filelist){
  tmp <- readland.nts(i)
  mydata$coords <- abind(mydata$coords, tmp)
  mydata$names <- rbind(mydata$names,cbind(dimnames(tmp)[[3]], i))
}
colnames(mydata$names) <- c("file") #If the file specifies species ID, I think this should be c("specimen", "file")

Note: There are more variables to specify with .tps, including species ID names and curve data. See the help file for more info.

Viewing landmark data

We can view the landmark data all together using plotAllspecimens()

plotAllSpecimens(arrow_nts)
## rglwidget()

You’ll notice that the arrowhead landmarks are not aligned to one another. For this, we need to use a function that with align the models based on the landmarks we selected.

GPAGEN function

Generalized Procrustes Analysis (GPA) is a common statistical analysis for quantifying and comparing the shape of objects. The gpagen() function aligns and superimposes the landmark data and calculates the mean shape of those objects combined. The output of the gpagen() is the primary data set for subsequent analyses and statistical tests that describe the shape of the objects.

GPA<-gpagen(arrow_nts, curves=sliders, surfaces = 11:60)#11:60 are the numbers assigned to the landmarks we created with digit.surf, the first 10 are fixed, and we created 50 semi-landmarks 11:60
## 
  |                                                                       
  |                                                                 |   0%
  |                                                                       
  |=============                                                    |  20%
  |                                                                       
  |==========================                                       |  40%
  |                                                                       
  |=======================================                          |  60%
  |                                                                       
  |====================================================             |  80%
  |                                                                       
  |=================================================================| 100%

If we examine GPA output in the console, it will provide us info about what analyses was run, the dimensions of the model, and the mean coordinates of the landmarks across our specimens. Note summary(GPA) provides the same output. View(GPA) provides an overview of what is in our list.

summary(GPA)
## 
## Call:
## gpagen(A = arrow_nts, curves = sliders, surfaces = 11:60) 
## 
## 
## 
## Generalized Procrustes Analysis
## with Partial Procrustes Superimposition
## 
## 8 fixed landmarks
## 52 semilandmarks (sliders)
## 3-dimensional landmarks
## 6 GPA iterations to converge
## Minimized squared Procrustes Distance used
## 
## 
## Consensus (mean) Configuration
## 
##                  X             Y             Z
##  [1,]  0.261934372  0.0006707571 -0.0007951224
##  [2,]  0.087614675  0.0894877256 -0.0024921778
##  [3,] -0.088156364  0.1078394347  0.0016415368
##  [4,] -0.101260511  0.0766791462  0.0067023909
##  [5,] -0.149569914  0.0710337308  0.0044389376
##  [6,] -0.155613091 -0.0005919932 -0.0074425870
##  [7,] -0.149971742 -0.0746877342 -0.0029248945
##  [8,] -0.097345064 -0.0777588959 -0.0037064919
##  [9,] -0.067437322 -0.1166094934  0.0030098608
## [10,]  0.109023218 -0.0830343529 -0.0009494468
## [11,] -0.103898385 -0.0165534849  0.0081106398
## [12,]  0.116600474  0.0530001120 -0.0090581846
## [13,] -0.146435230  0.0203944627  0.0063002850
## [14,] -0.073726599  0.0377642909  0.0087751691
## [15,] -0.118331593 -0.0750397798 -0.0052549106
## [16,] -0.024907715  0.0964087615  0.0037375495
## [17,] -0.033285495 -0.0300550294  0.0022133770
## [18,]  0.065397710  0.0617579984 -0.0101361976
## [19,]  0.144917868  0.0514195520 -0.0037147041
## [20,]  0.052774485  0.0798708105  0.0096934193
## [21,]  0.054261446 -0.0629155462  0.0039480846
## [22,]  0.199555765  0.0381146290 -0.0058983376
## [23,]  0.164259447  0.0231051231 -0.0167054377
## [24,] -0.039693849 -0.0709535799  0.0062357414
## [25,] -0.079586243 -0.0049793219 -0.0299725196
## [26,]  0.063209397  0.0787728747 -0.0063718867
## [27,]  0.093172977  0.0129419275  0.0109605354
## [28,] -0.017360707  0.1117929552  0.0059779538
## [29,] -0.003359777  0.0572687424 -0.0004980589
## [30,]  0.129501404 -0.0528072554  0.0037386607
## [31,] -0.062962342 -0.1071494281  0.0029967236
## [32,]  0.017504801 -0.0392336150 -0.0003770869
## [33,] -0.023853329 -0.0979558683  0.0033999411
## [34,]  0.001580971 -0.0562152989  0.0081938805
## [35,] -0.054027673  0.1107769933  0.0109893169
## [36,]  0.039304566  0.0018285705  0.0010785961
## [37,] -0.026780523 -0.0394772183 -0.0122618769
## [38,] -0.101712408  0.0458801825 -0.0128702217
## [39,]  0.108298497 -0.0657175741  0.0043065322
## [40,] -0.007165797  0.0370406732 -0.0104661267
## [41,] -0.079379972  0.0785841291 -0.0092092024
## [42,] -0.089961710 -0.0665508296  0.0009881064
## [43,]  0.168536046 -0.0428981446  0.0028222192
## [44,]  0.241553483  0.0015406608 -0.0001321844
## [45,]  0.021637981  0.0889876455 -0.0102136291
## [46,]  0.091925925 -0.0738782265  0.0001169097
## [47,]  0.153721015 -0.0254324880  0.0109407299
## [48,] -0.142279904 -0.0357082978 -0.0061010534
## [49,] -0.134664552  0.0652769693  0.0038410596
## [50,]  0.207205811 -0.0250801026  0.0046679692
## [51,] -0.082462065 -0.0801971333 -0.0026387111
## [52,] -0.155559974  0.0012055356  0.0057301045
## [53,]  0.010965777 -0.0983228631  0.0021993008
## [54,] -0.084845258  0.0843772483  0.0081572925
## [55,] -0.141566710 -0.0491979359 -0.0031880055
## [56,]  0.098189503  0.0669299644  0.0021106736
## [57,]  0.051269998 -0.0832045432 -0.0049722034
## [58,]  0.093583715  0.0002784144  0.0084570572
## [59,] -0.131295314  0.0528241124  0.0046998746
## [60,]  0.016235328  0.0955922255  0.0032178918

You can also view in RStudio with:

View(GPA)
Inspect the different data sets attached to GPA and read their descriptions below.
  • coords
    A (p x k x n) array of aligned Procrustes coordinates, where p is the number of landmark points, k is the number of landmark dimensions (2 or 3), and n is the number of specimens. The third dimension of this array contains names for each specimen if specified in the original input array.
  • Csize A vector of centroid sizes for each specimen, containing the names for each specimen if specified in the original input array.
  • iter
    The number of GPA iterations until convergence was found (or GPA halted).
  • points.VCV
    Variance-covariance matrix among landmark coordinates.
  • points.var
    Variances of landmark points.
  • consnsus
    The consensus (mean) configuration.
  • p Number of landmarks.
  • k Number of landmark dimensions.
  • nsliders
    Number of semilandmarks along curves.
  • nsurf Number of semilandmarks as surface points.
  • data
    Data frame with an n x (pk) matrix of Procrustes residuals and centroid size.
  • Q Final convergence criterion value.
  • slide.method
    Method used to slide semilandmarks.
  • call
    The match call.

Plot aligned landmarks

For our purposes, the most important part of the GPA list is the $coords data, the coordinates of the aligned landmark data for each specimen.

Let’s plot GPA to see our aligned landmark data (in grey) and the mean shape of the arrowheads (in black).

plotAllSpecimens(GPA$coords)

To see the mean shape more clearly, we can define the links between the points using a matrix which defines the order to link the points together.

alinks <- matrix(c(1,rep(2:10, each=2),1), nrow=10, byrow=TRUE) #we have 10 landmarks. for an example with "n" landmarks, change both "10" to "n". 
alinks
##       [,1] [,2]
##  [1,]    1    2
##  [2,]    2    3
##  [3,]    3    4
##  [4,]    4    5
##  [5,]    5    6
##  [6,]    6    7
##  [7,]    7    8
##  [8,]    8    9
##  [9,]    9   10
## [10,]   10    1
plotAllSpecimens(GPA$coords, links=alinks)

To view the mean shape created by gpagen, we can use the function mshape():

meanarrow<-mshape(GPA$coords)
plot(meanarrow)

Visualizing differences in shape

To visualize the differences between the specimens and the mean, we can use the function plotRefToTarget(), with M1 as the reference specimen and M2 as the target specimen.

plotRefToTarget(M1=meanarrow,M2=GPA$coords[,,1],method="points") #M2 here is the first specimen A1

If we add our alinks for the outline, we can add an outline of the target (black) and reference (gray) shape to the plot.

plotRefToTarget(meanarrow,GPA$coords[,,1],method="points", links = alinks)

In this next example, the target points are colored red, and their links are blue. Target=A3, Ref=Mean shape:

plotRefToTarget(meanarrow,GPA$coords[,,3],gridPars=gridPar(tar.pt.bg = "red", tar.link.col="blue",
tar.link.lwd=2), method="point", links = alinks)

We can also compare across two specimens. Target=A1, Ref=A3:

plotRefToTarget(GPA$coords[,,3], method="point", GPA$coords[,,1],links = alinks)

Part III: Multidimensional Trait Analysis

Biologically meaningful phenotypes are often multi dimensional simply because real phenotypes are under multidimensional selection, resulting in modular subunits which interact as a complex, functional phenotype. (For example, if we are interested in the evolution of “Brain Size”, we need to take into account how selection operates on different regions of the brain). Therefore, it is necesary to have analyses that can quantify and statiscally contrast multi-dimensional traits.

The following are examples of such analyses using landmark data that quantify 11 dimensions of head shape for 40 Plethodon salamander species. This will include statisical programs such as PCA applied to high dimensional data sets as well as more novel analyses that allow analysis of phylogenetic information in high dimensional data. All of these analyses work with data that have been alligned via a GPA.

Basic Analyses

First, load and allign the Plethodon species data set that came with the geomorph package. These data consist of 11 head shape landmarks whose dimensions are alligned to a common x,y grid to produce 11 sets of coordinates that specify a single point in 11 dimensional trait space.

library(geomorph)
data(plethspecies)
Y.gpa<-gpagen(plethspecies$land)
Y.gpa
Call:
gpagen(A = plethspecies$land)

Generalized Procrustes Analysis
with Partial Procrustes Superimposition

11 fixed landmarks
0 semilandmarks (sliders)
2-dimensional landmarks
2 GPA iterations to converge

Consensus (mean) Configuration

               X           Y
 [1,]  0.21321198 -0.02105584
 [2,]  0.24769363 -0.08046387
 [3,] -0.02712554 -0.01550055
 [4,] -0.26228896 -0.09485409
 [5,] -0.29017433 -0.06350496
 [6,] -0.31712433 -0.03118322
 [7,] -0.31143955  0.04394055
 [8,] -0.17573253  0.10754464
 [9,]  0.05243877  0.11186878
 [10,]  0.24134839  0.07648746
 [11,]  0.62919248 -0.03327890

As you can see, the means of the 11, alligned 2D landmarks are arranged in a table. To view the common, unitless grid containing 11 alligned points, use the following function with the landmark coordinates extracted:

plotAllSpecimens(Y.gpa$coords)

You can obtain a summary of the PCA and plot of the first 2 PCs with the following functions. The summary shows the exact and cummulative proportion of total variance explained by most of the PCs.First, convert the data into a 2D data matrix organized by species and then do the PCA and then obtain a plot.

y<-two.d.array(Y.gpa$coords)
plotTangentSpace(Y.gpa$coords) 
PC Summary

Importance of first k=8 (out of 9) components:
                          PC1     PC2     PC3      PC4      PC5     PC6      PC7      PC8
Standard deviation     0.01649 0.01059 0.01042 0.007543 0.005143 0.00355 0.002441 0.001259
Proportion of Variance 0.45640 0.18799 0.18199 0.095450 0.044380 0.02115 0.010000 0.002660
Cumulative Proportion  0.45640 0.64439 0.82637 0.921820 0.966200 0.98734 0.997340 1.000000

Phylogenetic Comparative Biology

We can also use {geomorph} to analyze the phylogenetic information contained in multi-dimensional data. For instance, it is useful to study divergence of evolutionary rates between closely related species. An evolutionary rate (ER) can be defined as the net rate of phenotypic or genetic variance over evolutionary time. ERs can be analyzed at different levels of biological organization (from species to populations to within individual variation). From analyzing ERs, we can begin to determine if certain phenotypes or genotypes are more or less evolutionarily labile in order to explain phenotypic differences of interest. We can use the compare.evol.rates() and associated functions to do this.

Function Assumptions

  • Estimation of ERs is done using either “R” or “Q” approaches. “R” approaches use matrices of ER variances and convariances among different species within a group. “Q” approaches use distance matrices (here, this means pairwise contrasing distances of species from the ancenstral node on a phylogenetic tree). The Q approach is beneficial for analysing multi-dimensional traits because although the number of covariance vlaues is dependent on trait dimensionality, the number of species pairwise distances is not. This allows for hypothesis testing without an increase in the probability of Type I error. The use of the Q approach has two implications for the analysis. First, it necesaitates that the data already be organized into a phylogeny in order for the distance matrix to be constructed. Second, for the sake of simplicity, the model of evolution used here is that of Brownian Motion. For more extensive description of Evolutionary Rate Analysis, see Adams (2013).

  • Under Brownian Motion, particles move multiple times on different axes, with each subsequent displacement being random and independent with respect to the previous displacement and any other particle’s displacement. As a result, the mean displacement of a particle is 0 and the net variance in the displacement only varies with time (σ²t). For our purposes, we can think of the particle as the traits of a phenotype moving under Brownian motion around an ancestral node in a high-dimensional morphospace. Evolution under brownian motion can be sumulated many times to produce a distribution of probabilities for the end result of where a given phenoype will existrelative to the ancestral node. The resulting distribution is normal, with a mean of 0 (no net displacment from the ancestral node) and variance as σ²t. The compare.evol.rates() function is most useful if you already have a well established phylogeny with known branch lengths and species distributions. For a more extensive description of the use of brownian Motion in phylogenetic analysis and a detailed text on phylogenetic analysis, see Felsenstein (2004).

  • The analysis presented here makes use of landmark data already applied to a 3D object and contrasts among groups of speces. Again, multi-dimensional data such as 3D landmarks are well suited for this analysis, but any type of multi-dimensional data is applicable and this analysis is only limited to contrasting objects that have evolved from a LCA and have well established phylogenies. Species groups are one example used here, but the same could be done for any other evolving thing.

Overview of How the Function Works

For the example here, the ERs of 2 species groups will be contrasted. The function calculates the average distance of each species pheotypic value from the ancestral node, calculates the multivariate evolutionary rates for the phenotypic value of each species group, and determines a ratio of the higher rate divided by the lower rate, a value that serves as a test statistic to contrast with a distribution of ratios for the null hypothesis of no difference between group ERs (i.e. a statistically single ER for the whole phylogeny)

The null distribution is constructed via calculation of a single ER for the entire phylogeny. Using the phylogeny and the single ER, simulated phenotypic data sets are produced at the tips of the phylogeny. The number of these simulations is manually determined, but it should be sufficiently hight to produce a reliable null distribution. Using this artificial data, ERs are calculated for the two groups after each simulation and the frequencies of the resulting ERs are used to create a null distribution for the analysis.

Working Through The Function

  1. The Plethodon species data used here is the same as was used for earlier analyses. Call the Plethodon data and allign with a GPA if not done already. Again, this data consists of 11 head shape landmarks that will be alligned to produce 11 sets of coordinates (each coordinate will have a y and x component) that specify a single point in an 11 dimensional trait space.
library(geomorph)
data(plethspecies)
Y.gpa<-gpagen(plethspecies$land)

Example output for the landmarks of one species.

$coords
, , P_cinereus

          [,1]        [,2]
[1,]  0.21608543 -0.02080337
[2,]  0.25310261 -0.07708298
[3,] -0.01792156 -0.01448663
[4,] -0.26651651 -0.09752395
[5,] -0.28724162 -0.06453206
[6,] -0.31967119 -0.03239812
[7,] -0.31362497  0.04171542
[8,] -0.17675527  0.10909564
[9,]  0.05039109  0.11565174
[10,]  0.23977149  0.08099904
[11,]  0.62238049 -0.04063473
  1. Now, establish the two groups of data whose means you’d like to contrast. Here, we’ll differentiate between landmark data of endangered (designated level 1) and non-endangered species (designated level 2). These are organized as in the original data set provided in the package.You can also view what you have with the names() function. Note that it will be disorganized if you don’t extract the phylogenetic data with the \(phy\)tip suffix to plethspecies.
gp.end<-factor(c(0,0,1,0,0,1,1,0,0))
gp.end
names(gp.end)<-plethspecies$phy$tip
names(gp.end)
[1] 0 0 1 0 0 1 1 0 0
Levels: 0 1
[1] "P_serratus"       "P_cinereus"       "P_shenandoah"     "P_hoffmani"      
[5] "P_virginia"       "P_nettingi"       "P_hubrichti"      "P_electromorphus"
[9] "P_richmondi"     
  1. Now the data is ready for running the actual analysis. Use the following function and inputs: compare.evol.rates(A, phy, gp, iter = 999, method = c("simulation","permutation"), print.progress = TRUE)

A = the data set in a 2D matrix (Here, this is the landmark coordinate data set alligned using the Y.gpa function earlier (Y.gpa$coords))

phy = the phylogeny in which the data is organized (Here, this is the data set with the phylogeny extracted (plethspecies$phy))

gp = the factor array that is defined for group membership (Here, this is gp.end)

iter = the number of iterations for significance testing

method = method to assess significance (simulation or permutation) : a permutation test will reshuffle the data to create sampling distributions for the test statistic without resampling. : a simulation test will allow resampling

ER<-compare.evol.rates(A=Y.gpa$coords,phy=plethspecies$phy,method=c("simulation"),gp=gp.end,iter=999, print.progress = TRUE)
ER
Call:

Observed Rate Ratio: 1.8372

P-value: 0.318

Based on 1000 random permutations

The rate for group 0 is 1.79641754369177e-06  

The rate for group 1 is 3.30041235292046e-06 

The plot() function returns the null distribution showing the frequency of each rate ratio given 1000 iterations of brownian evolution at a fixed rate for the entire phylogeny. The arrow indicates where your test statistic falls on the distribution. Note that because this distribution is recreated for every analysis, although your observed rate ratio will always be the same for contrasting the two given groups, the associated ratio test statistic and p-value will vary as the null distribution changes.

plot(ER)

Based on the null distribution and summary data, the observed rate ratio will occur about 40/1000 (RR=1.8) times (p=0.338) if there is only one true rate for the phylogeny. If our alpha value is 0.05, our data do not support a hypothesis that the species have different evolutionary rates based on their status as endangered or not.

Calculating Phylogenetic Signal

We can not only calculate ER differences between defined groups of species, but estimate whether divergence of sister species is equivalent to divergence between non-sister species using analysis of phylogenetic signal. This is an estimation of the degree of covariation in phenotype due to shared ancestry. The mean head shape data points among the Plethodon species shouldn’t be considered independent data because they all have shared evolutionary history and some species groups are more closely related than others. Therefore, geomorph provides a simple function for estimating the strength of phylogenetic signal in high dimensional data.

The strength of phylogenetic signal (PS) is estimated with a K Statistic that is used to contrast an observed PS to the PS expected under Brownian Motion. The K statistic is a complex ratio. The numerator is observed phenotypic variation/phenotypic variation under phylogenetic covariation.The denomonator is the ratio of phenotypic variation under Brownian Motion/the number of species. As with the ER analysis, geomorph uses a distance-based analysis for estimating PS and the null distribution is constructed analogously: Given a phylogeny, brownian evolution is simulated a specified number of times and a frequency distribution of simulated K values are obtained to which the actual K value is contrasted. The funtions for the PS analysis and distribution are as follows, where A = the data set used and phy = the extracted phylogenetic information. For a more detailed review of PS analysis for multi-dimensional data, see Adams (2014).

PS<- physignal(A=Y.gpa$coords,phy=plethspecies$phy,iter=999)
summary(PS)
plot(PS)
Call:
physignal(A = Y.gpa$coords, phy = plethspecies$phy, iter = 999) 

Observed Phylogenetic Signal (K): 0.9573

P-value: 0.011

Based on 1000 random permutations

As you can see, the K value is in the upper tail of the distribution with a p value <0.5. Given an alpha of 0.05, we have support for rejecting the null hypothesis that evolution of these species occuring randomly. Those species who are sister species are more related than would be expected if brownain evolution had occured, the divergence between sister species is not equivalent to the divergence between non-sister species and therefore we should consider phylogeny when doing statisics on these data. Therefore, these phenotypic data support that the species are not all equally related.

For an overview of all of {geomorph} functions and datasets, see the Package Documentation by Adams et al (2017)

References Cited:

Adams, D.C.; Collyer,M; Kaliontzopoulou,A; Sherratt,E (2017) Geometric Morphometric Analyses of 2D/3D Landmark Data. URL: http://geomorphr.github.io/geomorph/

Adams, D.C. (2014) A Generalized K Statistic for Estimating Phylogenetic Signal from Shape and Other High-Dimensional Multivariate Data. Syst. Biol.63(5): 685-697

Adams, D.C. (2013) Quantifying and Comparing Phylogenetic Evolutionary Rates for Shape and Other High-Dimensional Phenotypic Data. Syst. Biol. 63(2):166–177

Adams, D.C. and Otárola-Castillo, E. 2013 Geomorph: An r package for the collection and analysis of geometric morphometric shape data. Methods in Ecology and Evolution 4(4).

Felsenstein J (2004) Inferring phylogenies. Sunderland (MA): SinauerAssociates 391-392