Introduction to Movement Modeling

Preliminaries

Install these packages in R: {ctmm}, {curl}, {rlist}, {dplyr}.

Objectives

The objective of this module is to discuss the function and application of animal movement models, specifically, Contiuous-Time Movement Models (ctmm). We will explore functions from {ctmm} to calculate and plot hypothetical movement models with initial estimates, fit such models to actual data, and assess fits before applying results to further statistical analysis (i.e. range and overlap estimation).

Background

A guy walks into a…. ctmm?

Random Walk Models

You are at the bar, and you just got you and your friends a third round of drinks. You turn around and the dance floor is packed like sardines, and your friends are at the very front next to the DJ booth. How do you get from where you are to where you’re going? Maybe you take a step to the left, two steps forward, another back and to the right. Each step you take is independent from the last, as the dance floor is moving, grooving and no one is staying still. This goes on (and on and on) until you have made your way back to your friends.

Some movement ecologists would consider this navigation an example of a random walk model. Random walk models assume that in a time period, an individual will take random and independent steps that are identically distributed in size and away from their previous position (Nau 2014). However, these steps do depend on the location of your previous step, with each step the starting point of your next.. that is - unless you’ve figured out teleportation.

Correlated Random Walk Models

Now, imagine it’s the morning after your night out and you decide to run off the yucky post-drinking feelings. After lacing up your tennis shoes, you hit the esplanade and begin running your usual route along the Charles. Your strides are relatively the same direction, uniformly distributed, and full of nauseating regret (Codling et al. 2018).

We can consider this directionally persistent example a correlated random walk model (CRW). This means that there is a directional bias to your travel. This is a great model to use for animal movement, as animals tend to move in one direction- forward (Codling et al. 2018).

CRWs were considered the standard for modeling movement data for a long time. However, such models cannot handle multi-scale autocorrelation and disproportionately represent sampling schedule rather than underlying meaningful movement processes (Calabrese et al. 216). These shortcomings lead to inconsistent and biased results. Continuous-time stochastic process models (CTSP models) were created as an alternative to CRW models and resolve these issues.

Continuous-Time Stochastic Process

Okay. Now, you’ve made it home from your run and you want to see the data on your fitness tracker app. But, OH NO! You must have accidentally turned it on last night when you were leaving the bar. It tracked your uber ride home, when you got out of bed to get a glass of water, when you laid on the bathroom floor for a while, and then the run. That is a lot of movement data!

This, like any GPS monitoring system, is an example of continuous-time stochastic process (CTSP) modeling. That is, this is a model that accounts for finely sampled and continuous modern data collection (Calabrese et al. 2016). The discrete-time correlated random walk, or CRW, had been the standard tool utilized for the statistical analysis of this data. Yet, this methodology is not precise and often confounds the movement process thus producing different results even when the exact same path is being sampled. This is because CRW reflects the sampling schedule more than it does the actual process of movement (Calabrese et al. 2016).

CTSP models, on the other hand, consider both sampling schedule and continuous autocorrelated movement. These models are able to accommodate a range of data characteristics including irregular sampling schedules and varying levels of autocorrelation. However, CTSP modeling can be statistically complex and movement ecologists historically lacked a comprehensive software to process such complex modeling.

Fortunately, the continuous-time movement modeling {ctmm} package for R was created (Fleming & Calabrese 2015). ctmm allows a broad array of movement related data analyses including diagnostic data visualization and model fitting. These parameters can be used to analyze various metrics related to animal movement. In this module, we will examine evidence of range residency, calculate home range size with confidence intervals, and visualize the overlap between the home ranges of two coyotes.

The {ctmm} package was designed to easily accommodate movement data from Movebank, a free online tool/repository for storing and downloading animal movement data. Movebank is an especially useful tool for cross species comparisons, as well as comparisons over time, and climate/ecological change (Kranstauber et al. 2011). You can explore the website and various publicly available datasets for yourself here!

Spatial Autocorrelation

‘Everything is related to everything else, but near things are more related than distant things’ - Waldo R. Tobler

As we have discussed in class previously, statistical analyses often rest on the assumption that data are independent and identically/normally distributed. However, we have also explored how such discrete independence is often not the case (like with our introduction to mixed effect modeling). Investigating autocorrelation, the similarity of near observations, can reveal patterns and relationships that can better inform the selection of appropriate statistical analyses.

Spatial data have a strong tendency to be dependent, meaning near values are more/less similar than expected compared to randomly associated pairs of observations. When dealing with spatial data, like the animal movement data we are exploring here, spatial autocorrelation is a critical concept that drives model fitting and analysis.

We will continue to explore this concept in greater detail as we dive deeper into the components of ctmm modeling which help ensure our spatial models are statistically meaningful.

Conservation Implications

Understanding animal movement is crucial when studying population ecology, disease spread, the flow of genetic material, and conservation among other things (Calabrese et al., 2016). In so far as conservation goes, this kind of data allows us to understand how animals are using their environments, and what can be considered their home ranges. With this information, we can designate certain areas as protected, create corridors and encourage reforestation of degraded areas.

Movement Data Analysis

Recently there has been a rapid development and advancement of the technology associated with movement ecology (Calabrese et al., 2016). As shown in the background literature, advancements in movement modelling have allowed for more accurate home range estimates and movement analyses that account for both the time dimension and autocorrelation in global position system (GPS) data. Here we explore the functions of the R package {ctmm} (Continuous-Time Movement Modeling) to model the movement of coyotes (Canis latrans) from publicly available data shared via Movebank (Mahoney & Young 2017). This data was collected via Lotek GPS collars (Model GPS3300S; Lotek, Newmarket, ON, Canada) on 8 coyotes in Idaho between 2004-2015. If you would like to also play around with this massive data set, take a look at the data repository here. With the coyote data, we will walk through each of the steps outlined below to create an appropriate and statistically significant movement model to explore measures of home range and range overlap.

Modeling Outline

Data configuration and visualization
Calculation and visualization of empirical variograms (or periodograms) to reveal patterns of movement and assess potential model types
Identify suitable model types
Fit selected models to the data via maximum likelihood and compare via AIC
Select AIC-best model for more detailed movement analyses (ie. home range estimations, time-series occurrence distributions/trajectories, encounters and overlap, etc.)

Data Configuration

We will begin by loading in a subset of the Mahoney & Young 2017 data. For the purpose of this module, we are only using data from ~ 1 month of the study (January 2005). The GPS collars used in the study recorded data every five minutes for over ten years and no one likes code that takes days to run :’)

Let’s first load our packages and then use {curl} to save the Movebank GPS data into the new object bigdata

library(ctmm)
library(curl)

## Using libcurl 7.64.1 with LibreSSL/2.8.3

f <- curl("https://raw.githubusercontent.com/fuzzyatelin/fuzzyatelin.github.io/master/bioanth-stats/module-F21-Group3/thisisit.csv")
bigdata <- read.csv(f, header = TRUE, sep = ",", stringsAsFactors = FALSE)

The {ctmm} package was designed explicitly to go hand in hand with Movebank. For any ctmm function to operate, data must be in Movebank format and coerced into a telemetry object.

Telemetry objects in {ctmm} contain information about the movement being analyzed including GPS location and sampling schedule. Objects from the {move} package in R can also be utilized in ctmm by using the as.telemetry function. Once in the proper format, movement data can be manipulated and visualized for data characterization and analysis.

coyotes <- as.telemetry(bigdata)

## Minimum sampling interval of 4 minutes in IdCoy_P1_1

## Minimum sampling interval of 4 minutes in IdCoy_P1_2

## Minimum sampling interval of 4 minutes in IdCoy_P2_1

## Minimum sampling interval of 4 minutes in IdCoy_P2_3

## Minimum sampling interval of 1 minutes in IdCoy_P2_6

## Minimum sampling interval of 4 minutes in IdCoy_P3_1

## Minimum sampling interval of 3 minutes in IdCoy_P4_2

## Minimum sampling interval of 4 minutes in IdCoy_P5_2

# Changing the names of the coyote IDs to make coding easier!
names(coyotes)<-c('coyA','coyB','coyC','coyD','coyE','coyF','coyG','coyH')

Data Visualization

As with any new data - it is important to explore!

Plotting and visualizing raw movement tracks is a great way to assess your data as a whole. In initial visualization you can detect obvious migratory behaviors, directional patterns, or perhaps even erroneous data points that require attention. This step can also aid in data selection for modeling.

Let’s visualize…

par(mfrow = c(1, 1))
plot(coyotes, col = rainbow(length(coyotes)))

## DOP values missing. Assuming DOP=1.
## DOP values missing. Assuming DOP=1.
## DOP values missing. Assuming DOP=1.
## DOP values missing. Assuming DOP=1.
## DOP values missing. Assuming DOP=1.
## DOP values missing. Assuming DOP=1.
## DOP values missing. Assuming DOP=1.
## DOP values missing. Assuming DOP=1.

title("All Coyotes")

From this plot, we can see pretty distinct ranges for the 8 individuals and there do not appear to be any unnatural patterns or random points that could skew future analysis.

Though you can use ctmm models to analyze cumulative data from an entire population, for the sake of time, we are going to build our model using the data for a single coyote, coyG.

Let’s begin by pulling out all of the movement data for this individual to create and then plot the object coy1

coy1 <- coyotes$coyG
plot(coy1)

## DOP values missing. Assuming DOP=1.

We can see from the fairly tight clustering of GPS points, that this coyote appears to be range resident with no obvious migrations. The next step of our workflow is to calculate and plot variograms from the GPS data for this individual.

Variograms

Variograms are plots of the semi-variance between positions with varying time-lags (ie. hours or days between observations), providing an unbiased way to visualize the autocorrelation structure of our data. The structure and ‘behavior’ of a variogram reveal critical information on the movement of an animal, as well as what method of modeling will fit the data best.

Arguably, the most crucial element of a variogram is if the semi-variance reaches an asymptote. This indicates that an animal is range-resident, which allows our range calculation to be meaningful.

The code below employs the function variogram() to calculate and then plot the empirical variogram of telemetry object, coy1.

vg.coy1<-variogram(coy1)

#plotting long/short lag variograms
par(mfrow = c(1, 2))
plot(vg.coy1)
title("Long Lag")
plot(vg.coy1,fraction=0.005)
title("Short Lag")

The asymptote of the variogram for coy1 at longer (day) lags supports our visual analysis that coy1 is range-resident, while the slight upward curve of the short (minutes) lags show evidence of directional persistence in the data.

Variograms are used to determine the initial parameter guesstimates which are then utilized in maximum likelihood estimation, which we will describe in more detail later on. Behavior of variograms near zero, whether linearity or upward curvature, can help inform us whether models with uncorrelated (BM, OU) or correlated velocities (IOU, OUF) will be most appropriate for the data at hand. Given the zoomed in figure on the right, ‘Short Lag’, do the velocities of coy1 appear to be uncorrelated or not?

If you’re unsure, fear not! - we will explain each of these typical movement model types in more detail later on.

CHALLENGE 1

Follow the steps outlined above to calculate and plot the variograms for a different individual, coyH.

What do the variograms for this coyote suggest?
How can we interpret the overall picture of the first variogram?
What does a more zoomed in variogram show us, anything about velocity?

coy2 <- coyotes$coyH
plot(coy2)

## DOP values missing. Assuming DOP=1.

vg.coy2<-variogram(coy2)
par(mfrow = c(1, 2))
plot(vg.coy2)
plot(vg.coy2,fraction=0.005)

Initial Parameter Estimates

Our next step in movement modeling involves visually fitting a movement model to our variogram in order to get initial parameter guesstimates. The functions variogram.fit() and ctmm.guess() integrate the theoretical analogue of the variogram, semi-variance function (SVF), in order to generate the starting parameters of our model. These estimated parameters will later be passed to ctmm.select or ctmm.fit to better assess the model.

If run in your console, variogram.fit() allows you to visually assess and adjust parameters via the manipulate gear in top left corner of the plot pane.

NOTE: Though you can save these parameters for later use in model fitting by selecting ‘save to GUESS’ from manipulate, the code below already includes the creation of object GUESS without needing to save the variogram.fit() parameters from the plot pane.

variogram.fit(vg.coy1)

## An object of class "ctmm"
## [[1]]
## [1] "stationary"
## 
## [[2]]
## An object of class "covm"
##          x        y
## x 344672.6      0.0
## y      0.0 344672.6
## Slot "par":
##    major    minor    angle 
## 344672.6 344672.6      0.0 
## 
## Slot "isotropic":
## [1] FALSE
## 
## 
## [[3]]
## gps NA 
##  FALSE 
## 
## [[4]]
## [1] FALSE
## 
## [[5]]
## [1] "x" "y"
## 
## [[6]]
## [1] FALSE
## 
## [[7]]
##   position   velocity 
## 12047.4182   440.8345 
## 
## [[8]]
## [1] 0
## 
## [[9]]
## [1] TRUE
## 
## Slot "info":
## $identity
## [1] "IdCoy_P4_2"
## 
## $timezone
## [1] "UTC"
## 
## $projection
## [1] "+proj=tpeqd +lat_1=43.7655737679396 +lon_1=-112.693432141095 +lat_2=43.7454794721036 +lon_2=-112.646927688557 +x_0=0 +y_0=0 +datum=WGS84 +units=m +no_defs"
## 
## $axes
## [1] "x" "y"

GUESS<-variogram.fit(vg.coy1)

We can also use ctmm.guess() and the variogram for coy1, vg.coy1, to estimate initial model parameters.

coy1_GUESS <- ctmm.guess(coy1,variogram = vg.coy1,interactive=FALSE)

These parameters, GUESS and coy1_GUESS, are critical in our ability to assess the most suitable model type for the data at hand. We will next explore how ctmm.select and ctmm.fit allow us to test and visualize potential model types.

Comparing Parameter Estimates

Let’s input the initial parameter estimates calculated from variogram.fit (GUESS) and ctmm.guess (coy1_GUESS) to compare the two methods.

#this takes ~10 mins to run

vg.fitted.mods <- ctmm.select(coy1,CTMM = GUESS,verbose = TRUE)  
guess.fitted.mods <- ctmm.select(coy1,CTMM = coy1_GUESS,verbose = TRUE)

#view the guesstimates

summary(vg.fitted.mods)

##                      ΔAICc ΔRMSPE (m) DOF[area]
## OUF anisotropic    0.00000   158.5202  77.25247
## OUF isotropic     13.87693   150.9085  78.74945
## OUf anisotropic 1410.24731     0.0000 559.37437
## OU anisotropic  2256.52201   180.5625  33.09277

summary(guess.fitted.mods)

##                      ΔAICc ΔRMSPE (m) DOF[area]
## OUF anisotropic    0.00000   158.5202  77.25247
## OUF isotropic     13.87693   150.9085  78.74946
## OUf anisotropic 1410.24731     0.0000 559.37438
## OU anisotropic  2256.52201   180.5625  33.09277

Both methods yield the same prototype model fittings. Thus - either are acceptable for determining initial parameter guesstimates.

Model Selection

In order to perform statistically meaningful spatial/movement analyses, we have to ensure that our data indicate range residency and is fit with an appropriate movement model. After visualizing the data and estimating initial model parameters we can now use these parameter estimates to create prototype models to identify suitable model types. There are many different possible models that we can assess and select for our data, the most common movement models found throughout the literature include: Brownian motion (BM) and multiple versions of Ornstein-Uhlenbeck (integrated OU, OU foraging, etc.). Independent identically distributed (IID) models are occasionally used and assume that observations are independent and random. However, animal movement is inherently correlated since one’s location in space is dependent on continuous movement (i.e. coyotes also can’t teleport). This means the IID model is not sufficient for our purposes despite it’s use in older kernel density estimation metrics.
Brownian motion - BM models assume random movement and diffusion across unlimited space. Such models are often only appropriate when data is not robust enough (spatially or temporally) to demonstrate range residency or velocity autocorrelation.
Ornstein-Uhlenbeck - OU models merge BM’s lack of autocorrelated velocities with an assumption of local fidelity - residency.
Integrated OU - IOU models differ from BM and OU in that they feature continuous velocities, but endless diffusion (the same lack of boundaries as seen in BM models) limits their ability to accurately characterize range residency.
Ornstein-Uhlenbeck-Foraging - OUF models are a hybrid of OU and IOU movement, featuring both continuous position and velocity as well as restricted space use. These models are ideal when data sampling is spatially detailed over a long enough period of time to reveal velocity autocorrelation and range residency.

These various model types are especially relevant when considering one’s sampling methodology. Many modern datasets, such as the coyote data used here, successfully combine long sampling periods with high sampling resolution to increase model accuracy and analytical application. Given the extensive data used here, we can expect that our best fit movement model might be the OUF model.

Prototype Model

We briefly looked at the prototype models fit above with our initial parameters to compare the estimations from ctmm.guess and variogram.fit. By fitting our initial parameter guesses with ctmm.select we can identify the most suitable model types for our data.

Now that we know more about the different types of movement models, let’s return to the results from ctmm.select and consider the summary values in greater depth.

#view the prototype model

summary(guess.fitted.mods)

##                      ΔAICc ΔRMSPE (m) DOF[area]
## OUF anisotropic    0.00000   158.5202  77.25247
## OUF isotropic     13.87693   150.9085  78.74946
## OUf anisotropic 1410.24731     0.0000 559.37438
## OU anisotropic  2256.52201   180.5625  33.09277

With our initial parameter estimates we can asses their fit to different model types with the function ctmm.select as shown above. The summary of these initial fits reveal the anisotropic OUF model has the lowest AIC values (which we will get more into later). Isotropic and anisotropic respectively refer to circular or elliptical covariance in the data. Essentially, if an animal’s spatial variance is a function of both distance and direction, the movement is considered anisotropic. If spatial variance is only a function of distance, the movement is isotropic. We can see in the initial models above that the anisotropic versions of each model type are favored over their isotropic counterparts, thus we must select a model that considers movement variance in both distance and direction.

Given the results of these prototype models, we can now visually examine the fits and decide between the anisotropic OU and OUF models.

NOTE: ctmm fits the assigned parameters to movement models within a nested hierarchy. Thus, only models that appear credible and applicable will be attempted (this can be adjusted with the addition of the argument ‘level =’). Read this detailed article by Fleming et al. 2017 that describes the filter responsible.

Assessing Model Types

Let’s first extract out the fitted anisotropic versions of the OU and OUF models.

NOTE: as shown above, vg.fitted.mods could also be used to pull out the model types

OU<-guess.fitted.mods[[4]]
OUF<-guess.fitted.mods[[1]]

We can now use the function ctmm.fit to fit the selected OU and OUF models to our data. This function generates values for compariing the fit of each model including: point estimates, confidence intervals, and AICc. Based on this output, we can select the most appropriate model with greater confidence.

M.OU <- ctmm.fit(coy1,OU) #these fits may take ~2 mins
M.OUF <- ctmm.fit(coy1,OUF)

FITS <- list(OU=M.OU,OUF=M.OUF)
summary(FITS)

##        ΔAICc ΔRMSPE (m) DOF[area]
## OUF    0.000    0.00000  77.25247
## OU  2256.522   22.04226  33.09277

To refresh from earlier in the semester, AICc is the (linearly) corrected Akaike Information Criteria. AIC balances likelihood against model complexity in a way that is critical if we want to make optimal predictions. Recall from Module 16 - Model Selection in General Linear Regression that AIC helps us to determine which model has the best fit. This metric is typically calculated via -2(log-likelihood) + 2K with the log-likelihood being a measure of a model’s fit and K reflecting the number of parameters considered within the model. In addition to assessing fit, AICc typically penalizes model complexity to encourage the removal of unnecessary predictor variables, thus improving model fit. While AIC does not tell us how good a model is on its own, it is useful in comparisons of different models like these to assess fit. Given this information, OUF would be the better model for our data.

Now, we will plot each of the methods over the original data variogram to see which model best fits our data. By doing this, we get a good idea of which model is operating efficiently over both long and short time lags.

par(mfrow = c(2, 2))
plot(vg.coy1, CTMM=OU, col.CTMM = 'red')
title("OU (long)")
plot(vg.coy1, CTMM=OU, col.CTMM = 'red', fraction = 0.005)
title("OU (short)")
plot(vg.coy1, CTMM=OUF, col.CTMM = '#1b1e77')
title("OUF (long)")
plot(vg.coy1, CTMM=OUF, col.CTMM = '#1b1e77', fraction = 0.005)
title("OUF (short)")

We can see from the figures above that OUF is a better fitted model (though not perfect!). The data we are using here are a subset of a much larger body of data. We reduced the number of observations so that the ctmm models would run in a reasonable amount of time, but it is likely that the OUF model would look even better fitted if more observation points were included. Additionally, even though the OUF model is not perfect in the shorter time lag plot, the overall trend over multiple days (longer time lag) is adequate and most important for assessing home range.

CHALLENGE 2

Follow the steps outlined above to calculate starting parameters and initial model selection for coy2 using ctmm.select

When considering your coy2 plot and fit variogram from earlier, how does the visualization of coy1 movement differ?
What do you think may contribute to differences in both parameter estimates?
Which model appears to fit the data for coy2 best?

variogram.fit(vg.coy2)

## An object of class "ctmm"
## [[1]]
## [1] "stationary"
## 
## [[2]]
## An object of class "covm"
##         x       y
## x 1907134       0
## y       0 1907134
## Slot "par":
##   major   minor   angle 
## 1907134 1907134       0 
## 
## Slot "isotropic":
## [1] FALSE
## 
## 
## [[3]]
## gps NA 
##  FALSE 
## 
## [[4]]
## [1] FALSE
## 
## [[5]]
## [1] "x" "y"
## 
## [[6]]
## [1] FALSE
## 
## [[7]]
##   position   velocity 
## 28270.8795   632.9355 
## 
## [[8]]
## [1] 0
## 
## [[9]]
## [1] TRUE
## 
## Slot "info":
## $identity
## [1] "IdCoy_P5_2"
## 
## $timezone
## [1] "UTC"
## 
## $projection
## [1] "+proj=tpeqd +lat_1=43.7655737679396 +lon_1=-112.693432141095 +lat_2=43.7454794721036 +lon_2=-112.646927688557 +x_0=0 +y_0=0 +datum=WGS84 +units=m +no_defs"
## 
## $axes
## [1] "x" "y"

coy2_GUESS <- ctmm.guess(coy2,variogram = vg.coy2,interactive=FALSE)
coy2_SELECT <- ctmm.select(coy2,CTMM = coy2_GUESS,verbose = TRUE)
summary(coy2_SELECT)

##                      ΔAICc ΔRMSPE (m)  DOF[area]
## OUF anisotropic    0.00000   528.9650  21.596096
## OUF isotropic     18.93578   515.4893  21.850218
## OU anisotropic  2089.00336   626.6339   9.849115
## OUf anisotropic 2254.68759     0.0000 413.798019

Model Fitting

Our previous steps revealed the OUF model is best suited for our data. We can now use ctmm.fit to fit our model via maximum likelihood. This function processes a telemetry object and model specification (i.e. OUF, OU, IID, etc.), returning the maximum likelihood parameter and interval estimates. The objects returned by ctmm.fit will allow us to compare the fitted model to the data.

this can take ~2 mins to run

coy1_FIT <- ctmm.fit(coy1, CTMM = OUF, method = "pHREML", control = list(method="pNewton",cores=2), trace = TRUE)

## Maximizing likelihood.

## Calculating Hessian.

## Calculating REML gradient.

## Calculating REML Hessian.

summary(coy1_FIT)

## $name
## [1] "OUF anisotropic"
## 
## $DOF
##       mean       area      speed 
##   41.39732   77.25247 4258.41374 
## 
## $CI
##                                low       est      high
## area (square kilometers)  5.226104  6.619442  8.174943
## τ[position] (hours)       3.384463  4.390026  5.694353
## τ[velocity] (minutes)     3.704709  3.913170  4.133361
## speed (kilometers/day)   37.089802 37.655334 38.220749

NOTE: CTMM = OUF based on our previous figures/calculations which show that is the best fit model compared to the OU model. Additionally, the argument trace = TRUE provides a progress report in the console window as ctmm.fit runs.

par(mfrow = c(2,2))
plot(vg.coy1,CTMM=FITS,col.CTMM=c("red","#1b1e77"),fraction=0.65,level=0.5)
title("OU v OUF (zoomed out)")
plot(vg.coy1,CTMM=FITS,col.CTMM=c("red","#1b1e77"),fraction=0.05,level=0.5)
title("OU v OUF (zoomed in)")
plot(vg.coy1,CTMM=coy1_FIT,col.CTMM=c("#1b1e77"),fraction=0.65,level=0.5)
title("Final Model (zoomed out)")
plot(vg.coy1,CTMM=coy1_FIT,col.CTMM=c("#1b1e77"),fraction=0.05,level=0.5)
title("Final Model (zoomed in)")

Finally - we plot the final fit of OUF to ensure the selected model explains the most significant features of the coyotes’s movement, as seen in the structure of the variogram we created earlier. In the figures above, we see the difference between the OU (red) and OUF (dark blue) models when plotted against the variogram. The zoomed in figure shows the dark blue line, the OUF model, more closely follows the curve of the variogram compared to the OU model. The bottom two figures show the final OUF movement model fit to our data. We can now use this movement model to statistically and graphically assess home range and other movement patterns of coy1.

CHALLENGE 3

Follow the steps outlined above to plot and fit the most appropriate model type for coy2.

NOTE: The process of individually pulling out and comparing the individual model types from our initial parameters is not always necessary if you can confidently estimate the best model for your data with initial visualization and variogram analysis. It is much faster to simply run ctmm.guess and plug those parameter estimates directly into ctmm.fit without running through the nested hierarchy of models from ctmm.select. Feel free to consider this while working through the data for coy2.

OU2<-coy2_SELECT[[4]]
OUF2<-coy2_SELECT[[1]]

M.OU2 <- ctmm.fit(coy2,OU2)
M.OUF2 <- ctmm.fit(coy2,OUF2)


FITS2 <- list(OU=M.OU2,OUF=M.OUF2)
summary(FITS2)

##        ΔAICc ΔRMSPE (m) DOF[area]
## OUF    0.000    528.965  21.59609
## OU  2254.688      0.000 413.79802

par(mfrow = c(2, 2))
plot(vg.coy2, CTMM=M.OU2, col.CTMM = '#1b9e77')
title("OU")
plot(vg.coy2, CTMM=M.OU2, col.CTMM = '#1b9e77', fraction = 0.005)
title("OU")
plot(vg.coy2, CTMM=M.OUF2, col.CTMM = '#1b1e77')
title("OUF")
plot(vg.coy2, CTMM=M.OUF2, col.CTMM = '#1b1e77', fraction = 0.005)
title("OUF")

coy2_FIT <- ctmm.fit(coy2, CTMM = M.OUF2, method = "pHREML", control = list(method="pNewton",cores=2), trace = TRUE)

## Maximizing likelihood.

## Calculating Hessian.

## Calculating REML gradient.

## Calculating REML Hessian.

summary(coy2_FIT)

## $name
## [1] "OUF anisotropic"
## 
## $DOF
##       mean       area      speed 
##   12.36825   21.59589 4280.15534 
## 
## $CI
##                                low      est      high
## area (square kilometers) 23.791750 38.14923 55.841680
## τ[position] (hours)       9.070864 15.76739 27.407614
## τ[velocity] (minutes)     3.471855  3.66519  3.869292
## speed (kilometers/day)   48.562337 49.30089 50.039283

par(mfrow = c(2,2))
plot(vg.coy2,CTMM=FITS2,col.CTMM=c("#1b9e77","#1b1e77"),fraction=0.65,level=0.5)
title("OU v OUF (zoomed out)")
plot(vg.coy2,CTMM=FITS2,col.CTMM=c("#1b9e77","#1b1e77"),fraction=0.05,level=0.5)
title("OU v OUF (zoomed in)")
plot(vg.coy2,CTMM=coy2_FIT,col.CTMM=c("#1b1e77"),fraction=0.65,level=0.5)
title("Final Model (zoomed out)")
plot(vg.coy2,CTMM=coy2_FIT,col.CTMM=c("#1b1e77"),fraction=0.05,level=0.5)
title("Final Model (zoomed in)")

Kernel Density Estimation

After confirming our fitted continuous-time movement model, we use this as the input in calculating Autocorrelated Kernel Density Estimates, AKDE. We use the AKDE estimator to produce ecologically relevant metrics, such as home range size, range overlap, and occurrence distributions. It is essential to select the best fitting movement model to produce accurate AKDE metric estimates.

Recall the methods we went over in Module 19 - Introduction to GIS and Spatial Analysis in R. Kernel density estimates (KDE) utilize the frequency of points to calculate the density of space usage areas utilized by an individual. However, kernel density estimation (KDE) utilizes a model that assumes independent and identically distributed data, or IID. As we have discussed, GPS data are inherently impacted by spatial autocorrelation, which results in the underestimation of home range area by KDE methods. Fleming et al. (2017) lay out a new methodology of kernel density estimation, AKDE, which removes previous biases and cooperates with less than ideal sample sizes. Before calculating the autocorrelated density estimates (AKDE) for the coyotes we are working with here, it is worthwhile to understand exactly what is happening “under the hood” statistically.

Conventional Kernel Density

More specifically, kernel density estimation works by using the movement data to create small kernels, which are a weighted function in non-parametric statistics that define the distribution of similar points around a given point x, with k(x,y) representing the similarity of point x given another point y. This bandwidth, or covariance, is denoted as σ_B. The average of these kernels are then used to estimate p̂ of the probability density function (PDF) p. KDE are best calculated with the optimal bandwidth, as this reduces the mean integrated square error (MISE) between p and p̂. Fleming et al. (2015) estimate MISE in order to achieve optimal bandwidth as follows:

Here, σ₀ represents the covariance of locations and σ_B, as discussed, represents the bandwidth.

Autocorrelated Kernel Density

For uncorrelated data, this method works best at estimating the MISE, but further changes must be made to better estimate error with autocorrelated data. To do this, Fleming et al.’s (2015) new estimator uses the same steps but relaxes the assumption of independently and identically distributed data (IID). Here, the minimum occurs at the optimal bandwidth. This new estimator for MISE is defined as follows:

Here γ(τ) is the semi-variance function (SVF), and n(τ) is the number of location pairs with time lag t between them. This function achieves optimal performance as autocorrelation increases, but operates efficiently for uncorrelated data as well by converging with the previously discussed function. Now we can see why we put so much work into analyzing autocorrelation via variograms! All of our previous model fitting is used to calculate the AKDE here, which incorporates autocorrelation to calculate a much more accurate kernel density estimation that reduces much of the problematic biases otherwise present.

Data Configuration

Before being able to run akde(), we need to follow the steps below to combine our model parameters and final fits for coy1 and coy2

both.coyotes <- coyotes[c("coyG","coyH")]

#variograms

vg.coyotes <- list(vg.coy1, vg.coy2)

coyotes_GUESS <- list(coy1_GUESS, coy2_GUESS)
coyotes_FITS <- list(coy1_FIT, coy2_FIT) 
names(coyotes_FITS) <- names(both.coyotes[1:2])

par(mfrow = c(2, 2))
plot(vg.coyotes[[1]],CTMM=coyotes_FITS[[1]],col.CTMM=c("purple"),fraction=0.65,level=0.5)
title('Final Model - coy1')
plot(vg.coyotes[[1]],CTMM=coyotes_FITS[[1]],col.CTMM=c("purple"),fraction=0.05,level=0.5)
title('Final Model - coy1')
plot(vg.coyotes[[2]],CTMM=coyotes_FITS[[2]],col.CTMM=c("blue"),fraction=0.65,level=0.5)
title('Final Model - coy2')
plot(vg.coyotes[[2]],CTMM=coyotes_FITS[[2]],col.CTMM=c("blue"),fraction=0.05,level=0.5)
title('Final Model - coy2')

Before we move on with our analyses: what observations can you make regarding the final movement models selected and plot above? What can we make of the differences between the fit of these models?

Coyote Home Ranges

All of your hard work selecting and fitting a movement model for coy2 can now be used in conjunction with our model for coy1 in order to explore relationships between their movement!

In order to draw any comparisons or conclusions about the movement of either coyote, we must first estimate the home range of each individual. As outlined in detail above, we can determine individual home ranges by using our movement models and calculating the autocorrelated kernel density estimates of coy1 and coy2. With the integrated akde() function from the {ctmm} package, we can input telemetry data and a corresponding ctmm model to estimate a home range, which is returned as a Utilization Distribution (UD) object that can be analyzed with both plot() and summary().
>NOTE: The code below uses the combined ctmm modeels/telemetry data for each individual for the sake of time and sanity. It is more common that model selection and fitting for range overlap/analysis of two or more individuals is executed simultaneously with a series of for loops.

UDS_coyotes <- akde(both.coyotes[1:2],coyotes_FITS)

UDS_coy1 <- summary(UDS_coyotes[[1]])
UDS_coy2 <- summary(UDS_coyotes[[2]])
UDS_summary <- list(coy1 = UDS_coy1, coy2 = UDS_coy2)
UDS_summary

## $coy1
## $coy1$DOF
##      area bandwidth 
##  77.25247 199.63497 
## 
## $coy1$CI
##                               low      est     high
## area (square kilometers) 4.823147 6.109051 7.544616
## 
## 
## $coy2
## $coy2$DOF
##      area bandwidth 
##  21.59589  36.11156 
## 
## $coy2$CI
##                               low      est    high
## area (square kilometers) 21.91208 35.13525 51.4299

par(mfrow = c(1, 2))
plot(UDS_coyotes[[1]])
title('coy1')
plot(UDS_coyotes[[2]])
title('coy2')

Plotting the results of akde() reveals 95% home range contour and the 95% confidence intervals around the contour of each individual. The results summary provide point estimates and corresponding 95% confidence intervals.

Coyote Overlap & CDE

There is more to understanding an animal’s movement and habitat use than just physical size or resource availability. Movement ecologists and animal behaviorists are also fascinated by the relationships between individuals, populations, and species. Such spatial relationships can be better understood by analyzing overlapping space us. Quantifying overlap can provide insight for more specific hypothesis testing related to inter/intraspecific competition, territoriality, and even mating/social systems.

The developers of the {ctmm} package continue to add on features/functions that increase the accessibility of complex animal movement analysis. This means that once you fit a ctmm movement model with your data and estimated home range (using the akde() function), assessing overlap and spatial distribution is extremely easy!

overlap(UDS_coyotes)

## , , low
## 
##            coyG       coyH
## coyG 1.00000000 0.06704101
## coyH 0.06704101 1.00000000
## 
## , , est
## 
##           coyG      coyH
## coyG 1.0000000 0.1564094
## coyH 0.1564094 1.0000000
## 
## , , high
## 
##           coyG      coyH
## coyG 1.0000000 0.3116926
## coyH 0.3116926 1.0000000

The function overlap() calculates a measure of similarity between two stationary distributions via the Bhattacharyya coefficient, which is essentially the ratio of the area where both distributions intersect to the average individual area. When input with UD objects (the objects created when estimating home range), the function returns the overlap of the autocorrelated kernel density estimates with 95% confidence intervals (listed in the table above as ‘, , low/high’)

The values of Bhattacharyya’s affinity (BA) range from zero to 1: with zero indicating no overlap and 1 indicating identical Utilization Distributions. The results from overlap() show there is overlap in the ranges of coy1 and coy2.

Let’s now visualize these results by plotting the calculated AKDEs.

plot(both.coyotes, UD = UDS_coyotes, col=rainbow(length(both.coyotes)))

## DOP values missing. Assuming DOP=1.
## DOP values missing. Assuming DOP=1.

This plot allows us to visualize the AKDE utilization distributions (est. home ranges) of both coy1 and coy2. The middle contour represents the maximum likelihood area where an individual spends 95% of its time. This figure shows that the individuals have dramatically different sized ranges and overlap very little for being pack animals. From consulting the study this data was retrieved from, we can confirm the slight overlap of coy1 (in blue) and coy2 (in red) is likely due to the fact that they are from different packs (Mahoney & Young 2017). We can see how these two packs separate themselves by the little overlap and clear separation of their space usage.

Another dimension of movement and interaction we can analyze is the Conditional Distribution of Encounters (CDE). The function encounter() allows us to determine the long-term encounter probabilities for movement within individual home ranges. This translates to calculating where coy1 and coy2 are most likely to interact.

CDE_coyotes <- encounter(UDS_coyotes)

plot(both.coyotes,
     col=c("#1b9e77","#1b1e77"),
     UD=CDE_coyotes,
     col.DF="#F2AD00",
     col.grid = NA)

## DOP values missing. Assuming DOP=1.
## DOP values missing. Assuming DOP=1.

Encounter estimations enable researchers to better understand individual interactions, as well as general population dynamics. CDE analysis is also extremely easy once you have properly fit a movement model and determined AKDE and overlap. For more detailed information about the components and application of CDE analysis, we recommend checking out this paper from Noonan et al. (2021).

References

Calabrese, J. M., Fleming, C. H., & Gurarie, E. (2016). ctmm: an r package for analyzing animal relocation data as a continuous-time stochastic process. Methods in Ecology and Evolution, 7(9), 1124–1132.

Codling, E. A., Plank, M. J., & Benhamou, S. (2008). Random walk models in biology. Journal of The Royal Society Interface, 5(25), 813–834. https://doi.org/10.1098/rsif.2008.0014

Fleming, C.H. & Calabrese, J.M. (2015). ctmm: Continuous-Time Movement Modeling.Rpackage version 0.3.0.

Fleming, C. H., Fagan, W. F., Mueller, T., Olson, K. A., Leimgruber, P., & Calabrese, J. M. (2015). Rigorous home range estimation with movement data: a new autocorrelated kernel density estimator. Ecology, 96(5), 1182-1188.

Fleming, C. H., & Calabrese, J. M. (2017). A new kernel density estimator for accurate home‐range and species‐range area estimation. Methods in Ecology and Evolution, 8(5), 571-579.

Fleming, C. H., Sheldon, D., Gurarie, E., Fagan, W. F., LaPoint, S., and Calabrese, J. M. (2017). Kálmán filters for continuous-time movement models. Ecological Informatics, 40: 8–21.

Kranstauber, B., Cameron, A., Weinzerl, R., Fountain, T., Tilak, S., Wikelski, M., & Kays, R. (2011). The Movebank data model for animal tracking. Environmental Modelling & Software, 26(6), 834-835.

Mrozewski, T. (2018). Movebank. Bulletin-Association of Canadian Map Libraries and Archives (ACMLA), (158), 24-27.

Nau, R. (2014, November 4). Notes on the random walk model - people.duke.edu. Notes on the random walk model. Retrieved 2021, from https://people.duke.edu/~rnau/Notes_on_the_random_walk_model--Robert_Nau.pdf

Noonan, M. J., Tucker, M. A., Fleming, C. H., …, and Calabrese, J. M. (2019). A comprehensive analysis of autocorrelation and bias in home range estimation. Ecological Monographs, 89(2), 2019, e01344.

Noonan, M. J., Martinez-Garcia, D., Davis, G. H., Crofoot, M. C., Kays, R., …, and Calabrese, J. M. (2021). Estimating encounter location distributions from animal tracking data. Methods in Ecology and Evolution, 12:1158–1173.

Winner, K., Noonan, M. J., Fleming, C. H., Olson, K. A., Mueller, T., Sheldon, D., and Calabrese, J. M. (2018). Statistical inference for home range overlap. Methods in Ecology and Evolution; 9:1679–1691.

Continuous-Time Movement Modeling

Zoe Albert, Natalia Kelley, Frank Short & Victoria Zdanowicz

Introduction to Movement Modeling

Preliminaries

Objectives

Background

A guy walks into a…. ctmm?

Random Walk Models

Correlated Random Walk Models

Continuous-Time Stochastic Process

Spatial Autocorrelation

Conservation Implications

Movement Data Analysis

Modeling Outline

Data Configuration

Data Visualization

Variograms

Initial Parameter Estimates

Comparing Parameter Estimates

Model Selection

Prototype Model

Assessing Model Types

Model Fitting

Kernel Density Estimation

Conventional Kernel Density

Autocorrelated Kernel Density

Data Configuration

Coyote Home Ranges

Coyote Overlap & CDE

References