Compare covariate models

Hypotheses with covariates

For simplicity, these exercises consider only a subset of possible models for our water deer dataset:

Distance to the coast affects density
Landcover influences detectability
Two detection functions: half-normal and hazard-rate

Notation for model names

We’ll use a specific notation for model names¹:

A full stop . for intercept-only models
An abbreviation such as ‘LC’ (landcover) for covariates
An underscore _ to separate detection and density covariates

Remember our two response variables

The first . or abbreviation in our model name is for detectability parameters, and the second is for density parameters - the same order as when you specify the model structure using tildas ~

If you can’t remember, ask R to print out the model structure:

1hn_NullT@call

1: Half-normal null model - density and detectability are constant across transects

distsamp(formula = ~1 ~ 1, data = TruncUMF, keyfun = "halfnorm", 
    output = "density", unitsOut = "ha")

Model: Null model

Now that we’ve added covariates, we can re-create the nalf-normal null model with our new UMF that includes the covariates:

1hn._. <- distsamp(
2  ~1
3  ~1,
4  keyfun="halfnorm",
5  TruncUMF, output="density", unitsOut="ha")
6rm(hn_NullT)

1: Create half-normal null model with new naming convention
2: Detectability constant across transects
3: Density constant across transects
4: Use a half-normal detection function
5: Specify other analysis arguments
6: Delete original null model, now obsolete

Model: Density varies with distance

Now create a second model which allows density to vary with distance from the coast:

1hn._DTC <- distsamp(
2  ~1
3  ~DistToCoast,
4  keyfun="halfnorm",
5  TruncUMF, output="density", unitsOut="ha")

1: Model name reflects hypothesis
2: Detectability constant across transects
3: Density varies with distance from the coast
4: Use a half-normal detection function
5: Specify other analysis arguments

Model: Landcover affects detection

Create a further model representing the hypothesis that density is constant over the study area, but detectability (the shape of the detection function) varies with landcover class:

1hnLC_. <- distsamp(
2  ~Landcover
3  ~1,
4  keyfun="halfnorm",
5  TruncUMF, output="density", unitsOut="ha")

1: Model name reflects hypothesis
2: Detectability varies between tall reedbeds and grassland
3: Density constant across transects
4: Use a half-normal detection function
5: Specify other analysis arguments

Model: Global model

Now create a global model, which combines all covariates included in the other models

1hnLC_DTC <- distsamp(
2  ~Landcover
3  ~DistToCoast,
4  keyfun="halfnorm",
5  TruncUMF, output="density", unitsOut="ha")

1: Model name reflects hypothesis
2: Detectability varies between tall reedbeds and grassland
3: Density varies with distance from the coast
4: Use a half-normal detection function
5: Specify other analysis arguments

❓ Hypothesis in words

What is the hypothesis represented by this model?

Hazard-rate models

Can you create an identical set of models using the hazard-rate detection function?

❓ Get help in R

How can you find out what the hazard-rate function is called in distsamp()?

This gives you eight models to compare, representing all combinations of our hypotheses:

The shape of the detection function is best modelled by a half-normal curve versus a hazard-rate curve
Detectability is influenced by landcover class
Density of water deer depends on distance from the coast

Before we continue in R, let’s learn about model likelihoods and weights

Model likelihoods

The likelihood, or “odds”, of a model expresses how likely the model is in comparison with the best model

The likelihood of a model, given the data \(L(\hat{\theta}) | data\) can be calculated as:

\[ L(\hat{\theta}) | data \propto exp(-\frac{1}{2} dAIC) \]

Where the \(-\frac{1}{2}\) removes the \(-2\) introduced in defining AIC

Likelihood as a lottery

Imagine a lottery 🎱 :

Each ticket has the same chance of winning the prize
Better models (models that are closer to reality) can afford to buy more tickets
The likelihood of a model measures the chance it has to “win” the lottery (be the best explanation of the patterns in your data)

If the best model can buy 100 tickets, a model with dAIC = 2 (likelihood = 0.368) can only buy 37 tickets, and has 1/3 chance of being the best approximation to the real world

Models with higher dAIC have even fewer tickets, and less chance of winning the prize!

Models with dAIC >10-12 are uninformative - the likelihood declines below 1%

dAIC	Likelihood	Tickets
0	1	100
1	0.607	61
2	0.368	37
4	0.135	14
6	0.050	5
8	0.018	2
12	0.002	0

Model weights

A model’s AIC weight tells us the strength of evidence for that model

AIC weight = the probability that a model is the best model in our set

\[ w_i = Prob\{ \hat{\theta_i} | data \} = \frac{exp(- \frac{1}{2} \Delta_i)}{\sum_{r=1}^R exp(- \frac{1}{2} \Delta_r)} \]

Model weights are useful because these simple numbers allow us to:

Evaluate the strength of evidence for each model
Determine the strength of evidence for each parameter
Calculate model-averaged estimates of parameter values

Model weights illustrate our uncertainty in choosing between hypotheses. They suggest how much confidence you can place in a model, considering the other models you are testing

Compare models using Information Theory

Now we can gather together the evidence from Information Theory for all the models, and compute model selection statistics such as Delta AIC

As we did for our comparison of half-normal and uniform detection functions, let’s create a list object that contains all the models we want to compare:

models <- list(hn._. = hn._., hn._DTC = hn._DTC, 
    hnLC_. = hnLC_., hnLC_DTC = hnLC_DTC,
    haz._. = haz._., haz._DTC = haz._DTC,
    hazLC_. = hazLC_., hazLC_DTC = hazLC_DTC)
1fmList <- fitList(fits = models)

1: Convert this standard R list into an unmarkedFitList

View the evidence

Use modSel() to conveniently summarise this larger set of models:

1ModSelect <- modSel(fmList)
2ModSelect

1: Compute \(\delta AIC\) values and other Information Theory statistics
2: View the AIC comparison table

          nPars    AIC delta   AICwt cumltvWt
hnLC_.        3 271.39  0.00 6.4e-01     0.64
hnLC_DTC      4 272.69  1.29 3.4e-01     0.98
hazLC_.       4 279.16  7.77 1.3e-02     0.99
hazLC_DTC     5 281.05  9.66 5.1e-03     1.00
haz._DTC      4 330.78 59.38 8.2e-14     1.00
hn._DTC       3 340.06 68.67 7.9e-16     1.00
haz._.        3 352.00 80.61 2.0e-18     1.00
hn._.         2 361.13 89.74 2.1e-20     1.00

❓ Which models have support?

Given what you know about Delta AIC, which models have support from our field data?

`modSel()` output: nPars

          nPars    AIC delta   AICwt cumltvWt
hnLC_.        3 271.39  0.00 6.4e-01     0.64
hnLC_DTC      4 272.69  1.29 3.4e-01     0.98
hazLC_.       4 279.16  7.77 1.3e-02     0.99
hazLC_DTC     5 281.05  9.66 5.1e-03     1.00
haz._DTC      4 330.78 59.38 8.2e-14     1.00
hn._DTC       3 340.06 68.67 7.9e-16     1.00
haz._.        3 352.00 80.61 2.0e-18     1.00
hn._.         2 361.13 89.74 2.1e-20     1.00

In nPars, R reminds us of the number of parameters used to build each model

For example, the half-normal null model hn._. has two parameters:

The standard deviation of the half-normal curve
An intercept for density

Our global (most complex) model, hazLC_DTC, contains:

The scale of the hazard-rate curve
An intercept plus a coefficient for shape of the hazard-rate detection function within the densely-vegetated wetlands, and
An intercept plus a coefficient for the effect of ‘Distance to coast’ on density

Intepret `modSel()` output

          nPars    AIC delta   AICwt cumltvWt
hnLC_.        3 271.39  0.00 6.4e-01     0.64
hnLC_DTC      4 272.69  1.29 3.4e-01     0.98
hazLC_.       4 279.16  7.77 1.3e-02     0.99
hazLC_DTC     5 281.05  9.66 5.1e-03     1.00
haz._DTC      4 330.78 59.38 8.2e-14     1.00
hn._DTC       3 340.06 68.67 7.9e-16     1.00
haz._.        3 352.00 80.61 2.0e-18     1.00
hn._.         2 361.13 89.74 2.1e-20     1.00

After npars, R displays:

AIC values, sorting the table from lowest to highest AIC
\(\delta AIC\) - zero is the best model; higher values indicate poorer fit
AICwt - models with higher weights are better supported by field data; low weights indicate lack of evidence

❓ What affects detectability and density?

What can we conclude about the factors that affect detectability and density in this water deer survey?

What can we conclude?

          nPars    AIC delta   AICwt cumltvWt
hnLC_.        3 271.39  0.00 6.4e-01     0.64
hnLC_DTC      4 272.69  1.29 3.4e-01     0.98
hazLC_.       4 279.16  7.77 1.3e-02     0.99
hazLC_DTC     5 281.05  9.66 5.1e-03     1.00
haz._DTC      4 330.78 59.38 8.2e-14     1.00
hn._DTC       3 340.06 68.67 7.9e-16     1.00
haz._.        3 352.00 80.61 2.0e-18     1.00
hn._.         2 361.13 89.74 2.1e-20     1.00

Our best-supported model is hnLC_.¹, based on highest AIC (meaning delta AIC is zero) and a large AIC weight

However, the second model hnLC_DTC² is also a strong competitor, with some evidence in favour: low Delta AIC (1.01), and adding it increases cumulative AIC weight to 98%

Conclusion from these models

There is strong evidence for Landcover class affecting detectability

Distance to coast may influence water deer density, but we need further clarification

Compare covariate models

Hypotheses with covariates

Notation for model names

Model: Null model

Model: Density varies with distance

Model: Landcover affects detection

Model: Global model

Hazard-rate models

Model likelihoods

Likelihood as a lottery

Model weights

Compare models using Information Theory

View the evidence

modSel() output: nPars

Intepret modSel() output

What can we conclude?

`modSel()` output: nPars

Intepret `modSel()` output