1hn_NullT@call
- 1
- Half-normal null model - density and detectability are constant across transects
distsamp(formula = ~1 ~ 1, data = TruncUMF, keyfun = "halfnorm",
output = "density", unitsOut = "ha")
For simplicity, these exercises consider only a subset of possible models for our water deer dataset:
We’ll use a specific notation for model names1:
.
for intercept-only models_
to separate detection and density covariatesRemember our two response variables
The first .
or abbreviation in our model name is for detectability parameters, and the second is for density parameters - the same order as when you specify the model structure using tildas ~
If you can’t remember, ask R to print out the model structure:
1hn_NullT@call
distsamp(formula = ~1 ~ 1, data = TruncUMF, keyfun = "halfnorm",
output = "density", unitsOut = "ha")
Now that we’ve added covariates, we can re-create the nalf-normal null model with our new UMF that includes the covariates:
1hn._. <- distsamp(
2 ~1
3 ~1,
4 keyfun="halfnorm",
5 TruncUMF, output="density", unitsOut="ha")
6rm(hn_NullT)
Now create a second model which allows density to vary with distance from the coast:
1hn._DTC <- distsamp(
2 ~1
3 ~DistToCoast,
4 keyfun="halfnorm",
5 TruncUMF, output="density", unitsOut="ha")
Create a further model representing the hypothesis that density is constant over the study area, but detectability (the shape of the detection function) varies with landcover class:
1hnLC_. <- distsamp(
2 ~Landcover
3 ~1,
4 keyfun="halfnorm",
5 TruncUMF, output="density", unitsOut="ha")
Now create a global model, which combines all covariates included in the other models
1hnLC_DTC <- distsamp(
2 ~Landcover
3 ~DistToCoast,
4 keyfun="halfnorm",
5 TruncUMF, output="density", unitsOut="ha")
❓ Hypothesis in words
What is the hypothesis represented by this model?
Can you create an identical set of models using the hazard-rate detection function?
❓ Get help in R
How can you find out what the hazard-rate function is called in distsamp()
?
This gives you eight models to compare, representing all combinations of our hypotheses:
Before we continue in R, let’s learn about model likelihoods and weights
The likelihood, or “odds”, of a model expresses how likely the model is in comparison with the best model
The likelihood of a model, given the data \(L(\hat{\theta}) | data\) can be calculated as:
\[ L(\hat{\theta}) | data \propto exp(-\frac{1}{2} dAIC) \]
Where the \(-\frac{1}{2}\) removes the \(-2\) introduced in defining AIC
Imagine a lottery 🎱 :
If the best model can buy 100 tickets, a model with dAIC = 2 (likelihood = 0.368) can only buy 37 tickets, and has 1/3 chance of being the best approximation to the real world
Models with higher dAIC have even fewer tickets, and less chance of winning the prize!
Models with dAIC >10-12 are uninformative - the likelihood declines below 1%
dAIC | Likelihood | Tickets |
---|---|---|
0 | 1 | 100 |
1 | 0.607 | 61 |
2 | 0.368 | 37 |
4 | 0.135 | 14 |
6 | 0.050 | 5 |
8 | 0.018 | 2 |
12 | 0.002 | 0 |
A model’s AIC weight tells us the strength of evidence for that model
AIC weight = the probability that a model is the best model in our set
\[ w_i = Prob\{ \hat{\theta_i} | data \} = \frac{exp(- \frac{1}{2} \Delta_i)}{\sum_{r=1}^R exp(- \frac{1}{2} \Delta_r)} \]
Model weights are useful because these simple numbers allow us to:
Model weights illustrate our uncertainty in choosing between hypotheses. They suggest how much confidence you can place in a model, considering the other models you are testing
Now we can gather together the evidence from Information Theory for all the models, and compute model selection statistics such as Delta AIC
As we did for our comparison of half-normal and uniform detection functions, let’s create a list object that contains all the models we want to compare:
models <- list(hn._. = hn._., hn._DTC = hn._DTC,
hnLC_. = hnLC_., hnLC_DTC = hnLC_DTC,
haz._. = haz._., haz._DTC = haz._DTC,
hazLC_. = hazLC_., hazLC_DTC = hazLC_DTC)
1fmList <- fitList(fits = models)
Use modSel()
to conveniently summarise this larger set of models:
nPars AIC delta AICwt cumltvWt
hnLC_. 3 271.39 0.00 6.4e-01 0.64
hnLC_DTC 4 272.69 1.29 3.4e-01 0.98
hazLC_. 4 279.16 7.77 1.3e-02 0.99
hazLC_DTC 5 281.05 9.66 5.1e-03 1.00
haz._DTC 4 330.78 59.38 8.2e-14 1.00
hn._DTC 3 340.06 68.67 7.9e-16 1.00
haz._. 3 352.00 80.61 2.0e-18 1.00
hn._. 2 361.13 89.74 2.1e-20 1.00
❓ Which models have support?
Given what you know about Delta AIC, which models have support from our field data?
modSel()
output: nPars nPars AIC delta AICwt cumltvWt
hnLC_. 3 271.39 0.00 6.4e-01 0.64
hnLC_DTC 4 272.69 1.29 3.4e-01 0.98
hazLC_. 4 279.16 7.77 1.3e-02 0.99
hazLC_DTC 5 281.05 9.66 5.1e-03 1.00
haz._DTC 4 330.78 59.38 8.2e-14 1.00
hn._DTC 3 340.06 68.67 7.9e-16 1.00
haz._. 3 352.00 80.61 2.0e-18 1.00
hn._. 2 361.13 89.74 2.1e-20 1.00
In nPars, R reminds us of the number of parameters used to build each model
For example, the half-normal null model hn._.
has two parameters:
Our global (most complex) model, hazLC_DTC
, contains:
modSel()
output nPars AIC delta AICwt cumltvWt
hnLC_. 3 271.39 0.00 6.4e-01 0.64
hnLC_DTC 4 272.69 1.29 3.4e-01 0.98
hazLC_. 4 279.16 7.77 1.3e-02 0.99
hazLC_DTC 5 281.05 9.66 5.1e-03 1.00
haz._DTC 4 330.78 59.38 8.2e-14 1.00
hn._DTC 3 340.06 68.67 7.9e-16 1.00
haz._. 3 352.00 80.61 2.0e-18 1.00
hn._. 2 361.13 89.74 2.1e-20 1.00
After npars, R displays:
❓ What affects detectability and density?
What can we conclude about the factors that affect detectability and density in this water deer survey?
nPars AIC delta AICwt cumltvWt
hnLC_. 3 271.39 0.00 6.4e-01 0.64
hnLC_DTC 4 272.69 1.29 3.4e-01 0.98
hazLC_. 4 279.16 7.77 1.3e-02 0.99
hazLC_DTC 5 281.05 9.66 5.1e-03 1.00
haz._DTC 4 330.78 59.38 8.2e-14 1.00
hn._DTC 3 340.06 68.67 7.9e-16 1.00
haz._. 3 352.00 80.61 2.0e-18 1.00
hn._. 2 361.13 89.74 2.1e-20 1.00
Our best-supported model is hnLC_.
1, based on highest AIC (meaning delta AIC is zero) and a large AIC weight
However, the second model hnLC_DTC
2 is also a strong competitor, with some evidence in favour: low Delta AIC (1.01), and adding it increases cumulative AIC weight to 98%
Conclusion from these models
There is strong evidence for Landcover class affecting detectability
Distance to coast may influence water deer density, but we need further clarification