Re-assess poor models

If our global model is poor

So what can we do if our global model is a poor fit? 😓

The ideal solution is to:

  1. 🤔 Reconsider what influences density and detection
  2. 🧭 Collect more covariate data and/or observations
  3. 💻 Run an improved model set

This is rarely possible or practical!

When we are restricted to our existing data & models, we can:

  • Degrade the precision of our parameter estimates
  • Adjust AIC values to reflect our lower confidence

Over-dispersion

Models may be poor because of over-dispersion

Over-dispersion describes field data which are more heterogeneous1 than expected from our global model

Over-dispersion might arise from:

  • Failure to include influential covariates
  • Lack of independence in animal sightings or density, causing spatial or temporal clusters in sightings

Variance inflation factor, \(\hat{c}\)

A solution to over-dispersion is to estimate a variance inflation factor \(\hat{c}\) (c-hat)

c-hat is the chi-squared Goodness of Fit statistic divided by its degrees of freedom

\[ \hat{c} = \frac{\chi^2}{df} \]

Interpret \(\hat{c}\)

We only need to act when \(\hat{c}\) is greater than 1

\(\hat{c}\) value Meaning
< 1 Under-dispersion - data are more uniformly distributed than expected
1 Field data are randomly distributed with reference to a particular model
> 1 Over-dispersion - data are clustered
2-4 Probable lack of independence of observations
> 4 Severe lack of fit - our models are inadequate to represent our data

For values close to 1, we can use \(\hat{c}\) to:

  1. Correct our AIC values for unmodelled sampling variance, generating Quasi AIC (QAIC) values which take into account our additional \(\hat{c}\) parameter
  2. Inflate our Standard Errors so parameter estimates are less precise

The closer \(\hat{c}\) is to 1, the closer QAICc results will be to AICc results

Calculate \(\hat{c}\)

The simplest way to calculate \(\hat{c}\) is from our Goodness of Fit test simulated data

We approximate \(\hat{c}\) by dividing the test statistic t0 from our field data by the mean (t.star) of the vector of simulated test statistics t_B

cHat <- GOF@t0 / mean(GOF@t.star)
cHat
     SSE 
1.240156 

\(\hat{c}\) for the deer case-study should be close to one, e.g. 1.24, indicating that our data are only slightly over-dispersed

Your c-hat will vary

The exact value of \(\hat{c}\) depends on the simulations your computer generated, so it will not match ours exactly

The more simulations we do, the more our \(\hat{c}\) values would converge on each other

Calculate QAICc

We incorporate \(\hat{c}\) into our model comparison by adding it as an argument to aictab():

1QAICctable <- aictab(models, c.hat = cHat)
QAICctable
1
Specify the variance inflation factor, \(\hat{c}\) so R automatically calculates QAICc

Model selection based on QAICc:
(c-hat estimate = 1.240156)

          K  QAICc Delta_QAICc QAICcWt Cum.Wt Quasi.LL
hnLC_.    4 227.71        0.00    0.94   0.94  -107.00
hnLC_DTC  5 233.43        5.72    0.05   1.00  -106.71
hazLC_.   5 238.65       10.94    0.00   1.00  -109.33
hazLC_DTC 6 247.36       19.65    0.00   1.00  -109.28
haz._DTC  5 280.27       52.56    0.00   1.00  -130.14
hn._DTC   4 283.08       55.37    0.00   1.00  -134.68
haz._.    4 292.71       65.00    0.00   1.00  -139.50
hn._.     3 296.97       69.26    0.00   1.00  -143.99

Parameters increase by 1

Note how the number of parameters has increased by one for every model, because we include c-hat as a parameter

The comparison statistics have changed slightly, again emphasising that our top model is the best-supported by our field data