Confidence intervals

Confidence intervals

Confidence intervals for coefficients can help us evaluate the importance of covariates

If the confidence interval spans zero, this suggests we cannot be certain that the covariate has an effect on density or detectability

There are two unmarked functions we can use to calculate coefficient confidence intervals:

  • confint() - we used this in the Analysis module
  • predict() - similar to backTransform(), but more flexible and informative

Calculate CIs with confint()

Confidence intervals for which parameter?

Remember that confint() can calculate confidence intervals for:

  • Density
  • Detectability
  • Covariate coefficients

Density confidence intervals from our best model, hnLC_.:

CIDens <- confint(hnLC_., type = "state")
CIDens
             0.025     0.975
lam(Int) -2.849277 -2.407755

Detectability confidence intervals:

CIDetect <- confint(hnLC_., type = "det")
CIDetect
                        0.025      0.975
p(Int)               4.691186  5.3749777
p(LandcoverWetland) -1.631540 -0.9402575

Calculate CIs with predict()

Back-transforming coefficients

Remember that coefficient values are on a logit scale, so need to be transformed back to the original measurement scale to understand their biological meaning to be clearer

Previously we’ve used backTransform() for this, but backTransform() doesn’t work for models with detectability covariates

The predict() function provides meaningful back-transformed estimates from a model, including confidence intervals for covariate coefficients

We need to give predict() the conditions for which we’d like to predict density or detectability

We’ll run some examples so you can see how to construct code for categorical and continuous covariates, and for density and detectability

Create a covariate data.frame

First we need to create a new data.frame with the ecological conditions we’re interested in. We’ll start with our two landcover types:

1landcovs <- data.frame(Landcover =
2    levels(TruncUMF@siteCovs$Landcover))
landcovs
1
Create new data.frame…
2
with both landcover types
  Landcover
1 Grassland
2   Wetland

Predict detectability

Use the new data to predict detectability in both landcover classes:

1predictionsLC <- cbind(
2    landcovs,
3    predict(
4        hnLC_.,
5        newdata = landcovs,
6        type="det"))
predictionsLC
1
Create a new data.frame by using cbind() to glue together…
2
our landcover levels and…
3
the output of predict()
4
Specify the model
5
Specify the covariate conditions
6
Estimate detectability
  Landcover Predicted        SE     lower     upper
1 Grassland 153.40505 26.759956 108.98235 215.93505
2   Wetland  42.40146  3.956076  35.31534  50.90944

Note how the predicted detectability is much higher in grassland, and the confidence intervals are far apart

Predict density

Run the process again, this time to predict the effect of distance to the coast on density

We’ll specify three different distances: On the coastline, 500m and 1km:

1coast_dists <- data.frame(
2    DistToCoast = c(0, 0.5, 1))
predictionsDTC <- cbind(coast_dists, 
    predict(hn._DTC,
        newdata = coast_dists,
        type="state"))
predictionsDTC
1
Create a data.frame from…
2
a list of our chosen distances, in km
  DistToCoast  Predicted          SE      lower      upper
1         0.0 0.02311217 0.005504594 0.01449143 0.03686127
2         0.5 0.02314824 0.005506373 0.01452241 0.03689750
3         1.0 0.02318435 0.005508147 0.01455346 0.03693378

Note how the estimated coefficient (Predicted column) is very similar for all distances from the coast, and the confidence intervals overlap almost completely

Predictions for multiple covariates

For more complicated sets of covariates, we can use the expandgrid() function to create an object containing all the scenarios - all combinations of our different covariate values

1cov_combinations <- expand.grid(
2    DistToCoast = seq(0, 1, 0.25),
3    Landcover = levels(TruncUMF@siteCovs$Landcover))
cov_combinations
1
Use expand.grid() to create a data.frame containing all combinations of…
2
Distance to coast and…
3
Landcover
   DistToCoast Landcover
1         0.00 Grassland
2         0.25 Grassland
3         0.50 Grassland
4         0.75 Grassland
5         1.00 Grassland
6         0.00   Wetland
7         0.25   Wetland
8         0.50   Wetland
9         0.75   Wetland
10        1.00   Wetland

Round predicted values

Now we have a data.frame containing all scenarios, let’s generate density predictions for each of them from our global model with a hazard-rate detection function

To make it easier to review our results, we’ll round our estimates to three decimal places

PredLC_DTC <- cbind(cov_combinations,
1    round(
        predict(hazLC_DTC,
            newdata = cov_combinations,
            type="state")
2        , 3))
PredLC_DTC
1
Round the predictions…
2
to three decimal places
   DistToCoast Landcover Predicted    SE lower upper
1         0.00 Grassland      0.07 0.019 0.041 0.119
2         0.25 Grassland      0.07 0.019 0.041 0.119
3         0.50 Grassland      0.07 0.019 0.041 0.119
4         0.75 Grassland      0.07 0.019 0.041 0.119
5         1.00 Grassland      0.07 0.019 0.041 0.119
6         0.00   Wetland      0.07 0.019 0.041 0.119
7         0.25   Wetland      0.07 0.019 0.041 0.119
8         0.50   Wetland      0.07 0.019 0.041 0.119
9         0.75   Wetland      0.07 0.019 0.041 0.119
10        1.00   Wetland      0.07 0.019 0.041 0.119

Review results

Examine the predicted values for density under these different scenarios. Are they what you would expect to see? 🤔

  1. Think about the model you used to generate these predictions:
    1. What hypothesis did that model test?
    2. Which covariates does it allow to affect density?
  2. Consider the results from our model comparison
    1. How well-supported was this model?
    2. Would you expect it to reveal different densities for the different scenarios?
    3. Or would you expect overlapping density confidence intervals under these scenarios?

Repeat with a different model

Try running this final step again with a different model

Examine the results, and compare them to your earlier results

What do you conclude?

What does this tell you about the importance of each covariate, and the validity of the different models?