Covariates and coefficients

Conservation hypotheses

It’s rare that we’re only interested in estimating the population size of our species of interest, or the detectability

We usually also want to understand the biological processes that result in a particular pattern of density or detection

We want to answer questions like:

Why does population size vary across space or time? Are our conservation interventions affecting density?

In order to answer these questions, we need to include covariates¹ in our models, to test our hypotheses about what influences endangered populations

Covariates

In statistical terms, explanatory variables¹ help explain the pattern in a response variable

For example, the prey density (explanatory variable) may enable you to predict or explain patterns in the predator density (response variable)

By including explanatory variables, or covariates, in a statistical model, we can:

Gain a better understanding of the ecological system
Calculate more precise estimates of parameters such as density (provided your covariates are well chosen)

Informative covariates² reduce the noise³ which is what contributes to the uncertainty in your parameter estimate

Covariates are so-called because they co-vary with density and/or detectability

Coefficients

Our modelling aim is to determine the relationship between the covariates and the parameter of interest (density or detectability)

The strength and direction of this relationship is indicated by the coefficient of each covariate

The coefficient is the number by which you multiply a covariate in order to predict density (or detectability)

As an over-simplified example, your measure of water availability at a site (covariate) might be multiplied by a factor of 2 (coefficient) to derive an estimate of density (parameter) in that site

In the familiar mathematical equation for a slope \(y = a + bx\), \(y\) is the parameter, \(x\) is the covariate, and \(b\) is the coefficient

Coefficients in distance sampling

In distance sampling it’s slightly more complicated because we’re not dealing with a linear relationship between covariate and parameter

This is because detectability is a probability and so must lie between zero and one

We’ll return to this in the page on interpreting coefficients

Missing data

Take care if you have missing data!

If values are missing only for certain variables, this will change the size of the dataset being used to test your different hypotheses, and make AIC comparisons invalid

If you have missing environmental covariate values for some records, it’s best to omit the records or covariate from your analysis

This emphasises the importance of careful and complete data collection!