hist(DeerObs$Distance_m,
1 breaks = seq(0,350,10))
- 1
-
Specify finer breaks to see detail (a similar plot to
hist(distUMF)
)
The most informative part of the detection function is close to the y axis, i.e. estimating the value of the detection probability density function on the transect line
This is why we prefer detection function shapes like half-normal, uniform and hazard-rate, because they have a good ‘shoulder’ where detection probability is high on the transect or close by
In contrast, your field data may contain a few observations very far from the transect. These distant animals:
When you fit a detection function to your whole dataset, R may struggle to select curve parameters that fit the main dataset and the outliers
R will attempt to do this by adding more parameters to the detection function model, creating a detection function with a more complex shape
When you remove outliers, it’s easier for R to fit a simple detection function, so:
Right truncation is the step of removing distant observations
Truncation decreases the number of sightings you use, but this loss of (possibly misleading) information is outweighed by the reduced bias and increased precision of your detection function
There are three ways to decide which outliers to remove:
We recommend the last approach of truncating at \(g(x) \thickapprox 0.15\) as the most effective
In practice you might use a combination of approaches to give you confidence in your choice of truncation distance
Plot a histogram of your sightings and examine it to identify gaps far from the transect:
hist(DeerObs$Distance_m,
1 breaks = seq(0,350,10))
hist(distUMF)
)
What’s the first distance at which no water deer were sighted? Is there an obvious gap where it seems sensible to discard observations beyond that distance?
Another helpful way is to sum the distance bins from distUMF to see how sightings decline with distance from the transect
How many observations would we discard if we chose 250m, before the furthest two clusters of water deer sightings?
[1] 5
We can take a look at the actual field data using part of the above line of code
The second method is to remove the most distant 5 to 10% of sightings
How many observations would you discard if you removed 5%?
1DiscardNum5 <- ceiling(nrow(DeerObs) * 0.05)
DiscardNum5
ceiling()
[1] 8
Examine the 8 observations that would be removed
1DeerObs <- DeerObs[order(DeerObs$Distance_m),]
order()
1tail(DeerObs, n = DiscardNum5)
TransectID Distance_m
3 T01 223.78
59 T04 227.48
11 T02 238.38
7 T02 274.31
61 T04 290.26
60 T04 294.07
92 T05 332.36
95 T05 340.07
How many observations would you discard if you removed 10%?
What is the furthest sighting after you’ve truncated 10% (15 sightings)?
Run your null model first!
To truncate at \(g(x) < 0.15\), you must have already fitted a null model. We’ll use our half-normal model results from earlier
We want to estimate the distance at which detection probability has declined to 15% (\(g(x) = 0.15\))
Let’s redraw our histogram, adding a horizontal dotted line to mark where detectability (y axis) falls to 15%
1hist(hn_Null)
2lines(lty = 3,
3 x = c(0,max(DistanceBins)),
4 y = c(0.15,0.15) *
5 (hist(hn_Null))$y[1])
Your truncation distance is the point on the x axis (distance from transect) where the half-normal model curve and the dotted line \(g(x) = 0.15\) cross
Remember that the bars are every 25m, which helps you judge the x value where our lines cross
See the next slide for our plot
In this model, detectability is 15% around 220m from the transect line
For comparison with the truncation methods above, let’s find out how many of our sightings lie beyond 220m and would be discarded based on their detection probability being less than 15%
Using \(g(x) = 0.15\) is the most robust method for choosing a truncation distance
Let’s truncate at 220m, as this lies in the middle of the distances suggested by visual discontinuities (250m) and removing 5% (222m) or 10% (190m) of observations
We need to:
Create a new, truncated dataset:
Create a new set of distance intervals, with a maximum of 220m:
Format the subset of deer observations for conversion into a UMF:
Convert into a UMF and examine to check all looks good:
TruncUMF <- unmarkedFrameDS(y = as.matrix(TruncyDat),
dist.breaks = TruncDistBins,
tlength = TransectLengths$Length,
survey = "line",
unitsIn = "m")
summary(TruncUMF)
unmarkedFrameDS Object
line-transect survey design
Distance class cutpoints (m): 0 20 40 60 80 100 120 140 160 180 200 220
12 sites
Maximum number of distance classes per site: 11
Mean number of distance classes per site: 11
Sites with at least one detection: 12
Tabulation of y observations:
0 1 2 3 4 5 6 7
68 29 17 7 5 3 1 2
Fit the half-normal model to the truncated dataset and view the output:
hn_NullT <- distsamp(~1 ~1, TruncUMF,
keyfun="halfnorm",
output="density",
unitsOut="ha")
summary(hn_NullT)
Call:
distsamp(formula = ~1 ~ 1, data = TruncUMF, keyfun = "halfnorm",
output = "density", unitsOut = "ha")
Density (log-scale):
Estimate SE z P(>|z|)
-2.89 0.111 -26.1 1.8e-150
Detection (log-scale):
Estimate SE z P(>|z|)
4.63 0.0862 53.7 0
AIC: 361.1297
Number of sites: 12
optim convergence code: 0
optim iterations: 62
Bootstrap iterations: 0
Survey design: line-transect
Detection function: halfnorm
UnitsIn: m
UnitsOut: ha