Section 2 Processing Egyptian Fruit Bat Tracks

We show the pre-processing pipeline at work on the tracks of three Egyptian fruit bats (Rousettus aegyptiacus), and construct residence patches.

2.3 Exploratory Data Analysis Panels: Main Text Figure 1

Here, we make some basic figures for exploratory data analysis shown in Figure 1 of the main text.

Plot the bat data as a sanity check, and inspect it visually for errors. The plot code is hidden in the rendered copy (PDF) of this supplementary material, but is available in the Rmarkdown file “supplement/06_bat_data.Rmd”.

2.3.2 Sampling Intervals

Here, we create the histogram of sampling intervals shown in Figure 1 of the main text. The plotting code is hidden in the PDF version, but available in the source code.

2.3.3 Localisation Error Measured by Systems

Here, we create the histogram of location error (variance in X) (Weiser et al. 2016) shown in Figure 1 of the main text. The plotting code is hidden in the PDF version, but available in the source code.

2.3.4 Plot paths from raw tracking data

Here, we plot the paths of individual bats from the raw tracking data to visually inspect them for errors.

Movement data from three Egyptian fruit bats tracked using the ATLAS system (Rousettus aegyptiacus; (Toledo et al. 2020; Shohami and Nathan 2020)). The bats were tracked in the Hula Valley, Israel (33.1^{\circ}N, 35.6^{\circ}E), and we use three nights of tracking (5, 6, and 7 May, 2018), for our demonstration, with an average of 13,370 positions (SD = 2,173; range = 11,195 – 15,542; interval = 8 seconds) per individual. After first plotting the individual tracks, we notice severe distortions, making pre-processing necesary

2.4 Prepare data for filtering

Here we apply a series of simple filters. It is always safer to deal with one individual at a time, so we split the data.table into a list of data.tables to avoid mixups among individuals.

This is a very rudimentary demonstration of the principle behind batch processing — splitting data into smaller, independent subsets, and applying the same steps to each subset.

2.5 Filter by system-generated error attributes

No natural bounds suggest themselves, so instead we proceed to filter by system-generated attributes of error, since point outliers are obviously visible.

We use filter out positions with SD > 20 and positions calculated using only 3 base stations, using the function atl_filter_covariates.

First we calculate the variable SD, which for ATLAS systems is calculated as:

\[SD = \sqrt{{VARX} + {VARY} + 2 \times {COVXY}}\]

Then we pass the filters to atl_filter_covariates. We apply the filter to each individual’s data using an lapply – this separates the data from each individual into a separate data frame, lessening the chances of inter-individual mix-ups.

This is another basic example of the principles behind batch-processing, and could be parallelised using the R package furrr (see

2.5.1 Sanity check: Plot filtered data

We plot the data to check whether the filtering has improved the data (Fig. 2.2). The plot code is once again hidden in this rendering, but is available in the source code file.

Bat data filtered for large location errors, removing observations with standard deviation > 20. Grey crosses show data that were removed. Since the number of base stations used in the location process is a good indicator of error (Weiser et al. 2016), we also removed observations calculated using fewer than four base stations. Both steps used the function . This filtering reduced the data to an average of 10,447 positions per individual (78% of the raw data on average). However, some point outliers remain.

2.6 Filter by speed

Some point outliers remain, and could be removed using a speed filter.

First we calculate speeds, using atl_get_speed. We must assign the speed output to a new column in the data.table, which has a special syntax which modifies in place, and is shown below. This syntax is a feature of the data.table package, not strictly of atlastools (Dowle and Srinivasan 2020).

Now filter for speeds > 20 m/s (around 70 km/h), passing the predicate (a statement return TRUE or FALSE) to atl_filter_covariates. First, we remove positions which have NA for their speed_in (the first position) and their speed_out (last position).

2.6.1 Sanity check: Plot speed filtered data

The speed filtered data is now inspected for errors (Fig. 2.3). The plot code is once again hidden.

Bat data with unrealistic speeds removed. Notice, compared with the previous figure, that spikes of unrealistic movement in all three tracks have been removed. Grey crosses show data that were removed. We calculated the incoming and outgoing speed of each position using atl_get_speed, and filtered out positions with speeds > 20 m/s using atl_filter_covariates, leaving 10,337 positions per individual on average (98% from the previous step).

2.7 Median smoothing

The quality of the data is relatively high, and a median smooth is not strictly necessary. We demonstrate the application of a 5 point median smooth to the data nonetheless (Fig. 2.4).

Since the median smoothing function atl_median_smooth modifies in place, we first make a copy of the data, using data.table’s copy function. No reassignment is required, in this case. The lapply function allows arguments to atl_median_smooth to be passed within lapply itself.

In this case, the same moving window \(K\) is applied to all individuals, but modifying this code to use the multivariate version Map allows different \(K\) to be used for different individuals. This is a programming matter, and is not covered here further.

2.7.1 Sanity check: Plot smoothed data

Bat data after applying a median smooth with a moving window K = 5. Grey circles show data prior to smoothing. The smoothing step did not discard any data.

2.8 Making residence patches

2.8.1 Calculating residence time

First, the data is put through the recurse package to get residence time (Bracis, Bildstein, and Mueller 2018).

We calculated residence time, but since bats may revisit the same features, we want to prevent confusion between frequent revisits and prolonged residence.

For this, we stop summing residence times within \(Z\) metres of a location if the animal exited the area for one hour or more. The value of \(Z\) (radius, in recurse parameter terms) was chosen as 50m.

This step is relatively complicated and is only required for individuals which frequently return to the same location, or pass over the same areas repeatedly, and for which revisits (cumulative time spent) may be confused for residence time in a single visit.

While a simpler implementation using total residence time divided by the number of revisits is also possible, this does assume that each revisit had the same residence time.

We bind the data together and assign a human readable timestamp column.

2.8.2 Movements away from the roost

To focus on night-time bat foraging around fruit trees, we shall filter data both on the timestamps, to select night-time positions, and on the locations, to select positions > 1 km away from the roost-cave at Har Gershom (see main text Fig. 8).

Combining these two filters allows us to exclude bat positions at the roost-cave that may be due to individual-differences in bats’ departure or return times to and from their foraging areas.

Users should plot the data to examine the effect of applying filters — this code is shown, but the figure is hidden for brevity.

We now filter the data to exclude both day-time data, as well as data that is < 1 km from the roost.

2.8.3 Split data by night-id

We assign a night-id to each position, i.e., the night-time spanning two calendar days. We then filter for data with a residence time > 5 minutes, as we expect that a bat stopped at a location for more than 5 minutes is likely to be foraging.

2.8.4 Constructing residence patches

Some preparation is required. First, the function requires columns x, y, time, and id, which we assign using the data.table syntax. The time column is already present, but the other columns need to be renamed to lower case.

We apply the residence patch method, using the default argument values (lim_spat_indep = 100 (metres), lim_time_indep = 30 (minutes)). We change the buffer_radius to 25 metres (twice the buffer radius is used, so points must be separated by 50m to be independent bouts), and min_fixes = 3.

2.8.5 Getting residence patch data

We extract the residence patch data as spatial sf-MULTIPOLYGON objects. These are returned as a list and must be converted into a single sf object. These objects and the raw movement data are shown in Fig. 2.5.

2.9 Main text Figure 8

See Fig. 8 in the main text, made with QGIS.

A visual examination of plots of the bats’ residence patches and linear approximations of paths between them showed that though all three bats roosted at the same site, they used distinct areas of the study site over the three nights (a). Bats tended to be resident near fruit trees, which are their main food source, travelling repeatedly between previously visited areas (b, c). However, bats also appeared to spend some time at locations where no fruit trees were recorded, prompting questions about their use of other food sources (b, c). When bats did occur close together, their residence patches barely overlapped, and their paths to and from the broad area of co-occurrence were not similar (c). Constructing residence patches for multiple individuals over multiple activity periods suggests interesting dynamics of within- and between-individual overlap (b, c).

