Install these packages in R: {bioacoustics}, {warbleR}, {tools}, {viridis}
The objective of this module is to introduce you to analysis of biacoustic data so that you can read, display, and analyze audio recordings of vocalizations in animals.
What is bioacoustics, who uses it, and how is it implemented?
Broadly, bioacoustics is the the noninvasive crossdisciplinary science of biology and acoustics that studies the production, transmission, and reception of animal sounds. Given the wide range of information that bioacoustics provides, it’s used by researchers, physicians, and government agencies. Studying sounds and it’s affects on living organisms, can provide powerful insight into natural history, structure and dynamics of ecosystems, species biodiversity, habitat and species health, information climate change, and much more.
Different Sounds will happen at different frequencies. This is important to know because you settings may have to change in order to display and analyze your bioacoustic data correctly. Here are a general list of frequency ranges that you might encounter:
First, let’s load in the packages that we need:
library(warbleR)
library(bioacoustics)
library(tools)
library(randomForest)
# Unfortunately, ohun is currently under policy violation, so for the time being we are using an archived version. We expect this library to be unarchived after investigation.
url <- "https://cran.r-project.org/src/contrib/Archive/ohun/ohun_1.0.0.tar.gz"
install.packages(url, type="source", repos=NULL)
library(ohun)
library(viridis)
The warbleR package comes with the function query_xc which allows you to seamlessly import audio data from [xeno-canto] (https://xeno-canto.org/484847).
# pulling all Strepera versicolor calls without downloading locally
df1 = query_xc(qword ='Strepera versicolor type:call cnt:"Australia"', download = FALSE)
# Filter by type
df1 = df1[df1$Vocalization_type=="call",] #filtering by only calls
df1 = df1[df1$Quality=="A",] #a type quality
df1 = df1[1:9,]#takes the first 9 mp3 files
# Download data to working directory
query_xc(X = df1, download = TRUE, path = getwd())
Great! Now that we know how to load in data from an cloud-based data set, lets try something that will probably be more useful in your own future research endeavor. Say you are a ecologist who has just come back from a field expedition where you set up passive bio-acoustic monitors. All the recorded audio files (usually in .mp3 or .wav formats) are saved in a harddrive. How would you import your data then?
You can import you files straight from your Harddrive, Desktop or anywhere else on your personal computer. No need for cloud storage! The following code reads in a file named mgr.wav which, you can download from: https://github.com/rhottensomers/AN588_R_Vignette/blob/main/mgr.wav and save in your working directory.
# getwd() gets our current working directory and file.path() combines getwd() and "mgr.wav" into a file path format.
wav <- read_audio(file.path(getwd(), "mgr.wav"))
# Say you have a file saved on your Desktop, your code might look a little like this:
# wav <- read_audio(file.path("~/Desktop", "mgr.wav"))
Now that we know how to read files, lets load in a file named “Strepera-versicolor-571957.mp3”. Here is the mp3 file:
Note, when you are sampling audio data, you need to make sure that your sampling rate is greater than twice the highest frequency you will observe.
For Birds (our data here), the highest frequency we would expect is 8000 Hz. Notice here that the sampling rate here is more than double that frequency.
# Read just one file!
STREPV <- read_audio(file.path(getwd(), "Strepera-versicolor-571957.mp3"))
STREPV
##
## Wave Object
## Number of Samples: 2938752
## Duration (seconds): 61.22
## Samplingrate (Hertz): 48000
## Channels (Mono/Stereo): Mono
## PCM (integer format): TRUE
## Bit (8/16/24/32/64): 16
However, while reading in one file at a time is helpful, an ecologist returning from a month long expedition will surely have more than one file to read. We have written a function here that will read all the mp3 files in your working directory. You could expand on this function to read files in different directories or even create a data frame!
# Read all the files in your current directory!
read_all_mp3 <- function(){
# List all mp3 files in current directory
mp3_files = list.files(path=".", pattern=".mp3", all.files=TRUE, full.names=TRUE)
audio_list <- list()
# Append all the files a paths to a single list named audio_list
for(files in mp3_files) {
curr_file_path <- read_audio(file.path(files))
audio_list <- append(audio_list, curr_file_path)
}
return(audio_list)
}
read_all_mp3()
## [[1]]
##
## Wave Object
## Number of Samples: 2938752
## Duration (seconds): 61.22
## Samplingrate (Hertz): 48000
## Channels (Mono/Stereo): Mono
## PCM (integer format): TRUE
## Bit (8/16/24/32/64): 16
##
##
## [[2]]
##
## Wave Object
## Number of Samples: 2644992
## Duration (seconds): 55.1
## Samplingrate (Hertz): 48000
## Channels (Mono/Stereo): Mono
## PCM (integer format): TRUE
## Bit (8/16/24/32/64): 16
##
##
## [[3]]
##
## Wave Object
## Number of Samples: 572544
## Duration (seconds): 12.98
## Samplingrate (Hertz): 44100
## Channels (Mono/Stereo): Stereo
## PCM (integer format): TRUE
## Bit (8/16/24/32/64): 16
##
##
## [[4]]
##
## Wave Object
## Number of Samples: 14673024
## Duration (seconds): 305.69
## Samplingrate (Hertz): 48000
## Channels (Mono/Stereo): Stereo
## PCM (integer format): TRUE
## Bit (8/16/24/32/64): 16
##
##
## [[5]]
##
## Wave Object
## Number of Samples: 1147392
## Duration (seconds): 23.9
## Samplingrate (Hertz): 48000
## Channels (Mono/Stereo): Mono
## PCM (integer format): TRUE
## Bit (8/16/24/32/64): 16
##
##
## [[6]]
##
## Wave Object
## Number of Samples: 2648448
## Duration (seconds): 60.06
## Samplingrate (Hertz): 44100
## Channels (Mono/Stereo): Stereo
## PCM (integer format): TRUE
## Bit (8/16/24/32/64): 16
##
##
## [[7]]
##
## Wave Object
## Number of Samples: 1325952
## Duration (seconds): 30.07
## Samplingrate (Hertz): 44100
## Channels (Mono/Stereo): Stereo
## PCM (integer format): TRUE
## Bit (8/16/24/32/64): 16
##
##
## [[8]]
##
## Wave Object
## Number of Samples: 5960448
## Duration (seconds): 124.18
## Samplingrate (Hertz): 48000
## Channels (Mono/Stereo): Stereo
## PCM (integer format): TRUE
## Bit (8/16/24/32/64): 16
##
##
## [[9]]
##
## Wave Object
## Number of Samples: 2025216
## Duration (seconds): 45.92
## Samplingrate (Hertz): 44100
## Channels (Mono/Stereo): Mono
## PCM (integer format): TRUE
## Bit (8/16/24/32/64): 16
##
##
## [[10]]
##
## Wave Object
## Number of Samples: 1534464
## Duration (seconds): 31.97
## Samplingrate (Hertz): 48000
## Channels (Mono/Stereo): Stereo
## PCM (integer format): TRUE
## Bit (8/16/24/32/64): 16
Now that we’ve imported, converted, and read our audio files it’s time to plot them. This can be done many different ways depending on what information you need out of your bioacoustic information. These primarily graph simply display the wave files but later on we will give you a way to graph analysis onto them.
The simplest of the graphs is plot() which plots wave objects. It displays a black and white graph of audio spikes. This is a waveform, x axis is the time of the file and the yaxis is a amplitude, or the displacement of a particle
# First you call your wave audio file that we converted previously:
plot(STREPV)
# Now let's zoom in on the first audio spike:
plot(STREPV, xlim = c(3, 5)) # xlim is indicating the time limits to narrow the field of analysis
Plot can also be used with stereo audio files where there is audio from the right and left channel. To see if your data is stereo vs mono you can look at the list from either read_audio or read_all_mp3(). Under the audio file it will state Channels(mono/stereo) and if it is stereo you can plot it this way:
# Reads stereo audio and converts it into a wave object.
STREPV2 <- read_audio(file.path(getwd(), "Strepera-versicolor-743906.mp3"))
# Displays wave object information. You can see clearly here that STREPV2 is a stereo wave object
STREPV2
##
## Wave Object
## Number of Samples: 5960448
## Duration (seconds): 124.18
## Samplingrate (Hertz): 48000
## Channels (Mono/Stereo): Stereo
## PCM (integer format): TRUE
## Bit (8/16/24/32/64): 16
# Now let's plot it!
plot(STREPV2)
This plot graph is similar to the one above but it has one graph for the right channel and one for the left. Stereo and mono wave object do plot differently with plot() so always make sure to confirm what kind of channel your object is.
While plot() is wonderful spectro() and fspec() are both plotting devices that come with the bioacoustics r package. They are able to generate spectrograms. Spectrograms are a visual representation of the strength or loudness of the audio signal/frequency throughout the wave object. Spectro, true to its name, plots a spectrogram of your data.
# First let's look at the parameters within spectrogram:
?spectro()
## Help on topic 'spectro' was found in the following packages:
##
## Package Library
## seewave /Library/Frameworks/R.framework/Versions/4.2/Resources/library
## bioacoustics /Library/Frameworks/R.framework/Versions/4.2/Resources/library
##
##
## Using the first match ...
# Tick indicates the frequency tick marks. We are displaying from 1 to 20 khz by 1khz steps. Different animals may need different frequencies so be sure to check that you are display the correct range.
ticks <- c(from = 1000, to = 20000, by = 1000)
# Specify the time limits on the X-axis in seconds (s)
temp_slice <- c(from = 1, to = 10)
spectro(STREPV, tlim = temp_slice, FFT_size = 512, ticks_y = ticks) # sets the y axis labeling, in Hz, so its 1 kHz to 100 kHz, in 1 kHz interval
NOTE: Y axis is frequency in hertz and X axis is time. However
spectro() does not have the ability to add x axis label so it should
only be used for visualization. Spectro() is wonderful for an elementary
spectrogram but there is a better way to visualize the data with
fspec(). It is a matrix of amplitude or decibel (dB) values in the time
/ frequency domain. This function on its own generates only the matris
of a spectrogram so in order to visual the data we use image() to
display the graph.
# First let's look at the parameters:
?fspec
# Now graph!
image(fspec(STREPV, tlim = temp_slice, rotate = TRUE))
NOTE: If you have older zero-crossing bat vocalization patterns, then the functions write_zc() and plot_zc() will serve as replacements for fspec() and spectro(). (Add in paragraph after running through spectro and fspec() with an asterisk)
Now that we have the prepared audiofile in its manipulatative form, we can filter the desired sounds(the bird calls) from background noise(other bird noises, self noise, etc).
Blob_detection is one function that detects and filters audio and displays them in batches. A “blob” is simply an area that is distinguishable from the background. It is a common type of detection/filtration algorithm used not only in audio analysis but also in image processing. Blob detection can be used to filter and identify colors, intensity/brightness, etc.
# blob_detection
?blob_detection
BD <- blob_detection(
STREPV, #Waveform of the audio file made
time_exp = 1, #Time expansion factor, 1 represents real-time
min_dur = 20, #Min duration of the threshold, in ms
max_dur = 2000, #Max duration of the threshold, in ms, based on the plot, 1500 ms is the average length of the bird's call, we'll set the max length to be 2 seconds to be safe
min_area = 85, #Minimum number of pixels in the area of the audio event. This can be adjusted to filter more sounds out.
min_TBE = 1, #Min window between 2 audio events
max_TBE = 25000, #Max window between audio events, we don't want to exclude any noises so we can set it to a high measurement
LPF = 6000, #Low-Pass Filter. Frequencies above the cutoff are not included. Based on average bird call frequency.
HPF = 1000, #High-Pass Filter. Frequencies below the cutoff are not included. Based on average bird call frequency.
FFT_size = 512, #Size of the Fast Fourier Transform (FFT) window, essentially the "smoothness" of frequency changes
FFT_overlap = 0.875, #Percentage of overlap between two FFT windows
blur = 3, #Gaussian smoothing function for blurring the spectrogram of the audio event to reduce image noise
spectro_dir = file.path(getwd(), "Spectros"), #Path to the filtered spectrogram png files used in the HTML
time_scale = 1, #Time resolution of the spectrogram in milliseconds per pixel.
ticks = c(1000, 20000, 1000), #Sets y-axis min, max, and interval width
acoustic_feat = FALSE
)
BD
## $data
## filename starting_time duration
## 1 Strepera-versicolor-571957.mp3 00:00:03.124 41.33333
## 2 Strepera-versicolor-571957.mp3 00:00:03.379 49.33333
## 3 Strepera-versicolor-571957.mp3 00:00:03.624 33.33333
## 4 Strepera-versicolor-571957.mp3 00:00:03.659 36.00000
## 5 Strepera-versicolor-571957.mp3 00:00:03.700 28.00000
## 6 Strepera-versicolor-571957.mp3 00:00:03.904 53.33333
## 7 Strepera-versicolor-571957.mp3 00:00:20.078 45.33333
## 8 Strepera-versicolor-571957.mp3 00:00:20.344 44.00000
## 9 Strepera-versicolor-571957.mp3 00:00:20.586 98.66667
## 10 Strepera-versicolor-571957.mp3 00:00:20.886 81.33333
## 11 Strepera-versicolor-571957.mp3 00:00:32.543 41.33333
## 12 Strepera-versicolor-571957.mp3 00:00:33.462 32.00000
## 13 Strepera-versicolor-571957.mp3 00:00:33.511 36.00000
## 14 Strepera-versicolor-571957.mp3 00:00:50.312 60.00000
## 15 Strepera-versicolor-571957.mp3 00:00:50.511 117.33333
## 16 Strepera-versicolor-571957.mp3 00:00:50.675 46.66667
## 17 Strepera-versicolor-571957.mp3 00:00:57.259 48.00000
## 18 Strepera-versicolor-571957.mp3 00:00:57.767 85.33333
## 19 Strepera-versicolor-571957.mp3 00:00:58.082 28.00000
##
## attr(,"class")
## [1] "blob_detection"
The function returns an HTML file that shows all the audio events that fall within the parameters of the function, and a data frame containing information about the events extracted. This may require trial and error with the arguments to find the events that need to be extracted. As seen above more than 5 events are found. This is because there are changes in the frequency of the calls as well as calls that cannot be seen in the waveform(low in volume). If you play around with some of the parameters for the detection function, you can see the change in the number of blobs identified. Try comparing the blobs with the actual recording and the waveform, what is it picking up?
Threshold detection is another form of detection that identifies events based on the SNR, the signal-to-noise ratio. If the amplitude is above the SNR, an audio event is recorded. Mostly, it consists of the same arguments as blob detection with a few additions.
# Threshold_detection
?threshold_detection
TD <- threshold_detection(
STREPV,
threshold = 12, #sets the sensitivity of the event, compared to the SNR(in dB, Lower the threshold for a noisier environment(low SNR); a too low or too high sensitivity may collect too many signals or not enough)
time_exp = 1,
min_dur = 20,
max_dur = 2000,
min_TBE = 1,
max_TBE = 25000,
LPF = 6000,
HPF = 1000,
start_thr = 60, #start threshold for an audio event, 20dB is recommended for birds
end_thr = 80, #end threshold for an audio event, 30dB is recommended for birds
SNR_thr = 12, #SNR threshold (dB) at which the extraction of the audio event stops, 8dB is recommended for birds.
angle_thr = 170, #Angle threshold at which the audio event extraction stops.
duration_thr = 2000, #Maximum duration threshold in milliseconds (ms) after which the monitoring of the background noise is resumed.
NWS = 200, #Length of time used for background noise estimation in the recording (ms)
spectro_dir = file.path(getwd(), "Spectros"),
time_scale = 1, # Time resolution of the spectrogram in milliseconds (ms) per pixel (px).
ticks = c(1000, 20000, 1000), #Sets y-axis min, max, and interval width
acoustic_feat = FALSE
)
TD
## $data
## $data$event_data
## filename starting_time duration
## 1 Strepera-versicolor-571957.mp3 00:00:03.025 1228.667
## 2 Strepera-versicolor-571957.mp3 00:00:19.975 1378.667
## 3 Strepera-versicolor-571957.mp3 00:00:32.817 1274.667
## 4 Strepera-versicolor-571957.mp3 00:00:49.631 1320.667
## 5 Strepera-versicolor-571957.mp3 00:00:57.153 1386.000
##
## $data$event_start
## [1] 145249 958817 1575265 2382305 2743361
##
## $data$event_end
## [1] 204224 1024992 1636448 2445696 2809888
##
##
## attr(,"class")
## [1] "threshold_detection"
You can see that threshold_detection has identified fewer events compared to blob detection. Notice how the extracted data compares between the data frames, because of how each form of detection filters audio, one type may be better to use than the other.
Third, the auto_detec function automatically detects the start and end of vocalizations in sound files based on amplitude, duration, and frequency range attributes.
# Let's look at the parameters of the function first
?auto_detec
detection <- auto_detec(threshold = 12,
ssmooth = 300, # 'smoothens' spectrogram
bp = c(1, 6), # lower and upper limits of a frequency bandpass filter (in kHz)
wl = 300, # Window used internally for bandpass filtering
path = getwd())
The packages ‘ohun’ and ‘viridis’ allow us to visualize the autodetection points determined above plotted on a spectrogram, this time with labeled axis!
label_spectro(wave = STREPV, #Wave object
detection = detection[detection$sound.files == "Strepera-versicolor-571957.mp3", ], #Data frame or 'selection.table' with the detection
hop.size = 10, #Time window duration (in ms).
ovlp = 50, #% overlap between two consecutive windows
flim = c(1, 10), #Frequency limits
fastdisp=TRUE) #Faster display of model
Try this yourself first! When done, check out what we came up with by hitting the “Show” button.
wav <- read_audio(file.path("~/Desktop", "your_file_name.wav"))
temp_slice <- c(from = 10, to = 20) # in seconds
ticks <- c(from = 1000, to = 20000, by = 1000)
spectro(wav, tlim = temp_slice, FFT_size = 512, ticks_y = ticks) sets the y axis labeling, in Hz, so its 1 kHz to 100 kHz, in 1 kHz interval
detection <- auto_detec(threshold = 5, ssmooth = 300, bp = c(2, 9), wl = 300, path = getwd())
label_spectro(wave = wav, detection = detection[detection$sound.files == "mgr.wav", ], hop.size = 10, ovlp = 50, flim = c(1, 10), fastdisp=TRUE)