Reference documentation

Note

This reference documentation is auto-generated from the doc strings in the module. For a tutorial-like overview of the functionality of slab, please see the previous sections.

Sounds

Inherits from slab.Signal.

class slab.Sound(data, samplerate=None)

Class for working with sounds, including loading/saving, manipulating and playing. Inherits from the base class slab.Signal. Instances of Sound can be created by either loading a file, passing an array of values and a samplerate or by using one of the sound-generating methods of the class (all of the @staticmethods).

Parameters:
  • data (str | pathlib.Path | numpy.ndarray | slab.Signal | list) – Given a string or Path pointing to the .wav file, the data and samplerate will be loaded from the file. Given an array, and instance of a Signal or a list, the data will be passed to the super class (see documentation of slab.Signal).

  • samplerate (int | float) – must only be defined when creating a Sound from an array.

.data

the data-array of the Sound object which has the shape n_samples x n_channels.

.n_channels

the number of channels in data.

.n_samples

the number of samples in data. Equals duration * samplerate.

.duration

the duration of the sound in seconds. Equals n_samples / samplerate.

Examples:

import slab, numpy
# generate a Sound object from an array of random floats:
sig = slab.Sound(data=numpy.random.randn(10000), samplerate=41000)
# generate a Sound object using one of the modules methods, like `tone`:
sig = slab.Sound.tone()  # generate a tone
sig.level = 80  # set the level to 80 dB
sig = sig.ramp(duration=0.05)  # add a 50 millisecond ramp
sig.spectrum(log_power=True)  # plot the spectrum
sig.waveform()  # plot the time courses
property level

Can be used to get or set the rms level of a sound, which should be in dB. For single channel sounds a value in dB is used, for multiple channel sounds a value in dB can be used for setting the level (all channels will be set to the same level), or a list/tuple/array of levels. Use slab.calibrate() to make the computed level reflect output intensity.

static read(filename)

Load a wav file and create an instance of Sound.

Parameters:

filename (str) – the full path to a (.wav) file.

Returns:

the sound generated with the data and samplerate from the file.

Return type:

(slab.Sound)

static tone(frequency=500, duration=1.0, phase=0, samplerate=None, level=None, n_channels=1)

Generate a pure tone.

Parameters:
  • frequency (int | float | list) – frequency of the tone. Given a list of length n_channels, one element of the list is used as frequency for each channel.

  • duration (float | int) – duration of the sound in seconds (given a float) or in samples (given an int).

  • phase (int | float | list) – phase of the sinusoid, defaults to 0. Given a list of length n_channels, one element of the list is used as phase for each channel.

  • samplerate (int | None) – the samplerate of the sound. If None, use the default samplerate.

  • level (None | int | float | list) – the sounds level in decibel. For a multichannel sound, a list of values can be provided to set the level of each channel individually. If None, the level is set to the default

  • n_channels (int) – number of channels, defaults to one.

Returns:

the tone generated from the parameters.

Return type:

(slab.Sound)

static dynamic_tone(frequencies=None, times=None, phase=0, samplerate=None, level=None, n_channels=1)

Generate a sinusoid with time-varying frequency from a list of frequencies and times. If times is None, a sound with len(frequencies) samples is generated. Be careful when giving a list of times: integers are treated as samples and floats as seconds for each list element independently. So times=[0, 0.5, 1] is probably a mistake, because the last entry is treated a sample number 1, instead of the intended 1 second time point. This will raise and error because the time values are not monotonically ascending. Correct would be [0., .5, 1.] or [0, 4000, 8000] for the default samplerate.

Parameters:
  • frequencies (list | numpy.ndarray) – frequencies of the tone. Intermediate values are linearely interpolated.

  • times (list | numpy.ndarray | None) – list of time points corresponding to the given frequencies. Must have same length as frequencies if given. If None, frequencies are assumed to correspond to consecutive samples. Integer values specify times in samples, and floats specify times in seconds.

  • phase (int | float) – initial phase of the sinusoid, defaults to 0

  • samplerate (None | int) – the samplerate of the sound. If None, use the default samplerate.

  • level (None | int | float) – the sounds level in decibel. If None, use the default samplerate.

  • n_channels (int) – number of channels, defaults to one.

Returns:

the tone generated from the parameters.

Return type:

(slab.Sound)

static harmoniccomplex(f0=500, duration=1.0, amplitude=0, phase=0, samplerate=None, level=None, n_channels=1)

Generate a harmonic complex tone composed of pure tones at integer multiples of the fundamental frequency.

Parameters:
  • f0 (int) – the fundamental frequency. Harmonics will be generated at integer multiples of this value.

  • duration (float | int) – duration of the sound in seconds (given a float) or in samples (given an int).

  • amplitude (int | float | list) – Amplitude in dB, relative to the full scale (i.e. 0 corresponds to maximum intensity, -30 would be 30 dB softer). Given a single int or float, all harmonics are set to the same amplitude and harmonics up to 1/5th of of the samplerate are generated. Given a list of values, the number of harmonics generated is equal to the length of the list with each element of the list setting the amplitude for one harmonic.

  • phase (int | float | string | list) – phase of the sinusoid, defaults to 0. Given a list (with the same length as the one given for the amplitude argument) every element will be used as the phase of one harmonic. Given a string, its value must be schroeder’, in which case the harmonics are in Schroeder phase, producing a complex tone with minimal peak-to-peak amplitudes (Schroeder 1970).

  • samplerate (int | None) – the samplerate of the sound. If None, use the default samplerate.

  • level (None | int | float | list) – the sounds level in decibel. For a multichannel sound, a list of values can be provided to set the level of each channel individually. If None, the level is set to the default

  • n_channels (int) – number of channels, defaults to one.

Returns:

the harmonic complex generated from the parameters.

Return type:

(slab.Sound)

Examples:

sig = slab.Sound.harmoniccomplex(f0=200, amplitude=[0,-10,-20,-30])  # generate the harmonic complex tone
_ = sig.spectrum()  # plot it's spectrum
static whitenoise(duration=1.0, samplerate=None, level=None, n_channels=1)

Generate white noise.

Parameters:
  • duration (float | int) – duration of the sound in seconds (given a float) or in samples (given an int).

  • samplerate (int | None) – the samplerate of the sound. If None, use the default samplerate.

  • level (None | int | float | list) – the sounds level in decibel. For a multichannel sound, a list of values can be provided to set the level of each channel individually.

  • n_channels (int) – number of channels, defaults to one. If channels > 1, several channels of uncorrelated noise are generated.

Returns:

the white noise generated from the parameters.

Return type:

(slab.Sound)

Examples:

noise = slab.Sound.whitenoise(1.0, n_channels=2).  # generate a 1 second white noise with two channels
static powerlawnoise(duration=1.0, alpha=1, samplerate=None, level=None, n_channels=1)

Generate a power-law noise where the spectral density per unit of bandwidth scales as 1/(f**alpha).

Parameters:
  • duration (float | int) – duration of the sound in seconds (given a float) or in samples (given an int).

  • alpha (int) – power law exponent.

  • samplerate (int | None) – the samplerate of the sound. If None, use the default samplerate.

  • level (None | int | float | list) – the sounds level in decibel. For a multichannel sound, a list of values can be provided to set the level of each channel individually. If None, the level is set to the default

  • n_channels (int) – number of channels, defaults to one. If channels > 1, several channels of uncorrelated noise are generated.

Returns:

the power law noise generated from the parameters.

Return type:

(slab.Sound)

Examples:

# Generate and plot power law noise with three different exponents
from matplotlib import pyplot as plt
fig, ax = plt.subplots()
for alpha in [1, 2, 3]:
    noise = slab.Sound.powerlawnoise(0.2, alpha, samplerate=8000)
    noise.spectrum(axis=ax, show=False)
plt.show()
static pinknoise(duration=1.0, samplerate=None, level=None, n_channels=1)

Generate pink noise (power law noise with exponent alpha==1). This is simply a wrapper for calling the powerlawnoise method.

Parameters:

slab.Sound.powerlawnoise (see) –

Returns:

power law noise generated from the parameters with exponent alpha==1.

Return type:

(slab.Sound)

static irn(frequency=100, gain=1, n_iter=4, duration=1.0, samplerate=None, level=None, n_channels=1)

Generate iterated ripple noise (IRN). IRN is a broadband noise with temporal regularities, which can give rise to a perceptible pitch. Since the perceptual pitch to noise ratio of these stimuli can be altered without substantially altering their spectral content, they have been useful in exploring the role of temporal processing in pitch perception [Yost 1996, JASA].

Parameters:
  • frequency (int | float) – the frequency of the signals perceived pitch in Hz.

  • gain (int | float) – multiplicative factor of the repeated additions. Smaller values reduce the temporal regularities in the resulting IRN.

  • n_iter (int) – number of iterations of additions. Higher values increase pitch saliency.

  • duration (float | int) – duration of the sound in seconds (given a float) or in samples (given an int).

  • samplerate (int | None) – the samplerate of the sound. If None, use the default samplerate.

  • level (None | int | float | list) – the sounds level in decibel. For a multichannel sound, a list of values can be provided to set the level of each channel individually. If None, the level is set to the default

  • n_channels (int) – number of channels, defaults to one. If channels > 1, several channels with copies of the noise are generated.

Returns:

ripple noise that has a perceived pitch at the given frequency.

Return type:

(slab.Sound)

static click(duration=0.0001, samplerate=None, level=None, n_channels=1)

Generate a click (a sequence of ones).

Parameters:
  • duration (float | int) – duration of the sound in seconds (given a float) or in samples (given an int).

  • samplerate (int | None) – the samplerate of the sound. If None, use the default samplerate.

  • level (None | int | float | list) – the sounds level in decibel. For a multichannel sound, a list of values can be provided to set the level of each channel individually. If None, the level is set to the default

  • n_channels (int) – number of channels, defaults to one.

Returns:

click generated from the given parameters.

Return type:

(slab.Sound)

static clicktrain(duration=1.0, frequency=500, clickduration=0.0001, level=None, samplerate=None)

Generate a series of n clicks (by calling the click method) with a perceived pitch at the given frequency.

Parameters:
  • duration (float | int) – duration of the sound in seconds (given a float) or in samples (given an int).

  • frequency (float | int) – the frequency of the signals perceived pitch in Hz.

  • clickduration – (float | int): duration of a single click in seconds (given a float) or in samples (given an int). The number of clicks in the train is given by duration / clickduration.

  • samplerate (int | None) – the samplerate of the sound. If None, use the default samplerate.

  • level (None | int | float | list) – the sounds level in decibel. For a multichannel sound, a list of values can be provided to set the level of each channel individually. If None, the level is set to the default

Returns:

click train generated from the given parameters.

Return type:

(slab.Sound)

static chirp(duration=1.0, from_frequency=100, to_frequency=None, samplerate=None, level=None, kind='quadratic')

Returns a pure tone with in- or decreasing frequency using the function scipy.sound.chirp.

Parameters:
  • duration (float | int) – duration of the sound in seconds (given a float) or in samples (given an int).

  • from_frequency (float | int) – the frequency of tone in Hz at the start of the sound.

  • to_frequency (float | int | None) – the frequency of tone in Hz at the end of the sound. If None, the nyquist frequency (samplerate / 2) will be used.

  • samplerate (int | None) – the samplerate of the sound. If None, use the default samplerate.

  • level (None | int | float | list) – the sounds level in decibel. For a multichannel sound, a list of values can be provided to set the level of each channel individually. If None, the level is set to the default

  • kind (str) – determines the type of ramp (see scipy.sound.chirp() for options).

Returns:

chirp generated from the given parameters.

Return type:

(slab.Sound)

static silence(duration=1.0, samplerate=None, n_channels=1)

Generate silence (all samples equal zero).

Parameters:
  • duration (float | int) – duration of the sound in seconds (float) or samples (int).

  • samplerate (int | None) – the samplerate of the sound. If None, use the default samplerate.

  • n_channels (int) – number of channels, defaults to one.

Returns:

silence generated from the given parameters.

Return type:

(slab.Sound)

static vowel(vowel='a', gender=None, glottal_pulse_time=12, formant_multiplier=1, duration=1.0, samplerate=None, level=None, n_channels=1)

Generate a sound resembling the human vocalization of a vowel.

Parameters:
  • vowel (str | None) – kind of vowel to generate can be: ‘a’, ‘e’, ‘i’, ‘o’, ‘u’, ‘ae’, ‘oe’, or ‘ue’. For these vowels, the function haa pre-set format frequencies. If None, a vowel will be generated from random formant frequencies in the range of the existing vowel formants.

  • gender (str | None) – Setting the gender (‘male’, ‘female’) is a shortcut for setting the arguments glottal_pulse_time and formant_multiplier.

  • glottal_pulse_time (int | float) – the distance between glottal pulses in milliseconds (determines vocal trakt length).

  • formant_multiplier (int | float) – multiplier for the pre-set formant frequencies (scales the voice pitch).

  • duration (float | int) – duration of the sound in seconds (given a float) or in samples (given an int).

  • samplerate (int | None) – the samplerate of the sound. If None, use the default samplerate.

  • level (None | int | float | list) – the sounds level in decibel. For a multichannel sound, a list of values can be provided to set the level of each channel individually. If None, the level is set to the default

  • n_channels (int) – number of channels, defaults to one.

Returns:

vowel generated from the given parameters.

Return type:

(slab.Sound)

static multitone_masker(duration=1.0, low_cutoff=125, high_cutoff=4000, bandwidth=0.3333333333333333, samplerate=None, level=None)

Generate noise made of ERB-spaced random-phase pure tones. This noise does not have random amplitude variations and is useful for testing CI patients [Oxenham 2014, Trends Hear].

Parameters:
  • duration (float | int) – duration of the sound in seconds (given a float) or in samples (given an int).

  • low_cutoff (int | float) – the lower frequency limit of the noise in Hz

  • high_cutoff (int | float) – the upper frequency limit of the noise in Hz

  • bandwidth (float) – the signals bandwidth in octaves.

  • samplerate (int | None) – the samplerate of the sound. If None, use the default samplerate.

  • level (None | int | float | list) – the sounds level in decibel. For a multichannel sound, a list of values can be provided to set the level of each channel individually. If None, the level is set to the default

Returns:

multi tone masker noise, generated from the given parameters.

Return type:

(slab.Sound)

Examples:

sig = Sound.multitone_masker()
sig = sig.ramp()
sig.spectrum()
static equally_masking_noise(duration=1.0, low_cutoff=125, high_cutoff=4000, samplerate=None, level=None)

Generate an equally-masking noise (ERB noise) within a given frequency band.

Parameters:
  • duration (float | int) – duration of the sound in seconds (given a float) or in samples (given an int).

  • low_cutoff (int | float) – the lower frequency limit of the noise in Hz

  • high_cutoff (int | float) – the upper frequency limit of the noise in Hz

  • samplerate (int | None) – the samplerate of the sound. If None, use the default samplerate.

  • level (None | int | float | list) – the sounds level in decibel. For a multichannel sound, a list of values can be provided to set the level of each channel individually. If None, the level is set to the default

Returns:

equally masking noise noise, generated from the given parameters.

Return type:

(slab.Sound)

Examples:

sig = Sound.erb_noise()
sig.spectrum()
static sequence(*sounds)

Join sounds into a new sound object.

Parameters:

*sounds (slab.Sound) – two or more sounds to combine.

Returns:

the input sounds combined in a single object.

Return type:

(slab.Sound)

write(filename, normalise=True, fmt='WAV')

Save the sound as a WAV.

Parameters:
  • filename (str | pathlib.Path) – path, the file is written to.

  • normalise (bool) – if True, the maximal amplitude of the sound is normalised to 1.

  • fmt (str) – data format to write. See soundfile.available_formats().

ramp(when='both', duration=0.01, envelope=None)

Adds an on and/or off ramp to the sound.

Parameters:
  • when (str) – can take values ‘onset’, ‘offset’ or ‘both’

  • duration (float | int) – duration of the sound in seconds (given a float) or in samples (given an int).

  • envelope (callable) – function to compute the samples of the ramp, defaults to a sinusoid

Returns:

copy of the sound with the added ramp(s)

Return type:

(slab.Sound)

repeat(n)

Repeat the sound n times.

Parameters:

n (int) – the number of repetitions.

Returns:

copy of the sound repeated n times.

Return type:

(slab.Sound)

static crossfade(*sounds, overlap=0.01)

Crossfade several sounds.

Parameters:
  • *sounds (instances of slab.Sound) – sounds to crossfade

  • overlap (float | int) – duration of the overlap between the cross-faded sounds in seconds (given a float) or in samples (given an int).

Returns:

A single sound that contains all input sounds cross-faded. The duration will be the

sum of the input sound’s durations minus the overlaps.

Return type:

(slab.Sound)

Examples:

noise = Sound.whitenoise(duration=1.0)
vowel = Sound.vowel()
noise2vowel = Sound.crossfade(vowel, noise, vowel, overlap=0.4)
noise2vowel.play()
pulse(frequency=4, duty=0.75, gate_time=0.005)

Apply a pulsed envelope to the sound.

Parameters:
  • frequency (float) – the frequency of pulses in Hz.

  • duty (float) – duty cycle, i.e. ratio between the pulse duration and pulse period, values must be between 1 (always high) and 0 (always low). When using values close to 0, gate_time may need to be decreased to avoid on and off ramps being longer than the pulse.

  • gate_time (float) – rise/fall time of each pulse in seconds

Returns:

pulsed copy of the instance.

Return type:

slab.Sound

am(frequency=10, depth=1, phase=0)

Apply an amplitude modulation to the sound by multiplication with a sine function.

Parameters:
  • frequency (int) – frequency of the modulating sine function in Hz

  • depth (int, float) – modulation depth/index of the modulating sine function

  • phase (int, float) – initial phase of the modulating sine function

Returns:

amplitude modulated copy of the instance.

Return type:

slab.Sound

filter(frequency=100, kind='hp')

Convenient wrapper for the Filter class for a standard low-, high-, bandpass, and bandstop filter.

Parameters:
  • frequency (int, tuple) – cutoff frequency in Hz. Integer for low- and highpass filters, tuple with lower and upper cutoff for bandpass and -stop.

  • kind (str) – type of filter, can be “lp” (lowpass), “hp” (highpass) “bp” (bandpass) or “bs” (bandstop)

Returns:

filtered copy of the instance.

Return type:

slab.Sound

aweight()

Returns A-weighted version of the sound. A-weighting is applied to instrument-recorded sounds to account for the relative loudness of different frequencies perceived by the human ear. See: https://en.wikipedia.org/wiki/A-weighting.

static record(duration=1.0, samplerate=None)

Record from inbuilt microphone. Uses SoundCard module if installed [recommended], otherwise uses SoX.

Parameters:
  • duration (float | int) – duration of the sound in seconds (given a float) or in samples (given an int). Note that duration has to be in seconds when using SoX

  • samplerate (int | None) – the samplerate of the sound. If None, use the default samplerate. Note that most sound cards can only record at 44100 Hz samplerate.

Returns:

The recorded sound.

Return type:

(slab.Sound)

play()

Plays the sound through the default device. If the soundcard module is installed it is used to play the sound. Otherwise the sound is saved as .wav to a temporary directory and is played via the play_file method.

static play_file(filename)

Play a .wav file using the OS-specific mechanism for Windows, Linux or Mac.

Parameters:

filename (str | pathlib.Path) – full path to the .wav file to be played.

waveform(start=0, end=None, show=True, axis=None)

Plot the waveform of the sound.

Parameters:
  • start (int | float) – start of the plot in seconds (float) or samples (int), defaults to 0

  • end (int | float | None) – the end of the plot in seconds (float) or samples (int), defaults to None.

  • show (bool) – whether to show the plot right after drawing.

  • axis (matplotlib.axes.Axes | None) – axis to plot to. If None create a new plot.

spectrogram(window_dur=0.005, dyn_range=120, upper_frequency=None, other=None, show=True, axis=None, **kwargs)

Plot a spectrogram of the sound or return the computed values.

Parameters:
  • window_dur – duration in samples (int) or seconds (float) of time window for short-term FFT, default 0.005 s.

  • dyn_range – dynamic range in dB to plot, defaults to 120.

  • upper_frequency (int | float | None) – The upper frequency limit of the plot. If None use the maximum.

  • other (slab.Sound) – if a sound object is given, subtract the waveform and plot the difference spectrogram.

  • show (bool) – whether to show the plot right after drawing. Note that if show is False and no axis is passed, no plot will be created.

  • axis (matplotlib.axes.Axes | None) – axis to plot to. If None create a new plot.

  • **kwargs – keyword arguments for computing the spectrogram. See documentation for scipy.signal.spectrogram.

Returns:

If show == True or an axis was passed, a plot is drawn and nothing is returned. Else,

a tuple is returned which contains frequencies, time bins and a 2D array of powers.

Return type:

(None | tuple)

cochleagram(bandwidth=0.2, n_bands=None, show=True, axis=None)

Computes a cochleagram of the sound by filtering with a bank of cosine-shaped filters and applying a cube-root compression to the resulting envelopes. The number of bands is either calculated based on the desired bandwidth or specified by the n_bands argument.

Parameters:
  • bandwidth (float) – filter bandwidth in octaves.

  • n_bands (int | None) – number of bands in the cochleagram. If this is not None, the bandwidth argument is ignored.

  • show (bool) – whether to show the plot right after drawing. Note that if show is False and no axis is passed, no plot will be created

  • axis (matplotlib.axes.Axes | None) – axis to plot to. If None create a new plot.

Returns:

If show == True or an axis was passed, a plot is drawn and nothing is returned.

Else, an array with the envelope is returned.

Return type:

(None | numpy.ndarray)

spectrum(low_cutoff=16, high_cutoff=None, log_power=True, axis=None, show=True)

Compute the spectrum (power spectral density, PSD) of the sound. The PSD is the squared amplitude (power) in each frequency bin returned by a Fourier transform of the signal, normalized by the width of the bins. This is typically more useful than the Fourier amplitudes because the values do not depend on the length of the signal.

Parameters:
  • low_cutoff (int | float) –

  • high_cutoff (int | float | None) – If these are left unspecified, it shows the full spectrum, otherwise it shows only between low and high in Hz.

  • log_power (bool) – whether to compute the log of the power.

  • show (bool) – whether to show the plot right after drawing. Note that if show is False and no axis is passed, no plot will be created.

  • axis (matplotlib.axes.Axes | None) – axis to plot to. If None create a new plot.

Returns:

If show=False, returns Z, freqs, where Z is a 1D array of powers

and freqs are the corresponding frequencies.

spectral_feature(feature='centroid', mean='rms', frame_duration=None, rolloff=0.85)

Computes one of several features of the spectrogram of a sound for each channel.

Parameters:
  • feature (str) – the kind of feature to compute, options are: “centroid”, the center of mass of the short-term spectrum, “fwhm”, the width of a Gaussian of the same variance as the spectrum around the centroid, “flux”, a measure of how quickly the power spectrum of a sound is changing, “flatness”, measures how tone-like a sound is, as opposed to being noise-like, “rolloff”, the frequency at which the spectrum rolls off.

  • mean (str | None) – method of computing the mean of the feature value over all samples. Can be “rms”, “average” or None. If None, a new sound with the feature value at each sample is generated.

  • frame_duration (float) – duration of frames in samples (int) or seconds (float) in which to compute features, defaults to 0.05 s

  • rolloff (float) – only used if feature is “rolloff”, fraction of spectral power below the rolloff frequency

Returns:

Mean feature for each channel in a list or a new Signal of feature values.

Return type:

(list | slab.Signal)

vocode(bandwidth=0.3333333333333333)

Returns a noise vocoded version of the sound by computing the envelope in different frequency subbands, filling these envelopes with noise, and collapsing the subbands into one sound. This removes most spectral information but retains temporal information in a speech sound.

Parameters:

bandwidth (float) – width of the subbands in octaves.

Returns:

a vocoded copy of the sound.

Return type:

(slab.Sound)

crest_factor()

The crest factor is the ratio of the peak amplitude and the RMS value of a waveform and indicates how extreme the peaks in a waveform are. Returns the crest factor in dB. Numerically identical to the peak-to-average power ratio.

Returns:

the crest factor or NaN if there are no peaks in the sound.

Return type:

(float | numpy.nan)

onset_slope()

Compute the centroid of a histogram of onset slopes as a measure of how many quick intensity increases the sound has. These onset-like features make the sound easier to localize via envelope ITD.

Returns:

the histograms centroid or 0 if there are no onsets in the sound.

Return type:

(float)

frames(duration=1024)

A generator that steps through the sound in overlapping, windowed frames. Get the frame center times by calling Sound’s frametimes method.

Parameters:
  • duration (int | float) – half of the length of the returned frames in samples (int) or seconds (float),

  • samples. (must be larger than 7) –

Returns:

the generator object that yields frames which are of the same type as the object.

Return type:

(generator)

Examples:

sound = slab.Sound.vowel()
windows = sound.frames()
for window in windows:  # get the flatness of each frame
    print(window.spectral_feature("flatness"))
frametimes(duration=1024)

Returns the time points at the frame centers constructed by the frames method.

Parameters:
  • duration (int | float) – half of the length of the returned frames in samples (int) or seconds (float),

  • samples. (must be larger than 7) –

Returns:

the center of each frame in seconds.

Return type:

(numpy.ndarray)

sound.apply_to_path(method=None, kwargs=None, out_path=None)

Apply a function to all wav files in a given directory.

Parameters:
  • path (str | pathlib.Path) – path to the folder from which wav files are collected for processing.

  • method (callable) – function to be applied to each file.

  • kwargs (dict) – dictionary of keyword arguments and values passed to the function.

  • out_path (str | pathlib.Path) – if is supplied, sounds are saved with their original file name in this directory.

Examples:

slab.apply_to_path('.', slab.Sound.spectral_feature, {'feature':'fwhm'})
slab.apply_to_path('.', slab.Sound.ramp, out_path='./modified')
slab.apply_to_path('.', slab.Sound.ramp, kwargs={'duration':0.3}, out_path='./test')

Signal

slab.Sound inherits from Signal, which provides basic methods to handle signals:

class slab.Signal(data, samplerate=None)

Base class for Signal data (from which the Sound and Filter class inherit). Provides arithmetic operations, slicing, and conversion between samples and times.

Parameters:
  • data (numpy.ndarray | slab.Signal | list) – samples of the sound. If it is an array, the first dimension should represent the number of samples and the second one the number of channels. If it’s an object, it must have a .data attribute containing an array. If it’s a list, the elements can be arrays or objects. The output will be a multi-channel sound with each channel corresponding to an element of the list.

  • samplerate (int | None) – the samplerate of the sound. If None, use the default samplerate.

.duration

duration of the sound in seconds

.n_samples

duration of the sound in samples

.n_channels

number of channels in the sound

.times

list with the time point of each sample

Examples:

import slab, numpy
sig = slab.Signal(numpy.ones([10,2]),samplerate=10)  # create a sound
sig[:5] = 0  # set the first 5 samples to 0
sig[:,1]  # select the data from the second channel
sig2 = sig * 2  # multiply each sample by 2
sig_inv = -sig  # invert the phase
property n_samples

The number of samples in the Signal.

property duration

The length of the Signal in seconds.

property times

An array of times (in seconds) corresponding to each sample.

property n_channels

The number of channels in the Signal.

static in_samples(ctime, samplerate)

Converts time values in seconds to samples. This is used to enable input in either samples (integers) or seconds (floating point numbers) in the class.

Parameters:
Returns:

the time(s) in samples.

Return type:

(int | list | numpy.ndarray)

static set_default_samplerate(samplerate)

Sets the global default samplerate for Signal objects, by default 8000 Hz.

channel(n)

Get a single data channel.

Parameters:

n (int) – channel index

Returns:

a new instance of the class that contains the selected channel as data.

Return type:

(slab.Signal)

channels()

Returns generator that yields channel data as objects of the calling class.

resize(duration)

Change the duration by padding with zeros or cutting the data.

Parameters:

duration (float | int) – new duration of the sound in seconds (given a float) or in samples (given an int).

Returns:

a new instance of the same class with the specified duration.

Return type:

(slab.Signal)

trim(start=0, stop=None)

Trim the signal by returning the section between start and stop. :param start: start of the section in seconds (given a float) or in samples (given an int). :type start: float | int :param stop: end of the section in seconds (given a float) or in samples (given an int). :type stop: float | int

Returns:

a new instance of the same class with the specified duration.

Return type:

(slab.Signal)

resample(samplerate)

Resample the sound.

Parameters:

samplerate (int) – the samplerate of the resampled sound.

Returns:

a new instance of the same class with the specified samplerate.

Return type:

(slab.Signal)

envelope(apply_envelope=None, times=None, kind='gain', cutoff=50)

Either apply an envelope to a sound or, if no apply_envelope was specified, compute the Hilbert envelope of the sound.

Parameters:
  • apply_envelope (None | numpy.ndarray) – data to multiply with the sound. The envelope is linearly interpolated to be the same length as the sound. If None, compute the sound’s Hilbert envelope

  • times (None | numpy.ndarray | list) – If None a vector linearly spaced from 0 to the duration of the sound is used. If time points (in seconds, clamped to the the sound duration) for the amplitude values in envelope are supplied, then the interpolation is piecewise linear between pairs of time and envelope valued (must have same length).

  • kind (str) – determines the unit of the envelope value

  • cutoff (int) – Frequency of the lowpass filter that is applied to remove the temporal fine structure in Hz.

Returns:

Either a copy of the instance with the specified envelope applied or the sound’s

Hilbert envelope.

Return type:

(slab.Signal)

delay(duration=1, channel=0, filter_length=2048)

Add a delay to one channel.

Parameters:
  • duration (int | float | array-like) – duration of the delay in seconds (given a float) or samples (given an int). Given an array with the same length as the sound, each sample is delayed by the corresponding number of seconds. This option is used by in slab.Binaural.itd_ramp.

  • channel (int) – The index of the channel to add the delay to

  • filter_length (int) – Must be and even number. determines the accuracy of the reconstruction when using fractional sample delays. Defaults to 2048, or the sound length for shorter signals.

Returns:

a copy of the instance with the specified delay.

Return type:

(slab.Signal)

plot_samples(show=True, axis=None)

Stem plot of the samples of the signal.

Parameters:
  • show (bool) – whether to show the plot right after drawing.

  • axis (matplotlib.axes.Axes | None) – axis to plot to. If None create a new plot.

Binaural sounds

Binaural sounds inherit from Sound and provide methods for manipulating interaural parameters of two-channel sounds.

class slab.Binaural(data, samplerate=None)

Class for working with binaural sounds, including ITD and ILD manipulation. Binaural inherits all sound generation functions from the Sound class, but returns binaural signals. Recasting an object of class sound or sound with 1 or 3+ channels calls Sound.copychannel to return a binaural sound with two channels identical to the first channel of the original sound.

Parameters:
  • data (slab.Signal | numpy.ndarray | list | str) – see documentation of slab.Sound for details. the data must have either one or two channels. If it has one, that channel is duplicated

  • samplerate (int) – samplerate in Hz, must only be specified when creating an instance from an array.

.left

the first data channel, containing the sound for the left ear.

.right

the second data channel, containing the sound for the right ear

.data

the data-array of the Sound object which has the shape n_samples x n_channels.

.n_channels

the number of channels in data. Must be 2 for a binaural sound.

.n_samples

the number of samples in data. Equals duration * samplerate.

.duration

the duration of the sound in seconds. Equals n_samples / samplerate.

property left

The left channel for a stereo sound.

property right

The right channel for a stereo sound.

itd(duration=None, max_lag=0.001)

Either estimate the interaural time difference of the sound or generate a new sound with the specified interaural time difference. The resolution for computing the ITD is 1/samplerate seconds. A negative ITD value means that the right channel is delayed, meaning the sound source is to the left.

Parameters:
  • duration (None| int | float) – Given None, the instance’s ITD is computed. Given another value, a new sound with the desired interaural time difference in samples (given an integer) or seconds (given a float) is generated.

  • max_lag (float) – Maximum possible value for ITD estimation. Defaults to 1 millisecond which is barely outside the physiologically plausible range for humans. Is ignored if duration is specified.

Returns:

The interaural time difference in samples or a copy of the instance with the

specified interaural time difference.

Return type:

(int | slab.Binaural)

Examples:

sound = slab.Binaural.whitenoise()
lateral = sound.itd(duration=0.0005)  # generate a sound with 0.5 ms ITD
lateral.itd()  # estimate the ITD of the sound
ild(dB=None)

Either estimate the interaural level difference of the sound or generate a new sound with the specified interaural level difference. Negative ILD value means that the left channel is louder than the right channel, meaning that the sound source is to the left. The level difference is achieved by adding half the ILD to one channel and subtracting half from the other channel, so that the mean intensity remains constant.

Parameters:

dB (None | int | float) – If None, estimate the sound’s ITD. Given a value, a new sound is generated with the desired interaural level difference in decibels.

Returns:

The sound’s interaural level difference, or a new sound with the specified ILD.

Return type:

(float | slab.Binaural)

Examples:

sig = slab.Binaural.whitenoise()
lateral_right = sig.ild(3) # attenuate left channel by 1.5 dB and amplify right channel by the same amount
lateral_left = sig.ild(-3) # vice-versa
itd_ramp(from_itd=-0.0006, to_itd=0.0006)

Generate a sound with a linearly increasing or decreasing interaural time difference. This is achieved by sinc interpolation of one channel with a dynamic delay. The resulting virtual sound source moves left or right.

Parameters:
  • from_itd (float) – interaural time difference in seconds at the start of the sound. Negative numbers correspond to sources to the left of the listener.

  • to_itd (float) – interaural time difference in seconds at the end of the sound.

Returns:

a copy of the instance wit the desired ITD ramp.

Return type:

(slab.Binaural)

Examples:

sig = slab.Binaural.whitenoise()
moving = sig.itd_ramp(from_itd=-0.001, to_itd=0.01)
moving.play()
ild_ramp(from_ild=-50, to_ild=50)

Generate a sound with a linearly increasing or decreasing interaural level difference. The resulting virtual sound source moves to the left or right.

Parameters:
  • from_ild (int | float) – interaural level difference in decibels at the start of the sound. Negative numbers correspond to sources to the left of the listener.

  • to_ild (int | float) – interaural level difference in decibels at the end of the sound.

Returns:

a copy of the instance with the desired ILD ramp. Any previously existing level difference

is removed.

Return type:

(slab.Binaural)

Examples:

sig = slab.Binaural.whitenoise()
moving = sig.ild_ramp(from_ild=-10, to_ild=10)
move.play()
static azimuth_to_itd(azimuth, frequency=2000, head_radius=8.75)

Compute the ITD for a sound source at a given azimuth. For frequencies >= 2 kHz the Woodworth (1962) formula is used. For frequencies <= 500 Hz the low-frequency approximation mentioned in Aronson and Hartmann (2014) is used. For frequencies in between, we interpolate linearly between the two formulas.

Parameters:
  • azimuth (int | float) – The azimuth angle of the sound source, negative numbers refer to sources to the left.

  • frequency (int | float) – Frequency in Hz for which the ITD is estimated. Use the default for for sounds with a broadband spectrum.

  • head_radius (int | float) – Radius of the head in centimeters. The bigger the head, the larger the ITD.

Returns:

The interaural time difference for a sound source at a given azimuth.

Return type:

(float)

Examples:

# compute the ITD for a sound source 90 degrees to the left for a large head
itd = slab.Binaural.azimuth_to_itd(-90, head_radius=10)
static azimuth_to_ild(azimuth, frequency=2000, ils=None)

Get the interaural level difference corresponding to a sound source at a given azimuth and frequency.

Parameters:
  • azimuth (int | float) – The azimuth angle of the sound source, negative numbers refer to sources to the left.

  • frequency (int | float) – Frequency in Hz for which the ITD is estimated. Use the default for for sounds with a broadband spectrum.

  • ils (dict | None) – interaural level spectrum from which the ILD is taken. If None,

  • use (make_interaural_level_spectrum() is called. For repeated) –

  • in (it is better to generate and keep the ils) –

  • it. (a variable to avoid re-computing) –

Returns:

The interaural level difference for a sound source at a given azimuth in decibels.

Return type:

(float)

Examples:

ils = slab.Binaural.make_interaural_level_spectrum() # using default KEMAR HRTF
ild = slab.Binaural.azimuth_to_ild(-90, ils=ils) # ILD equivalent to 90 deg leftward source for KEMAR
at_azimuth(azimuth=0, ils=None)

Convenience function for adding ITD and ILD corresponding to the given azimuth to the sound source. Values are obtained from azimuth_to_itd and azimuth_to_ild. Frequency parameters for these functions are generated from the centroid frequency of the sound.

Parameters:
  • azimuth (int | float) – The azimuth angle of the sound source, negative numbers refer to sources to the left.

  • ils (dict | None) – interaural level spectrum from which the ILD is taken. If None,

  • use (make_interaural_level_spectrum() is called. For repeated) –

  • in (it is better to generate and keep the ils) –

  • it. (a variable to avoid re-computing) –

Returns:

a sound with the appropriate ITD and ILD applied

Return type:

(slab.Binaural)

externalize(hrtf=None)

Convolve the sound with a smoothed HRTF to evoke the impression of an external sound source without adding directional information, see Kulkarni & Colburn (1998) for why that works.

Parameters:

hrtf (None | slab.HRTF) – The HRTF to use. If None use the one from the MIT KEMAR mannequin. The sound source at zero azimuth and elevation is used for convolution so it has to be present in the HRTF.

Returns:

externalized copy of the instance.

Return type:

(slab.Binaural)

static make_interaural_level_spectrum(hrtf=None)

Compute the frequency band specific interaural intensity differences for all sound source azimuth’s in a head-related transfer function. For every azimuth in the hrtf, the respective transfer function is applied to a sound. This sound is then divided into frequency sub-bands. The interaural level spectrum is the level difference between right and left for each of these sub-bands for each azimuth. When individual HRTFs are not avilable, the level spectrum of the KEMAR mannequin may be used (default). Note that the computation may take a few minutes. Save the level spectrum to avoid re-computation, for instance with pickle or numpy.save (see documentation on readthedocs for examples).

Parameters:

hrtf (None | slab.HRTF) – The head-related transfer function used to compute the level spectrum. If None, use the recordings from the KEMAR mannequin.

Returns:

A dictionary with keys samplerate, frequencies [n], azimuths [m], and level_diffs [n x m],

where frequencies lists the centres of sub-bands for which the level difference was computed, and azimuths lists the sound source azimuth’s in the hrft. level_diffs is a matrix of the interaural level difference for each sub-band and azimuth.

Return type:

(dict)

Examples:

ils = slab.Binaural.make_interaural_level_spectrum()  # get the ils from the KEMAR recordings
ils['samplerate'] # the sampling rate
ils['frequencies'] # the sub-band frequencies
ils['azimuths']  # the sound source azimuth's for which the level difference was calculated
ils['level_diffs'][5, :]  # the level difference for each azimuth in the 5th sub-band
interaural_level_spectrum(azimuth, ils=None)

Apply an interaural level spectrum, corresponding to a sound sources azimuth, to a binaural sound. The interaural level spectrum consists of frequency specific interaural level differences which are computed from a head related transfer function (see the make_interaural_level_spectrum() method). The binaural sound is divided into frequency sub-bands and the levels of each sub-band are set according to the respective level in the interaural level spectrum. Then, the sub-bands are summed up again into one binaural sound.

Parameters:
  • azimuth (int | float) – azimuth for which the interaural level spectrum is calculated.

  • ils (dict) – interaural level spectrum to apply. If None, make_interaural_level_spectrum() is called.

  • use (For repeated) –

  • it. (it is better to generate and keep the ils in a variable to avoid re-computing) –

Returns:

A binaural sound with the interaural level spectrum corresponding to the given azimuth.

Return type:

(slab.Binaural)

Examples:

noise = slab.Binaural.pinknoise(kind='diotic')
ils = slab.Binaural.make_interaural_level_spectrum() # using default KEMAR HRTF
noise.interaural_level_spectrum(azimuth=-45, ils=ils).play()
drr(winlength=0.0025)

Calculate the direct-to-reverberant-ratio, DRR for the impulse input. This is calculated by DRR = 10 * log10( X(T0-C:T0+C)^2 / X(T0+C+1:end)^2 ), where X is the approximated integral of the impulse, T0 is the time of the direct impulse, and C=2.5ms (Zahorik, P., 2002: ‘Direct-to-reverberant energy ratio sensitivity’, The Journal of the Acoustical Society of America, 112, 2110-2117)

Parameters:
  • winlength (int | float) – specifies the length of the direct sound window. This window is used to calculate

  • sound (the energy of impulse) –

  • impulse. (starting from the position of the peak amplitude of the) –

Returns:

Direct-to-reverberation value of the input impulse measured in dB

Return type:

(float)

static whitenoise(kind='diotic', **kwargs)

Generate binaural white noise. kind`=’diotic’ produces the same noise samples in both channels, `kind`=’dichotic’ produces uncorrelated noise. The rest is identical to `slab.Sound.whitenoise.

static pinknoise(kind='diotic', **kwargs)

Generate binaural pink noise. kind`=’diotic’ produces the same noise samples in both channels, `kind`=’dichotic’ produces uncorrelated noise. The rest is identical to `slab.Sound.pinknoise.

static powerlawnoise(kind='diotic', **kwargs)

Generate binaural power law noise. kind`=’diotic’ produces the same noise samples in both channels, `kind`=’dichotic’ produces uncorrelated noise. The rest is identical to `slab.Sound.powerlawnoise.

static irn(kind='diotic', **kwargs)

Generate iterated ripple noise (IRN). kind`=’diotic’ produces the same noise samples in both channels, `kind`=’dichotic’ produces uncorrelated noise. The rest is identical to `slab.Sound.irn.

static tone(**kwargs)

Identical to slab.Sound.tone, but with two channels.

static dynamic_tone(**kwargs)

Identical to slab.Sound.dynamic_tone, but with two channels.

static harmoniccomplex(**kwargs)

Identical to slab.Sound.harmoniccomplex, but with two channels.

static click(**kwargs)

Identical to slab.Sound.click, but with two channels.

static clicktrain(**kwargs)

Identical to slab.Sound.clicktrain, but with two channels.

static chirp(**kwargs)

Identical to slab.Sound.chirp, but with two channels.

static silence(**kwargs)

Identical to slab.Sound.silence, but with two channels.

static vowel(**kwargs)

Identical to slab.Sound.vowel, but with two channels.

static multitone_masker(**kwargs)

Identical to slab.Sound.multitone_masker, but with two channels.

static equally_masking_noise(**kwargs)

Identical to slab.Sound.erb_noise, but with two channels.

Psychoacoustic procedures

class slab.Trialsequence(conditions=2, n_reps=1, trials=None, kind=None, deviant_freq=None, label='')

Randomized, non-adaptive trial sequences.

Parameters:
  • conditions (list | int | str) – defines the different stimuli appearing the sequence. If given a list, every element is one condition. The elements can be anything - strings, dictionaries, objects etc. Note that, if the elements are not JSON serializable, the sequence can only be saved as a pickle file. If conditions is an integer i, the list of conditions is given by range(i). A string is treated as the filename of a previously saved trial sequence object, which is then loaded.

  • n_reps (int) – number of repetitions for each condition. Number of trials is given by len(conditions)*n_reps).

  • trials (None | list | numpy.ndarray) – The sequence of trials in the order in which they are appearing in sequence. Defaults to None, because trials are usually generated by the class based on the other parameters. However, it is possible to pass a list or one-dimensional array. In that case the parameters for generating the sequence are ignored.

  • kind (str) – The kind of randomization used to generate the trial sequence. Possible options are: non_repeating (randomization without direct repetition of a condition, default if n_conditions > 2), random_permutation (complete randomization, default if n_conditions <= 2) or infinite (sequence that reset when reaching the end to generate an infinite number of trials. randomization method is random_permutation` if n_conditions` <= 2 and non_repeating otherwise).

  • deviant_freq (float) – frequency with which deviants (encoded as 0) appear in the sequence. The minimum number of trials between two deviants is 3 if deviant frequency is below 10%, 2 if it is below 20% and 1 if it is below 30%. A deviant frequency greater than 30% is not supported

  • label (str) – a text label for the sequence.

.trials

the order in which the conditions are repeated in the sequence. The elements are integers referring to indices in conditions, starting from 1. 0 represents a deviant (only present if deviant_freq > 0)

.n_trials

the total number of trials in the sequence

.conditions

list of the different unique elements in the sequence

.n_conditions

number of conditions, is equal to len(conditions) or len(conditions)+1 if there are deviants

.n_remaining

the number of trials remaining i.e. that have not been called when iterating trough the sequence

.this_n

current trials index in the entire sequence, equals the number of trials completed so far

.this_trial

a dictionary giving the parameters of the current trial

.finished

boolean signaling if all trials have been called

.kind

randomization kind of sequence (random_permutation, non_repeating, infinite)

.data

list with the same length as the one in the trials attribute. On sequence generation, data is a list of empty lists. Then , one can use the add_response method to append to the list belonging to the current trial

add_response(*response)

Append response(s) to the list in the data attribute belonging to the current trial (see Trialsequence doc).

response

data to append to the list. Can be anything but save_json method won’t be available if the content of response is not JSON serializable (if it’s an object for example).

Type:

any

print_trial_info()

Convenience method for printing current trial information.

get_future_trial(n=1)

Returns the condition of a trial n iterations into the future or past, without advancing the trials.

Parameters:

n (int) – number of iterations into the future or past (negative numbers).

Returns:

element of the list stored in the conditions attribute belonging to the trial n

iterations into the past/future. Returns None if attempting to go beyond the first/last trial

Return type:

(any)

transitions()

Count the number of transitions between conditions.

Returns:

table of shape n_conditions x n_conditions where the rows represent the condition transitioning from and the columns represent the condition transitioning to. For example [0, 2] shows the number of transitions from condition 1 to condition 3. If the kind of the sequence is “non_repeating”, the diagonal is 0 because no condition transitions into itself.

Return type:

(numpy.ndarray)

condition_probabilities()

Return the frequency with which each condition appears in the sequence.

Returns:

list of floats floats, where every element represents the frequency of one condition.

The fist element is the frequency of the first condition and so on.

Return type:

(list)

response_summary()

Generate a summary of the responses for each condition. The function counts how often a specific response was given to a condition for all conditions and each possible response (including None).

Returns:

indices of the outer list represent the conditions in the sequence. Each inner list contains the number of responses per response key, with the response keys sorted in ascending order, the last element always represents None. If the sequence is not finished yet, None is returned.

Return type:

(list of lists | None)

Examples:

import slab
import random
sequence = slab.Trialsequence(conditions=3, n_reps=10)  # a sequence with three conditions
# iterate trough the list and generate a random response. The response can be either yes (1), no (0) or
# there can be no response at all (None)
for trial in sequence:
    response = random.choice([0, 1, None])
    sequence.add_response(response)
sequence.response_summary()
# Out: [[1, 1, 7], [2, 5, 3], [4, 4, 2]]
# The first sublist shows that the subject responded to the first condition once with no (0),
# once with yes (1) and did not give a response seven times, the second and third list show
# prevalence of the same response keys for conditions two and three.
plot(axis=None, show=True)

Plot the trial sequence as scatter plot.

Parameters:
  • axis (matplotlib.pyplot.Axes) – plot axis to draw on, if none a new plot is generated

  • show (bool) – show the plot immediately, defaults to True

load_json(file_name)

Read JSON file and deserialize the object into self.__dict__.

file_name

name of the file to read.

Type:

str | pathlib.Path

load_pickle(file_name)

Read pickle file and deserialize the object into self.__dict__.

file_name

name of the file to read.

Type:

str | pathlib.Path

present_afc_trial(target, distractors, key_codes=range(49, 58), isi=0.25, print_info=True)

Present the reference and distractor sounds in random order and acquire a response keypress. The subject has to identify at which position the reference was played. The result (True if response was correct or False if response was wrong) is stored in the sequence via the add_response method.

Parameters:
  • target (instance of slab.Sound) – sound that ought to be identified in the trial

  • distractors (instance or list of slab.Sound) – distractor sound(s)

  • key_codes (list of int) – ascii codes for the response keys (get code for button ‘1’: ord(‘1’) –> 49) pressing the second button in the list is equivalent to the response “the reference was the second sound played in this trial”. Defaults to the key codes for buttons ‘1’ to ‘9’

  • isi (int or float) – inter stimulus interval which is the pause between the end of one sound and the start

  • one. (of the next) –

  • print_info (bool) – If true, call the print_trial_info method afterwards

present_tone_trial(stimulus, correct_key_idx=0, key_codes=range(49, 58), print_info=True)

Present the reference and distractor sounds in random order and acquire a response keypress. The result (True if response was correct or False if response was wrong) is stored in the sequence via the add_response method.

Parameters:
  • stimulus (slab.Sound) – sound played in the trial.

  • correct_key_idx (int | list of int) – index of the key in key_codes that represents a correct response. Response is correct if response == key_codes[correct_key_idx]. Can be a list of ints if several keys are counted as correct response.

  • key_codes (list of int) – ascii codes for the response keys (get code for button ‘1’: ord(‘1’) –> 49).

  • print_info (bool) – If true, call the print_trial_info method afterwards.

save_json(file_name=None, clobber=False)

Save the object as JSON file. The object’s __dict__ is serialized and saved in standard JSON format, so that it can be easily reconstituted (see load_json method). Raises FileExistsError if the file exists, unless clobber is True. When file_name in None (default), the method returns the JSON string, in case you want to inspect it. Note that Numpy arrays are not serializable and are converted to Python int. This works because the Trialsequence and Staircase classes use arrays of indices. If your instances of these classes contain arrays of float, use save_pickle instead.

Parameters:
  • file_name (str | pathlib.Path) – name of the file to create. If None or ‘stdout’, return a JSON object.

  • clobber (bool) – overwrite existing file with the same name, defaults to False.

Returns:

True if writing was successful.

Return type:

(bool)

save_pickle(file_name, clobber=False)

Save the object as pickle file.

Parameters:
  • file_name (str | pathlib.Path) – name of the file to create.

  • clobber (bool) – overwrite existing file with the same name, defaults to False.

Returns:

True if writing was successful.

Return type:

(bool)

simulate_response(threshold=None, transition_width=2, intervals=1, hitrates=None)

Return a simulated response to the current condition index value by calculating the hitrate from a psychometric (logistic) function. This is only sensible if trials is numeric and an interval scale representing a continuous stimulus value.

Parameters:
  • threshold (None | int | float) – Midpoint of the psychometric function for adaptive testing. When the intensity of the current trial is equal to the threshold the hitrate is 50 percent.

  • transition_width (int | float) – range of stimulus intensities over which the hitrate increases from 0.25 to 0.75.

  • intervals (int) – use 1 (default) to indicate a yes/no trial, 2 or more to indicate an alternative forced choice trial. The number of choices determines the probability for a correct response by chance.

  • hitrates (None | list | numpy.ndarray) – list or numpy array of hitrates for the different conditions, to allow custom rates instead of simulation. If given, threshold and transition_width are not used. If a single value is given, this value is used.

class slab.Staircase(start_val, n_reversals=None, step_sizes=1, step_up_factor=1, n_pretrials=0, n_up=1, n_down=2, step_type='lin', min_val=-inf, max_val=inf, label='')

Class to handle adaptive testing which means smoothly the selecting next trial, report current values and so on. The sequence will terminate after a certain number of reverals have been exceeded.

Parameters:
  • start_val (int | float) – initial stimulus value for the staircase

  • n_reversals (int) – number of reversals needed to terminate the staircase

  • step_sizes (int | list) – Size of steps in the staircase. Given an integer, the step size is constant. Given a list, the step size will progress to the next entry at each reversal. If the list is exceeded before the sequence was finished, it will continue with the last entry of the list as constant step size.

  • step_up_factor – allows different sizes for up and down steps to implement a Kaernbach1991 weighted up-down method. step_sizes sets down steps, which are multiplied by step_up_factor to obtain up step sizes. The default is 1, i.e. same size for up and down steps.

  • n_pretrials (int) – number of trial at the initial stimulus value presented as before start of the staircase

  • n_up (int) – number of incorrect (or 0) responses before the staircase level increases. Is 1, regardless of specified value until the first reversal. Lewitt (1971) gives the up-down values for different threshold points on the psychometric function: 1-1 (0.5), 1-2 (0.707), 1-3 (0.794), 1-4 (0.841), 1-5 (0.891).

  • n_down (int) – number of correct (or 1) responses before the staircase level decreases (see n_up).

  • step_type (str) – defines the change of stimulus intensity at each step of the staircase. possible inputs are ‘lin’ (adds or subtract a certain amount), ‘db’, and ‘log’ (prevents the intensity from reaching zero).

  • min_val (int or float) – smallest stimulus value permitted, or -Inf for staircase without lower limit

  • max_val (int or float) – largest stimulus value permitted, or Inf for staircase without upper limit

  • label (str) – text label for the sequence, defaults to an empty string

.this_trial_n

number of completed trials

.intensities

presented stimulus values

.current_direction

‘up’ or ‘down’

.data

list of responses

.reversal_points

indices of reversal trials

.reversal_intensities

stimulus values at the reversals (used to compute threshold)

.finished

True/False: have we finished yet?

Examples:

stairs = Staircase(start_val=50, n_reversals=10, step_type='lin',
    step_sizes=[4,2], min_val=10, max_val=60, n_up=1, n_down=1)
print(stairs)
for trial in stairs:
    response = stairs.simulate_response(30)
    stairs.add_response(response)
print(f'reversals: {stairs.reversal_intensities}')
print(f'mean of final 6 reversals: {stairs.threshold()}')
add_response(result, intensity=None)

Add a True or 1 to indicate a correct/detected trial or False or 0 to indicate an incorrect/missed trial. This is essential to advance the staircase to a new intensity level. Supplying an intensity value indicates that you did not use the recommended intensity in your last trial and the staircase will replace its recorded value with the one supplied.

calculate_next_intensity()

Based on current intensity, counter of correct responses, and current direction.

threshold(n=0)

Returns the average of the last n reversals.

Parameters:

n (int) – number of reversals to average over, if 0 use n_reversals - 1.

Returns:

the arithmetic (if step_type`===’lin’) or geometric mean of the `reversal_intensities.

print_trial_info()

Convenience method for printing current trial information.

save_csv(filename)

Write a csv text file with the stimulus values in the 1st line and the corresponding responses in the 2nd.

Parameters:

filename (str) – the name under which the csv file is saved.

Returns:

True if saving was successful, False if there are no trials to save.

Return type:

(bool)

plot(axis=None, show=True)

Plot the staircase. If called after each trial, one plot is created and updated.

Parameters:
  • axis (matplotlib.pyplot.Axes) – plot axis to draw on, if none a new plot is generated

  • show (bool) – whether to show the plot right after drawing.

static close_plot()

Closes a staircase plot (if not drawn into a specified axis) - used for plotting after each trial.

load_json(file_name)

Read JSON file and deserialize the object into self.__dict__.

file_name

name of the file to read.

Type:

str | pathlib.Path

load_pickle(file_name)

Read pickle file and deserialize the object into self.__dict__.

file_name

name of the file to read.

Type:

str | pathlib.Path

present_afc_trial(target, distractors, key_codes=range(49, 58), isi=0.25, print_info=True)

Present the reference and distractor sounds in random order and acquire a response keypress. The subject has to identify at which position the reference was played. The result (True if response was correct or False if response was wrong) is stored in the sequence via the add_response method.

Parameters:
  • target (instance of slab.Sound) – sound that ought to be identified in the trial

  • distractors (instance or list of slab.Sound) – distractor sound(s)

  • key_codes (list of int) – ascii codes for the response keys (get code for button ‘1’: ord(‘1’) –> 49) pressing the second button in the list is equivalent to the response “the reference was the second sound played in this trial”. Defaults to the key codes for buttons ‘1’ to ‘9’

  • isi (int or float) – inter stimulus interval which is the pause between the end of one sound and the start

  • one. (of the next) –

  • print_info (bool) – If true, call the print_trial_info method afterwards

present_tone_trial(stimulus, correct_key_idx=0, key_codes=range(49, 58), print_info=True)

Present the reference and distractor sounds in random order and acquire a response keypress. The result (True if response was correct or False if response was wrong) is stored in the sequence via the add_response method.

Parameters:
  • stimulus (slab.Sound) – sound played in the trial.

  • correct_key_idx (int | list of int) – index of the key in key_codes that represents a correct response. Response is correct if response == key_codes[correct_key_idx]. Can be a list of ints if several keys are counted as correct response.

  • key_codes (list of int) – ascii codes for the response keys (get code for button ‘1’: ord(‘1’) –> 49).

  • print_info (bool) – If true, call the print_trial_info method afterwards.

save_json(file_name=None, clobber=False)

Save the object as JSON file. The object’s __dict__ is serialized and saved in standard JSON format, so that it can be easily reconstituted (see load_json method). Raises FileExistsError if the file exists, unless clobber is True. When file_name in None (default), the method returns the JSON string, in case you want to inspect it. Note that Numpy arrays are not serializable and are converted to Python int. This works because the Trialsequence and Staircase classes use arrays of indices. If your instances of these classes contain arrays of float, use save_pickle instead.

Parameters:
  • file_name (str | pathlib.Path) – name of the file to create. If None or ‘stdout’, return a JSON object.

  • clobber (bool) – overwrite existing file with the same name, defaults to False.

Returns:

True if writing was successful.

Return type:

(bool)

save_pickle(file_name, clobber=False)

Save the object as pickle file.

Parameters:
  • file_name (str | pathlib.Path) – name of the file to create.

  • clobber (bool) – overwrite existing file with the same name, defaults to False.

Returns:

True if writing was successful.

Return type:

(bool)

simulate_response(threshold=None, transition_width=2, intervals=1, hitrates=None)

Return a simulated response to the current condition index value by calculating the hitrate from a psychometric (logistic) function. This is only sensible if trials is numeric and an interval scale representing a continuous stimulus value.

Parameters:
  • threshold (None | int | float) – Midpoint of the psychometric function for adaptive testing. When the intensity of the current trial is equal to the threshold the hitrate is 50 percent.

  • transition_width (int | float) – range of stimulus intensities over which the hitrate increases from 0.25 to 0.75.

  • intervals (int) – use 1 (default) to indicate a yes/no trial, 2 or more to indicate an alternative forced choice trial. The number of choices determines the probability for a correct response by chance.

  • hitrates (None | list | numpy.ndarray) – list or numpy array of hitrates for the different conditions, to allow custom rates instead of simulation. If given, threshold and transition_width are not used. If a single value is given, this value is used.

class slab.Precomputed(sounds, n=10)

This class is a list of pre-computed sound stimuli which simplifies their generation and presentation. It is typically used when stimulus generation takes too long to happen in each trial. In this case, a list of stimuli is precomputed and a random stimulus from the list is presented in each trial, ideally without direct repetition. The Precomputed list has a play method which automatically selects an element other than the previous one for playing, and can be used like an slab.Sound object.

Parameters:
  • sounds (list | callable | iterator) – sequence of Sound objects (each must have a play method).

  • n – only used if sounds is a callable, calls it n times to make the stimuli.

.sequence

a list of all the elements that have been played already.

Examples:

stims = slab.Precomputed(sound_list) # using a pre-made list
# using a lambda function to make 10 examples of pink noise
stims = slab.Precomputed(lambda: slab.Sound.pinknoise(), n=10)
stims = slab.Precomputed( (slab.Sound.vowel(vowel=v) for v in ['a','e','i']) ) # using a generator
stims.play() # playing a sound from the list
play()

Play a random, but never the previous, stimulus from the list.

random_choice(n=1)

Pick (without replacement) random elements from the list.

Parameters:

n (int) – number of elements to pick.

Returns:

list of n random elements.

Return type:

(list)

write(filename)

Save the Precomputed object as a zip file containing all sounds as wav files.

Parameters:

filename (str | pathlib.Path) – full path to under which the file is saved.

static read(filename)

Read a zip file containing wav files.

Parameters:

filename (str | pathlib.Path) – full path to the file to be read.

Returns:

the file content.

Return type:

(slab.Precomputed)

class slab.ResultsFile(subject='test', folder=None, filename=None)

A class for simplifying the typical use cases of results files, including generating the name, creating the folders, and writing to the file after each trial. Writes a JSON Lines file, in which each line is a valid self-contained JSON string (see http://jsonlines.org).

Parameters:
  • subject (str) – determines the name of the sub-folder and files.

  • folder (None | str) – folder in which all results are saved, if None use the global variable results_folder.

.path

full path to the results file.

.subject

the subject’s name.

Example:

ResultsFile.results_folder = 'MyResults'
file = ResultsFile(subject='MS')
print(file.name)
property name

The name of the results file.

write(data, tag=None)

Safely write data to the file which is opened just before writing and closed immediately after to avoid data loss. Call this method at the end of each trial to save the response and trial state.

Parameters:
  • data (any) – data to save must be JSON serializable [string, list, dict, …]). If data is an object, the __dict__ is extracted and saved.

  • tag (str) – The tag is prepended as a key. If None is provided, the current time is used.

static read_file(filename, tag=None)

Read a results file and return the content.

Parameters:
Returns:

The content of the file. If tag is None, the whole file is returned, else only the

dictionaries with that tag as a key are returned. The content will be a list of dictionaries or a dictionary if there is only a single element.

Return type:

(list | dict)

read(tag=None)

Wrapper for the read_file method.

static previous_file(subject=None)

Returns the name of the most recently used results file for a given subject. Intended for extracting information from a previous file when running partial experiments.

Parameters:

subject (str) – the subject name name under which the file is stored.

Returns:

full path to the most recent results file.

Return type:

(pathlib.Path)

clear()

Clears the file by erasing all content.

psychoacoustics.key()

Wrapper for curses module to simplify getting a single keypress from the terminal (default), a buttonbox, or a figure. Set slab.psychoacoustics.input_method = ‘buttonbox’ to use a custom USB buttonbox, to ‘figure’ to open a figure called ‘stairs’ (if not already opened by the slab.Staricase.plot method), or to ‘prompt’ for a simple Python prompt (requires pressing Enter/Return after the response). Optionally takes a string argument which is printed in the terminal for conveying instructions to the participant.

Example:

with slab.key('Waiting for buttons 1 (yes) or 2 (no).') as key:
response = key.getch()
psychoacoustics.load_config()

Reads a text file with variable assignments. This is a simple convenience method that allows easy writing and loading of configuration text files. Experiments sometimes use configuration files when experimenters (who might not by Python programmers) need to set parameters without changing the code. The format is a plain text file with a variable assignment on each line, because it is meant to be written and changed by humans. These variables and their values are then accessible as a namedtuple.

Parameters:

filename (str | pathlib.Path) – path to the file to be read.

Returns:

a tuple containing the variables and values defined in the text file.

Return type:

(collections.namedtuple)

Example:

# assuming there is a file named 'example.txt' with the following content:
samplerate = 32000
pause_duration = 30
speeds = [60,120,180]
# call load_config to parse the file into a named tuple:
conf = load_config('example.txt')
conf.speeds
# Out: [60, 120, 180]

Filters

class slab.Filter(data, samplerate=None, fir='FIR')

Class for generating and manipulating filter banks and transfer functions. Filters can be finite impulse response (‘FIR’), impulse response (‘IR’), or transfer functions (‘TF’). FIR filters are applied using two-way filtering (see scipy.signal.filtfilt), which avoids adding group delays and is intended for high- and lowpass filters and the like. IR filters are applied using convolution (scipy.signal.fftconvolve) and are intended for long room impulse responses, binaural impulse responses and the like, where group delays are intended.

Parameters:
  • data (numpy.ndarray | slab.Signal | list) – samples of the filter. If it is an array, the first dimension should represent the number of samples and the second one the number of channels. If it’s an object, it must have a .data attribute containing an array. If it’s a list, the elements can be arrays or objects. The output will be a multi-channel sound with each channel corresponding to an element of the list.

  • samplerate (int | None) – the samplerate of the sound. If None, use the default samplerate.

  • fir (str) – the kind of filter; options are ‘FIR’, ‘IR’, or ‘TF’.

.n_filters

number of filters in the object (overloads n_channels attribute in the parent Signal class)

.n_taps

the number of taps in a finite impulse response filter. Analogous to n_samples in a Signal.

.n_frequencies

the number of frequency bins in a Fourier filter. Analogous to n_samples in a Signal.

.frequencies

the frequency axis of a Fourier filter.

property n_filters

The number of filters in the bank.

property n_taps

The number of filter taps.

property n_frequencies

The number of frequency bins.

property frequencies

The frequency axis of the filter.

static band(kind='hp', frequency=100, gain=None, samplerate=None, length=1000, fir='FIR')

Generate simple passband or stopband filters, or filters with a transfer function defined by pairs of frequency and gain values.

Parameters:
  • kind – The type of filter to generate. Can be ‘lp’ (lowpass), ‘hp’ (highpass), ‘bp’ (bandpass) or ‘bs’ (bandstop/notch). If gain is specified the kind argument is ignored.

  • frequency (int | float | tuple | list) – For a low- or highpass filter, a single integer or float value must be given which is the filters edge frequency in Hz. a bandpass or -stop filter takes a tuple of two values which are the filters lower and upper edge frequencies. Given a list of values and a list of equal length as gain the resulting filter will have the specified gain at each frequency.

  • gain (None | list) – Must be None when generating an lowpass, highpass, bandpass or bandstop filter. For generating a custom filter, define a list of the same length as frequency with values between 1.0 (no suppression at that frequency) and 0.0 (maximal suppression at that frequency).

  • samplerate (int | None) – the samplerate of the sound. If None, use the default samplerate.

  • length (int) – The number of samples in the filter

  • fir (str) – If ‘FIR’ or ‘IR’ generate a finite impulse response filter, else generate a transfer function.

Returns:

a filter with the specified properties

Return type:

(slab.Filter)

Examples:

filt = slab.Filter.band(frequency=3000, kind='lp')  # lowpass filter
filt = slab.Filter.band(frequency=(100, 2000), kind='bs')  # bandstop filter
filt = slab.Filter.band(frequency=[100, 1000, 3000, 6000], gain=[0., 1., 0., 1.])  # custom filter
apply(sig)

Apply the filter to a sound. If sound and filter have the same number of channels, each filter channel will be applied to the corresponding channel in the sound. If the filter has multiple channels and the sound only 1, each filter is applied to the same sound. In that case the filtered sound wil contain the same number of channels as the filter with every channel being a copy of the original sound with one filter channel applied. If the filter has only one channel and the sound has multiple channels, the same filter is applied to each sound channel. FIR filters are applied using two-way filtering (see scipy.signal.filtfilt), which avoids adding group delays and is intended for high- and lowpass filters and the like. IR filters are applied using convolution (scipy.signal.fftconvolve) and are intended for long room impulse responses, binaural impulse responses and the like, where group delays are intended.

Parameters:

sig (slab.Signal | slab.Sound) – The sound to be filtered.

Returns:

a filtered copy of sig.

Return type:

(slab.Signal | slab.Sound)

Examples:

filt = slab.Filter.band(frequency=(100, 1500), kind='bp')  # bandpass filter
sound = slab.Sound.whitenoise()  # generate sound
filtered_sound = filt.apply(sound)  # apply the filter to the sound
tf(channels='all', n_bins=None, show=True, axis=None)

Compute a filter’s transfer function (magnitude over frequency) and optionally plot it.

Parameters:
  • channels (str | list | int) – the filter channels to compute the transfer function for. Defaults to the string “all” which includes all channels. To compute the transfer function for multiple channels, pass a list of channel integers. For the transfer function for a single channel pass it’s index as integer.

  • n_bins (None) – number of bins in the transfer function (determines frequency resolution). If None, use the maximum number of bins.

  • show (bool) – whether to show the plot right after drawing.

  • axis (matplotlib.axes.Axes | None) – axis to plot to. If None create a new plot.

Returns:

the frequency bins in the range from 0 Hz to the Nyquist frequency. (numpy.ndarray: the magnitude of each frequency in w. None: If show is True OR and axis was specified, a plot is drawn and nothing is returned.

Return type:

(numpy.ndarray)

Examples:

filt = slab.Filter.band(frequency=(100, 1500), kind='bp')  # bandpass filter
filt.tf(show=True)  # compute and plot the transfer functions
w, h = filt.tf(show=False)  # compute and return the transfer functions
static cos_filterbank(length=5000, bandwidth=0.3333333333333333, low_cutoff=0, high_cutoff=None, pass_bands=False, n_filters=None, samplerate=None)

Generate a set of Fourier filters. Each filter’s transfer function is given by the positive phase of a cosine wave. The amplitude of the cosine is that filters central frequency. Following the organization of the cochlea, the width of the filter increases in proportion to it’s center frequency. This increase is defined by Moore & Glasberg’s formula for the equivalent rectangular bandwidth (ERB) of auditory filters. The number of filters is either determined by the n_filters argument or calculated based on the desired bandwidth or. This function is used for example to divide a sound into sub-bands for equalization.

length

The number of bins in each filter, determines the frequency resolution.

Type:

int

bandwidth

Width of the sub-filters in octaves. The smaller the bandwidth, the more filters will be generated.

Type:

float

low_cutoff

The lower limit of frequency range in Hz.

Type:

int | float

high_cutoff

The upper limit of frequency range in Hz. If None, use the Nyquist frequency.

Type:

int | float

pass_bands

Whether to include a half cosine at the filter bank’s lower and upper edge frequency. If True, allows reconstruction of original bandwidth when collapsing subbands.

Type:

bool

n_filters

Number of filters. When this is not None, the bandwidth argument is ignored.

Type:

int | None

samplerate

the samplerate of the sound that the filter shall be applied to. If None, use the default samplerate.s

Type:

int | None

Examples:

sig = slab.Sound.pinknoise(samplerate=44100)
fbank = slab.Filter.cos_filterbank(length=sig.n_samples, bandwidth=1/10, low_cutoff=100,
                              samplerate=sig.samplerate)
fbank.tf()
# apply the filter bank to the data. The filtered sound will contain as many channels as there are
# filters in the bank. Every channel is a copy of the original sound with one filter applied.
# In this context, the channels are the signals sub-bands:
sig_filt = fbank.apply(sig)
static collapse_subbands(subbands, filter_bank=None)

Sum a sound that has been filtered with a filterbank and which channels represent the sub-bands of the original sound. For each sound channel, the fourier transform is calculated and the result is multiplied with the corresponding filter in the filter bank. For the resulting spectrum, and inverse fourier transform is performed. The resulting sound is summed over the all channels.

Parameters:
  • subbands (slab.Signal) – The sound which is divided into subbands by filtering. The number of channels in the sound must be equal to the number of filters in the filter bank.

  • filter_bank (None | slab.Filter) – The filter bank applied to the sound’s subbands. The number of filters must be equal to the number of channels in the sound. If None a filter bank with the default parameters is generated. Note that the filters must have a number of frequency bins equal to the number of samples in the sound.

Returns:

A sound generated from summing the spectra of the subbands.

Return type:

(slab.Signal)

Examples:

sig = slab.Sound.whitenoise()  # generate a sound
fbank = slab.Filter.cos_filterbank(length=sig.n_samples)  # generate a filter bank
subbands = fbank.apply(sig)  # divide the sound into subbands by applying the filter
# by collapsing the subbands, a new sound is generated that is (almost) equal to the original sound:
collapsed = fbank.collapse_subbands(subbands, fbank)
filter_bank_center_freqs()

Get the maximum of each filter in a filter bank. For filter banks generated with the cos_filterbank method this corresponds to the filters center frequency.

Returns:

array with length equal to the number of filters in the bank, containing each filter’s

center frequency.

Return type:

(numpy.ndarray)

static equalizing_filterbank(reference, sound, length=1000, bandwidth=0.125, low_cutoff=200, high_cutoff=None, alpha=1.0, filt_meth='filtfilt')

Generate an equalizing filter from the spectral difference between a sound and a reference. Both are divided into sub-bands using the cos_filterbank and the level difference per sub-band is calculated. The sub-band frequencies and level differences are then used to generate an equalizing filter that makes the spectrum of the sound more equal to the one of the reference. The main use case is equalizing the differences between transfer functions of individual loudspeakers.

Parameters:
  • reference (slab.Sound) – The reference for equalization, i.e. what the sound should look like after applying the equalization. Must have exactly one channel.

  • sound (slab.Sound) – The sound to equalize. Can have multiple channels.

  • length (int) – Number of frequency bins in the filter.

  • bandwidth (float) – Width of the filters, used to divide the signal into subbands, in octaves. A small bandwidth results in a fine tuned transfer function which is useful for equalizing small notches.

  • low_cutoff (int | float) – The lower limit of frequency range in Hz.

  • high_cutoff (int | float) – The upper limit of frequency range in Hz. If None, use the Nyquist frequency.

  • alpha (float) – Filter regularization parameter. Values below 1.0 reduce the filter’s effect, values above amplify it. WARNING: large filter gains may result in temporal distortions of the sound

  • filt_meth (str) – filtering method, must be in (‘filtfilt’, ‘fft’, ‘slab’, ‘causal’). ‘filtfilt’ and ‘slab’ applies the filter twice (forward and backward) so the filtered signal is not time-shifted, but in this case we need a filter with half of the desired amplitude so the end result has the correct modification. For ‘fft’ and ‘causal’, the filter with desired amplitude is returned

Returns:

An equalizing filter bank with a number of filters equal to the number of channels in the

equalized sound.

Return type:

(slab.Filter)

Example:

# generate a sound and apply some arbitrary filter to it
sound = slab.Sound.pinknoise(samplerate=44100)
filt = slab.Filter.band(frequency=[100., 800., 2000., 4300., 8000., 14500., 18000.],
                        gain=[0., 1., 0., 1., 0., 1., 0.], samplerate=sound.samplerate)
filtered = filt.apply(sound)
# make an equalizing filter and apply it to the filtered signal. The result looks more like the original
fbank = slab.Filter.equalizing_filterbank(sound, filtered, low_cutoff=200, high_cutoff=16000)
equalized = fbank.apply(filtered)
save(filename)

Save the filter in Numpy’s .npy format to a file. :param filename: Full path to which the data is saved. :type filename: str | pathlib.Path

static load(filename)

Load a filter from a .npy file.

Parameters:

filename (str | pathlib.Path) – Full path to the file to load.

Returns:

The Filter loaded from the file.

Return type:

(slab.Filter)

HRTFs

class slab.HRTF(data, datatype=None, samplerate=None, sources=None, listener=None, verbose=False)

Class for reading and manipulating head-related transfer functions with attributes and functions to manage them.

Parameters:
  • data (str | Filter | numpy.ndarray) – Typically, this is the path to a file in the .sofa format. The file is then loaded and the data of each source for which the transfer function was recorded is stored as a Filter object in the data attribute. Instead of a file name, the data can be passed directly as Filter or numpy array. Given a Filter, every filter channel in the instance is taken as a source (this does not result in a typical HRTF object and is only intended for equalization filter banks). Given a 3D array, the first dimension represents the sources, the second the number of taps per filter and the last the number of filter channels per filter (should be always 2, for left and right ear).

  • datatype (None | string) – type of the HRTF filter bank, can be ‘FIR’ for finite impulse response filters or ‘TF’ for Fourier filters.

  • samplerate (None | float) – rate at which the data was acquired, only relevant when not loading from .sofa file

  • sources (None | array) – positions of the recorded sources, only relevant when not loading from .sofa file

  • listener (None | list | dict) – position of the listener, only relevant when not loading from .sofa file

  • verbose (bool) – print out items when loading .sofa files, defaults to False

.data

The HRTF data. The elements of the list are instances of slab.Filter.

Type:

list

.datatype

Type of the HRTF filter bank.

Type:

string

.samplerate

sampling rate at which the HRTF data was acquired.

Type:

int

.sources

Cartesian coordinates (x, y, z), vertical-polar and interaural-polar coordinates (azimuth, elevation, distance) of all sources.

Type:

named tuple

.n_sources

The number of sources in the HRTF.

Type:

int

.n_elevations

The number of elevations in the HRTF.

Type:

int

.listener

A dictionary containing the position of the listener (“pos”), the point which the listener is fixating (“view”), the point 90° above the listener (“up”) and vectors from the listener to those points.

Type:

dict

Example

import slab hrtf = slab.HRTF.kemar() # use inbuilt KEMAR data sourceidx = hrtf.cone_sources(20) hrtf.plot_sources(sourceidx) hrtf.plot_tf(sourceidx, ear=’left’)

property n_sources

The number of sources in the HRTF.

property n_elevations

The number of elevations in the HRTF.

apply(source, sound, allow_resampling=True)

Apply a filter from the HRTF set to a sound. The sound will be recast as slab.Binaural. If the samplerates of the sound and the HRTF are unequal and allow_resampling is True, then the sound will be resampled to the filter rate, filtered, and then resampled to the original rate. The filtering is done with scipy.signal.fftconvolve.

Parameters:
Returns:

a spatialized copy of sound.

Return type:

(slab.Binaural)

elevations()

Get all different elevations at which sources where recorded . Note: This currently only works as intended for HRTFs recorded in horizontal rings.

Returns:

a sorted list of source elevations.

Return type:

(list)

plot_tf(sourceidx, ear='left', xlim=(1000, 18000), n_bins=None, kind='waterfall', linesep=20, xscale='linear', show=True, axis=None)

Plot transfer functions at a list of source indices.

Parameters:
  • ear (str) – the ear from which data is plotted. Can be ‘left’, ‘right’, or ‘both’.

  • sourceidx (list of int) – sources to plot. Typically be generated using the hrtf.cone_sources Method.

  • xlim (tuple of int) – frequency range of the plot

  • n_bins (int) – passed to slab.Filter.tf() and determines frequency resolution

  • kind (str) – type of plot to draw. Can be waterfall (as in Wightman and Kistler, 1989), image (as in Hofman, 1998) or ‘surface’ (as in Schnupp and Nelken, 2011).

  • linesep (int) – vertical distance between transfer functions in the waterfall plot

  • xscale (str) – sets x-axis scaling (‘linear’, ‘log’)

  • show (bool) – If True, show the plot immediately

  • axis (matplotlib.axes._subplots.AxesSubplot) – Axis to draw the plot on

diffuse_field_avg()

Compute the diffuse field average transfer function, i.e. the constant non-spatial portion of a set of HRTFs. The filters for all sources are averaged, which yields an unbiased average only if the sources are uniformly distributed around the head.

Returns:

the diffuse field average as FFR filter object.

Return type:

(Filter)

diffuse_field_equalization(dfa=None)

Equalize the HRTF by dividing each filter by the diffuse field average. The resulting filters have a mean close to 0 and are Fourier filters.

Parameters:

dfa (None) – Filter object containing the diffuse field average transfer function of the HRTF. If none is provided, the diffuse_field_avg method is called to obtain it.

Returns:

diffuse field equalized version of the HRTF.

Return type:

(HRTF)

cone_sources(cone=0, full_cone=False)

Get all sources of the HRTF that lie on a “cone of confusion”. The cone is a vertical off-axis sphere slice. All sources that lie on the cone have the same interaural level and time difference. Note: This currently only works as intended for HRTFs recorded in horizontal rings.

Parameters:
  • cone (int | float) – azimuth of the cone center in degree.

  • full_cone (bool) – If True, return all sources that lie on the cone, otherwise, return only sources in front of the listener.

Returns:

elements of the list are the indices of sound sources on the frontal half of the cone.

Return type:

(list)

Examples:

import HRTF
hrtf = slab.HRTF.kemar()
sourceidx = hrtf.cone_sources(20)  # get the source indices
print(hrtf.sources[sourceidx])  # print the coordinates of the source indices
hrtf.plot_sources(sourceidx)  # show the sources in a 3D plot
elevation_sources(elevation=0)

Get the indices of sources along a horizontal sphere slice at the given elevation.

Parameters:

elevation (int | float) – The elevation of the sources in degree. The default returns sources along the frontal horizon.

Returns:

indices of the sound sources. If the hrtf does not contain the specified elevation an empty

list is returned.

Return type:

(list)

tfs_from_sources(sources, n_bins=96)

Get the transfer function from sources in the hrtf.

Parameters:
  • sources (list) – Indices of the sources (as generated for instance with the HRTF.cone_sources method), for which the transfer function is extracted.

  • n_bins (int) – The number of frequency bins for each transfer function.

Returns:

2-dimensional array where the first dimension represents the frequency bins and the

second dimension represents the sources.

Return type:

(numpy.ndarray)

interpolate(azimuth=0, elevation=0, method='nearest', plot_tri=False)

Interpolate a filter at a given azimuth and elevation from the neighboring HRTFs. A weighted average of the 3 closest HRTFs in the set is computed in the spectral domain with barycentric weights. The resulting filter values vary smoothly with changes in azimuth and elevation. The fidelity of the interpolated filter decreases with increasing distance of the closest sources and should only be regarded as appropriate approximation when the contributing filters are less than 20˚ away.

Parameters:
  • azimuth (float) – the azimuth component of the direction of the interpolated filter

  • elevation (float) – the elevation component of the direction of the interpolated filter

  • method (str) – interpolation method, ‘nearest’ returns the filter of the nearest direction. Any other string returns a barycentric interpolation.

  • plot_tri (bool) – plot the triangulation of source positions used of interpolation. Useful for checking for areas where the interpolation may not be accurate (look for irregular or elongated triangles).

Returns:

an IR-type binaural Filter

Return type:

(slab.Filter)

vsi(sources=None, equalize=True)

Compute the “vertical spectral information” which is a measure of the dissimilarity of spectral profiles at different elevations. The vsi relates to behavioral localization accuracy in the vertical dimension (Trapeau and Schönwiesner, 2016). It is computed as one minus the average of the correlation coefficients between all combinations of directional transfer functions of the specified sources. A set of identical transfer functions results in a vsi of 0 whereas highly different transfer functions will result in a high VSI (empirical maximum is ~1.07, KEMAR has a VSI of 0.82).

Parameters:
  • sources (None | list) – indices of sources for which to compute the VSI. If None use the vertical midline.

  • equalize (bool) – If True, apply the diffuse_field_equalization method (set to False if the hrtf object is already diffuse-field equalized).

Returns:

the vertical spectral information between the specified sources.

Return type:

(float)

plot_sources(idx=None, show=True, label=False, axis=None)

Plot source locations in 3D.

Parameters:
  • idx (list of int) – indices to highlight in the plot

  • show (bool) – whether to show plot (set to False if plotting into an axis and you want to add other elements)

  • label (bool) – if True, show the index of each source in self.sources as text label, if idx is also given, then only theses sources are labeled

  • axis (mpl_toolkits.mplot3d.axes3d.Axes3D) – axis to draw the plot on

static kemar()

Provides HRTF data from the KEMAR recording (normal pinna) conducted by Gardner and Martin at MIT in 1994 (MIT Media Lab Perceptual Computing - Technical Report #280) and converted to the SOFA Format. Slab includes a compressed copy of the data. This function reads it and returns the corresponding HRTF object. The objects is cached in the class variable _kemar and repeated calls return the cached object instead of reading the file from disk again.

Returns:

the KEMAR HRTF data.

Return type:

(slab.HRTF)

static estimate_hrtf(recordings, signal, sources, listener=None)

Compute a set of transfer functions from binaural recordings and an input (reference) signal. For each sound source, compute the DFT of left- and right-ear recordings and divide by the Fourier transform of the input signal to obtain the head related transfer function.

Parameters:
  • signal (slab.Signal | slab.Sound) – the signal used to produce the in-ear recordings.

  • recordings (list) – in-ear recordings stored in a list of slab.Binaural objects.

  • sources (numpy.array) – interaural polar coordinates (azimuth, elevation, distance) of all sources, number and order of sources must match the recordings.

Returns:

an HRTF object with the dimensions specified by the recordings and the source file.

Return type:

(slab.HRTF)

write_sofa(filename)

Save the HRTF data to a SOFA file.

Parameters:

filename (str | pathlib.Path) – path, the file is written to.

class slab.Room(size=[4, 5, 3], listener=[2, 2.5, 1.5], source=[0, 0, 1.4], order=2, max_echos=50, absorption=[0.1], wall_filters=None)

Class for acoustic room simulations, including binaural room impulse responses for given source and listener positions, echo locations, reverberation, and wall filters. Initializing a room objects immediately computes echo image locations. Reverb times, a reverb filter, and the binaural room impulse response can then be obtained with the respective methods.

Parameters:
  • size (list | numpy.ndarray) – width, length, and height of room in meters.

  • listener (list | numpy.ndarray) – cartesian coordinates of listener in the room (x,y,z)

  • source (list | numpy.ndarray) – source location relative to listener in spherical coordiates (azimuth,elevation,distance). The source location is immediately transformed into cartesian coordinates relative to the room.

  • order (int) – Number of reflections to simulate (default 2).

  • max_echos (int) – Number of source images to keep (defaults to simulated number, around 50 is perceptually sufficient).

  • absorption (int | list) – absorption coefficients [wall, [floor, [ceiling]]] in 1/m^2

  • wall_filters (int | list) – [wall, [floor, [ceiling]]] filter indices into the database !!NOT IMPLEMENTED YET!!

.image_locs

Echo locations in spherical coordinates (azimuth, elevation, distance).

Type:

numpy.ndarray

.orders

Reflection order for each echo in image_locs.

Type:

list

.floor_orders

Number of reflections from the floor for each echo.

Type:

list

.ceil_orders

Number of reflections from the ceiling for each echo.

Type:

list

Example

import slab

set_source(source)

Set the source attribute, convert source to cartesian coordinates, and recompute image_locs and orders.

Parameters:

source (list | numpy.ndarray) – source location relative to listener in spherical coordiates (azimuth,elevation,distance).

reverb_time(size=None, absorption=None)

Use Sabine formula to calculate reverberation time for the given room. Can be called as static method arguments size and absorption, otherwise uses current room parameters.

Parameters:
  • size (list) – optional, room dimensions in meters (width, length, height)

  • absorption (list) – optional, absorption coefficients [wall, [floor, [ceiling]]] in 1/m^2

Returns:

reverberation time in s

Return type:

(float)

reverb(t_reverb=None, low_factor=0.8, trim=0.25, samplerate=44100)

Generate an exponentially decaying reverberation tail impulse response.

Parameters:
  • t_reverb (int | float) – optional, reverberation time in seconds (default: use current room parameters)

  • low_factor (float) – t_reverb * low_factor is the reverb time used for the low frequencies (default 0.8)

  • trim (int | float) – shorten the (usually unnecessarily long) reverb tail; if int: trim to that number of samples, if float < 1: trim to that fraction (default is 0.5).

  • samplerate (int | float) – samplerate of the resulting noise (default 44100)

Returns:

reverberation tail impulse response

Return type:

(slab.Filter)

hrir(reverb=None, hrtf=None, trim=0.5)

Compute the binaural room response. The resulting filter has the same samplerate as the HRTF object.

Parameters:
  • reverb (float | slab.Filter) – if None, generate the decaying reverberation tail from current room parameters (default), if float, generate the reverberation tail with a time constant of ‘reverb’, if slab.Filter, use the provided binaural filter

  • hrtf (slab.HRTF) – HRTF dataset (default: slab.HRTF.kemar)

  • trim (int | float) – shorten the (usually unnecessarily long) reverb tail; if int: trim to that number of samples, if float < 1: trim to that fraction (default is 0.5).

Returns:

binaural room response for given source, room, and listener position

Return type:

(slab.Filter)