On Sep 11, 2009, at 4:40 PM, Raymond Stanley wrote:
the proper place to ask general questions about Psychtoolbox.
just waits for the trigger and returns the timestamp.
BasicSoundInputDemo additionally records the wave data and saves it
to a file. The BasicSoundInputDemo however doesn't use low-latency
mode, so if you really wanted to use it for reliable voice triggers,
you should modify the 'Open' call to use low latency mode, just as in
SimpleVoiceTriggerDemo. This difference is because voice triggers
were just added later on to the BasicSoundInputDemo and i don't want
to use low-latency mode by default in that demo -- It would cause
failures on MS-Windows with non-ASIO hardware.
Another limitation of the demos is that they only use a simple
intensity threshold for detecting response onset, no clever filtering
schemes which would be better suited for vocal responses. The point
of the demos is to demonstrate how to get very (sample-accurate)
precise timestamps for events in the recorded sound stream, not to be
the "mother of all voice-triggers". Would be very simple to build on
that demos though and add more fancy stuff like a little integration
window in time or some filtering for more robustness.
that part, so you must know more than me.
In theory it should have timing as stable as the play function, i.e.,
the reported audio onset/capture timestamps should be as accurate -
usually down to sub-millisecond accuracy. The timestamping mechanism
behind capture timestamps is the same as the one for playback
timestamps.
In practice the mechanisms rely on the operating system and the
drivers / hardware of your sound hardware. A misconfiguration of the
sound system, or bugs in the os / drivers / hardware could lead to
wrong timestamps and timing. As opposed to visual stimulus onset
where we have a lot of consistency checks in place to spot such bugs,
there is no automated way to do the same for sound. Our driver has to
fully rely on the hardware doing the right thing. This is why you
must verify the timing at least once for your setup with some test
procedure to make sure it works for your specific setup.
In practice the timing mechanisms work well on the tested Linux and
MacOS/X systems, as well as on Windows machines with a soundcard that
has native ASIO support, e.g., many of Creative labs cards, M-Audio
cards, RME cards... Native support does not mean "Asio4All", which
is a softare emulation for cards that don't have native support.
"Asio4All" may work very well for some cards, and not well for
others. It is a hit and miss thing.
I tested the playback timing extensively on all operating systems,
but i only tested the capture timing indirectly on MacOS and Linux as
follows:
There is a script called KeyboardLatencyTest. It can be used to test
the timing accuracy/latency/variability of response devices.
Initially it could only test keyboards, hence the name. The current
version also tests mouse button responses, the Cedrus USB based
response boxes, the RTBox response box, the CMU button box and the
PST button box. The script runs ten test trials. In each trial the
user has to hit a button on the response device hard while the script
uses PsychPortAudio's "Voicetrigger" feature to detect the timestamp
of the noise made by hitting the response button. It compares that
audio timestamp of button press with the button press timestamp
collected from the response device, prints the difference and
computes the mean deviation and standard deviation.
If you have e.g., a MacbookPro, you can try it yourself. The
microphone is inside the left speaker, so you should place external
response boxes or mice close to the left side of the computer and
then run the script in a silent room. On MS-Windows you'd need an
external microphone properly placed and an ASIO capable soundcard.
The script basically measures the combined error and variability in
timestamping of both the response device and the PsychPortAudio
capture timestamps. I tested this setup on both OS/X and Linux with
the macbook pro's builtin soundchip and microphone with the Cedrus
response box, the RTBox and the PST button box, as well as with mice
and keyboards. While mice and keyboard showed the expected huge
latency and variability, the response boxes all showed almost perfect
timestamps with an accuracy of better than 1msec and variability of
at most 1 msec.
From that i conclude that the accuracy of both our response box
drivers and of the audio capture timestamping must be better than
1msec. Theoretically it could also happen that both the response
devices and the audio driver have huge errors, but that by pure
chance those errors are exactly the same in magnitude but of opposite
sign, so they cancel each other out perfectly. But given that the
same audio driver and hardware was tested over many trials with many
different response devices (each with its own type of connection and
driver/timestamping algorithm), that both modules (audio driver vs.
response box drivers) are completely independent in their operation
and that i always got the same consistently low error, it is much
more likely that everything worked correctly and our drivers are
trustworthy.
So i believe it works correctly and is very accurate if your sound
hardware/drivers are free of bugs and properly configured, which my
test setups apparently are.
In the end you'll need to repeat such tests on your hardware. Could
be that some device drivers out there are buggy or misconfigured,
which would explain the conflicting evidence. If you have a supported
response box you can test with the KeyboardLatencyTest script. If you
only have a mouse or keyboard, you can't test that way, as they are
expected to have enormous errors.
Another practical test would be to get a loopback cable that feeds
sound back from the line-out to the line-in and then write test
script in the spirit of KeyboardLatencyTest to output some test tone,
record it again and compare the sound output onset and sound capture
onset timestamps for consistency.
Maybe you'd also hear audible artifacts and glitches/dropouts/
distorations. And the structure returned by PychPortAudio
('GetStatus',..); contains a subfield 'xruns' which would show a non-
zero value. If the driver detects overflows or underruns of its audio
buffers, it will increment the xruns count, but the driver can't
detect all types of malfunctions. If xruns is > zero then something
certainly wen't wrong. If xruns is zero then either everything wen't
fine, or some malfunction slipped through undetected.
best,
-mario
Mario Kleiner
Max Planck Institute for Biological Cybernetics
Spemannstr. 38
72076 Tuebingen
Germany
e-mail: mario.kleiner@...
office: +49 (0)7071/601-1623
fax: +49 (0)7071/601-616
www: http://www.kyb.tuebingen.mpg.de/~kleinerm
*********************************************************************
"For a successful technology, reality must take precedence
over public relations, for Nature cannot be fooled."
(Richard Feynman)
> Hello Mario,Hi Raymond. I'll forward this mail to the psychtoolbox forum which is
the proper place to ask general questions about Psychtoolbox.
>There are two demos to show voicetriggers. SimpleVoiceTriggerDemo
> Thank you for taking the time to read this email, and taking the
> time to be such an integral part of a tool that is essential for so
> much research.
>
> If and when you have the time, I have a rather relatively simple
> issue about using the PsychPortAudio interface for recording that
> I'd like to query you on:
>
> I'd like to use the record function to record the stimuli output
> and a voice response from a participant, save that to a wave file,
> and compute the reaction time post-experiment. I see that you've
> created some elegant online voice detection schemes, but I'd rather
> be able to listen to the responses in the data analysis stage.
just waits for the trigger and returns the timestamp.
BasicSoundInputDemo additionally records the wave data and saves it
to a file. The BasicSoundInputDemo however doesn't use low-latency
mode, so if you really wanted to use it for reliable voice triggers,
you should modify the 'Open' call to use low latency mode, just as in
SimpleVoiceTriggerDemo. This difference is because voice triggers
were just added later on to the BasicSoundInputDemo and i don't want
to use low-latency mode by default in that demo -- It would cause
failures on MS-Windows with non-ASIO hardware.
Another limitation of the demos is that they only use a simple
intensity threshold for detecting response onset, no clever filtering
schemes which would be better suited for vocal responses. The point
of the demos is to demonstrate how to get very (sample-accurate)
precise timestamps for events in the recorded sound stream, not to be
the "mother of all voice-triggers". Would be very simple to build on
that demos though and add more fancy stuff like a little integration
window in time or some filtering for more robustness.
>What kind of conflicting evidence? I never got any feedback about
> I've found conflicting evidence in different places about whether
> the timing of the record function has been found to be stable. So,
> I have two questions:
>
> 1) is the record function have timing as stable as the play function?
that part, so you must know more than me.
In theory it should have timing as stable as the play function, i.e.,
the reported audio onset/capture timestamps should be as accurate -
usually down to sub-millisecond accuracy. The timestamping mechanism
behind capture timestamps is the same as the one for playback
timestamps.
In practice the mechanisms rely on the operating system and the
drivers / hardware of your sound hardware. A misconfiguration of the
sound system, or bugs in the os / drivers / hardware could lead to
wrong timestamps and timing. As opposed to visual stimulus onset
where we have a lot of consistency checks in place to spot such bugs,
there is no automated way to do the same for sound. Our driver has to
fully rely on the hardware doing the right thing. This is why you
must verify the timing at least once for your setup with some test
procedure to make sure it works for your specific setup.
In practice the timing mechanisms work well on the tested Linux and
MacOS/X systems, as well as on Windows machines with a soundcard that
has native ASIO support, e.g., many of Creative labs cards, M-Audio
cards, RME cards... Native support does not mean "Asio4All", which
is a softare emulation for cards that don't have native support.
"Asio4All" may work very well for some cards, and not well for
others. It is a hit and miss thing.
I tested the playback timing extensively on all operating systems,
but i only tested the capture timing indirectly on MacOS and Linux as
follows:
There is a script called KeyboardLatencyTest. It can be used to test
the timing accuracy/latency/variability of response devices.
Initially it could only test keyboards, hence the name. The current
version also tests mouse button responses, the Cedrus USB based
response boxes, the RTBox response box, the CMU button box and the
PST button box. The script runs ten test trials. In each trial the
user has to hit a button on the response device hard while the script
uses PsychPortAudio's "Voicetrigger" feature to detect the timestamp
of the noise made by hitting the response button. It compares that
audio timestamp of button press with the button press timestamp
collected from the response device, prints the difference and
computes the mean deviation and standard deviation.
If you have e.g., a MacbookPro, you can try it yourself. The
microphone is inside the left speaker, so you should place external
response boxes or mice close to the left side of the computer and
then run the script in a silent room. On MS-Windows you'd need an
external microphone properly placed and an ASIO capable soundcard.
The script basically measures the combined error and variability in
timestamping of both the response device and the PsychPortAudio
capture timestamps. I tested this setup on both OS/X and Linux with
the macbook pro's builtin soundchip and microphone with the Cedrus
response box, the RTBox and the PST button box, as well as with mice
and keyboards. While mice and keyboard showed the expected huge
latency and variability, the response boxes all showed almost perfect
timestamps with an accuracy of better than 1msec and variability of
at most 1 msec.
From that i conclude that the accuracy of both our response box
drivers and of the audio capture timestamping must be better than
1msec. Theoretically it could also happen that both the response
devices and the audio driver have huge errors, but that by pure
chance those errors are exactly the same in magnitude but of opposite
sign, so they cancel each other out perfectly. But given that the
same audio driver and hardware was tested over many trials with many
different response devices (each with its own type of connection and
driver/timestamping algorithm), that both modules (audio driver vs.
response box drivers) are completely independent in their operation
and that i always got the same consistently low error, it is much
more likely that everything worked correctly and our drivers are
trustworthy.
So i believe it works correctly and is very accurate if your sound
hardware/drivers are free of bugs and properly configured, which my
test setups apparently are.
In the end you'll need to repeat such tests on your hardware. Could
be that some device drivers out there are buggy or misconfigured,
which would explain the conflicting evidence. If you have a supported
response box you can test with the KeyboardLatencyTest script. If you
only have a mouse or keyboard, you can't test that way, as they are
expected to have enormous errors.
Another practical test would be to get a loopback cable that feeds
sound back from the line-out to the line-in and then write test
script in the spirit of KeyboardLatencyTest to output some test tone,
record it again and compare the sound output onset and sound capture
onset timestamps for consistency.
> 2) Would any record "timing" issues manifest itself in distortionThis question i don't understand? The timestamps would be wrong.
> of time between two points in a saved file, or would it only show
> in up in problems with onset latency of the recording?
Maybe you'd also hear audible artifacts and glitches/dropouts/
distorations. And the structure returned by PychPortAudio
('GetStatus',..); contains a subfield 'xruns' which would show a non-
zero value. If the driver detects overflows or underruns of its audio
buffers, it will increment the xruns count, but the driver can't
detect all types of malfunctions. If xruns is > zero then something
certainly wen't wrong. If xruns is zero then either everything wen't
fine, or some malfunction slipped through undetected.
best,
-mario
>*********************************************************************
> Thanks very much-
> Ray
>
> --
> Raymond M. Stanley, Ph.D.
> Postdoctoral Fellow
> Memory And Cognition Lab
> Volen National Center for Complex Systems
> Brandeis University
Mario Kleiner
Max Planck Institute for Biological Cybernetics
Spemannstr. 38
72076 Tuebingen
Germany
e-mail: mario.kleiner@...
office: +49 (0)7071/601-1623
fax: +49 (0)7071/601-616
www: http://www.kyb.tuebingen.mpg.de/~kleinerm
*********************************************************************
"For a successful technology, reality must take precedence
over public relations, for Nature cannot be fooled."
(Richard Feynman)