I spent in total almost a whole work day on this and the result is that this is definitely not a Psychtoolbox bug, but a bug or limitation in the underlying 3rd party Portaudio library or even in the Windows Wasapi sound system. Looking at the Portaudio source code for Wasapi shows that Windows Wasapi has quirks (to say it in the most friendly way) on the sound input side which make this challenging for Portaudio to handle. Tons of special cases depending on the sound hardware connected/audio driver used, and the specific sound settings requested. The amount of complexity needed is pretty awful, so i’m not that surprised bugs could creep in easily on the capture side for various sound cards and not get noticed for a long time.
Diagnosing any further if this is a Portaudio problem or MS-Windows problem would take a lot of time and effort, fixing it (if it is fixable from our side at all, ie. not a MS-Windows bug) so i will almost certainly not work on this anytime soon, and not without being contracted and paid to do so. If i spent a day investigating something and at the end i’m more surprised that it works at all, than that it has bugs usually doesn’t mean it will be an easy or quick fix.
What i can say is that there are fundamentally different processing paths in Portaudio depending if you use reqlatencyclass 1 or > 1, and the low-latency settings you’d need (lower than 20 msecs input latency) seem to be even more broken, so atm. you probably can’t get precise timestamps for voice onset and low latency at the same time. There are also different processing paths for full-duplex vs. half-duplex, and various special cases depending on hardware properties.
I guess the best you can do is tinker. As far as i understand your paradigm you don’t need precise timestamps for voice onset, only a quick low-latency way to detect if voice onset happened, so you can quickly call PsychPortAudio(‘Start’) for the playback part. So it may work for you to just push input latency down by tinkering with the bufferSize parameter and setting reqlatency to 4.
Or switch to Linux, where this stuff should work better, and especially if you have timing requirements like yours, as long as you use a reasonably supported sound card. Onboard HDA sound usually works well if the sound chip actually sticks to the HDA standard or is not super brand-new or exotic, as do UAC compliant USB sound cards, and the stuff that is officially supported by ALSA.