Quick fix for Mac Tahoe timing issues

Hi all, I just wanted to report a trick that fixed the timing for me. So now I’ve upgraded to Mac OS 26.5 beta. I don’t know if this works on previous versions or not. I tested this using Octave 11.1 and PTB 3.0.22. My initial test of this OS confirmed the VSync error, such that the effective frame rate was effectively halved. But when I changed my monitor refresh rate to 59.94 Hz, the problem went away. And it’s still fine when I switched it back to 60 Hz. Something about changing the system video refresh rate seems to have at least temporarily fixed the problem for me. I haven’t done any testing to see how stable it is. VBLSyncTest reports 1 out of 600 stimulus presentation deadlines missed, as it should.

The computer is a Mac Studio with M2 Ultra chip and Pro Display XDR.

I wanted to try this on my laptop, but my MacBook Air (M4) does not allow changes to the refresh rate; the frame-skipping error was initially still present with OS 26.5 beta.

HOWEVER, starting in the 1710 x 1107 (Default) spatial resolution, I get the frame skipping error. But when I switch to a higher (1920 x 1243) or lower (1440 x 932) resolution, the error goes away and VSync works fine. At first when I switched back to the default resolution, the VSync error returned, but the second time I switched resolutions and back (I may be a bit fuzzy on the details here), the Vsync error is also fixed at the default resolution…

How odd.

I don’t know what combination of OS, Octave, PTB fixed it, but the problem is *almost* resolved.

Presumably once the Vsync error goes away, it will stay fixed until you reboot, and then you just have to switch spatial or temporal resolutions upon restart to fix it.

Keith

-----
Keith Schneider
Professor, Department of Psychological & Brain Sciences
University of Delaware

Confirmed this behavior on M1 iMac and Matlab 2026a. This computer also doesn’t allow the frame rate to be changed, but changing the screen resolution, running a Demo (e.g. DotDemo), then changing it back to the default resolution, solves the timing issue. I actually had to change the resolution back and forth twice to get it to work.

Keith

Hi Keith. You are a hero :superhero: !

It is weird though, I did test different scaled resolutions to absolutely no effect on macOS 26.1, I think. I couldn’t test with external displays / dual-displays, or refresh rate other than 60 Hz, as I only had a M1 MacBookAir available for testing on its internal display at that time. Maybe this is some new behavior since 26.5 beta, or at least some version after 26.1?

My assumption based on a week of low level testing was that this was an intentional change to the compositor deadline, shifting it from close to end of frame to almost beginning of frame, most likely due to the liquid glass UI being more taxing to the gpu. That would have been a “cheap” engineering hack to hide performance issues. That would have been a hopeless scenario, as if Apple would have done this by design, there would have been zero chance of it ever getting fixed.

I guess now we are back to the more “uplifting” explanation of it being simply an especially absurd macOS 26 software bug due to severe incompetence and lack of care in testing on Apples side.

I guess it would be useful for others to also report their results. I can’t test this myself, as I currently only can use macOS 15.7.4. I also found the new liquid glass design quite annoying, so I’m not eager to upgrade. Maybe we can update guidance a bit, based on your and others results.

I had a quick play yesterday and experienced the same thing. I have not done anything detailed about order of doing things etc. But the basic observation I could replicate. Changes in refresh rate and resolution can have these effects.

I am running macOS 26.4. So, it seems the most recent beta is not needed.

As for the bigger picture, I am not involved in PTB code development, so could not comment.

I just use macOS to develop code. Nothing more. As it has been for a long time.

P

For whatever reason, I’ve found that it takes two changes of spatial resolution to get the timing to work. For example, change from default to higher resolution, and then back again. On the other hand, only one change of frame rate is necessary, e.g. from 60 Hz default to 59.94 Hz to fix the timing errors.

I can’t explain this bazaar behavior, but I’ve found this to be the case on both my laptop (M4) and iMac (M1).

Keith

p.s. the PsychVulkanCore-ERROR: vkWaitForPresentKHR(1): Failed due to timeout! error appears in the PTB output every single time the timing problem occurs, and never when timing is working correctly. So whatever is causing the Tahoe timing bug, it is consistently manifesting as a vkWaitForPresentKHR timeout.

I can also replicate the problem consistently. When I start octave or Matlab, the timing of the first program I run (e.g. DotDemo) is a bit wonky. But the timing when I run it the second time is always bad, and the timing of the third time I run it and subsequently, is fine.

Keith

p.p.s. now I don’t think it had anything to do with changing the resolutions at all, it just depends on how many times I dun a Demo since starting Matlab or Octave. The first two times are bad and the third time and thereafter are fine.

You mean performance goes back to normal at the 3rd run, e.g., 60 fps on a 60 Hz display? Even if you reboot and don’t do any resolution / refresh rate switching on macOS 26? I never observed that on macOS 26.1, so that would be a change for the better.

The pattern of vkWaitForPresentKHR errors on some runs, often the 1st one, and then less so on later runs, is something I also observed on macOS 15. Reason is completely unclear after lots of investigation. But others can reproduce these problems, e.g., in video games that use that function, unrelated to Psychtoolbox, so we know it is not a bug on our side. See WIP: Fix semantics of VK_KHR_present_wait. by kleinerm · Pull Request #2693 · KhronosGroup/MoltenVK · GitHub - Right now, the MoltenVK Vulkan driver for macOS ships an intentionally broken version of that function that violates the Vulkan specification, because that works “better” for some video games, and they couldn’t figure out a way to make it work properly. My fix for the brokeness, which is just restoring their original correct implementation, was rejected, until somebody figures out how to deal with macOS Metal brokeness in the timing domain.

Psychtoolbox ships my fixed version of that function atm., because for PTB it is a small step up, but still far from great. The only thing we know is that macOS underlying Metal functionality is broken and unreliable since the day it was introduced in macOS 10.15.

Another problem due to macOS Metal bugs is that any flip more than 1 second into the future will show similar malfunctions. It is not clear if this is a bug, or if Metal restricts presents to less than 1 second apart but doesn’t document this limitation anywhere.

Apple Silicon macOS is not at the level or reliability yet, as Intel macOS, whenever timing is concerned.

Unfortunately Apple removed all the mechanisms that allowed me to work around brokenness on Intel Macs, both in macOS on the software side, and with the proprietary Apple display hardware itself, so any new solution - if there is any - will come with fun new tradeoffs.

Yes, that’s right, this is very reliable. It happens every time I start a new octave or Matlab session:

  1. Run 1 after starting a new Octave/Matlab session: partial failure, ~10% of frames missed. Also shows the CAUTION message: “Completing flips too early should never ever happen” — so the Metal timing is unreliable in both directions on run 1, not just late but also early.
  2. Run 2: near-total failure, 599/600 frames missed. Crucially, the output shows exactly ONE vkWaitForPresentKHR timeout, right at startup during calibration: PsychVulkanCore-ERROR: vkWaitForPresentKHR(1): Failed due to timeout! That single timeout during calibration appears to be enough to corrupt PTB’s timing model for the entire run.
  3. Run 3 and subsequent runs, no frames missed, accurate 60 fps on 60 Hz display.

This same pattern happens exactly on all three of the computers I tested, M1, M2 Ultra, M4 and also on both octave and Matlab. This resets when Octave or Matlab is restarted, confirming that some persistent PTB state is carrying over between runs within the same session.

My earlier observations that rebooting and changing the resolutions fixed it, were just a coincidence based on my testing sequence. It’s actually all tied to the initiation of a new octave or Matlab session.

Keith

Ok, so macOS 26.4 did change something, both for the better and the worse, compared to macOS 26.1/2.

Run 1 after starting a new Octave/Matlab session: partial failure, ~10% of frames missed. Also shows the CAUTION message: “Completing flips too early should never ever happen” — so the Metal timing is unreliable in both directions on run 1, not just late but also early.

That’s a new one. Haven’t seen that at all so far.

Run 2: near-total failure, 599/600 frames missed. Crucially, the output shows exactly ONE vkWaitForPresentKHR timeout, right at startup during calibration: PsychVulkanCore-ERROR: vkWaitForPresentKHR(1): Failed due to timeout! That single timeout during calibration appears to be enough to corrupt PTB’s timing model for the entire run.

That’s what I always got on macOS 26.1 without exception, and nothing helped. That said, the one-time (or sometimes a few times on macOS 13 and 15), vkWaitForPresentKHR timeout is an unrelated problem caused by macOS Metal bugs already present in macOS 13, 14, 15, and possibly earlier versions.

Run 3 and subsequent runs, no frames missed, accurate 60 fps on 60 Hz display.

That’s new. Does VBLSyncTest([], w) provide happy results for w = 1, 2, 3, 60, 120?

On macOS 15 and earlier, w = 0,1,2,3,…,59 is fine, and for w>=60 it falls apart, iow. when requesting flips at least 1 second apart. On macOS 26.1 it was the same, except w=0,1 were failures with 599/600 missed.

There is some persistent state throughout a running Matlab/Octave session, and I’ve seen this “first run after Matlab/Octave launch bad, successive runs fine” behavior already on earlier macOS version, only sometimes and randomly though. The PsychVulkanCore mex file prevents itself and the Vulkan runtime from being completely shut down and clear’ed out of memory on macOS - as PTB mex files would usually do during clear all clear mex etc. - to work around other macOS bugs. On other operating systems it shuts down completely on a clear, but on macOS it can’t. How this can cause the observed behavior is completely unclear though, as all the bits and state that interact with the windowing system and relate to the timing problems do get shut down completely whenever an onscreen window closes. And how re-running the same script three times in a row can cause a reproducible sequence of “healing itself” is also quite magic. Random memory corruption or state bleeding bugs normally cause things to get worse on successive runs, not to get better from a broken state. Ofc. we don’t know what the underlying Apple macOS frameworks like Metal and CoreAnimation do internally: While PsychVulkanCore is loaded, the MoltenVK Vulkan driver doesn’t shut down completely, and therefore the Apple macOS frameworks don’t shut down completely. But behavior change like this across macOS versions suggests the cause for the bugs or “miracle repairs” is in macOS and its frameworks and window server, not in PTB or the open-source MoltenVK Vulkan driver.

That’s new. Does VBLSyncTest([], w) provide happy results for w = 1, 2, 3, 60, 120?

On macOS 15 and earlier, w = 0,1,2,3,…,59 is fine, and for w>=60 it falls apart, iow. when requesting flips at least 1 second apart. On macOS 26.1 it was the same, except w=0,1 were failures with 599/600 missed.

No, 0, 1 and 2 work fine, 0/600 flips missed. However, for 3, 60 and 120, 15/600 flips are missed and the error below occurs for each flip missed:

PsychVulkanCore-ERROR: vkWaitForPresentKHR(1): Failed due to timeout!

PsychVulkanCore-ERROR: PsychPresent(1): Failed to retrieve visual stimulus onset

Keith

But the misses for w=3 are not consistent each time? Or usually only at the beginning of a run?

They are consistent and occur systematically except at the beginning of the run.

Here are some plots (showing both figures from VBLSyncTest only for the larger W’s).

(Attachment w0.pdf is missing)

(Attachment w1.pdf is missing)

(Attachment w2.pdf is missing)

(Attachment w3a.pdf is missing)

(Attachment w3b.pdf is missing)

(Attachment w60a.pdf is missing)

(Attachment w60b.pdf is missing)

(Attachment w120a.pdf is missing)

(Attachment w120b.pdf is missing)

Hi Mario, it won’t let me post the images, but no, the missed frames are regularly spaced out, but not at the beginning of the run. There are 15 missed frames each time, for the higher w’s. The w=3 case intermittently has either no missed frames or the same 15 as the higher runs.

Keith

This is so weird! Regularly distributed over a run, not all or most at the beginning, and each time a vkWait… timeout error. This goes counter to pretty much any half way plausible mental model or prediction one can form about how certain types of more likely bugs can cause problems. And, again, different from how older macOS versions behave.

The vkWait… error happens when macOS Metal+CoreAnimation framework, or the macOS WindowServer aka Quartz display server, “scrambles” the order in which feedback about completed image presentations is delivered. Essentially PTB + the MoltenVK Vulkan driver tags each to-be-presented image with a unique serial number n and a target presentation time, hands it over to the macOS proprietary closed source Metal + CoreAnimation frameworks, then vkWait… waits for feedback from macOS/Metal that the image with this serial number n has been presented, and also with a timestamp of when that image has been presented. This feedback is run back through MoltenVK/Vulkan’s accounting and book keeping, then to PTB with its accounting and book keeping, checking etc. and returned as Screen('Flip', ...) stimulus onset timestamp.

A properly working graphics system will receive frame with tag n, process and display it at the proper time, or a bit later in case of overload or too late submission, then immediately after successful presentation will return the feedback to the Vulkan driver which finishes the vkWait… for tag n.

What every single macOS version before macOS 26 so far randomly did is scramble this sequence at random times for random unclear reasons, e.g.,

  1. Submit frame n → vkWait for frame n → macOS does or does not actually present frame n, but in any case does not signal completion back. vkWait times out after one second, PTB gives up, prints a warning/error message, fakes a half-way plausible but still wrong timestamp, script goes on based on that.

  2. Submit frame n+1 → vkWait for frame n+1, same sequence of failure as 1.

  3. Submit frame n+2 → vkWait for frame n+2, macOS suddenly reports back completion feedback for frames n, n+1, and n+2, so now vkWait for n+2 completes successfully, and Vulkan/PTB throw away the stale feedback for frames n and n+1, process frame n+2 and successfully report proper timestamps etc. for frame n+2.

  4. Now suddenly the process is unjammed, and everything works for frames n+3, n+4, …

  5. So the user observes weird multi-second stutter and printed warnings and errors for the first few Screen(‘Flips’) and then suddenly things work smoothly…

  6. …until they don’t, e.g., because there was a pause in stimulus presentation of at least 1 second and the system starts to malfunction and jam up again for a few frames, until it unjams, or forever because it never recovers at all.

None of this makes sense, except it being 100% a severe bug in macOS graphics and display system, which Apple never ever fixed, not since the mechanism was introduced in macOS 10.15 over six years ago. The error pattern varies between macOS versions, and sometimes between Intel Macs (where we don’t care much, as we have other and much more reliable “PTB unique” ways of fixing this - only all other toolkits are broken on macOS Intel) and Apple Silicon Macs - where there is no known simple, elegant, efficient, or even somewhat hacky way around this mess.

All these problems also happen with other applications, essentially with any applications that require any kind of half-way precise and reliable feedback about how image presentation went.

So “things jam up at the beginning and after long pauses in stimulus presentation, with vkWait… timeout errors” makes sense from the mental model of the brokenness of macOS.

macOS 26.0 added an additional problem: The system expects a new stimulus image targetting video refesh cycle m+1 to be submitted within fractions of a millisecond after start of video refresh cycle m, because the composition cutoff deadline of the WindowServer has been shifted from close to the end of a cycle to very close to the beginning of a cycle. So whenever an applications waits for feedback about present completion of a given frame for video refresh cycle m, it will only get that feedback at beginning of cycle m in the best case scenario, but then it will only have << 1 millisecond to prepare, render and submit the next stimulus frame targetting cycle m+1. Less than 1 millisecond is almost never enough time for all the needed processing, and so the composition deadline is missed and the frame targetting m+1 will not display at m+1 but only one cycle later at m+2 and you have a missed Flip deadline → Now each Flip takes at least two video refresh cycles and the maximum achievable presentation rate is cut into half the video refresh rate of your display, iow. maximum 30 fps on a 60 Hz display. These were my results of testing macOS 26.0 and 26.1 on a M1 MacBookAir for over a week in December with every diagnostic tool I had available, and comparing behavior to macOS 14 and macOS 15 on the M2 Pro MacBookPro and macOS 13 on the 2017 Intel MacBookPro with AMD graphics.

I assumed this shifting of the composition deadline was intentionally done by Apple, not a bug in the strict sense, to hide or reduce severe performance problems caused by macOS Tahoes new (and well hated by many, including myself) “Liquid Glass” GUI. The Liquid Glass UI puts substantially more load and stress on the gpu and graphics drivers, and can more easily overwhelm a relatively weak integrated gpu like the ones used in Apple Silicon Macs, causing slow downs, sluggish UI behavior, stuttering animations and reduced runtime on battery. Shifting the composition deadline to the beginning of a cycle would hide or reduce the effects of these performance problems, while at the same time only really punishing applications that need precisely controlled visual presentation timing, or exact audio-video sync in some specific scenarios. Other run of the mill interactive apps would only be affected in a way that is not too perceptible by Joe average user. If this assumption is true then it would be a no win scenario for any properly working vision science toolkit, as you can only have either precise timing control, or performance, but not both.

So your results now suggest that Apple changed or “fixed” something sometimes between macOS 26.2 and macOS 26.4 (according to Peters results) or macOS 26.5-beta (according to your results), and the composition deadline may no longer be nailed to the beginning of a frame, which is somewhat better for us (a glimmer of hope), but at the same time they might have broken yet another thing, because the new error pattern doesn’t match any of the previous patterns, or diagnostic results, or my mental models of brokenness.

Having skipped frames in the middle of runs is usually a sign of temporary load changes or resource shortages, and having them on slower running animations (for w > 1 cases in your tests) can be a sign of gpu dynamic power management interfering in very unwelcome and suboptimal ways. But for skipped frames in the middle of otherwise continuous animations, with vkWait… 1 second timeouts would be something entirely new with no good explanation at all.

It’s a jungle out there. Not sure if more feedback or testing helps right now. This testing was useful though, as just something additional to take into account in this zoo of macOS display absurdities. Keep it coming if you learn something new with a future macOS release, much appreciated!

My own dedicated “high risk work” machine for Tahoe testing is literally an island away right now with my girlfriend, out of reach for direct hands on testing probably for a couple of months, and I can’t risk upgrading the M2 Pro MBP from macOS 15 atm., until I’ve found a solution to the pre-Tahoe macOS problems that establishes a new baseline to reliably work from. There is one possibly effective idea left for dealing with macOS timing problems, but it is complex and time consuming to implement and test, and it is based on other pieces of work from Linux land that first need to fall into place, and it needs macOS 13 and 15 as baselines first. So not all hope is lost, but it will take considerable time until that is ready for testing. And it may or may not work on macOS 26 even if it would work on earlier versions of macOS.

Atm. and for the last 8+ months everyone and everything, from Apple, to NVidia and AMD on MS-Windows, to “the Linux ecosystem in general”, to our online shop / reseller FastSpring, is throwing software-, process-, policy- and business obstacles in my way, I really feel like being locked into a jump and run game. So expect a lot of slow or even negative technical progress and depressing outcomes in the next months, because apparently nothing can ever go right ever for more than a month or two per year, just long enough to create a false illusion of cautious optimism…

Yes, it is quite odd. Once the steady-state has been achieved (after several runs), the output of VBLSyncTest(.x) for x=60,120 consistently gives 15 missed flips per run, with a swap delay graph liked the attached. The x=3 case is intermittent, but the x=0,1,2 cases always have 0/600 flips missed.

As you can see, there’s some random delay for the first ~125 flips, and then after that it enters a periodic regime with 15 flips missed total, every 30 flips or so.

This is with the latest 26.5 beta and is consistent for M1, M2 and M4 chips using Octave or Matlab.

Keith

I would be fine to add to any testing, but it seems that would not ATM be super helpful.

I can however confirm that I get the same basic result. Just run things 3+ times and it seems to work. By “work” I have not done any digging or systematic investigation (but would be happy to help if I can). I am running 26.4.1

On the positive side: I think it is great that PTB continues to be the most rigorously tested system out there for stimulus presentation. For me, I never trust macOS for real experiments for years. But it is brilliant to continue to develop code on such a system. The cross platform nature of PTB remains outstanding.

Sorry to hear of your hassles @mariokleiner. If it is any help, working at a university is the same or worse. The sheer inefficiency to get even basic things done is mind-blowing.

P

For what it is worth, I have very quickly tested with Matlab 2026a. I obviously did not expect any differences, but thought “why not”.

I have attached images of run 1, 2, and 3 of VBLSyncText.

Note: I quite regularly get a “blocky” version as seen in 2. Sometimes there are more “blocks” than this, but it seems to toggle between two values on the second run consistently.

Please also note, I run this very quickly with a dual monitor setup between meetings. The PTB window was the native laptop window.

Happy to do further testing, but obviously if there is no point, no worries.

P

Oh, fun new mysteries! No real time to look into this right now, but

The blue flip delta plots from run 1 and 2 look as smooth as I’d expect on a correctly working Apple Silicon macOS: The blue line with the timing measurements is rather horizontal / non-wiggly with only a noise in the low microsecond range, suggesting properly working time stamping. But performance sucks → One Flip only every 2nd refresh cycle, just like on older macOS 26 versions.

Run 3 looks like performance problem is “fixed”, but the timestamp noise is wayyyyyy too high, looks like almost +/- 1 msec! This suggests broken and possibly no longer trustworthy / wrong time stamping. So now I think it probably doesn’t fix itself at 3rd run, but instead it is even more broken → Some new failure of some timing related mechanism collaborates with another existing bug to provide the appearance of improved performance. This is very bad if that is what happens. Psychtoolbox has some fallbacks for timestamping failure, but these usually come with warning messages at least, so if it stays silent, then that’s a new type of unrecognized failure.

Do you observe any new warning or error messages, or any PsychVulkan status messages in run 3 vs. 2/1?

PsychVulkan('Verbosity',9) before a run would give more debug output - flood the screen with debug messages. Especially run 3 with the noisy timestamps. Also, PTB is not prepared to deal with ProMotion mode or VRR mode on internal/external displays.