How to perform "Flip" on multiple displays without reducing the frame rate?

We are trying to play video/animation on a haploscope. Left-eye and right-eye displays are on separate XWindow screens. Running Screen( ‘Flip’ ) on both displays reduces the frame rate by half (e.g., from 30 to 15 or from 60 to 30). Rendering is not the bottleneck (it is >120 fps).

I have tried so far:

A) Multiflip:
When I do:

 when = 0;
 dontsync = 0;
 dontsync = 0;
 multiflip = 1;
 dontclear = 2;
 Screen( 'Flip', win_left, when, dontclear, dontsync, multiflip );

only one screen is flipped, the other one is blank. If I add

 Screen( 'Flip', win_right, when, dontclear, dontsync, multiflip );   

both screens flicker (horribly) between blank and rendered content.

B) Async flip:
When I do:

 when = 0;
 dontsync = 1;
 dontsync = 0;
 multiflip = 0;
 dontclear = 2;
 Screen( 'Flip', win_left, when, dontclear, dontsync, multiflip );   

I get 60Hz, but only one side is rendered. When I add:

 Screen( 'Flip', win_right, when, dontclear, dontsync, multiflip );   

The frame rate drops to 30Hz.

Note that there is almost no overhead of rendering to another display - I just draw the same texture on win_left and win_right.

We see the same problem on Ubuntu (specs below) and a Windows machine.

How do I correctly perform Flip on a system with two separate screens so that the frame rate does not drop?

NVIDIA-SMI 470.256.02   Driver Version: 470.256.02   CUDA Version: 11.4

Linux version 5.15.0-122-generic (buildd@lcy02-amd64-106) (gcc (Ubuntu 9.4.0-1ubuntu1~20.04.2) 9.4.0, GNU ld (GNU Binutils for Ubuntu) 2.34) #132~20.04.1-Ubuntu SMP Fri Aug 30 15:50:07 UTC 2024

PsychtoolboxVersion
ans =
    '3.0.19 - Flavor: Manual Install, 11-Aug-2024 21:22:37'

On Linux you would use one X-Screen with both displays connected. So either only a standard dual-display setup connected to standard X-Screen / PTB screen 0, or if you also have a separate experimenter display for Matlab/Octave and the desktop GUI, you attach those to X-Screen 0, and then attach both haploscope displays to X-Screen 1. XOrgConfCreator + XOrgConfSelector + logout + login will do that conveniently for you on all recommended gpu’s with their open-source drivers. I think it may also work with generally not recommended NVidia gpus + their proprietary graphics drivers, or you could use the nvidia-settings GUI utility to create such a triple-display setup.

Then you open a single onscreen window with steremode 4 and PsychImaging, just follow the code of ImagingStereoDemo(4).

On modern AMD graphics cards with suitable displays connected on modern Linux versions, e.g., two displays of the same model from the same vendor with the same display settings, both connected with the same connector type, e.g., DisplayPort, I think the display driver will automatically synchronize the video refresh cycles of both displays, as there is proper hardware auto-sync setup code implemented inside the amdgpu kernel driver to enable this at the hardware level, but I could not ever properly test this due to lack of two identical display monitors. On older AMD cards, Psychtoolbox itself has built-in low-level tricks to perform such a sync reasonably well. GraphicsDisplaySyncAcrossDualHeadsTestLinux has all the info to test dual-display sync on Linux. PerceptualVBLSyncTest with the optional testdualheadsync is another visual way to test proper sync between displays.

I can’t remember / or don’t know if or how current NVidia GeForce consumer gpu’s handle this, for more expensive Quadro gpu’s there are some mechanisms.

Thank you for your response. But let me clarify.

We have two haploscope setups, and we have the same issue with both:

(1) On Ubuntu

We have 3 GPUs driving 9 displays. We cannot put left- and right-side displays on the same screen because they are on separate GPUs.

(2) On Windows

We have a single GPU driving 4 displays. Windows puts each monitor on a separate screen. We could potentially use screen “0” spanning all displays (as done in ‘ImagingStereoDemo’), though it would be rather inconvenient (we use one monitor for debugging).

I do not mind a small async due to using different GPUs. But we cannot get our experiment working if the frame rate is halved.

I tried all combinations of parameters of the Flip command but I could not get the native frame rate.

I started going through PsychFlipWindowBuffersIndirect, but the code is rather complicated.

Is there a way to perform a low-level glXSwapBuffers without executing the extra code in PsychFlipWindowBuffersIndirect?

Ok, I have resolved the issue on Ubuntu, and I must admit it was my mistake.

I called Screen( ‘Flip’ ) for each of the two windows opened on the same screen (four times in total). When I call Screen( ‘Flip’ ) once per screen, I get the native frame rate.

I will check next the Windows setup and post the update below.

What?! What kind of haploscope is this? For spiders or aliens with 8 - 9 eyes?

I think I’ll need a very detailed description of what the experimental setup and purpose is, what exact graphics cards models you have, how they are connected to which displays etc. to even think about if this is solvable at all, and under which conditions.

The normal way to do this is X-Screen 0 attached to the single “experimenter display” or a set of them, with the regular desktop GUI, Matlab/Octave, other stuff etc.

Then another X-Screen for each set of displays that you want to operate in sync, without tearing or other desynchronization artifacts, e.g., one X-Screen per experiment subject you want to stimulate, in a multi-subject paradigm. E.g, X-Screen 1 connected to 2 displays for binocular stimulation - what I would understand as a typical haploscope for two eyed humans or animals, then stereomode 4 for properly adressing the left/right eye stimuli. Or some home-grown rendering to the proper areas of a window in standard mono mode, spanning all displays of a X-Screen for triple/quad/… display setups.

Linux allows for some optimizations not available on other operating systems, but only with graphics cards that have open-source drivers, in your case realistically AMD or Intel.

Way more information is needed here.

Using screen 0 only works for dual-display stimulation, as that always combines screen 1 and screen 2 into a monitor, with some optimizations for that specific case. Won’t work with a triple-display stimulation setup.

Again way more detail about your setup, paradigm, used graphics hardware etc. is needed.

Update: Wrt. glXSwapBuffers or its equivalent on MS-Windows, you will essentially get that if in ``Screen(‘Flip’, window, when, dontclear, vblsynclevel);` you set when == 0 or [ ] to request flip asap and vblsynclevel = 1 to not block until the stimulus presentation is completed, maybe a dontclear=2 as well. But you’d lose any time-stamping to check your timing, and it won’t speed things up in the way you may hope for. In the end your code could run on for at most one frame duration ahead unthrottled before it would block again, throttled down to the refresh cycle of the specific display. Only running with vblsynclevel=2 would avoid that by disabling vsync, but then all your stimuli would be just undefined awful mush. So there isn’t a simple way like that to solving your problem. More information about your setup may point to a way though.

This may take serious thinking on my side, which will require at least one paid support token paying for 30 minutes of my time, possibly/likely more paid time.

But lets first find out what we are dealing with in this unusual sounding setup. Proper performance will be likely challenging, depending on the very specific situation.

Mario, thank you for all the help.

To follow up on this, the issue was resolved for our first haploscope, which we control from Linux/Ubuntu. As you wrote, running Screen( 'Flip', ... ) with dontsync = 1 gave the expected frame rate. Previously, I was running Flip twice for two different windows on the same screen, and that caused the halving of the frame rate.

I could also achieve the right frame rate on my Windows “development” machine, but strangely, the framerate is still half of what is expected (30fps instead of 60fps) on the “experiment” machine connected to another haploscope. This one has two Eizo Prominence displays, both running at 10bit per channel, 4k resolution and 60Hz. We will try running everything from Linux/Ubuntu and putting left- and right-eye windows on the same XWindow screen, as you suggested.

I could not get multiflip = 1 to work. I am unsure whether ‘multi’ refers to windows or screens.

We use Nvidia cards (4090, 3080 and 3090). The unusual 8-display haploscope is our ultra-realistic high-dynamic-range multi-focal stereo display (here are some photos - Computer Laboratory – Projects: Reproducing Reality with a High-Dynamic-Range Multi-Focal Stereo Display).

The best way to thank me is to pay me with a support membership → help PsychPaidSupportAndServices, otherwise the time I can spend on trying to help you is very limited - technically non-existent, but your questions tickle my inner geek, so I’m a bit more generous than I would normally be for free.

That is not a good approach if you care about proper timing or any certainty that the display timing is correct. Flips will be vsync’ed and tear-free with dontsync=1, but all timing and timestamping and timing checks will be toast. You definitely want either only the two displays of your haploscope connected, or have the haploscope displays on a separate X-Screen 1. If the displays are identical models at identical settings on identical video outputs, e.g., both DisplayPort, then depending on your graphics card and driver and configuration, the video refresh cycles should be synced and stuff should work with regular flip settings and proper timing. GraphicsDisplaySyncAcrossDualHeadsTestLinux allows to test for proper sync, although it might need additional setup when running on NVidia’s proprietary graphics driver. It will do plots of sampled scanout position of two displays at test, so one should see well aligned saw-tooth plots for display 1 and 2, and a difference plot that is mostly a horizontal line close to zero, with possible infrequent spikes. PerceptualVBLSyncTest([],[],[],[],120, 0, 1) allows for a visual check of sync across displays as well. You should see an intentionally created tear line on both displays, but the vertical position of the horizontal tear should be the same on both displays if they are synchronized, as opposed to vertically offset or one of the displays having a vertically moving/drifting tear line.

Without synced refresh, this is pretty hopeless, even at dontsync=1, as one or the other display will always throttle flips at a variable/shifting beat pattern, depending on how and how much the display refresh cycles drift, e.g., as a something like running at proper 60 fps for a while, then suddenly falling down to 30 fps or stuttering at inbetween rates, then going back to 60 fps, etc. Depending on luck, and how long you run a test, you may or may not notice this problem always and immediately, but it would be always there.

The way I understand from your postings and also the other thread, you use NVidia GeForce consumer class graphics cards with the proprietary drivers installed on both Ubuntu Linux and MS-Windows, right? With NVidia proprietary, some of PTB’s Linux goodies won’t work, so you are pretty much at the whim of NVidia’s proprietary graphics driver if or when it syncs.

Using our test scripts mentioned above would be the way to figure out if things are working, or if you are somewhat fooling yourself.

That said, I tried yesterday with my only NVidia card, a GeForce GTX 1650, and if I chose a meta-mode with identical resolution and refresh rate in nvidia-settings, my test scripts mentioned above did indicate that both displays on the X-Screen were synchronized, so probably that would be the same for you on Linux if all display settings etc. were correct. Then one wouldn’t expect that a regular Screen('Flip', window) would run at throttled 30 fps. Something like standard timing tests like PerceptualVBLSyncTest or VBLSyncTest, or ImagingStereoDemo, with stereomode 4 should confirm that.

Oh also, have a look at AsyncFlipTest or MultiWindowLockstepTest for how to use async flips for how to do flipping + timestamping in the background on a separate thread, while the Matlab script could already draw the next simulus image. It requires more complex coding, but allows you to use the time your script would normally have to wait for a vsynced flip to complete for something else, e.g., drawing the next stimulus already on the gpu, or doing keyboard/mouse/hardware i/o, other logic in the code that executes on the cpu. It does come at a bit of its own performance overhead though, and the gains for graphics workloads will depend, as not all stimulus post-processing can be parallelized with pending flips iirc., only some of it.

Btw. Another thing to mention wrt. dual-display haploscope use cases is that our Vulkan based HDR display modes are currently supported on Linux only with AMD graphics cards, where a special hack exists for dual-display stereo HDR. On MS-Windows, stereo dual-display HDR at least in my testing late 2020 exposed various HDR and deep color 10 bpc related proprietary driver bugs, so I don’t know if that would work nowadays. And the before mentioned Asyncflip machinery does not yet work in HDR mode at all, only conventional Screen('Flip').

That’s what basically never works. One screen == One fullscreen window is the rule, or things will really suck in many different ways wrt. timing, sync etc.

The way to maybe make this work on Windows + NVidia would be to use “NVidia surround” mode to combine the two haploscope displays into one virtual monitor (PTB screen 2 – with stereomode 4 or 5) and leave the experimenter display as PTB screen 1. And of course never touch the keyboard or mouse while running an experiment in a MS-Windows multi-display configuration, as that will destroy timing due to MS-Windows system limitations, as soon as a window other than PTB’s onscreen window gets input focus, e.g., due to a mouse click into the Matlab window or other window, or due to ALT-Tabbing there – Windows is very fragile on multi-display.

Also, surround mode only supports at most 3 displays in sync, an arbitrary restriction, so that NVidia can sell you way more expensive pro class gpus for such use cases.

Or only have the two haploscope monitors connected to the machine and use PTB screen 0 with stereomode 4, to be more click resistant. However, surround does not work with dual-display. And check video sync with PerceptualVBLSyncTest at proper parameters, or maybe via GraphicsDisplaySyncAcrossDualHeadsTest.

For Surround mode setup and limitations see:
https://nvidia.custhelp.com/app/answers/detail/a_id/5335/~/gettings-started-with-nvidia-surround

That is trying to flip all open onscreen windows in sync, either without any tearing for multiflip=1 but a chance not flipping in sync and throttling of framerate, or with some chance of tearing on all but one window (multiflip=2) in exchange for no throttling. Both modes can still cause desync of flips, as it is a best-effort method, not a guarantee. Returned timestamps would always refer to the onscreen window whose window handle was specified in the Screen('Flip', window, ..., multiflip) call. No guarantee is made that other windows really flipped in sync, only that it tried. On Linux with open-source drivers, ie. not NVidia gpu’s with proprietary drivers, chances of success are probably higher, but still not guaranteed.

Given that for proper timing and performance etc. an onscreen window must always fully cover a Psychtoolbox screen, flipping all windows is effectively synonymous with flipping all Psychtoolbox screens.

I guess the multiflip stuff would be the cheap but hacky way to maybe get your 8-display super-haploscope to behave ok enough timing wise. For dual-display haploscopes you’d want to connect both displays to one gpu and use one of the other strategies mentioned.

That is a very cool setup, combining a lot of bleeding edge research into one setup. I have some memories of my time as computer graphics student about lightfield rendering and lumigraph when it was a brand new concept, and multi-focal plane displays. Squeezing stereo, and multi-focal and lightfield rendering and HDR into one setup is quite impressive! I really have to try to visit you when I’m in Cambridge the next time, would love to see it. And read that article when I hopefully find the time some time.

A challenge though to get that working with proper sync at non-slideshow framerates, and your specific graphics hardware may prove not a great choice. But more on that later, I have other work to do now which actually pays the bills.

The proper hopefully solid and reliable technical solution to get 8 separate display devices on two separate graphics cards to flip and display in sync, would be to throw money at the problem. In the case of NVidia, you would have to buy somewhat expensive pro class graphics cards from the NVidia Quadro series, which are often now branded “RTX something” instead of Quadro because marketing likes to invent new names for old things, which are specifically designed for display sync across gpu’s. NVidia currently calls this “Mosaic technology”. There is a lot of confusing information about this on the internet, also lots of outdated one, because NVidias management and their marketing department like to change marketing names, and also the conditions under which which features allow to do what with which driver release and in which year on which operating system, in an attempt to squeeze more money out of customers and get them to buy new expensive pro hardware. E.g., there is “Base Mosaic” and “SLI Mosaic” and then yet another Mosaic configuration requiring either just a suitable computer motherboard and two SLI capable pro gpu’s, or also some kind of additional SLI link or NVLink interface or board, or some Quadro sync board…

This guide seems somewhat recent:

And here some somewhat confusing setup guide in german language for Linux:

These suggest that if you had two pro-class SLI gpu’s in a SLI capable machine, you could get your 8 displays properly synced via SLI Mosaic. There is also Base Mosaic for 2-4 identical gpus, which does not seem to require SLI or other additional hardware, but it is somewhat unclear what the downsides are, but maybe the downsides could be the need for identical gpu’s or not as reliable or precise sync across displays.

The idea of all these modes would be to present everything in one single huge onscreen window that spans all displays, draw the different simuli into the proper rectangular subregions of that window, flip, and the driver + hardware takes care of synchronizing everything everywhere.