Dear community,
My main objective is to play movies on a VPixx PROPixx projector with accurate color rendering. This projector has its own unique color gamut, and linear gamma, plus high bit-depth modes. I’m still a beginner with PTB; I’ve used it for a few years but mostly making minor modifications to existing code. Digging through the documentation and code today, I found interesting features, including some directly related to the PROPixx. But I have a few questions:
Has anybody gone through the effort of doing color correction for displaying “regular” content (e.g. BT.709 or sRGB color spaces) on the PROPixx accurately, and willing to share tips or code? This would entail gamma decoding and some sort of color space conversion / gamut mapping. I started working on this and I’m also using the C48 mode to work with 12 bpc and avoid quantization issues at 8 bit.
I know GStreamer is used to play movies. However, I wanted to confirm how this is done, and what command is used to do the actual video decoding, e.g. ffmpeg? I started looking in the c code, but I’m hoping someone can help me find it quicker. In particular, I want to make sure the color range (limited vs full) is dealt with properly, either by the decoder, or if I have to do it with my gamma decoder.
I saw PsychImaging('DisplayColorCorrection', …, 'MatrixMultiply4') that I could perhaps use for color space conversion but I still need to compute that matrix to go from BT.709 to PROPixx. There isn’t already code for this, right, e.g. where you’d provide the source/target color primaries and it would do the conversion? The PROPixx has a generally wider gamut, though the BT.709 blue primary is slightly outside it. I’m struggling a bit to find the appropriate way to do this. I’ve tried converting to CIEXYZ, but I’m still getting strange results. So again if anybody has tips or experience with this, it will be appreciated.
Don’t know about the PROPixx specifically. PTB itself does colorspace conversion in HDR display mode, in that it assumes the windows content itself is drawn in BT2020 color space and then converts to whatever output colorspace the HDR display device and operating system expect, sometimes scRGB (on macOS, and sometimes on MS-Windows), sometimes BT2020 (Linux and sometimes MS-Windows).
In standard SDR mode, it doesn’t happen automatically, but the various PsychColorCorrection() functions can be assembled together to build a suitable color correction sequence, e.g., the MatrixMultiply4 method you found for color space conversion, or some gamma mapping, or specifying stimuli in XYZ space and converting to RGB output with the help of some calibration data, iirc. See also RenderDemo.m for examples of usage.
GStreamer uses a wide range of decoder plugins, depending on content type and system. Various plugins written by the GStreamer team, some operating specific plugins, e.g., for hardware accelerated video decoding on modern graphics cards, but also various plugins provided by the ffmpeg project.
You can use Screen('HookFunction', win, 'WindowColorGamut', [], dstGamut); to define the color gamut used for an onscreen window. This affects movie playback with the optional Screen('Openmovie', ..., ) pixelFormat parameter set to 11. It will automatically convert video frames from whatever color gamut they are (and limited or full range) to full range and that color gamut set via Screen('HookFunction', win, 'WindowColorGamut', [], dstGamut);
The default pixelFormat 4 for movie playback and others will not do any color conversion, but just pass on pixels “as is”.
There’s MCSC = ConvertRGBSourceToRGBTargetColorSpace(srcGamut, dstGamut) as helper to compute a 3x3 conversion matrix to convert from source to target gamut. You could extend that 3x3 matrix to a 4x4 matrix and use that for PsychImaging('DisplayColorCorrection', …, 'MatrixMultiply4'). The imaging pipeline would then efficiently convert any stimulus from srcGamut to dstGamut for output.
The way this is used in HDR mode is basically by defining the window itself as BT2020 linear, and then the GStreamer pixelFormat 11 will convert movie content to BT2020, and you use drawing commands to draw already in BT2020 (images can be converted in software via [~, img] = ]ConvertRGBSourceToRGBTargetColorSpace(srcGamut, win, srcImg) as a convenient helper).
Then PTB will automatically convert from BT2020 to the colorspace needed by the HDR display.
So in general PsychColorCorrection() has the building blocks for fast and efficient conversion to PROPixx, and pixelFormat 11 as explained above has builtin conversion for videos.
If you only wanted to playback color correct videos, you could only use that, otherwise you’d probably follow a two step approach like we do for HDR displays.
Thanks Mario, I’m immensely grateful for your detailed answer.
ConvertRGBSourceToRGBTargetColorSpace confirmed my math was correct. Sadly it didn’t produce the expected result when I tested it with my own shader implementation yesterday. Maybe I’ll have better results with PsychImaging and PsychColorCorrection. I got my hardware (NVidia & ProPixx) somewhat passing the preliminary tests: the animation was smooth but in intervals, with random pauses. In any case I’ll look at the color results tomorrow.
I’m including a solution below for reference. But I have a follow-up question. When using multiple stages of processing as in the solution below, how is this implemented, as a single combined shader or as a series of shaders and frame buffers? If it’s the latter, I’d probably be better off combining them into one to improve efficiency.
Here’s the solution I have now that gives good color rendering on the ProPixx:
Note that the color range is expanded from “limited/tv” to “full/pc” by the video decoder, so I’m ignoring that for now, though I suspect it might be done better if we could do it at high bit depth like the rest of the processing below. But I’m not sure how to keep limited range in the decoded output from GStreamer.
Gamma = 2.4; % for "BT.709" movie gamma decoding (see wikipedia for why it's not inverse OETF)
% Color space conversion matrix, BT.709 to ProPixx
sRGB = [0.64, 0.33; 0.3, 0.6; 0.15, 0.06; 0.3127, 0.329]';
ProPixx = [0.665, 0.321; 0.172, 0.726; 0.163, 0.039; 0.3127, 0.329]';
ColorConv = eye(4);
ColorConv(1:3, 1:3) = ConvertRGBSourceToRGBTargetColorSpace(sRGB, ProPixx);
% Initialize Datapixx
Datapixx('Open'); % Open the Datapixx connection
% Enable full pixel mode and set the video mode
Datapixx('EnablePixelMode', 0); % Full pixel mode to track movie progression
Datapixx('SetVideoMode', MovieVidMode); % Set to C48 video mode
Datapixx('RegWrRd'); % Write and read back the register cache to apply changes
Datapixx('Close');
% Open a fullscreen window
PsychImaging('PrepareConfiguration');
PsychImaging('AddTask', 'FinalFormatting', 'DisplayColorCorrection', 'SimpleGamma');
PsychImaging('AddTask', 'FinalFormatting', 'DisplayColorCorrection', 'MatrixMultiply4');
PsychImaging('AddTask', 'General', 'EnableDataPixxC48Output', 2); % mode 2: average even-odd pixels.
win = PsychImaging('OpenWindow', screenNumber, 0); % Black background
PsychColorCorrection('SetEncodingGamma', win, Gamma)
PsychColorCorrection('SetColorClampingRange', win, 0.0, 1.0)
PsychColorCorrection('SetMultMatrix4', win, ColorConv)
% Overwrite PTB default of hiding pixel sync raster line.
% This is a subfunction from PTB PsychImaging, which should be copied below
doDatapixx('SetVideoPixelSyncLine', 0, 1, 0); % line 0, single line, do not blank line
Depends on the specifics of the output method and other stuff. One DisplayColorCorrection task gets folded into the C48 output conversion shader for VPixx devices, but if you use more than one then it switches into a sequence of shader passes. I doubt it would make much of a difference here though, with fast gpu’s. That said, I think there’s probably room to make this more simple and efficient here if you just mostly play back movies and only need efficient color correction for those.
I don’t understand what you mean. If you use movie playback with pixelFormat 11, as recommended before, then movie content will be decoded with the maximum precision of the movie content, 8/10/12/…/16 bpc, then processed in the shader and stored in all buffers with 32 bit float precision - more than enough to not lose precision anywhere and output with the highest precision possible by a Propixx.
If you use pixelformat 11 then gamma decoding should not be needed, as the movie shader will already do the proper gamma decoding of a given movie in a given standard color space from movie “gamma” non-linear encoding to linear RGB, plus range extension limited → full range if needed, at full 32 bit float (~ effective 23 bpc linear precision) into the onscreen windows framebuffer. By default, on a standard onscreen window in SDR mode instead of HDR display mode, without an explicit color gamut assigned, color gamut would be converted from movie color gamut to BT.709 (~ sRGB) gamut by applying a suitable color space conversion matrix. For HDR windows the default conversion is to BT2020.
If you increase the verbosity level of Screen() slightly to 4, it will print some info about what GStreamer reported wrt. movie properties and how the decoding and color conversion etc. will be done by the movie playback engine.
You could also use Screen('HookFunction', win, 'WindowColorGamut', [], dstGamut); as described before to directly assign the color gamut of the ProPixx, ie. your ProPixx variable as the dstGamut of the onscreen window. This way during Screen('DrawTexture', ...) of your movie video frame, the shader would not only do the degamma from movie to linear, but also convert directly from movie colorspace/gamut to the “ProPixx” color gamut. Iow. You would not need any PsychColorCorrection() tasks. Obviously you’d need to take manually care of selecting proper colors for regular Screen drawing commands.
This would do all processing needed for movies in the movie decoding shader for pixelFormat 11 in one pass.
Most of the above doesn’t make sense, as SetvideoMode gets called by the 'EnableDataPixxC48Output` task. You’d only need Datapixx('EnablePixelMode'). However, it would be better to implement proper support for that in PTB’s own setup code for a future PTB release. In general it makes sense to wrap setup or execution of VPixx related functionality into PsychImaging() or PsychDataPixx(), to make sure PTB’s builtin VPixx support and low-level Datapixx calls do not step on each others feet. Also maybe have a look at the PsychDataPixx() timestamping functions, if you need additional timestamping from VPixx devices on top of PTB’s own high precision timestamping.
Why not hiding that top-most scanline? Is this somehow a requirement of the ‘EnablePixelMode’ function for the pixel to get recognized? It seems only useful for debugging to me, but as I said, I don’t have working experience with that function?
That said, you would just call Datapixx('SetVideoPixelSyncLine', 0, 1, 0); Datapixx('RegWrRd');, not weirdly use PTB internal code taken out of context if you wanted to do that, as daDatapixx is just an internal wrapper to allow some basic error checking, and some very bare bones device emulation that only makes sense for PTB development.
Thanks again Mario for your thorough reply. I read again your original message as well to better understand.
Just to make sure we’re on the same page, the processing steps I need are: 1. color range conversion (tv 16-235 to pc 0-255), 2. gamma decoding of 2.4 because PROPixx is linear, 3. gamut conversion, 4. C48 high bit depth mode output.
From what I understood, the first step was being done by the movie decoder, and so at the movie content bit depth which is 8 bpc. I’m still unsure which library GStreamer uses on my Windows system to decode H.264 mp4, perhaps ffmpeg, which I’m pretty sure by default would expand the range. I’d like to verify if this is the case. If I just play the movie without any PsychImaging and with pixelFormat 4, would PTB do any processing or just display what the decoder returns?
EnableDataPixxC48Output implies the use of a 32 bit frame buffer, so I assumed this would be redundant with pixelFormat 11. That’s not the case? If we go with your approach and add the processing steps with Screen('HookFunction', how is that different (in particular in terms of performance) vs the PsychImaging/PsychColorCorrection processing pipeline?
In your approach, you also did not indicate how the image would be gamma decoded to linear. Typically this would not be needed as LCD displays do it. Here it’s an unusual device which is why we have to do it in software. So where would I specify this, with another Screen('HookFunction'?
It’s true that Datapixx('SetVideoMode' is redundant with EnableDataPixxC48Output.
EnablePixelMode basically replaces a photodiode typically used to measure visual stimulus onset, and can act as a 24 bit digital port to send event markers to be recorded with our electrophysiology (MEG) data. The projector itself sends the color values of the top left pixel one frame (8.3 ms) ahead of it appearing on screen. This hardware solution is much safer than any software timing measure.
Regarding SetVideoPixelSyncLine, now it’s my turn not being familiar with what that’s about. I simply have no reason to hide the first line of pixels, unless I misunderstood. And yes, I liked the error checking, which is why I copied doDatapixx, but maybe not the best idea here.
I did a bit more investigation, looking at debug info, hook chains and attentive viewing.
Playing the movie without pixelFormat=11 (so 4 by default), it reports this:
PTB-DEBUG: Video colorimetry is 1:1:5:1.
PTB-DEBUG: Video range 1, colormatrix 1, color primaries 1, eotf 5.
PTB-DEBUG: Video format BGRA. Depth 8 bpc.
With pixelFormat=11, it reports instead:
PTB-DEBUG: Video colorimetry is bt709.
PTB-DEBUG: Video range 2, colormatrix 3, color primaries 1, eotf 5.
PTB-DEBUG: Video format NV12. Depth 8 bpc.
colormatrix: 1 = identity, 3 = BT.709
range: 1 = full, 2 = limited
eotf: 5 = BT.709 (gamma 2.2 with linear segment?)
I’m guessing those represent what GStreamer outputs to PTB, since it’s not the original content metadata. This means in the first case, GStreamer has already expanded the color range, but not in the second case.
I don’t see the initial frame decoding by PTB in the hook chain, but it is reported here:
PTB-DEBUG: Using movie video frame decoding from YUV-I420/P0xx -> RGB with 8 bpc precision. Limited range input. SDR/LDR footage, eotf 5. HDR mapping to [0.0 ; 1.000000].
From our discussion I expect it’s done at high bit depth, though it’s not super clear since it mentions 8 bpc precision for the first step.
Looking closely at the movie, it does appear there is gamma decoding actually happening here. I find that very strange, unless it is expected that we would again re-encode gamma before sending it to a display? I thought linear displays like the ProPixx were the exception, but maybe not for PTB users?
In any case, I’ll have to do even more tests to figure out if this is equivalent to my PsychImaging solution without pixelFormat=11, and if the EOTF used here is correct (BT.1886 and not inverse BT.709 OETF).
Edit: I found the GStreamer constants here: video color, and some of my previous interpretations were wrong, now corrected above.
My thorough reply was a courtesy, because I took a bit of interest in your questions and thought it could be useful to others as well. However, answering such questions in this detail is normally reserved to paid support, and this conversation by now takes up a significant amount of time, so I’d appreciate you supporting us → help PsychPaidSupportAndServices. In the meantime, I’ll answer with shorter answers inline, no unpaid time to go into further details…
Yes, that’s what pixelFormat 11 in OpenMovie should do, combined with the PsychImaging C48 output mode. Except gamma decoding will be from whatever gamma function the movie uses to linear, if that is 2.4 gamma or something else, e.g., HLG or PQ or sRGB or whatever, depends on what GStreamer determines the movie is encoded in, iow. what eotf is detected.
Higher Screen verbosity levels will also print the type of video decoders in use. And PsychTweak() allows access to even more GStreamer debugging/logging facilities, although some of this detailed only works poorly on MS-Windows or in Matlab, often a bit better in Octave or Octave in a terminal, and on Unix operating systems like Linux and macOS.
While raw video content enters the shader at whatever bit depth, typical 8,10,12,16 bpc, as soon as it is in the shader it will be processed with 32 bit float precision, then go into the 32 bit float framebuffer and only converted into 16 bpc fixed point before send-off to the projector.
The debug output about “Video …” is what the Gstreamer pipeline detects as the properties of the movie, “Sink …” is what PTB wants from GStreamer, so GStreamer itsels would transform from Video → Sink on a mismatch in software. In pixelFormat other than 11 PTB would just return exactly that. In pixelFormat 11 it would run its own shader to go from that via its own x bpc → float precision conversion, YUV->RGB, range extension as needed, EOTF decoding, CSC. In pixelFormat 11, Video and Sink properties will normally match, so GStreamer itself doesn’t do conversion, and our own shader is in control. This shader was originally designed specifically for HDR / WCG movie playback and HDR / WCG display needs, e.g., to not lose precision either, as software processing was way too slow for 4k HDR-10 content. When opted into it also gives control and substantial decoding speedup, especially for high-res, WCG, high bit depth content.
No, one is about drawing, post-processing and displaying any kind of stimuli with high precision, the other is about a specific way of decoding movie footage. pixelFormat is only auto-activated by default if a onscreen window is configured for HDR display output, as anything but format 11 is quite detrimental to HDR movie playback.
They are not related as of right now. Atm. that HookFunction ColorGamut is only used by pixelFormat 11 movie playback to define the target color gamut for movie content decoded by our own shader. If omitted, a standard SDR window will be assumed as BT.709, a HDR window as BT.2020. It would be better to integrate that Hookfunction setup more cleanly for other cases than HDR display, but that’s not what the HDR enablement contract paid for, so that it can be used for SDR cases as well in a slightly clunky way is just a free bonus.
Most displays need some gamma correction if you want calibrated or linearized output, either via Screen(‘LoadNormalizedGammaTable’) or via the PsychColorCorrection functions, depending on use case and output device. If you omit both, it will be whatever your display makes out of it. LCD’s typically emulate a sort of approximative gamma somewhere between 1.8 - 2.2 gamma, or an sRGB EOTF, that’s the theory, so content that is already encoded for a gamma in that range will approximately look ok on a non-gamma corrected run of the mill display at default settings. That’s the theory at least, as far as I understand it.
I would assume it sends it at the moment the top-left pixel displays, as that would make more sense? But I’m not familiar with the Propixx, only with Datapixx and older ViewPixx models which I don’t think had that PixelMode.
That statement may be true for other software than Psychtoolbox, iow. pretty much all other software, with its mediocre or shoddy timing mechanisms. Or maybe if you used PTB on unsuitable, broken or misconfigured hardware and operating systems and graphics drivers - in which case PTB would likely warn you about it. But on properly selected and configured hardware and software, PTB’s high precision timestamping has been verified to be just as accurate and reliable as hardware solutions like VPixx devices. In fact, various hardware devices including VPixx are periodically used to verify PTB’s reliability and precision.
Also such hardware methods have various failure modes that can lead to wrong results. I have seen quite a few people fooling themselves in the past by putting too much trust into hardware or photo-diode methods, in combination with lack of knowledge about how to cover all bases or understanding the “fun” ways things can go wrong on modern operating systems and graphics hardware.
What is true though is that the failure modes between high quality PTB software timestamping and hardware methods are usually different, so both methods can complement each other and provide redundant sources of the same information, so combining both can be useful for maximum peace of mind and certainty.
Or it can be convenient for hardware trigger emission in sync with visual stimuli. So I’m certainly not trying to convince you to not use that EnablePixelMode function.
Just saying that general statement about hardware methods being “much safer” is wrong, reality is more complex than that.
The methods I referred to in a previous post are additional hardware timestamping and trigger emission mechanisms provided by VPixx hardware, to complement PTB’s timestamping. Some of our tests like VBLSyncTest or FlipTimingWithRTBoxPhotoDiodeTest do employ various hardware methods to compare and check against PTB’s mechanisms - This is what I use for periodic validation of Psychtoolbox itself.
I think the EnablePixelMode feature is something only introduced a little while ago, I think for “ViewPixx EEG”, didn’t know the ProPixx also has it now, mostly as a simple convenience for people who don’t want to deal with the more powerful but slightly more complex mechanisms that were present already in the first VPixx products and that are supported by and well integrated into PTB out of the box. It’s recency is why PTB doesn’t have builtin setup and integration yet.
It is hidden by default, because it is not meant to be seen by subjects, but only “seen” by the hardware, and could be visually distracting, unless debugging a non-working setup.
Anyhow, I’ve spent way too much unpaid time explaining now, so paid support would be required to continue this conversation.
Hi Marc, you really should consider purchasing paid support, if you want the best answers. This is Mario’s full-time job, so his livelihood depends on getting support from users like you.
I sympathise, but I already have a working solution and our lab funds are limited. At this point I’m just trying to optimize and learn, which I can manage even if it takes me much longer without assistance. Still, I’ll look at the support rates, and keep this in mind and mention it to our users if/when they’re preparing new tasks.