On 02/13, Ville Syrjälä wrote:
> On Mon, Feb 13, 2023 at 11:01:31AM +0200, Pekka Paalanen wrote:
> > On Fri, 10 Feb 2023 14:47:50 -0500
> > Harry Wentland <harry.wentland@amd.com> wrote:
> > 
> > > On 2/10/23 04:28, Pekka Paalanen wrote:
> > > > On Thu, 9 Feb 2023 13:27:02 -0100
> > > > Melissa Wen <mwen@igalia.com> wrote:
> > > >   
> > > >> On 01/31, Pekka Paalanen wrote:  
> > > >>> On Mon, 9 Jan 2023 14:38:09 -0100
> > > >>> Melissa Wen <mwen@igalia.com> wrote:
> > > >>>     
> > > >>>> On 01/09, Melissa Wen wrote:    
> > > >>>>> Hi,
> > > >>>>>
> > > >>>>> After collecting comments in different places, here is a second version
> > > >>>>> of the work on adding DRM CRTC 3D LUT support to the current DRM color
> > > >>>>> mgmt interface. In comparison to previous proposals [1][2][3], here we
> > > >>>>> add 3D LUT before gamma 1D LUT, but also a shaper 1D LUT before 3D LUT,
> > > >>>>> that means the following DRM CRTC color correction pipeline:
> > > >>>>>
> > > >>>>> Blend -> Degamma 1D LUT -> CTM -> Shaper 1D LUT -> 3D LUT -> Gamma 1D LUT    
> > 
> > ...
> > 
> > > >>> +/*
> > > >>> + * struct drm_mode_lut3d_mode - 3D LUT mode information.
> > > >>> + * @lut_size: number of valid points on every dimension of 3D LUT.
> > > >>> + * @lut_stride: number of points on every dimension of 3D LUT.
> > > >>> + * @bit_depth: number of bits of RGB. If color_mode defines entries with higher
> > > >>> + *             bit_depth the least significant bits will be truncated.
> > > >>> + * @color_format: fourcc values, ex. DRM_FORMAT_XRGB16161616 or DRM_FORMAT_XBGR16161616.
> > > >>> + * @flags: flags for hardware-sepcific features
> > > >>> + */
> > > >>> +struct drm_mode_lut3d_mode {
> > > >>> +	__u16 lut_size;
> > > >>> +	__u16 lut_stride[3];
> > > >>> +	__u16 bit_depth;
> > > >>> +	__u32 color_format;
> > > >>> +	__u32 flags;
> > > >>> +};
> > 
> > ...
> > 
> > > >>> What is "number of bits of RGB"? Input precision? Output precision?
> > > >>> Integer or floating point?    
> > > >>
> > > >> It's the bit depth of the 3D LUT values, the same for every channels. In
> > > >> the AMD case, it's supports 10-bit and 12-bit, for example.  
> > > > 
> > > > Ok. So e.g. r5g6b5 is not a possible 3D LUT element type on any
> > > > hardware ever?
> > > >   
> > > 
> > > I haven't had a chance to go through all patches yet but if this is
> > > modeled after Alex Hung's work this should be covered by color_format.
> > > The idea is that color_format takes a FOURCC value and defines the
> > > format of the entries in the 3DLUT blob.
> > > 
> > > The bit_depth describes the actual bit depth that the HW supports.
> > > E.g., color_format could be DRM_FORMAT_XRGB16161616 but HW might only
> > > support 12-bit precision. In that case the least significant bits get
> > > truncated.
> > > 
> > > One could define the bit_depth per color, but I'm not sure that'll be
> > > necessary.
> > 
> > Exactly. I just have no idea how sure we should be about that.
> > 
> > > > What exactly is the truncation the comment refers to?
> > > > 
> > > > It sounds like if input has higher precision than the LUT elements,
> > > > then "truncation" occurs. I can kind of see that, but I also think it
> > > > is a false characterisation. The LUT input precision affects the
> > > > precision of LUT indexing and the precision of interpolation between
> > > > the LUT elements. I would not expect those two precisions to be
> > > > truncated to the LUT element precision (but they could be truncated to
> > > > something else hardware specific). Instead, I do expect the
> > > > interpolation result to be truncated to the LUT output precision, which
> > > > probably is the same as the LUT element precision, but not necessarily.
> > > > 
> > > > Maybe the comment about truncation should simply be removed? The result
> > > > is obvious if we know the LUT input, element, and output precision, and
> > > > what exactly happens with the indexing and interpolation is probably
> > > > good enough to be left hardware-specific if it is difficult to describe
> > > > in generic terms across different hardware.
> > > >   
> > > 
> > > Maybe it makes sense to just drop the bit_depth field.
> > 
> > Well, it's really interesting information for userspace, but maybe it
> > should have a more holistic design. Precision is a factor, when
> > userspace considers whether it can use KMS hardware for a conversion or
> > not. Unfortunately, none of the existing KMS color pipeline elements
> > have any information on precision IIRC, so there is more to be fixed.
> > 
> > The interesting thing is the minimum guaranteed precision of each
> > element and the connections between them. It might be different for
> > pass-through vs. not. Another interesting thing is the usable value
> > range.
> > 
> > This is probably a complex problem, so there should be no need to solve
> > it before a 3D LUT interface can land, given old elements already have
> > the issue.
> 
> Yeah, I think all the precision stuff is all better handled by
> eg. the proposed GAMMA_MODE property or something similar.
> It's going to be needed for 1D LUTs as well. 1D LUTs would
> also need it to expose diffrent LUT sizes with different
> precision tradeoffs.
> 
> As for the 3D LUT blob, I don't think the blob needs any 
> strides/etc. either. I had none of that for my i915 version:
> https://github.com/vsyrjala/linux/commits/3dlut
> Just the LUT entries + blob size is sufficient. At least
> for cube shaped LUTs. Dunno if anyone would have a need
> for something else?

I only use lut_size and bit_depth for programming a CRTC 3D LUT in this
proposal, so far GAMMA_MODE also would fit. But don't know for
pre-blending 3D LUT.

> 
> The two things the are absolutely needed:
> - Position of the LUT in the pipeline. We've already
>   seen at least two variants of this IIRC, so we'll
>   likely need to define a unique property for each tap
>   point.

IIRC, I'd say three, since in rcar-du the 3D LUT is before the gamma
LUT, but there isn't a shaper 1D LUT before 3D LUT.  I'd like to know
how the gamma LUT pre-3D LUT acts on intel pipeline. If it's, in the
end, somehow similar to a shaper LUT.

I mean, if we don't name the LUTs after CTM, we could fit something
similar in terms of dimensions, as:

-> CTM -> 1D LUT -> 3D LUT -> 1D LUT

> - The order the elements are stored in the blob. I didn't
>   check if all the already known hw (amdgpu, i915, rcar-du,
>   were there also others?) would agree on this or not.
>   If yes, maybe just follow the hw order for all those,
>   and if not, then I guess flip a few coins.
> 
> -- 
> Ville Syrjälä
> Intel