All of lore.kernel.org
 help / color / mirror / Atom feed
From: Daniel Vetter <daniel@ffwll.ch>
To: Maxime Ripard <maxime@cerno.tech>
Cc: "Samuel Holland" <samuel@sholland.org>,
	"Heiko Stübner" <heiko@sntech.de>,
	"Sandy Huang" <hjc@rock-chips.com>,
	dri-devel@lists.freedesktop.org,
	linux-rockchip@lists.infradead.org,
	"Alistair Francis" <alistair@alistair23.me>,
	"Ondřej Jirman" <x@xff.cz>,
	"Andreas Kemnade" <andreas@kemnade.info>,
	"David Airlie" <airlied@linux.ie>,
	"Geert Uytterhoeven" <geert@linux-m68k.org>,
	"Krzysztof Kozlowski" <krzk+dt@kernel.org>,
	"Liang Chen" <cl@rock-chips.com>,
	"Maarten Lankhorst" <maarten.lankhorst@linux.intel.com>,
	"Michael Riesch" <michael.riesch@wolfvision.net>,
	"Nicolas Frattaroli" <frattaroli.nicolas@gmail.com>,
	"Peter Geis" <pgwipeout@gmail.com>,
	"Rob Herring" <robh+dt@kernel.org>,
	"Sam Ravnborg" <sam@ravnborg.org>,
	"Thierry Reding" <thierry.reding@gmail.com>,
	"Thomas Zimmermann" <tzimmermann@suse.de>,
	devicetree@vger.kernel.org, linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org
Subject: Re: [RFC PATCH 00/16] drm/rockchip: Rockchip EBC ("E-Book Controller") display driver
Date: Wed, 8 Jun 2022 17:34:28 +0200	[thread overview]
Message-ID: <YqDBhMZCu1gKNFfs@phenom.ffwll.local> (raw)
In-Reply-To: <20220608144847.3ibr4buxcbmfj3al@houat>

On Wed, Jun 08, 2022 at 04:48:47PM +0200, Maxime Ripard wrote:
> On Wed, Jun 01, 2022 at 02:35:35PM +0200, Daniel Vetter wrote:
> > On Tue, May 31, 2022 at 10:58:35AM +0200, Maxime Ripard wrote:
> > > Hi Daniel,
> > > 
> > > Thanks for your feedback
> > > 
> > > On Wed, May 25, 2022 at 07:18:07PM +0200, Daniel Vetter wrote:
> > > > > > VBLANK Events and Asynchronous Commits
> > > > > > ======================================
> > > > > > When should the VBLANK event complete? When the pixels have been blitted
> > > > > > to the kernel's shadow buffer? When the first frame of the waveform is
> > > > > > sent to the panel? When the last frame is sent to the panel?
> > > > > > 
> > > > > > Currently, the driver is taking the first option, letting
> > > > > > drm_atomic_helper_fake_vblank() send the VBLANK event without waiting og
> > > > > > the refresh thread. This is the only way I was able to get good
> > > > > > performance with existing userspace.
> > > > > 
> > > > > I've been having the same kind of discussions in private lately, so I'm
> > > > > interested by the answer as well :)
> > > > > 
> > > > > It would be worth looking into the SPI/I2C panels for this, since it's
> > > > > basically the same case.
> > > > 
> > > > So it's maybe a bit misnamed and maybe kerneldocs aren't super clear (pls
> > > > help improve them), but there's two modes:
> > > > 
> > > > - drivers which have vblank, which might be somewhat variable (VRR) or
> > > >   become simulated (self-refresh panels), but otherwise is a more-or-less
> > > >   regular clock. For this case the atomic commit event must match the
> > > >   vblank events exactly (frame count and timestamp)
> > > 
> > > Part of my interrogation there is do we have any kind of expectation
> > > on whether or not, when we commit, the next vblank is going to be the
> > > one matching that commit or we're allowed to defer it by an arbitrary
> > > number of frames (provided that the frame count and timestamps are
> > > correct) ?
> > 
> > In general yes, but there's no guarantee. The only guarante we give for
> > drivers with vblank counters is that if you receive a vblank event (flip
> > complete or vblank event) for frame #n, then an immediate flip/atomic
> > ioctl call will display earliest for frame #n+1.
> > 
> > Also usually you should be able to hit #n+1, but even today with fun stuff
> > like self refresh panels getting out of self refresh mode might take a bit
> > more than a few frames, and so you might end up being late. But otoh if
> > you just do a page flip loop then on average (after the crtc is fully
> > resumed) you should be able to update at vrefresh rate exactly.
> 
> I had more the next item in mind there: if we were to write something in
> the kernel that would transparently behave like a full-blown KMS driver,
> but would pipe the commits through a KMS writeback driver before sending
> them to our SPI panel, we would always be at best two vblanks late.
> 
> So this would mean that userspace would do a page flip, get a first
> vblank, but the actual vblank for that commit would be the next one (at
> best), consistently.
> 
> > > > - drivers which don't have vblank at all, mostly these are i2c/spi panels
> > > >   or virtual hw and stuff like that. In this case the event simply happens
> > > >   when the driver is done with refresh/upload, and the frame count should
> > > >   be zero (since it's meaningless).
> > > > 
> > > > Unfortuantely the helper to dtrt has fake_vblank in it's name, maybe
> > > > should be renamed to no_vblank or so (the various flags that control it
> > > > are a bit better named).
> > > > 
> > > > Again the docs should explain it all, but maybe we should clarify them or
> > > > perhaps rename that helper to be more meaningful.
> > > > 
> > > > > > Blitting/Blending in Software
> > > > > > =============================
> > > > > > There are multiple layers to this topic (pun slightly intended):
> > > > > >  1) Today's userspace does not expect a grayscale framebuffer.
> > > > > >     Currently, the driver advertises XRGB8888 and converts to Y4
> > > > > >     in software. This seems to match other drivers (e.g. repaper).
> > > > > >
> > > > > >  2) Ignoring what userspace "wants", the closest existing format is
> > > > > >     DRM_FORMAT_R8. Geert sent a series[4] adding DRM_FORMAT_R1 through
> > > > > >     DRM_FORMAT_R4 (patch 9), which I believe are the "correct" formats
> > > > > >     to use.
> > > > > > 
> > > > > >  3) The RK356x SoCs have an "RGA" hardware block that can do the
> > > > > >     RGB-to-grayscale conversion, and also RGB-to-dithered-monochrome
> > > > > >     which is needed for animation/video. Currently this is exposed with
> > > > > >     a V4L2 platform driver. Can this be inserted into the pipeline in a
> > > > > >     way that is transparent to userspace? Or must some userspace library
> > > > > >     be responsible for setting up the RGA => EBC pipeline?
> > > > > 
> > > > > I'm very interested in this answer as well :)
> > > > > 
> > > > > I think the current consensus is that it's up to userspace to set this
> > > > > up though.
> > > > 
> > > > Yeah I think v4l mem2mem device is the answer for these, and then
> > > > userspace gets to set it all up.
> > > 
> > > I think the question wasn't really about where that driver should be,
> > > but more about who gets to set it up, and if the kernel could have
> > > some component to expose the formats supported by the converter, but
> > > whenever a commit is being done pipe that to the v4l2 device before
> > > doing a page flip.
> > > 
> > > We have a similar use-case for the RaspberryPi where the hardware
> > > codec will produce a framebuffer format that isn't standard. That
> > > format is understood by the display pipeline, and it can do
> > > writeback.
> > > 
> > > However, some people are using a separate display (like a SPI display
> > > supported by tinydrm) and we would still like to be able to output the
> > > decoded frames there.
> > > 
> > > Is there some way we could plumb things to "route" that buffer through
> > > the writeback engine to perform a format conversion before sending it
> > > over to the SPI display automatically?
> > 
> > Currently not transparently. Or at least no one has done that, and I'm not
> > sure that's really a great idea. With big gpus all that stuff is done with
> > separate command submission to the render side of things, and you can
> > fully pipeline all that with in/out-fences.
> > 
> > Doing that in the kms driver side in the kernel feels very wrong to me :-/
> 
> So I guess what you're saying is that there's a close to 0% chance of it
> being accepted if we were to come up with such an architecture?

Yup.

I think the only exception is if you have a multi-region memory manager
using ttm (or hand-rolled, but please don't), where we first have to move
the buffer into the right region before it can be scanned out. And that's
generally done with a copy engine, for performance reasons.

But that copy engine is really just a very dumb (but fast!) memcpy, and
doesn't do any format conversion or stride/orientation changes like a
full-blown blitter engine (or mem2mem in v4l speak) can do.

So if it's really just memory management then I think it's fine, but
anything beyond that is a no imo.

Now for an overall full-featured stack we clearly need that, and it would
be great if there's some common userspace libraries for hosting such code.
But thus far all attempts have fallen short :-/ Which I guess is another
indicator that we really shouldn't try to solve this problem in a generic
fashion, and hence really shouldn't try to solve it with magic behind the
generic kms interface in the kernel.

For even more context I do think my old "why is 2d so hard" blogpost rant
still applies:

https://blog.ffwll.ch/2018/08/no-2d-in-drm.html

The "why no 2d api for the more limited problem of handling framebuffers"
is really just a small, but not any less complex, subset of that bigger
conundrum.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

_______________________________________________
Linux-rockchip mailing list
Linux-rockchip@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-rockchip

WARNING: multiple messages have this Message-ID (diff)
From: Daniel Vetter <daniel@ffwll.ch>
To: Maxime Ripard <maxime@cerno.tech>
Cc: "David Airlie" <airlied@linux.ie>, "Ondřej Jirman" <x@xff.cz>,
	dri-devel@lists.freedesktop.org, linux-kernel@vger.kernel.org,
	"Thierry Reding" <thierry.reding@gmail.com>,
	"Michael Riesch" <michael.riesch@wolfvision.net>,
	"Sam Ravnborg" <sam@ravnborg.org>,
	"Samuel Holland" <samuel@sholland.org>,
	"Nicolas Frattaroli" <frattaroli.nicolas@gmail.com>,
	linux-rockchip@lists.infradead.org,
	"Andreas Kemnade" <andreas@kemnade.info>,
	"Geert Uytterhoeven" <geert@linux-m68k.org>,
	"Liang Chen" <cl@rock-chips.com>,
	devicetree@vger.kernel.org, "Peter Geis" <pgwipeout@gmail.com>,
	"Alistair Francis" <alistair@alistair23.me>,
	"Rob Herring" <robh+dt@kernel.org>,
	linux-arm-kernel@lists.infradead.org,
	"Sandy Huang" <hjc@rock-chips.com>,
	"Thomas Zimmermann" <tzimmermann@suse.de>,
	"Krzysztof Kozlowski" <krzk+dt@kernel.org>
Subject: Re: [RFC PATCH 00/16] drm/rockchip: Rockchip EBC ("E-Book Controller") display driver
Date: Wed, 8 Jun 2022 17:34:28 +0200	[thread overview]
Message-ID: <YqDBhMZCu1gKNFfs@phenom.ffwll.local> (raw)
In-Reply-To: <20220608144847.3ibr4buxcbmfj3al@houat>

On Wed, Jun 08, 2022 at 04:48:47PM +0200, Maxime Ripard wrote:
> On Wed, Jun 01, 2022 at 02:35:35PM +0200, Daniel Vetter wrote:
> > On Tue, May 31, 2022 at 10:58:35AM +0200, Maxime Ripard wrote:
> > > Hi Daniel,
> > > 
> > > Thanks for your feedback
> > > 
> > > On Wed, May 25, 2022 at 07:18:07PM +0200, Daniel Vetter wrote:
> > > > > > VBLANK Events and Asynchronous Commits
> > > > > > ======================================
> > > > > > When should the VBLANK event complete? When the pixels have been blitted
> > > > > > to the kernel's shadow buffer? When the first frame of the waveform is
> > > > > > sent to the panel? When the last frame is sent to the panel?
> > > > > > 
> > > > > > Currently, the driver is taking the first option, letting
> > > > > > drm_atomic_helper_fake_vblank() send the VBLANK event without waiting og
> > > > > > the refresh thread. This is the only way I was able to get good
> > > > > > performance with existing userspace.
> > > > > 
> > > > > I've been having the same kind of discussions in private lately, so I'm
> > > > > interested by the answer as well :)
> > > > > 
> > > > > It would be worth looking into the SPI/I2C panels for this, since it's
> > > > > basically the same case.
> > > > 
> > > > So it's maybe a bit misnamed and maybe kerneldocs aren't super clear (pls
> > > > help improve them), but there's two modes:
> > > > 
> > > > - drivers which have vblank, which might be somewhat variable (VRR) or
> > > >   become simulated (self-refresh panels), but otherwise is a more-or-less
> > > >   regular clock. For this case the atomic commit event must match the
> > > >   vblank events exactly (frame count and timestamp)
> > > 
> > > Part of my interrogation there is do we have any kind of expectation
> > > on whether or not, when we commit, the next vblank is going to be the
> > > one matching that commit or we're allowed to defer it by an arbitrary
> > > number of frames (provided that the frame count and timestamps are
> > > correct) ?
> > 
> > In general yes, but there's no guarantee. The only guarante we give for
> > drivers with vblank counters is that if you receive a vblank event (flip
> > complete or vblank event) for frame #n, then an immediate flip/atomic
> > ioctl call will display earliest for frame #n+1.
> > 
> > Also usually you should be able to hit #n+1, but even today with fun stuff
> > like self refresh panels getting out of self refresh mode might take a bit
> > more than a few frames, and so you might end up being late. But otoh if
> > you just do a page flip loop then on average (after the crtc is fully
> > resumed) you should be able to update at vrefresh rate exactly.
> 
> I had more the next item in mind there: if we were to write something in
> the kernel that would transparently behave like a full-blown KMS driver,
> but would pipe the commits through a KMS writeback driver before sending
> them to our SPI panel, we would always be at best two vblanks late.
> 
> So this would mean that userspace would do a page flip, get a first
> vblank, but the actual vblank for that commit would be the next one (at
> best), consistently.
> 
> > > > - drivers which don't have vblank at all, mostly these are i2c/spi panels
> > > >   or virtual hw and stuff like that. In this case the event simply happens
> > > >   when the driver is done with refresh/upload, and the frame count should
> > > >   be zero (since it's meaningless).
> > > > 
> > > > Unfortuantely the helper to dtrt has fake_vblank in it's name, maybe
> > > > should be renamed to no_vblank or so (the various flags that control it
> > > > are a bit better named).
> > > > 
> > > > Again the docs should explain it all, but maybe we should clarify them or
> > > > perhaps rename that helper to be more meaningful.
> > > > 
> > > > > > Blitting/Blending in Software
> > > > > > =============================
> > > > > > There are multiple layers to this topic (pun slightly intended):
> > > > > >  1) Today's userspace does not expect a grayscale framebuffer.
> > > > > >     Currently, the driver advertises XRGB8888 and converts to Y4
> > > > > >     in software. This seems to match other drivers (e.g. repaper).
> > > > > >
> > > > > >  2) Ignoring what userspace "wants", the closest existing format is
> > > > > >     DRM_FORMAT_R8. Geert sent a series[4] adding DRM_FORMAT_R1 through
> > > > > >     DRM_FORMAT_R4 (patch 9), which I believe are the "correct" formats
> > > > > >     to use.
> > > > > > 
> > > > > >  3) The RK356x SoCs have an "RGA" hardware block that can do the
> > > > > >     RGB-to-grayscale conversion, and also RGB-to-dithered-monochrome
> > > > > >     which is needed for animation/video. Currently this is exposed with
> > > > > >     a V4L2 platform driver. Can this be inserted into the pipeline in a
> > > > > >     way that is transparent to userspace? Or must some userspace library
> > > > > >     be responsible for setting up the RGA => EBC pipeline?
> > > > > 
> > > > > I'm very interested in this answer as well :)
> > > > > 
> > > > > I think the current consensus is that it's up to userspace to set this
> > > > > up though.
> > > > 
> > > > Yeah I think v4l mem2mem device is the answer for these, and then
> > > > userspace gets to set it all up.
> > > 
> > > I think the question wasn't really about where that driver should be,
> > > but more about who gets to set it up, and if the kernel could have
> > > some component to expose the formats supported by the converter, but
> > > whenever a commit is being done pipe that to the v4l2 device before
> > > doing a page flip.
> > > 
> > > We have a similar use-case for the RaspberryPi where the hardware
> > > codec will produce a framebuffer format that isn't standard. That
> > > format is understood by the display pipeline, and it can do
> > > writeback.
> > > 
> > > However, some people are using a separate display (like a SPI display
> > > supported by tinydrm) and we would still like to be able to output the
> > > decoded frames there.
> > > 
> > > Is there some way we could plumb things to "route" that buffer through
> > > the writeback engine to perform a format conversion before sending it
> > > over to the SPI display automatically?
> > 
> > Currently not transparently. Or at least no one has done that, and I'm not
> > sure that's really a great idea. With big gpus all that stuff is done with
> > separate command submission to the render side of things, and you can
> > fully pipeline all that with in/out-fences.
> > 
> > Doing that in the kms driver side in the kernel feels very wrong to me :-/
> 
> So I guess what you're saying is that there's a close to 0% chance of it
> being accepted if we were to come up with such an architecture?

Yup.

I think the only exception is if you have a multi-region memory manager
using ttm (or hand-rolled, but please don't), where we first have to move
the buffer into the right region before it can be scanned out. And that's
generally done with a copy engine, for performance reasons.

But that copy engine is really just a very dumb (but fast!) memcpy, and
doesn't do any format conversion or stride/orientation changes like a
full-blown blitter engine (or mem2mem in v4l speak) can do.

So if it's really just memory management then I think it's fine, but
anything beyond that is a no imo.

Now for an overall full-featured stack we clearly need that, and it would
be great if there's some common userspace libraries for hosting such code.
But thus far all attempts have fallen short :-/ Which I guess is another
indicator that we really shouldn't try to solve this problem in a generic
fashion, and hence really shouldn't try to solve it with magic behind the
generic kms interface in the kernel.

For even more context I do think my old "why is 2d so hard" blogpost rant
still applies:

https://blog.ffwll.ch/2018/08/no-2d-in-drm.html

The "why no 2d api for the more limited problem of handling framebuffers"
is really just a small, but not any less complex, subset of that bigger
conundrum.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

WARNING: multiple messages have this Message-ID (diff)
From: Daniel Vetter <daniel@ffwll.ch>
To: Maxime Ripard <maxime@cerno.tech>
Cc: "Samuel Holland" <samuel@sholland.org>,
	"Heiko Stübner" <heiko@sntech.de>,
	"Sandy Huang" <hjc@rock-chips.com>,
	dri-devel@lists.freedesktop.org,
	linux-rockchip@lists.infradead.org,
	"Alistair Francis" <alistair@alistair23.me>,
	"Ondřej Jirman" <x@xff.cz>,
	"Andreas Kemnade" <andreas@kemnade.info>,
	"David Airlie" <airlied@linux.ie>,
	"Geert Uytterhoeven" <geert@linux-m68k.org>,
	"Krzysztof Kozlowski" <krzk+dt@kernel.org>,
	"Liang Chen" <cl@rock-chips.com>,
	"Maarten Lankhorst" <maarten.lankhorst@linux.intel.com>,
	"Michael Riesch" <michael.riesch@wolfvision.net>,
	"Nicolas Frattaroli" <frattaroli.nicolas@gmail.com>,
	"Peter Geis" <pgwipeout@gmail.com>,
	"Rob Herring" <robh+dt@kernel.org>,
	"Sam Ravnborg" <sam@ravnborg.org>,
	"Thierry Reding" <thierry.reding@gmail.com>,
	"Thomas Zimmermann" <tzimmermann@suse.de>,
	devicetree@vger.kernel.org, linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org
Subject: Re: [RFC PATCH 00/16] drm/rockchip: Rockchip EBC ("E-Book Controller") display driver
Date: Wed, 8 Jun 2022 17:34:28 +0200	[thread overview]
Message-ID: <YqDBhMZCu1gKNFfs@phenom.ffwll.local> (raw)
In-Reply-To: <20220608144847.3ibr4buxcbmfj3al@houat>

On Wed, Jun 08, 2022 at 04:48:47PM +0200, Maxime Ripard wrote:
> On Wed, Jun 01, 2022 at 02:35:35PM +0200, Daniel Vetter wrote:
> > On Tue, May 31, 2022 at 10:58:35AM +0200, Maxime Ripard wrote:
> > > Hi Daniel,
> > > 
> > > Thanks for your feedback
> > > 
> > > On Wed, May 25, 2022 at 07:18:07PM +0200, Daniel Vetter wrote:
> > > > > > VBLANK Events and Asynchronous Commits
> > > > > > ======================================
> > > > > > When should the VBLANK event complete? When the pixels have been blitted
> > > > > > to the kernel's shadow buffer? When the first frame of the waveform is
> > > > > > sent to the panel? When the last frame is sent to the panel?
> > > > > > 
> > > > > > Currently, the driver is taking the first option, letting
> > > > > > drm_atomic_helper_fake_vblank() send the VBLANK event without waiting og
> > > > > > the refresh thread. This is the only way I was able to get good
> > > > > > performance with existing userspace.
> > > > > 
> > > > > I've been having the same kind of discussions in private lately, so I'm
> > > > > interested by the answer as well :)
> > > > > 
> > > > > It would be worth looking into the SPI/I2C panels for this, since it's
> > > > > basically the same case.
> > > > 
> > > > So it's maybe a bit misnamed and maybe kerneldocs aren't super clear (pls
> > > > help improve them), but there's two modes:
> > > > 
> > > > - drivers which have vblank, which might be somewhat variable (VRR) or
> > > >   become simulated (self-refresh panels), but otherwise is a more-or-less
> > > >   regular clock. For this case the atomic commit event must match the
> > > >   vblank events exactly (frame count and timestamp)
> > > 
> > > Part of my interrogation there is do we have any kind of expectation
> > > on whether or not, when we commit, the next vblank is going to be the
> > > one matching that commit or we're allowed to defer it by an arbitrary
> > > number of frames (provided that the frame count and timestamps are
> > > correct) ?
> > 
> > In general yes, but there's no guarantee. The only guarante we give for
> > drivers with vblank counters is that if you receive a vblank event (flip
> > complete or vblank event) for frame #n, then an immediate flip/atomic
> > ioctl call will display earliest for frame #n+1.
> > 
> > Also usually you should be able to hit #n+1, but even today with fun stuff
> > like self refresh panels getting out of self refresh mode might take a bit
> > more than a few frames, and so you might end up being late. But otoh if
> > you just do a page flip loop then on average (after the crtc is fully
> > resumed) you should be able to update at vrefresh rate exactly.
> 
> I had more the next item in mind there: if we were to write something in
> the kernel that would transparently behave like a full-blown KMS driver,
> but would pipe the commits through a KMS writeback driver before sending
> them to our SPI panel, we would always be at best two vblanks late.
> 
> So this would mean that userspace would do a page flip, get a first
> vblank, but the actual vblank for that commit would be the next one (at
> best), consistently.
> 
> > > > - drivers which don't have vblank at all, mostly these are i2c/spi panels
> > > >   or virtual hw and stuff like that. In this case the event simply happens
> > > >   when the driver is done with refresh/upload, and the frame count should
> > > >   be zero (since it's meaningless).
> > > > 
> > > > Unfortuantely the helper to dtrt has fake_vblank in it's name, maybe
> > > > should be renamed to no_vblank or so (the various flags that control it
> > > > are a bit better named).
> > > > 
> > > > Again the docs should explain it all, but maybe we should clarify them or
> > > > perhaps rename that helper to be more meaningful.
> > > > 
> > > > > > Blitting/Blending in Software
> > > > > > =============================
> > > > > > There are multiple layers to this topic (pun slightly intended):
> > > > > >  1) Today's userspace does not expect a grayscale framebuffer.
> > > > > >     Currently, the driver advertises XRGB8888 and converts to Y4
> > > > > >     in software. This seems to match other drivers (e.g. repaper).
> > > > > >
> > > > > >  2) Ignoring what userspace "wants", the closest existing format is
> > > > > >     DRM_FORMAT_R8. Geert sent a series[4] adding DRM_FORMAT_R1 through
> > > > > >     DRM_FORMAT_R4 (patch 9), which I believe are the "correct" formats
> > > > > >     to use.
> > > > > > 
> > > > > >  3) The RK356x SoCs have an "RGA" hardware block that can do the
> > > > > >     RGB-to-grayscale conversion, and also RGB-to-dithered-monochrome
> > > > > >     which is needed for animation/video. Currently this is exposed with
> > > > > >     a V4L2 platform driver. Can this be inserted into the pipeline in a
> > > > > >     way that is transparent to userspace? Or must some userspace library
> > > > > >     be responsible for setting up the RGA => EBC pipeline?
> > > > > 
> > > > > I'm very interested in this answer as well :)
> > > > > 
> > > > > I think the current consensus is that it's up to userspace to set this
> > > > > up though.
> > > > 
> > > > Yeah I think v4l mem2mem device is the answer for these, and then
> > > > userspace gets to set it all up.
> > > 
> > > I think the question wasn't really about where that driver should be,
> > > but more about who gets to set it up, and if the kernel could have
> > > some component to expose the formats supported by the converter, but
> > > whenever a commit is being done pipe that to the v4l2 device before
> > > doing a page flip.
> > > 
> > > We have a similar use-case for the RaspberryPi where the hardware
> > > codec will produce a framebuffer format that isn't standard. That
> > > format is understood by the display pipeline, and it can do
> > > writeback.
> > > 
> > > However, some people are using a separate display (like a SPI display
> > > supported by tinydrm) and we would still like to be able to output the
> > > decoded frames there.
> > > 
> > > Is there some way we could plumb things to "route" that buffer through
> > > the writeback engine to perform a format conversion before sending it
> > > over to the SPI display automatically?
> > 
> > Currently not transparently. Or at least no one has done that, and I'm not
> > sure that's really a great idea. With big gpus all that stuff is done with
> > separate command submission to the render side of things, and you can
> > fully pipeline all that with in/out-fences.
> > 
> > Doing that in the kms driver side in the kernel feels very wrong to me :-/
> 
> So I guess what you're saying is that there's a close to 0% chance of it
> being accepted if we were to come up with such an architecture?

Yup.

I think the only exception is if you have a multi-region memory manager
using ttm (or hand-rolled, but please don't), where we first have to move
the buffer into the right region before it can be scanned out. And that's
generally done with a copy engine, for performance reasons.

But that copy engine is really just a very dumb (but fast!) memcpy, and
doesn't do any format conversion or stride/orientation changes like a
full-blown blitter engine (or mem2mem in v4l speak) can do.

So if it's really just memory management then I think it's fine, but
anything beyond that is a no imo.

Now for an overall full-featured stack we clearly need that, and it would
be great if there's some common userspace libraries for hosting such code.
But thus far all attempts have fallen short :-/ Which I guess is another
indicator that we really shouldn't try to solve this problem in a generic
fashion, and hence really shouldn't try to solve it with magic behind the
generic kms interface in the kernel.

For even more context I do think my old "why is 2d so hard" blogpost rant
still applies:

https://blog.ffwll.ch/2018/08/no-2d-in-drm.html

The "why no 2d api for the more limited problem of handling framebuffers"
is really just a small, but not any less complex, subset of that bigger
conundrum.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

WARNING: multiple messages have this Message-ID (diff)
From: Daniel Vetter <daniel@ffwll.ch>
To: Maxime Ripard <maxime@cerno.tech>
Cc: "Samuel Holland" <samuel@sholland.org>,
	"Heiko Stübner" <heiko@sntech.de>,
	"Sandy Huang" <hjc@rock-chips.com>,
	dri-devel@lists.freedesktop.org,
	linux-rockchip@lists.infradead.org,
	"Alistair Francis" <alistair@alistair23.me>,
	"Ondřej Jirman" <x@xff.cz>,
	"Andreas Kemnade" <andreas@kemnade.info>,
	"David Airlie" <airlied@linux.ie>,
	"Geert Uytterhoeven" <geert@linux-m68k.org>,
	"Krzysztof Kozlowski" <krzk+dt@kernel.org>,
	"Liang Chen" <cl@rock-chips.com>,
	"Maarten Lankhorst" <maarten.lankhorst@linux.intel.com>,
	"Michael Riesch" <michael.riesch@wolfvision.net>,
	"Nicolas Frattaroli" <frattaroli.nicolas@gmail.com>,
	"Peter Geis" <pgwipeout@gmail.com>,
	"Rob Herring" <robh+dt@kernel.org>,
	"Sam Ravnborg" <sam@ravnborg.org>,
	"Thierry Reding" <thierry.reding@gmail.com>,
	"Thomas Zimmermann" <tzimmermann@suse.de>,
	devicetree@vger.kernel.org, linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org
Subject: Re: [RFC PATCH 00/16] drm/rockchip: Rockchip EBC ("E-Book Controller") display driver
Date: Wed, 8 Jun 2022 17:34:28 +0200	[thread overview]
Message-ID: <YqDBhMZCu1gKNFfs@phenom.ffwll.local> (raw)
In-Reply-To: <20220608144847.3ibr4buxcbmfj3al@houat>

On Wed, Jun 08, 2022 at 04:48:47PM +0200, Maxime Ripard wrote:
> On Wed, Jun 01, 2022 at 02:35:35PM +0200, Daniel Vetter wrote:
> > On Tue, May 31, 2022 at 10:58:35AM +0200, Maxime Ripard wrote:
> > > Hi Daniel,
> > > 
> > > Thanks for your feedback
> > > 
> > > On Wed, May 25, 2022 at 07:18:07PM +0200, Daniel Vetter wrote:
> > > > > > VBLANK Events and Asynchronous Commits
> > > > > > ======================================
> > > > > > When should the VBLANK event complete? When the pixels have been blitted
> > > > > > to the kernel's shadow buffer? When the first frame of the waveform is
> > > > > > sent to the panel? When the last frame is sent to the panel?
> > > > > > 
> > > > > > Currently, the driver is taking the first option, letting
> > > > > > drm_atomic_helper_fake_vblank() send the VBLANK event without waiting og
> > > > > > the refresh thread. This is the only way I was able to get good
> > > > > > performance with existing userspace.
> > > > > 
> > > > > I've been having the same kind of discussions in private lately, so I'm
> > > > > interested by the answer as well :)
> > > > > 
> > > > > It would be worth looking into the SPI/I2C panels for this, since it's
> > > > > basically the same case.
> > > > 
> > > > So it's maybe a bit misnamed and maybe kerneldocs aren't super clear (pls
> > > > help improve them), but there's two modes:
> > > > 
> > > > - drivers which have vblank, which might be somewhat variable (VRR) or
> > > >   become simulated (self-refresh panels), but otherwise is a more-or-less
> > > >   regular clock. For this case the atomic commit event must match the
> > > >   vblank events exactly (frame count and timestamp)
> > > 
> > > Part of my interrogation there is do we have any kind of expectation
> > > on whether or not, when we commit, the next vblank is going to be the
> > > one matching that commit or we're allowed to defer it by an arbitrary
> > > number of frames (provided that the frame count and timestamps are
> > > correct) ?
> > 
> > In general yes, but there's no guarantee. The only guarante we give for
> > drivers with vblank counters is that if you receive a vblank event (flip
> > complete or vblank event) for frame #n, then an immediate flip/atomic
> > ioctl call will display earliest for frame #n+1.
> > 
> > Also usually you should be able to hit #n+1, but even today with fun stuff
> > like self refresh panels getting out of self refresh mode might take a bit
> > more than a few frames, and so you might end up being late. But otoh if
> > you just do a page flip loop then on average (after the crtc is fully
> > resumed) you should be able to update at vrefresh rate exactly.
> 
> I had more the next item in mind there: if we were to write something in
> the kernel that would transparently behave like a full-blown KMS driver,
> but would pipe the commits through a KMS writeback driver before sending
> them to our SPI panel, we would always be at best two vblanks late.
> 
> So this would mean that userspace would do a page flip, get a first
> vblank, but the actual vblank for that commit would be the next one (at
> best), consistently.
> 
> > > > - drivers which don't have vblank at all, mostly these are i2c/spi panels
> > > >   or virtual hw and stuff like that. In this case the event simply happens
> > > >   when the driver is done with refresh/upload, and the frame count should
> > > >   be zero (since it's meaningless).
> > > > 
> > > > Unfortuantely the helper to dtrt has fake_vblank in it's name, maybe
> > > > should be renamed to no_vblank or so (the various flags that control it
> > > > are a bit better named).
> > > > 
> > > > Again the docs should explain it all, but maybe we should clarify them or
> > > > perhaps rename that helper to be more meaningful.
> > > > 
> > > > > > Blitting/Blending in Software
> > > > > > =============================
> > > > > > There are multiple layers to this topic (pun slightly intended):
> > > > > >  1) Today's userspace does not expect a grayscale framebuffer.
> > > > > >     Currently, the driver advertises XRGB8888 and converts to Y4
> > > > > >     in software. This seems to match other drivers (e.g. repaper).
> > > > > >
> > > > > >  2) Ignoring what userspace "wants", the closest existing format is
> > > > > >     DRM_FORMAT_R8. Geert sent a series[4] adding DRM_FORMAT_R1 through
> > > > > >     DRM_FORMAT_R4 (patch 9), which I believe are the "correct" formats
> > > > > >     to use.
> > > > > > 
> > > > > >  3) The RK356x SoCs have an "RGA" hardware block that can do the
> > > > > >     RGB-to-grayscale conversion, and also RGB-to-dithered-monochrome
> > > > > >     which is needed for animation/video. Currently this is exposed with
> > > > > >     a V4L2 platform driver. Can this be inserted into the pipeline in a
> > > > > >     way that is transparent to userspace? Or must some userspace library
> > > > > >     be responsible for setting up the RGA => EBC pipeline?
> > > > > 
> > > > > I'm very interested in this answer as well :)
> > > > > 
> > > > > I think the current consensus is that it's up to userspace to set this
> > > > > up though.
> > > > 
> > > > Yeah I think v4l mem2mem device is the answer for these, and then
> > > > userspace gets to set it all up.
> > > 
> > > I think the question wasn't really about where that driver should be,
> > > but more about who gets to set it up, and if the kernel could have
> > > some component to expose the formats supported by the converter, but
> > > whenever a commit is being done pipe that to the v4l2 device before
> > > doing a page flip.
> > > 
> > > We have a similar use-case for the RaspberryPi where the hardware
> > > codec will produce a framebuffer format that isn't standard. That
> > > format is understood by the display pipeline, and it can do
> > > writeback.
> > > 
> > > However, some people are using a separate display (like a SPI display
> > > supported by tinydrm) and we would still like to be able to output the
> > > decoded frames there.
> > > 
> > > Is there some way we could plumb things to "route" that buffer through
> > > the writeback engine to perform a format conversion before sending it
> > > over to the SPI display automatically?
> > 
> > Currently not transparently. Or at least no one has done that, and I'm not
> > sure that's really a great idea. With big gpus all that stuff is done with
> > separate command submission to the render side of things, and you can
> > fully pipeline all that with in/out-fences.
> > 
> > Doing that in the kms driver side in the kernel feels very wrong to me :-/
> 
> So I guess what you're saying is that there's a close to 0% chance of it
> being accepted if we were to come up with such an architecture?

Yup.

I think the only exception is if you have a multi-region memory manager
using ttm (or hand-rolled, but please don't), where we first have to move
the buffer into the right region before it can be scanned out. And that's
generally done with a copy engine, for performance reasons.

But that copy engine is really just a very dumb (but fast!) memcpy, and
doesn't do any format conversion or stride/orientation changes like a
full-blown blitter engine (or mem2mem in v4l speak) can do.

So if it's really just memory management then I think it's fine, but
anything beyond that is a no imo.

Now for an overall full-featured stack we clearly need that, and it would
be great if there's some common userspace libraries for hosting such code.
But thus far all attempts have fallen short :-/ Which I guess is another
indicator that we really shouldn't try to solve this problem in a generic
fashion, and hence really shouldn't try to solve it with magic behind the
generic kms interface in the kernel.

For even more context I do think my old "why is 2d so hard" blogpost rant
still applies:

https://blog.ffwll.ch/2018/08/no-2d-in-drm.html

The "why no 2d api for the more limited problem of handling framebuffers"
is really just a small, but not any less complex, subset of that bigger
conundrum.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  reply	other threads:[~2022-06-08 15:34 UTC|newest]

Thread overview: 110+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-04-13 22:19 [RFC PATCH 00/16] drm/rockchip: Rockchip EBC ("E-Book Controller") display driver Samuel Holland
2022-04-13 22:19 ` Samuel Holland
2022-04-13 22:19 ` Samuel Holland
2022-04-13 22:19 ` Samuel Holland
2022-04-13 22:19 ` [RFC PATCH 01/16] drm: Add a helper library for EPD drivers Samuel Holland
2022-04-13 22:19   ` Samuel Holland
2022-04-13 22:19   ` Samuel Holland
2022-04-13 22:19   ` Samuel Holland
2022-04-13 22:19 ` [RFC PATCH 02/16] dt-bindings: display: rockchip: Add EBC binding Samuel Holland
2022-04-13 22:19   ` Samuel Holland
2022-04-13 22:19   ` Samuel Holland
2022-04-13 22:19   ` Samuel Holland
2022-04-14  8:15   ` Andreas Kemnade
2022-04-14  8:15     ` Andreas Kemnade
2022-04-14  8:15     ` Andreas Kemnade
2022-04-14  8:15     ` Andreas Kemnade
2022-04-15  3:00     ` Samuel Holland
2022-04-15  3:00       ` Samuel Holland
2022-04-15  3:00       ` Samuel Holland
2022-04-15  3:00       ` Samuel Holland
2022-04-15 11:45       ` Andreas Kemnade
2022-04-15 11:45         ` Andreas Kemnade
2022-04-15 11:45         ` Andreas Kemnade
2022-04-15 11:45         ` Andreas Kemnade
2022-04-13 22:19 ` [RFC PATCH 03/16] drm/rockchip: Add EBC platform driver skeleton Samuel Holland
2022-04-13 22:19   ` Samuel Holland
2022-04-13 22:19   ` Samuel Holland
2022-04-13 22:19   ` Samuel Holland
2022-04-13 22:19 ` [RFC PATCH 04/16] drm/rockchip: ebc: Add DRM " Samuel Holland
2022-04-13 22:19   ` Samuel Holland
2022-04-13 22:19   ` Samuel Holland
2022-04-13 22:19   ` Samuel Holland
2022-04-13 22:19 ` [RFC PATCH 05/16] drm/rockchip: ebc: Add CRTC mode setting Samuel Holland
2022-04-13 22:19   ` Samuel Holland
2022-04-13 22:19   ` Samuel Holland
2022-04-13 22:19   ` Samuel Holland
2022-04-13 22:19 ` [RFC PATCH 06/16] drm/rockchip: ebc: Add CRTC refresh thread Samuel Holland
2022-04-13 22:19   ` Samuel Holland
2022-04-13 22:19   ` Samuel Holland
2022-04-13 22:19   ` Samuel Holland
2022-04-13 22:19 ` [RFC PATCH 07/16] drm/rockchip: ebc: Add CRTC buffer management Samuel Holland
2022-04-13 22:19   ` Samuel Holland
2022-04-13 22:19   ` Samuel Holland
2022-04-13 22:19   ` Samuel Holland
2022-04-13 22:19 ` [RFC PATCH 08/16] drm/rockchip: ebc: Add LUT loading Samuel Holland
2022-04-13 22:19   ` Samuel Holland
2022-04-13 22:19   ` Samuel Holland
2022-04-13 22:19   ` Samuel Holland
2022-04-13 22:19 ` [RFC PATCH 09/16] drm/rockchip: ebc: Implement global refreshes Samuel Holland
2022-04-13 22:19   ` Samuel Holland
2022-04-13 22:19   ` Samuel Holland
2022-04-13 22:19   ` Samuel Holland
2022-04-13 22:19 ` [RFC PATCH 10/16] drm/rockchip: ebc: Implement partial refreshes Samuel Holland
2022-04-13 22:19   ` Samuel Holland
2022-04-13 22:19   ` Samuel Holland
2022-04-13 22:19   ` Samuel Holland
2022-04-13 22:19 ` [RFC PATCH 11/16] drm/rockchip: ebc: Enable diff mode for " Samuel Holland
2022-04-13 22:19   ` Samuel Holland
2022-04-13 22:19   ` Samuel Holland
2022-04-13 22:19   ` Samuel Holland
2022-04-13 22:19 ` [RFC PATCH 12/16] drm/rockchip: ebc: Add support for direct mode Samuel Holland
2022-04-13 22:19   ` Samuel Holland
2022-04-13 22:19   ` Samuel Holland
2022-04-13 22:19   ` Samuel Holland
2022-04-13 22:19 ` [RFC PATCH 13/16] drm/rockchip: ebc: Add a panel reflection option Samuel Holland
2022-04-13 22:19   ` Samuel Holland
2022-04-13 22:19   ` Samuel Holland
2022-04-13 22:19   ` Samuel Holland
2022-04-13 22:19 ` [RFC PATCH 14/16] drm/panel-simple: Add eInk ED103TC2 Samuel Holland
2022-04-13 22:19   ` Samuel Holland
2022-04-13 22:19   ` Samuel Holland
2022-04-13 22:19   ` Samuel Holland
2022-04-13 22:19 ` [RFC PATCH 15/16] arm64: dts: rockchip: rk356x: Add EBC node Samuel Holland
2022-04-13 22:19   ` Samuel Holland
2022-04-13 22:19   ` Samuel Holland
2022-04-13 22:19   ` Samuel Holland
2022-04-13 22:19 ` [RFC PATCH 16/16] [DO NOT MERGE] arm64: dts: rockchip: pinenote: Enable EBC display pipeline Samuel Holland
2022-04-13 22:19   ` Samuel Holland
2022-04-13 22:19   ` Samuel Holland
2022-04-13 22:19   ` Samuel Holland
2022-04-14  8:50 ` [RFC PATCH 00/16] drm/rockchip: Rockchip EBC ("E-Book Controller") display driver Maxime Ripard
2022-04-14  8:50   ` Maxime Ripard
2022-04-14  8:50   ` Maxime Ripard
2022-04-14  8:50   ` Maxime Ripard
2022-05-25 17:18   ` Daniel Vetter
2022-05-25 17:18     ` Daniel Vetter
2022-05-25 17:18     ` Daniel Vetter
2022-05-25 17:18     ` Daniel Vetter
2022-05-31  8:58     ` Maxime Ripard
2022-05-31  8:58       ` Maxime Ripard
2022-05-31  8:58       ` Maxime Ripard
2022-06-01 12:35       ` Daniel Vetter
2022-06-01 12:35         ` Daniel Vetter
2022-06-01 12:35         ` Daniel Vetter
2022-06-01 12:35         ` Daniel Vetter
2022-06-08 14:48         ` Maxime Ripard
2022-06-08 14:48           ` Maxime Ripard
2022-06-08 14:48           ` Maxime Ripard
2022-06-08 15:34           ` Daniel Vetter [this message]
2022-06-08 15:34             ` Daniel Vetter
2022-06-08 15:34             ` Daniel Vetter
2022-06-08 15:34             ` Daniel Vetter
2022-04-21  6:43 ` Andreas Kemnade
2022-04-21  6:43   ` Andreas Kemnade
2022-04-21  6:43   ` Andreas Kemnade
2022-04-21  6:43   ` Andreas Kemnade
2022-04-21  7:10   ` Nicolas Frattaroli
2022-04-21  7:10     ` Nicolas Frattaroli
2022-04-21  7:10     ` Nicolas Frattaroli
2022-04-21  7:10     ` Nicolas Frattaroli

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YqDBhMZCu1gKNFfs@phenom.ffwll.local \
    --to=daniel@ffwll.ch \
    --cc=airlied@linux.ie \
    --cc=alistair@alistair23.me \
    --cc=andreas@kemnade.info \
    --cc=cl@rock-chips.com \
    --cc=devicetree@vger.kernel.org \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=frattaroli.nicolas@gmail.com \
    --cc=geert@linux-m68k.org \
    --cc=heiko@sntech.de \
    --cc=hjc@rock-chips.com \
    --cc=krzk+dt@kernel.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-rockchip@lists.infradead.org \
    --cc=maarten.lankhorst@linux.intel.com \
    --cc=maxime@cerno.tech \
    --cc=michael.riesch@wolfvision.net \
    --cc=pgwipeout@gmail.com \
    --cc=robh+dt@kernel.org \
    --cc=sam@ravnborg.org \
    --cc=samuel@sholland.org \
    --cc=thierry.reding@gmail.com \
    --cc=tzimmermann@suse.de \
    --cc=x@xff.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.