dri-devel.lists.freedesktop.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] drm: support up to 128 drm devices
@ 2023-06-30 11:56 James Zhu
  2023-07-14 10:31 ` Simon Ser
  0 siblings, 1 reply; 9+ messages in thread
From: James Zhu @ 2023-06-30 11:56 UTC (permalink / raw)
  To: dri-devel; +Cc: jamesz, christian.koenig, Christian König

From: Christian König <ckoenig.leichtzumerken@gmail.com>

This makes room for up to 128 DRM devices.

Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: James Zhu <James.Zhu@amd.com>
---
 drivers/gpu/drm/drm_drv.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/drm_drv.c b/drivers/gpu/drm/drm_drv.c
index 73b845a75d52..0d55c64444f5 100644
--- a/drivers/gpu/drm/drm_drv.c
+++ b/drivers/gpu/drm/drm_drv.c
@@ -137,7 +137,7 @@ static int drm_minor_alloc(struct drm_device *dev, unsigned int type)
 		r = idr_alloc(&drm_minors_idr,
 			NULL,
 			64 * type,
-			64 * (type + 1),
+			64 * (type + 2),
 			GFP_NOWAIT);
 		spin_unlock_irqrestore(&drm_minor_lock, flags);
 	}
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH] drm: support up to 128 drm devices
  2023-06-30 11:56 [PATCH] drm: support up to 128 drm devices James Zhu
@ 2023-07-14 10:31 ` Simon Ser
  2023-07-14 10:49   ` Simon Ser
  2023-07-17  7:30   ` Emil Velikov
  0 siblings, 2 replies; 9+ messages in thread
From: Simon Ser @ 2023-07-14 10:31 UTC (permalink / raw)
  To: James Zhu
  Cc: jamesz, Pekka Paalanen, Christian König, dri-devel,
	wayland-devel, Daniel Vetter, christian.koenig

(cc Daniel Vetter and Pekka because this change has uAPI repercussions)

On Friday, June 30th, 2023 at 13:56, James Zhu <James.Zhu@amd.com> wrote:

> From: Christian König <ckoenig.leichtzumerken@gmail.com>
> 
> This makes room for up to 128 DRM devices.
> 
> Signed-off-by: Christian König <christian.koenig@amd.com>
> Signed-off-by: James Zhu <James.Zhu@amd.com>
> ---
>  drivers/gpu/drm/drm_drv.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/drm_drv.c b/drivers/gpu/drm/drm_drv.c
> index 73b845a75d52..0d55c64444f5 100644
> --- a/drivers/gpu/drm/drm_drv.c
> +++ b/drivers/gpu/drm/drm_drv.c
> @@ -137,7 +137,7 @@ static int drm_minor_alloc(struct drm_device *dev, unsigned int type)
>  		r = idr_alloc(&drm_minors_idr,
>  			NULL,
>  			64 * type,
> -			64 * (type + 1),
> +			64 * (type + 2),

The type comes from this enum:

    enum drm_minor_type {
    	DRM_MINOR_PRIMARY,
    	DRM_MINOR_CONTROL,
    	DRM_MINOR_RENDER,
    	DRM_MINOR_ACCEL = 32,
    };

Before this patch, 0..63 are for primary, 64..127 are for control (never
exposed by the kernel), 128..191 are for render, 2048..2112 are for accel.
After this patch, 0..127 are for primary, 64..191 are for control (never
exposed by the kernel), 128..255 are for render, 2048..2176 are for accel.

I'm worried what might happen with old user-space, especially old libdrm.
When there are a lot of primary nodes, some would get detected as control
nodes, even if they aren't. Is this fine? This would at least be confusing
for information-gathering tools like drm_info. I'm not sure about other
consequences. Do we want to forever forbid the 64..127 range instead, so
that we have the guarantee that old libdrm never sees it?

I'm also worried about someone adding a new entry to the enum after
DRM_MINOR_RENDER (DRM_MINOR_ACCEL was specifically set to 32 so that new
node types could be added before). drm_minor_alloc() would blow up in this
case, because it'd use the 192..319 range, which overlaps with render.
I think a switch with explicit ranges (and WARN when an unknown type is
passed in) would be both easier to read and less risky.

>  			GFP_NOWAIT);
>  		spin_unlock_irqrestore(&drm_minor_lock, flags);
>  	}
> --
> 2.34.1

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] drm: support up to 128 drm devices
  2023-07-14 10:31 ` Simon Ser
@ 2023-07-14 10:49   ` Simon Ser
  2023-07-17  7:30   ` Emil Velikov
  1 sibling, 0 replies; 9+ messages in thread
From: Simon Ser @ 2023-07-14 10:49 UTC (permalink / raw)
  To: James Zhu
  Cc: jamesz, Pekka Paalanen, Christian König, dri-devel,
	wayland-devel, Daniel Vetter, christian.koenig

On Friday, July 14th, 2023 at 12:31, Simon Ser <contact@emersion.fr> wrote:

> Before this patch, 0..63 are for primary, 64..127 are for control (never
> exposed by the kernel), 128..191 are for render, 2048..2112 are for accel.
> After this patch, 0..127 are for primary, 64..191 are for control (never
> exposed by the kernel), 128..255 are for render, 2048..2176 are for accel.

Correction: reading the code, accel is handled separately.

Additional find: the kernel creates sysfs control nodes because old
user-space reads them to figure out whether a device is DRIVER_MODESET.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] drm: support up to 128 drm devices
  2023-07-14 10:31 ` Simon Ser
  2023-07-14 10:49   ` Simon Ser
@ 2023-07-17  7:30   ` Emil Velikov
  2023-07-17  9:45     ` Simon Ser
  1 sibling, 1 reply; 9+ messages in thread
From: Emil Velikov @ 2023-07-17  7:30 UTC (permalink / raw)
  To: Simon Ser
  Cc: jamesz, Pekka Paalanen, Christian König, dri-devel,
	wayland-devel, Daniel Vetter, James Zhu, christian.koenig

On Fri, 14 Jul 2023 at 11:32, Simon Ser <contact@emersion.fr> wrote:
>
> (cc Daniel Vetter and Pekka because this change has uAPI repercussions)
>
> On Friday, June 30th, 2023 at 13:56, James Zhu <James.Zhu@amd.com> wrote:
>
> > From: Christian König <ckoenig.leichtzumerken@gmail.com>
> >
> > This makes room for up to 128 DRM devices.
> >
> > Signed-off-by: Christian König <christian.koenig@amd.com>
> > Signed-off-by: James Zhu <James.Zhu@amd.com>
> > ---
> >  drivers/gpu/drm/drm_drv.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/drivers/gpu/drm/drm_drv.c b/drivers/gpu/drm/drm_drv.c
> > index 73b845a75d52..0d55c64444f5 100644
> > --- a/drivers/gpu/drm/drm_drv.c
> > +++ b/drivers/gpu/drm/drm_drv.c
> > @@ -137,7 +137,7 @@ static int drm_minor_alloc(struct drm_device *dev, unsigned int type)
> >               r = idr_alloc(&drm_minors_idr,
> >                       NULL,
> >                       64 * type,
> > -                     64 * (type + 1),
> > +                     64 * (type + 2),
>
> The type comes from this enum:
>
>     enum drm_minor_type {
>         DRM_MINOR_PRIMARY,
>         DRM_MINOR_CONTROL,
>         DRM_MINOR_RENDER,
>         DRM_MINOR_ACCEL = 32,
>     };
>
> Before this patch, 0..63 are for primary, 64..127 are for control (never
> exposed by the kernel), 128..191 are for render, 2048..2112 are for accel.
> After this patch, 0..127 are for primary, 64..191 are for control (never
> exposed by the kernel), 128..255 are for render, 2048..2176 are for accel.
>
> I'm worried what might happen with old user-space, especially old libdrm.

I also share the same concern. Although the bigger issue is not libdrm
- since we can update it and prod distributions to update it.
The biggest concern is software that cannot be rebuild/updated -
closed source and various open-source that has been abandoned.

As mentioned in the gitlab ticket - the current style of embedding the
uABI/uAPI in each and every application is not great IMHO. That is why
I've introduced the `drmGetDevices2` API, to consolidate most of the
chaos in a single place.

For going forward, here is one way we can shave this yak:
 - update libdrm to max 64 nodes
 - roll libdrm release, nag distributions to update to it // could be
folded with the next release below
 - update libdrm to use name driven type detection
 - roll libdrm release, nag distributions to update to it
 - once ^^ release hits most distributions, merge the above proposed
kernel patch
   - the commit message should explain the caveats and fixed libdrm version
   - we should be prepared to revert the commit, if it causes user
space regression - fix (see below) and re-introduce the kernel patch
1-2 releases later

In parallel to all the above, as a community we should collectively
audit and update open-source applications to the `drmDevices2` API.

As with other legacy (DRI1), this one will take some time but we can get there.

HTH
Emil

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] drm: support up to 128 drm devices
  2023-07-17  7:30   ` Emil Velikov
@ 2023-07-17  9:45     ` Simon Ser
  2023-07-17 13:24       ` Emil Velikov
  0 siblings, 1 reply; 9+ messages in thread
From: Simon Ser @ 2023-07-17  9:45 UTC (permalink / raw)
  To: Emil Velikov
  Cc: jamesz, Pekka Paalanen, Christian König, dri-devel,
	wayland-devel, Daniel Vetter, James Zhu, christian.koenig

On Monday, July 17th, 2023 at 09:30, Emil Velikov <emil.l.velikov@gmail.com> wrote:

> > I'm worried what might happen with old user-space, especially old libdrm.
> 
> I also share the same concern. Although the bigger issue is not libdrm
> - since we can update it and prod distributions to update it.
> The biggest concern is software that cannot be rebuild/updated -
> closed source and various open-source that has been abandoned.

Well. Now that we have Flatpak and AppImage and friends, we're really likely
to have old libdrm copies vendored all over the place, and these will stick
around essentially forever.

> For going forward, here is one way we can shave this yak:
>  - update libdrm to max 64 nodes
>  - roll libdrm release, nag distributions to update to it // could be
> folded with the next release below
>  - update libdrm to use name driven type detection
>  - roll libdrm release, nag distributions to update to it
>  - once ^^ release hits most distributions, merge the above proposed
> kernel patch
>    - the commit message should explain the caveats and fixed libdrm version
>    - we should be prepared to revert the commit, if it causes user
> space regression - fix (see below) and re-introduce the kernel patch
> 1-2 releases later

That sounds really scary to me. I'd really prefer to try not to break the
kernel uAPI here.

The kernel rule is "do not break user-space".

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] drm: support up to 128 drm devices
  2023-07-17  9:45     ` Simon Ser
@ 2023-07-17 13:24       ` Emil Velikov
  2023-07-17 13:54         ` Simon Ser
  0 siblings, 1 reply; 9+ messages in thread
From: Emil Velikov @ 2023-07-17 13:24 UTC (permalink / raw)
  To: Simon Ser
  Cc: jamesz, Pekka Paalanen, Christian König, dri-devel,
	wayland-devel, Daniel Vetter, James Zhu, christian.koenig

On Mon, 17 Jul 2023 at 10:45, Simon Ser <contact@emersion.fr> wrote:
>
> On Monday, July 17th, 2023 at 09:30, Emil Velikov <emil.l.velikov@gmail.com> wrote:
>
> > > I'm worried what might happen with old user-space, especially old libdrm.
> >
> > I also share the same concern. Although the bigger issue is not libdrm
> > - since we can update it and prod distributions to update it.
> > The biggest concern is software that cannot be rebuild/updated -
> > closed source and various open-source that has been abandoned.
>
> Well. Now that we have Flatpak and AppImage and friends, we're really likely
> to have old libdrm copies vendored all over the place, and these will stick
> around essentially forever.
>

The flatpak devs have been very helpful. So I'm pretty sure we can get
those updated - even for older flatpaks.
For AppImage, I have no experience.

> > For going forward, here is one way we can shave this yak:
> >  - update libdrm to max 64 nodes
> >  - roll libdrm release, nag distributions to update to it // could be
> > folded with the next release below
> >  - update libdrm to use name driven type detection
> >  - roll libdrm release, nag distributions to update to it
> >  - once ^^ release hits most distributions, merge the above proposed
> > kernel patch
> >    - the commit message should explain the caveats and fixed libdrm version
> >    - we should be prepared to revert the commit, if it causes user
> > space regression - fix (see below) and re-introduce the kernel patch
> > 1-2 releases later
>
> That sounds really scary to me. I'd really prefer to try not to break the
> kernel uAPI here.
>

With part in particular? Mind you I'm not particularly happy either,
since in essence it's like a controlled explosion.

> The kernel rule is "do not break user-space".

Yes, in a perfect world. In practice, there have been multiple kernel
changes breaking user-space. Some got reverted, some remained.
AFAICT the above will get us out of the sticky situation we're in with
the least amount of explosion.

If there is a concrete proposal, please go ahead and sorry if I've
missed it. I'm supposed to be off, having fun with family when I saw
this whole thing explode.

Small note: literally all the users I've seen will stop on a missing
node (card or render) aka if the kernel creates card0...63 and then
card200... then (hard wavy estimate) 80% of the apps will be broken.

HTH
Emil

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] drm: support up to 128 drm devices
  2023-07-17 13:24       ` Emil Velikov
@ 2023-07-17 13:54         ` Simon Ser
  2023-07-25  9:21           ` Emil Velikov
  0 siblings, 1 reply; 9+ messages in thread
From: Simon Ser @ 2023-07-17 13:54 UTC (permalink / raw)
  To: Emil Velikov
  Cc: jamesz, Pekka Paalanen, Christian König, dri-devel,
	wayland-devel, Daniel Vetter, James Zhu, christian.koenig

On Monday, July 17th, 2023 at 15:24, Emil Velikov <emil.l.velikov@gmail.com> wrote:

> > > For going forward, here is one way we can shave this yak:
> > >  - update libdrm to max 64 nodes
> > >  - roll libdrm release, nag distributions to update to it // could be
> > > folded with the next release below
> > >  - update libdrm to use name driven type detection
> > >  - roll libdrm release, nag distributions to update to it
> > >  - once ^^ release hits most distributions, merge the above proposed
> > > kernel patch
> > >    - the commit message should explain the caveats and fixed libdrm version
> > >    - we should be prepared to revert the commit, if it causes user
> > > space regression - fix (see below) and re-introduce the kernel patch
> > > 1-2 releases later
> >
> > That sounds really scary to me. I'd really prefer to try not to break the
> > kernel uAPI here.
> >
> 
> With part in particular? Mind you I'm not particularly happy either,
> since in essence it's like a controlled explosion.

I believe there are ways to extend the uAPI to support more devices without
breaking the uAPI. Michał Winiarski's patch for instance tried something to
this effect.

> > The kernel rule is "do not break user-space".
> 
> Yes, in a perfect world. In practice, there have been multiple kernel
> changes breaking user-space. Some got reverted, some remained.
> AFAICT the above will get us out of the sticky situation we're in with
> the least amount of explosion.
> 
> If there is a concrete proposal, please go ahead and sorry if I've
> missed it. I'm supposed to be off, having fun with family when I saw
> this whole thing explode.
> 
> Small note: literally all the users I've seen will stop on a missing
> node (card or render) aka if the kernel creates card0...63 and then
> card200... then (hard wavy estimate) 80% of the apps will be broken.

That's fine, because that's not a kernel regression. Supporting more than 64
devices is a new kernel feature, and if some user-space ignores the new nodes,
that's not a kernel regression. A regression only happens when a use-case which
works with an older kernel is broken by a newer kernel.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] drm: support up to 128 drm devices
  2023-07-17 13:54         ` Simon Ser
@ 2023-07-25  9:21           ` Emil Velikov
  0 siblings, 0 replies; 9+ messages in thread
From: Emil Velikov @ 2023-07-25  9:21 UTC (permalink / raw)
  To: Simon Ser
  Cc: jamesz, Pekka Paalanen, Christian König, dri-devel,
	wayland-devel, Daniel Vetter, James Zhu, christian.koenig

On Mon, 17 Jul 2023 at 14:54, Simon Ser <contact@emersion.fr> wrote:
>
> On Monday, July 17th, 2023 at 15:24, Emil Velikov <emil.l.velikov@gmail.com> wrote:
>
> > > > For going forward, here is one way we can shave this yak:
> > > >  - update libdrm to max 64 nodes
> > > >  - roll libdrm release, nag distributions to update to it // could be
> > > > folded with the next release below
> > > >  - update libdrm to use name driven type detection
> > > >  - roll libdrm release, nag distributions to update to it
> > > >  - once ^^ release hits most distributions, merge the above proposed
> > > > kernel patch
> > > >    - the commit message should explain the caveats and fixed libdrm version
> > > >    - we should be prepared to revert the commit, if it causes user
> > > > space regression - fix (see below) and re-introduce the kernel patch
> > > > 1-2 releases later
> > >
> > > That sounds really scary to me. I'd really prefer to try not to break the
> > > kernel uAPI here.
> > >
> >
> > With part in particular? Mind you I'm not particularly happy either,
> > since in essence it's like a controlled explosion.
>
> I believe there are ways to extend the uAPI to support more devices without
> breaking the uAPI. Michał Winiarski's patch for instance tried something to
> this effect.
>
> > > The kernel rule is "do not break user-space".
> >
> > Yes, in a perfect world. In practice, there have been multiple kernel
> > changes breaking user-space. Some got reverted, some remained.
> > AFAICT the above will get us out of the sticky situation we're in with
> > the least amount of explosion.
> >
> > If there is a concrete proposal, please go ahead and sorry if I've
> > missed it. I'm supposed to be off, having fun with family when I saw
> > this whole thing explode.
> >
> > Small note: literally all the users I've seen will stop on a missing
> > node (card or render) aka if the kernel creates card0...63 and then
> > card200... then (hard wavy estimate) 80% of the apps will be broken.
>
> That's fine, because that's not a kernel regression. Supporting more than 64
> devices is a new kernel feature, and if some user-space ignores the new nodes,
> that's not a kernel regression. A regression only happens when a use-case which
> works with an older kernel is broken by a newer kernel.

Won't this approach effectively hard-code/leak even more kernel uABI
into dozens of not hundreds of userspace projects? This does not sound
like a scalable solution IMHO.

I am 100% behind the "don't break userspace rule", alas very few
things in life are as black/white as your comments seem to suggest.
Thus I would suggest doing a bit of both or a compromise if you will.
Namely:
 - try the initial route outlined above
 - if there are (m)any fires, revert the kernel patch and opt for the
work by Michał

This has the benefit of fixing a bunch of the uABI abuses out there,
and leaking more uABI only on as-needed basis.

Side note: KDE folks have their own flatpak runtime and have been
quite open to backport libdrm/other fixes.

HTH
Emil

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH] drm: support up to 128 drm devices
@ 2023-06-30 11:48 James Zhu
  0 siblings, 0 replies; 9+ messages in thread
From: James Zhu @ 2023-06-30 11:48 UTC (permalink / raw)
  To: dri-devel; +Cc: Christian König, jamesz, christian.koenig

From: Christian König <ckoenig.leichtzumerken@gmail.com>

This makes room for up to 128 DRM devices.

Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: James Zhu <James.Zhu@amd.com>
---
 drivers/gpu/drm/drm_drv.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/drm_drv.c b/drivers/gpu/drm/drm_drv.c
index 73b845a75d52..0d55c64444f5 100644
--- a/drivers/gpu/drm/drm_drv.c
+++ b/drivers/gpu/drm/drm_drv.c
@@ -137,7 +137,7 @@ static int drm_minor_alloc(struct drm_device *dev, unsigned int type)
 		r = idr_alloc(&drm_minors_idr,
 			NULL,
 			64 * type,
-			64 * (type + 1),
+			64 * (type + 2),
 			GFP_NOWAIT);
 		spin_unlock_irqrestore(&drm_minor_lock, flags);
 	}
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2023-07-25  9:21 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-06-30 11:56 [PATCH] drm: support up to 128 drm devices James Zhu
2023-07-14 10:31 ` Simon Ser
2023-07-14 10:49   ` Simon Ser
2023-07-17  7:30   ` Emil Velikov
2023-07-17  9:45     ` Simon Ser
2023-07-17 13:24       ` Emil Velikov
2023-07-17 13:54         ` Simon Ser
2023-07-25  9:21           ` Emil Velikov
  -- strict thread matches above, loose matches on Subject: below --
2023-06-30 11:48 James Zhu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).