linux-next.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: Issue with DRM and "reimplement IDR and IDA using the radix tree"
       [not found] <fd3db206-8b3e-150f-a060-e88cc2f49606@nvidia.com>
@ 2016-12-14 14:08 ` Alexandre Courbot
  2016-12-16 16:16   ` Thierry Reding
  0 siblings, 1 reply; 3+ messages in thread
From: Alexandre Courbot @ 2016-12-14 14:08 UTC (permalink / raw)
  To: Matthew Wilcox, Stephen Rothwell
  Cc: Andrew Morton, linux-next, linux-kernel, dri-devel

Forgot to add the most relevant list for this issue (linux-next).

Stephen, maybe you will want to temporarily revert this patch until this
is cleared? This probably affects other users than DRM.

On 12/13/2016 04:14 PM, Alexandre Courbot wrote:
> Hi Matthew,
> 
> Trying the latest -next on the Jetson TK1 board (with two different DRM
> devices and display and render), I noticed that the GPU device probe
> always failed with error -ENOSPC. After investigating I figured out that
> this was due to the minor device allocation failing when a second DRM
> device is added.
> 
> More precisely, when drm_minor_alloc() is called with DRM_MINOR_PRIMARY
> (0) as argument for a second time, the call to idr_alloc() (which has a
> requested range of 0..64) fails instead of returning 1 as expected. Note
> that the first call is successful.
> 
> Reverting "reimplement IDR and IDA using the radix tree" on 20161213's
> next fixes the issue for me, suggesting a bug may have slipped in there.
> 
> Not sure how this could be fixed, so reporting the issue for now in case
> it is not known yet.
> 
> Cheers,
> Alex.
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel
> 
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Issue with DRM and "reimplement IDR and IDA using the radix tree"
  2016-12-14 14:08 ` Issue with DRM and "reimplement IDR and IDA using the radix tree" Alexandre Courbot
@ 2016-12-16 16:16   ` Thierry Reding
  2016-12-17  6:47     ` Alexandre Courbot
  0 siblings, 1 reply; 3+ messages in thread
From: Thierry Reding @ 2016-12-16 16:16 UTC (permalink / raw)
  To: Matthew Wilcox, Stephen Rothwell, Andrew Morton
  Cc: Alexandre Courbot, linux-next, linux-kernel, dri-devel

[-- Attachment #1: Type: text/plain, Size: 4512 bytes --]

On Wed, Dec 14, 2016 at 11:08:20PM +0900, Alexandre Courbot wrote:
> Forgot to add the most relevant list for this issue (linux-next).
> 
> Stephen, maybe you will want to temporarily revert this patch until this
> is cleared? This probably affects other users than DRM.
> 
> On 12/13/2016 04:14 PM, Alexandre Courbot wrote:
> > Hi Matthew,
> > 
> > Trying the latest -next on the Jetson TK1 board (with two different DRM
> > devices and display and render), I noticed that the GPU device probe
> > always failed with error -ENOSPC. After investigating I figured out that
> > this was due to the minor device allocation failing when a second DRM
> > device is added.
> > 
> > More precisely, when drm_minor_alloc() is called with DRM_MINOR_PRIMARY
> > (0) as argument for a second time, the call to idr_alloc() (which has a
> > requested range of 0..64) fails instead of returning 1 as expected. Note
> > that the first call is successful.
> > 
> > Reverting "reimplement IDR and IDA using the radix tree" on 20161213's
> > next fixes the issue for me, suggesting a bug may have slipped in there.
> > 
> > Not sure how this could be fixed, so reporting the issue for now in case
> > it is not known yet.

I can confirm Alex' findings, though the symptoms seem to be slightly
different, which may be related to me testing on next-20161216 rather
than next-20161213.

What I'm seeing is that all drivers get probed correctly, but when an
application tries to open the DRM device files (/dev/dri/card0 in this
case), then all devices of a given minor type disappear. So in my case
upon boot I get this:

	# ls -l /dev/dri/
	total 0
	crw-rw---- 1 root video 226,   0 Dec 16 15:59 card0
	crw-rw---- 1 root video 226,   1 Dec 16 15:59 card1
	crw-rw---- 1 root video 226, 128 Dec 16 15:59 renderD128

The modetest program from libdrm is then unable to open any devices:

	# modetest
	trying to open device 'i915'...failed
	trying to open device 'amdgpu'...failed
	trying to open device 'radeon'...failed
	trying to open device 'nouveau'...failed
	trying to open device 'vmwgfx'...failed
	trying to open device 'omapdrm'...failed
	trying to open device 'exynos'...failed
	trying to open device 'tilcdc'...failed
	trying to open device 'msm'...failed
	trying to open device 'sti'...failed
	trying to open device 'tegra'...failed
	trying to open device 'imx-drm'...failed
	trying to open device 'rockchip'...failed
	trying to open device 'atmel-hlcdc'...failed
	trying to open device 'fsl-dcu-drm'...failed
	trying to open device 'vc4'...failed
	trying to open device 'virtio_gpu'...failed
	trying to open device 'mediatek'...failed
	no device found

And after that all of the primary minors are gone:

	# ls -l /dev/dri/
	total 0
	crw-rw---- 1 root video 226, 128 Dec 16 15:59 renderD128

Strangely this can't be reproduced for renderD128. So explicitly passing
the renderD128 node to modetest:

	# modetest -D /dev/dri/renderD128 
	trying to open device 'i915'...failed
	trying to open device 'amdgpu'...failed
	trying to open device 'radeon'...failed
	trying to open device 'nouveau'...failed
	trying to open device 'vmwgfx'...failed
	trying to open device 'omapdrm'...failed
	trying to open device 'exynos'...failed
	trying to open device 'tilcdc'...failed
	trying to open device 'msm'...failed
	trying to open device 'sti'...failed
	trying to open device 'tegra'...failed
	trying to open device 'imx-drm'...failed
	trying to open device 'rockchip'...failed
	trying to open device 'atmel-hlcdc'...failed
	trying to open device 'fsl-dcu-drm'...failed
	trying to open device 'vc4'...failed
	trying to open device 'virtio_gpu'...failed
	trying to open device 'mediatek'...failed
	no device found

It isn't finding anything either, but the renderD128 node at least
doesn't disappear like the card* nodes:

	# ls -l /dev/dri/
	total 0
	crw-rw---- 1 root video 226, 128 Dec 16 15:59 renderD128

Oddly enough, though:

	# cat /dev/dri/renderD128
	cat: /dev/dri/renderD128: No such device

Something's really weird.

Reverting b05bbe3ea2db ("Reimplement IDR and IDA using the radix tree"),
which is the version in today's linux-next of the patch that Alex had
pinpointed, restores everything to normal.

Andrew, Stephen, can we drop this patch for now?

Matthew, since I have an easy and reliable way of reproducing this, feel
free to enlist me for testing any new revisions of your patch.

Thanks,
Thierry

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Issue with DRM and "reimplement IDR and IDA using the radix tree"
  2016-12-16 16:16   ` Thierry Reding
@ 2016-12-17  6:47     ` Alexandre Courbot
  0 siblings, 0 replies; 3+ messages in thread
From: Alexandre Courbot @ 2016-12-17  6:47 UTC (permalink / raw)
  To: Thierry Reding, Matthew Wilcox, Stephen Rothwell, Andrew Morton
  Cc: linux-next, linux-kernel, dri-devel

On 12/17/2016 01:16 AM, Thierry Reding wrote:
> * PGP Signed by an unknown key
> 
> On Wed, Dec 14, 2016 at 11:08:20PM +0900, Alexandre Courbot wrote:
>> Forgot to add the most relevant list for this issue (linux-next).
>>
>> Stephen, maybe you will want to temporarily revert this patch until this
>> is cleared? This probably affects other users than DRM.
>>
>> On 12/13/2016 04:14 PM, Alexandre Courbot wrote:
>>> Hi Matthew,
>>>
>>> Trying the latest -next on the Jetson TK1 board (with two different DRM
>>> devices and display and render), I noticed that the GPU device probe
>>> always failed with error -ENOSPC. After investigating I figured out that
>>> this was due to the minor device allocation failing when a second DRM
>>> device is added.
>>>
>>> More precisely, when drm_minor_alloc() is called with DRM_MINOR_PRIMARY
>>> (0) as argument for a second time, the call to idr_alloc() (which has a
>>> requested range of 0..64) fails instead of returning 1 as expected. Note
>>> that the first call is successful.
>>>
>>> Reverting "reimplement IDR and IDA using the radix tree" on 20161213's
>>> next fixes the issue for me, suggesting a bug may have slipped in there.
>>>
>>> Not sure how this could be fixed, so reporting the issue for now in case
>>> it is not known yet.
> 
> I can confirm Alex' findings, though the symptoms seem to be slightly
> different, which may be related to me testing on next-20161216 rather
> than next-20161213.
> 
> What I'm seeing is that all drivers get probed correctly, but when an
> application tries to open the DRM device files (/dev/dri/card0 in this
> case), then all devices of a given minor type disappear. So in my case
> upon boot I get this:
> 
> 	# ls -l /dev/dri/
> 	total 0
> 	crw-rw---- 1 root video 226,   0 Dec 16 15:59 card0
> 	crw-rw---- 1 root video 226,   1 Dec 16 15:59 card1
> 	crw-rw---- 1 root video 226, 128 Dec 16 15:59 renderD128
> 
> The modetest program from libdrm is then unable to open any devices:
> 
> 	# modetest
> 	trying to open device 'i915'...failed
> 	trying to open device 'amdgpu'...failed
> 	trying to open device 'radeon'...failed
> 	trying to open device 'nouveau'...failed
> 	trying to open device 'vmwgfx'...failed
> 	trying to open device 'omapdrm'...failed
> 	trying to open device 'exynos'...failed
> 	trying to open device 'tilcdc'...failed
> 	trying to open device 'msm'...failed
> 	trying to open device 'sti'...failed
> 	trying to open device 'tegra'...failed
> 	trying to open device 'imx-drm'...failed
> 	trying to open device 'rockchip'...failed
> 	trying to open device 'atmel-hlcdc'...failed
> 	trying to open device 'fsl-dcu-drm'...failed
> 	trying to open device 'vc4'...failed
> 	trying to open device 'virtio_gpu'...failed
> 	trying to open device 'mediatek'...failed
> 	no device found
> 
> And after that all of the primary minors are gone:
> 
> 	# ls -l /dev/dri/
> 	total 0
> 	crw-rw---- 1 root video 226, 128 Dec 16 15:59 renderD128

That's exactly what I am also getting with 20161216. As it turns out the
patch has changed slightly (my revert did not apply after a rebase), and
the symptoms changed against 20161215, but the fix is the same:
reverting gives me back a working system.

This patch really should be reverted for now. Like Thierry I am
available to test further iterations.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2016-12-17  6:47 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <fd3db206-8b3e-150f-a060-e88cc2f49606@nvidia.com>
2016-12-14 14:08 ` Issue with DRM and "reimplement IDR and IDA using the radix tree" Alexandre Courbot
2016-12-16 16:16   ` Thierry Reding
2016-12-17  6:47     ` Alexandre Courbot

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).