All of lore.kernel.org
 help / color / mirror / Atom feed
* Device loses its IRQ number on driver unload?
@ 2015-03-09 10:04 Thomas Hellstrom
  2015-03-09 15:22 ` Daniel Vetter
  0 siblings, 1 reply; 10+ messages in thread
From: Thomas Hellstrom @ 2015-03-09 10:04 UTC (permalink / raw)
  To: dri-devel

Hi,

I'm not sure this started with 4.0 but when I rmmod the device driver
like so
rmmod vmwgfx

The device loses its IRQ line as shown in lscpi:
   Flags: bus master, medium devsel, latency 64 <irq missing here>

and a subsequent modprobe will fail since pdev->irq is 0.

Is anyone else seeing this with other drivers?

Thanks,
Thomas

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Device loses its IRQ number on driver unload?
  2015-03-09 10:04 Device loses its IRQ number on driver unload? Thomas Hellstrom
@ 2015-03-09 15:22 ` Daniel Vetter
  2015-03-09 16:02   ` Thomas Hellstrom
  0 siblings, 1 reply; 10+ messages in thread
From: Daniel Vetter @ 2015-03-09 15:22 UTC (permalink / raw)
  To: Thomas Hellstrom; +Cc: dri-devel

On Mon, Mar 09, 2015 at 11:04:01AM +0100, Thomas Hellstrom wrote:
> Hi,
> 
> I'm not sure this started with 4.0 but when I rmmod the device driver
> like so
> rmmod vmwgfx
> 
> The device loses its IRQ line as shown in lscpi:
>    Flags: bus master, medium devsel, latency 64 <irq missing here>
> 
> and a subsequent modprobe will fail since pdev->irq is 0.
> 
> Is anyone else seeing this with other drivers?

I seen occasionally (over the past couple of kernels) random zeros in pdev
but dismissed it as broken machines or bugs in i915 (we have them ...).
Usually the box died chasing a NULL pointer from pdev. Otherwise no.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Device loses its IRQ number on driver unload?
  2015-03-09 15:22 ` Daniel Vetter
@ 2015-03-09 16:02   ` Thomas Hellstrom
  2015-03-09 20:25     ` Dave Airlie
  0 siblings, 1 reply; 10+ messages in thread
From: Thomas Hellstrom @ 2015-03-09 16:02 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: dri-devel

On 03/09/2015 04:22 PM, Daniel Vetter wrote:
> On Mon, Mar 09, 2015 at 11:04:01AM +0100, Thomas Hellstrom wrote:
>> Hi,
>>
>> I'm not sure this started with 4.0 but when I rmmod the device driver
>> like so
>> rmmod vmwgfx
>>
>> The device loses its IRQ line as shown in lscpi:
>>    Flags: bus master, medium devsel, latency 64 <irq missing here>
>>
>> and a subsequent modprobe will fail since pdev->irq is 0.
>>
>> Is anyone else seeing this with other drivers?
> I seen occasionally (over the past couple of kernels) random zeros in pdev
> but dismissed it as broken machines or bugs in i915 (we have them ...).
> Usually the box died chasing a NULL pointer from pdev. Otherwise no.
> -Daniel
OK. Thanks for the info. Since in my case this is 100% reproducible I
guess I have an excellent opportunity to bisect the problem :-/

/Thomas

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Device loses its IRQ number on driver unload?
  2015-03-09 16:02   ` Thomas Hellstrom
@ 2015-03-09 20:25     ` Dave Airlie
  2015-03-10 12:55       ` Thomas Hellstrom
  0 siblings, 1 reply; 10+ messages in thread
From: Dave Airlie @ 2015-03-09 20:25 UTC (permalink / raw)
  To: Thomas Hellstrom; +Cc: dri-devel

On 10 March 2015 at 02:02, Thomas Hellstrom <thellstrom@vmware.com> wrote:
> On 03/09/2015 04:22 PM, Daniel Vetter wrote:
>> On Mon, Mar 09, 2015 at 11:04:01AM +0100, Thomas Hellstrom wrote:
>>> Hi,
>>>
>>> I'm not sure this started with 4.0 but when I rmmod the device driver
>>> like so
>>> rmmod vmwgfx
>>>
>>> The device loses its IRQ line as shown in lscpi:
>>>    Flags: bus master, medium devsel, latency 64 <irq missing here>
>>>
>>> and a subsequent modprobe will fail since pdev->irq is 0.
>>>
>>> Is anyone else seeing this with other drivers?
>> I seen occasionally (over the past couple of kernels) random zeros in pdev
>> but dismissed it as broken machines or bugs in i915 (we have them ...).
>> Usually the box died chasing a NULL pointer from pdev. Otherwise no.
>> -Daniel
> OK. Thanks for the info. Since in my case this is 100% reproducible I
> guess I have an excellent opportunity to bisect the problem :-/
>

does lspci -H1, or some option like to direct access hw show it?

just whether this is the kernel copy or the hw register getting messed up.

Dave.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Device loses its IRQ number on driver unload?
  2015-03-09 20:25     ` Dave Airlie
@ 2015-03-10 12:55       ` Thomas Hellstrom
  2015-03-10 14:01         ` Alex Deucher
  2015-03-10 21:05         ` Dave Airlie
  0 siblings, 2 replies; 10+ messages in thread
From: Thomas Hellstrom @ 2015-03-10 12:55 UTC (permalink / raw)
  To: Dave Airlie; +Cc: linux-graphics-maintainer, dri-devel

On 03/09/2015 09:25 PM, Dave Airlie wrote:
> On 10 March 2015 at 02:02, Thomas Hellstrom <thellstrom@vmware.com> wrote:
>> On 03/09/2015 04:22 PM, Daniel Vetter wrote:
>>> On Mon, Mar 09, 2015 at 11:04:01AM +0100, Thomas Hellstrom wrote:
>>>> Hi,
>>>>
>>>> I'm not sure this started with 4.0 but when I rmmod the device driver
>>>> like so
>>>> rmmod vmwgfx
>>>>
>>>> The device loses its IRQ line as shown in lscpi:
>>>>    Flags: bus master, medium devsel, latency 64 <irq missing here>
>>>>
>>>> and a subsequent modprobe will fail since pdev->irq is 0.
>>>>
>>>> Is anyone else seeing this with other drivers?
>>> I seen occasionally (over the past couple of kernels) random zeros in pdev
>>> but dismissed it as broken machines or bugs in i915 (we have them ...).
>>> Usually the box died chasing a NULL pointer from pdev. Otherwise no.
>>> -Daniel
>> OK. Thanks for the info. Since in my case this is 100% reproducible I
>> guess I have an excellent opportunity to bisect the problem :-/
>>
> does lspci -H1, or some option like to direct access hw show it?
>
> just whether this is the kernel copy or the hw register getting messed up.
>
> Dave.
Hi, Dave,

lspci -H1 indeed shows the IRQ number. It turns out that the commit
introduced in 4.0 breaking this is

b4b55cda587442477a3a9f0669e26bba4b7800c0 is the first bad commit
commit b4b55cda587442477a3a9f0669e26bba4b7800c0
Author: Jiang Liu <jiang.liu@linux.intel.com>
Date:   Thu Feb 5 13:44:47 2015 +0800

    x86/PCI: Refine the way to release PCI IRQ resources


It's obvious from the commit message that unloading the driver *should*
drop the irq resource but its not
obvious what's reallocating that resource on driver load...

Anyway, it turns out that adding a
pci_disable_device(pdev) in the pci driver's remove() method
(vmw_remove() in my case) appears to fix the problem:
The device irq is removed on driver unload and enabled again on driver
load There appears to be no pci_disable_device() on driver exit in core drm.

However it still beats me why other drm drivers aren't seeing this, and
IMHO that commit should probably add a warning message if the pci device
isn't disabled on pci driver unload......

/Thomas

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Device loses its IRQ number on driver unload?
  2015-03-10 12:55       ` Thomas Hellstrom
@ 2015-03-10 14:01         ` Alex Deucher
  2015-03-10 21:05         ` Dave Airlie
  1 sibling, 0 replies; 10+ messages in thread
From: Alex Deucher @ 2015-03-10 14:01 UTC (permalink / raw)
  To: Thomas Hellstrom; +Cc: linux-graphics-maintainer, dri-devel

On Tue, Mar 10, 2015 at 8:55 AM, Thomas Hellstrom <thellstrom@vmware.com> wrote:
> On 03/09/2015 09:25 PM, Dave Airlie wrote:
>> On 10 March 2015 at 02:02, Thomas Hellstrom <thellstrom@vmware.com> wrote:
>>> On 03/09/2015 04:22 PM, Daniel Vetter wrote:
>>>> On Mon, Mar 09, 2015 at 11:04:01AM +0100, Thomas Hellstrom wrote:
>>>>> Hi,
>>>>>
>>>>> I'm not sure this started with 4.0 but when I rmmod the device driver
>>>>> like so
>>>>> rmmod vmwgfx
>>>>>
>>>>> The device loses its IRQ line as shown in lscpi:
>>>>>    Flags: bus master, medium devsel, latency 64 <irq missing here>
>>>>>
>>>>> and a subsequent modprobe will fail since pdev->irq is 0.
>>>>>
>>>>> Is anyone else seeing this with other drivers?
>>>> I seen occasionally (over the past couple of kernels) random zeros in pdev
>>>> but dismissed it as broken machines or bugs in i915 (we have them ...).
>>>> Usually the box died chasing a NULL pointer from pdev. Otherwise no.
>>>> -Daniel
>>> OK. Thanks for the info. Since in my case this is 100% reproducible I
>>> guess I have an excellent opportunity to bisect the problem :-/
>>>
>> does lspci -H1, or some option like to direct access hw show it?
>>
>> just whether this is the kernel copy or the hw register getting messed up.
>>
>> Dave.
> Hi, Dave,
>
> lspci -H1 indeed shows the IRQ number. It turns out that the commit
> introduced in 4.0 breaking this is
>
> b4b55cda587442477a3a9f0669e26bba4b7800c0 is the first bad commit
> commit b4b55cda587442477a3a9f0669e26bba4b7800c0
> Author: Jiang Liu <jiang.liu@linux.intel.com>
> Date:   Thu Feb 5 13:44:47 2015 +0800
>
>     x86/PCI: Refine the way to release PCI IRQ resources
>
>
> It's obvious from the commit message that unloading the driver *should*
> drop the irq resource but its not
> obvious what's reallocating that resource on driver load...
>
> Anyway, it turns out that adding a
> pci_disable_device(pdev) in the pci driver's remove() method
> (vmw_remove() in my case) appears to fix the problem:
> The device irq is removed on driver unload and enabled again on driver
> load There appears to be no pci_disable_device() on driver exit in core drm.
>
> However it still beats me why other drm drivers aren't seeing this, and
> IMHO that commit should probably add a warning message if the pci device
> isn't disabled on pci driver unload......

They are probably broken as well.  I don't think module unload and
reload is commonly done with most drivers.  FWIW, the drm core also
does not register a pci shutdown callback so when you use kexec,
nothing in the driver gets torn down properly.

Alex
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Device loses its IRQ number on driver unload?
  2015-03-10 12:55       ` Thomas Hellstrom
  2015-03-10 14:01         ` Alex Deucher
@ 2015-03-10 21:05         ` Dave Airlie
  2015-03-11  6:40           ` Thomas Hellstrom
  1 sibling, 1 reply; 10+ messages in thread
From: Dave Airlie @ 2015-03-10 21:05 UTC (permalink / raw)
  To: Thomas Hellstrom; +Cc: linux-graphics-maintainer, dri-devel

On 10 March 2015 at 22:55, Thomas Hellstrom <thellstrom@vmware.com> wrote:
> On 03/09/2015 09:25 PM, Dave Airlie wrote:
>> On 10 March 2015 at 02:02, Thomas Hellstrom <thellstrom@vmware.com> wrote:
>>> On 03/09/2015 04:22 PM, Daniel Vetter wrote:
>>>> On Mon, Mar 09, 2015 at 11:04:01AM +0100, Thomas Hellstrom wrote:
>>>>> Hi,
>>>>>
>>>>> I'm not sure this started with 4.0 but when I rmmod the device driver
>>>>> like so
>>>>> rmmod vmwgfx
>>>>>
>>>>> The device loses its IRQ line as shown in lscpi:
>>>>>    Flags: bus master, medium devsel, latency 64 <irq missing here>
>>>>>
>>>>> and a subsequent modprobe will fail since pdev->irq is 0.
>>>>>
>>>>> Is anyone else seeing this with other drivers?
>>>> I seen occasionally (over the past couple of kernels) random zeros in pdev
>>>> but dismissed it as broken machines or bugs in i915 (we have them ...).
>>>> Usually the box died chasing a NULL pointer from pdev. Otherwise no.
>>>> -Daniel
>>> OK. Thanks for the info. Since in my case this is 100% reproducible I
>>> guess I have an excellent opportunity to bisect the problem :-/
>>>
>> does lspci -H1, or some option like to direct access hw show it?
>>
>> just whether this is the kernel copy or the hw register getting messed up.
>>
>> Dave.
> Hi, Dave,
>
> lspci -H1 indeed shows the IRQ number. It turns out that the commit
> introduced in 4.0 breaking this is
>
> b4b55cda587442477a3a9f0669e26bba4b7800c0 is the first bad commit
> commit b4b55cda587442477a3a9f0669e26bba4b7800c0
> Author: Jiang Liu <jiang.liu@linux.intel.com>
> Date:   Thu Feb 5 13:44:47 2015 +0800
>
>     x86/PCI: Refine the way to release PCI IRQ resources
>
>
> It's obvious from the commit message that unloading the driver *should*
> drop the irq resource but its not
> obvious what's reallocating that resource on driver load...
>
> Anyway, it turns out that adding a
> pci_disable_device(pdev) in the pci driver's remove() method
> (vmw_remove() in my case) appears to fix the problem:
> The device irq is removed on driver unload and enabled again on driver
> load There appears to be no pci_disable_device() on driver exit in core drm.

Yes that is because at one time pre kms if you pci disabled the VGA device,
bad things would happen.

I think with modesetting driver it shouldn't be a problem anymore.

Dave.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Device loses its IRQ number on driver unload?
  2015-03-10 21:05         ` Dave Airlie
@ 2015-03-11  6:40           ` Thomas Hellstrom
  2015-03-11  7:22             ` Dave Airlie
  0 siblings, 1 reply; 10+ messages in thread
From: Thomas Hellstrom @ 2015-03-11  6:40 UTC (permalink / raw)
  To: Dave Airlie; +Cc: Thomas Hellstrom, linux-graphics-maintainer, dri-devel

On 03/10/2015 10:05 PM, Dave Airlie wrote:
> On 10 March 2015 at 22:55, Thomas Hellstrom <thellstrom@vmware.com> wrote:
>> On 03/09/2015 09:25 PM, Dave Airlie wrote:
>>> On 10 March 2015 at 02:02, Thomas Hellstrom <thellstrom@vmware.com> wrote:
>>>> On 03/09/2015 04:22 PM, Daniel Vetter wrote:
>>>>> On Mon, Mar 09, 2015 at 11:04:01AM +0100, Thomas Hellstrom wrote:
>>>>>> Hi,
>>>>>>
>>>>>> I'm not sure this started with 4.0 but when I rmmod the device driver
>>>>>> like so
>>>>>> rmmod vmwgfx
>>>>>>
>>>>>> The device loses its IRQ line as shown in lscpi:
>>>>>>    Flags: bus master, medium devsel, latency 64 <irq missing here>
>>>>>>
>>>>>> and a subsequent modprobe will fail since pdev->irq is 0.
>>>>>>
>>>>>> Is anyone else seeing this with other drivers?
>>>>> I seen occasionally (over the past couple of kernels) random zeros in pdev
>>>>> but dismissed it as broken machines or bugs in i915 (we have them ...).
>>>>> Usually the box died chasing a NULL pointer from pdev. Otherwise no.
>>>>> -Daniel
>>>> OK. Thanks for the info. Since in my case this is 100% reproducible I
>>>> guess I have an excellent opportunity to bisect the problem :-/
>>>>
>>> does lspci -H1, or some option like to direct access hw show it?
>>>
>>> just whether this is the kernel copy or the hw register getting messed up.
>>>
>>> Dave.
>> Hi, Dave,
>>
>> lspci -H1 indeed shows the IRQ number. It turns out that the commit
>> introduced in 4.0 breaking this is
>>
>> b4b55cda587442477a3a9f0669e26bba4b7800c0 is the first bad commit
>> commit b4b55cda587442477a3a9f0669e26bba4b7800c0
>> Author: Jiang Liu <jiang.liu@linux.intel.com>
>> Date:   Thu Feb 5 13:44:47 2015 +0800
>>
>>     x86/PCI: Refine the way to release PCI IRQ resources
>>
>>
>> It's obvious from the commit message that unloading the driver *should*
>> drop the irq resource but its not
>> obvious what's reallocating that resource on driver load...
>>
>> Anyway, it turns out that adding a
>> pci_disable_device(pdev) in the pci driver's remove() method
>> (vmw_remove() in my case) appears to fix the problem:
>> The device irq is removed on driver unload and enabled again on driver
>> load There appears to be no pci_disable_device() on driver exit in core drm.
> Yes that is because at one time pre kms if you pci disabled the VGA device,
> bad things would happen.
>
> I think with modesetting driver it shouldn't be a problem anymore.
>
> Dave.

So what's the preferred remedy here? should I file a bug against the
above commit or should we go ahead modifying
the DRM drivers?

Thanks,
Thomas



> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/dri-devel


_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Device loses its IRQ number on driver unload?
  2015-03-11  6:40           ` Thomas Hellstrom
@ 2015-03-11  7:22             ` Dave Airlie
  2015-03-11  9:28               ` Thomas Hellstrom
  0 siblings, 1 reply; 10+ messages in thread
From: Dave Airlie @ 2015-03-11  7:22 UTC (permalink / raw)
  To: Thomas Hellstrom; +Cc: Thomas Hellstrom, linux-graphics-maintainer, dri-devel

>>
>> I think with modesetting driver it shouldn't be a problem anymore.
>>
>> Dave.
>
> So what's the preferred remedy here? should I file a bug against the
> above commit or should we go ahead modifying
> the DRM drivers?

I'd file against that first, and maybe see why it clears the value.

Dave.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Device loses its IRQ number on driver unload?
  2015-03-11  7:22             ` Dave Airlie
@ 2015-03-11  9:28               ` Thomas Hellstrom
  0 siblings, 0 replies; 10+ messages in thread
From: Thomas Hellstrom @ 2015-03-11  9:28 UTC (permalink / raw)
  To: Dave Airlie; +Cc: linux-graphics-maintainer, dri-devel

On 03/11/2015 08:22 AM, Dave Airlie wrote:
>>> I think with modesetting driver it shouldn't be a problem anymore.
>>>
>>> Dave.
>> So what's the preferred remedy here? should I file a bug against the
>> above commit or should we go ahead modifying
>> the DRM drivers?
> I'd file against that first, and maybe see why it clears the value.
>
> Dave.

https://bugzilla.kernel.org/show_bug.cgi?id=94721

/Thomas

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2015-03-11  9:28 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-03-09 10:04 Device loses its IRQ number on driver unload? Thomas Hellstrom
2015-03-09 15:22 ` Daniel Vetter
2015-03-09 16:02   ` Thomas Hellstrom
2015-03-09 20:25     ` Dave Airlie
2015-03-10 12:55       ` Thomas Hellstrom
2015-03-10 14:01         ` Alex Deucher
2015-03-10 21:05         ` Dave Airlie
2015-03-11  6:40           ` Thomas Hellstrom
2015-03-11  7:22             ` Dave Airlie
2015-03-11  9:28               ` Thomas Hellstrom

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.