All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] pci: do a msi rearm on init
@ 2017-11-24  2:56 Karol Herbst
       [not found] ` <20171124025626.14037-1-kherbst-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
  0 siblings, 1 reply; 4+ messages in thread
From: Karol Herbst @ 2017-11-24  2:56 UTC (permalink / raw)
  To: nouveau-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

On my GP107 when I load nouveau after unloading it, for some reason the
GPU stopped sending or the CPU stopped receiving interrupts if MSI was
enabled.

Doing a rearm once before getting any interrupts fixes this.

Signed-off-by: Karol Herbst <kherbst@redhat.com>
---
 drm/nouveau/nvkm/subdev/pci/base.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/drm/nouveau/nvkm/subdev/pci/base.c b/drm/nouveau/nvkm/subdev/pci/base.c
index b1b1f362..7ee1fbb4 100644
--- a/drm/nouveau/nvkm/subdev/pci/base.c
+++ b/drm/nouveau/nvkm/subdev/pci/base.c
@@ -136,6 +136,10 @@ nvkm_pci_init(struct nvkm_subdev *subdev)
 		return ret;
 
 	pci->irq = pdev->irq;
+	/* workaround: do a rearm once */
+	if (pci->msi)
+		pci->func->msi_rearm(pci);
+
 	return ret;
 }
 
-- 
2.14.3

_______________________________________________
Nouveau mailing list
Nouveau@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/nouveau

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH] pci: do a msi rearm on init
       [not found] ` <20171124025626.14037-1-kherbst-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
@ 2017-11-24 14:02   ` Thierry Reding
  2017-11-24 14:08     ` Karol Herbst
  0 siblings, 1 reply; 4+ messages in thread
From: Thierry Reding @ 2017-11-24 14:02 UTC (permalink / raw)
  To: Karol Herbst; +Cc: nouveau-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW


[-- Attachment #1.1: Type: text/plain, Size: 992 bytes --]

On Fri, Nov 24, 2017 at 03:56:26AM +0100, Karol Herbst wrote:
> On my GP107 when I load nouveau after unloading it, for some reason the
> GPU stopped sending or the CPU stopped receiving interrupts if MSI was
> enabled.

I suppose this could happen if the GPU raises an interrupt after the
driver's already called free_irq() on it, and hence the driver can't
rearm itself in the interrupt handler.

This possibly points to a bug somewhere (the GPU should be completely
idle by the time free_irq() is called), but this seems like a valid
thing to do at initialization in any case to avoid relying on the prior
owner of the device to always behave properly.

> Doing a rearm once before getting any interrupts fixes this.
> 
> Signed-off-by: Karol Herbst <kherbst-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
> ---
>  drm/nouveau/nvkm/subdev/pci/base.c | 4 ++++
>  1 file changed, 4 insertions(+)

Reviewed-by: Thierry Reding <treding-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

[-- Attachment #2: Type: text/plain, Size: 154 bytes --]

_______________________________________________
Nouveau mailing list
Nouveau@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/nouveau

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] pci: do a msi rearm on init
  2017-11-24 14:02   ` Thierry Reding
@ 2017-11-24 14:08     ` Karol Herbst
       [not found]       ` <CACO55tt7xEkxfYHvw=gDRQgULvYr1dCLEUmOztKfXHyBOvUdAw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 4+ messages in thread
From: Karol Herbst @ 2017-11-24 14:08 UTC (permalink / raw)
  To: Thierry Reding; +Cc: nouveau

On Fri, Nov 24, 2017 at 3:02 PM, Thierry Reding
<thierry.reding@gmail.com> wrote:
> On Fri, Nov 24, 2017 at 03:56:26AM +0100, Karol Herbst wrote:
>> On my GP107 when I load nouveau after unloading it, for some reason the
>> GPU stopped sending or the CPU stopped receiving interrupts if MSI was
>> enabled.
>
> I suppose this could happen if the GPU raises an interrupt after the
> driver's already called free_irq() on it, and hence the driver can't
> rearm itself in the interrupt handler.
>
> This possibly points to a bug somewhere (the GPU should be completely
> idle by the time free_irq() is called), but this seems like a valid
> thing to do at initialization in any case to avoid relying on the prior
> owner of the device to always behave properly.
>

Yeah, this makes sense. But what I am wondering about is, why this
isn't a bigger problem or maybe this is just due to those changes in
the Pascal interrupt handler and this is a Pascal only problem?
Anyway, the Nvidia driver seems to do it once on loading time as well,
so I was quite sure we could simply do it this way and be sure that we
are able to use the GPU from any state.

>> Doing a rearm once before getting any interrupts fixes this.
>>
>> Signed-off-by: Karol Herbst <kherbst@redhat.com>
>> ---
>>  drm/nouveau/nvkm/subdev/pci/base.c | 4 ++++
>>  1 file changed, 4 insertions(+)
>
> Reviewed-by: Thierry Reding <treding@nvidia.com>
_______________________________________________
Nouveau mailing list
Nouveau@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/nouveau

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] pci: do a msi rearm on init
       [not found]       ` <CACO55tt7xEkxfYHvw=gDRQgULvYr1dCLEUmOztKfXHyBOvUdAw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2017-11-24 14:23         ` Thierry Reding
  0 siblings, 0 replies; 4+ messages in thread
From: Thierry Reding @ 2017-11-24 14:23 UTC (permalink / raw)
  To: Karol Herbst; +Cc: nouveau


[-- Attachment #1.1: Type: text/plain, Size: 2840 bytes --]

On Fri, Nov 24, 2017 at 03:08:25PM +0100, Karol Herbst wrote:
> On Fri, Nov 24, 2017 at 3:02 PM, Thierry Reding
> <thierry.reding-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
> > On Fri, Nov 24, 2017 at 03:56:26AM +0100, Karol Herbst wrote:
> >> On my GP107 when I load nouveau after unloading it, for some reason the
> >> GPU stopped sending or the CPU stopped receiving interrupts if MSI was
> >> enabled.
> >
> > I suppose this could happen if the GPU raises an interrupt after the
> > driver's already called free_irq() on it, and hence the driver can't
> > rearm itself in the interrupt handler.
> >
> > This possibly points to a bug somewhere (the GPU should be completely
> > idle by the time free_irq() is called), but this seems like a valid
> > thing to do at initialization in any case to avoid relying on the prior
> > owner of the device to always behave properly.
> >
> 
> Yeah, this makes sense. But what I am wondering about is, why this
> isn't a bigger problem or maybe this is just due to those changes in
> the Pascal interrupt handler and this is a Pascal only problem?

Yeah, this could be some kind of race that's only triggering on Pascal.

Comparing with the nvgpu driver it seems like the MSI interrupt should
be rearmed only after all interrupts have been processed, while Nouveau
currently rearms before processing interrupts (though after masking the
interrupts). I'm not very familiar with all of this, but perhaps Pascal
has some interrupts that Nouveau doesn't mask and therefore might race.

Perhaps something like this would help:

--- >8 ---
diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/pci/base.c b/drivers/gpu/drm/nouveau/nvkm/subdev/pci/base.c
index b1b1f3626b96..0b3b802c26df 100644
--- a/drivers/gpu/drm/nouveau/nvkm/subdev/pci/base.c
+++ b/drivers/gpu/drm/nouveau/nvkm/subdev/pci/base.c
@@ -72,10 +72,10 @@ nvkm_pci_intr(int irq, void *arg)
        struct nvkm_device *device = pci->subdev.device;
        bool handled = false;
        nvkm_mc_intr_unarm(device);
-       if (pci->msi)
-               pci->func->msi_rearm(pci);
        nvkm_mc_intr(device, &handled);
        nvkm_mc_intr_rearm(device);
+       if (pci->msi)
+               pci->func->msi_rearm(pci);
        return handled ? IRQ_HANDLED : IRQ_NONE;
 }

--- >8 ---

> Anyway, the Nvidia driver seems to do it once on loading time as well,
> so I was quite sure we could simply do it this way and be sure that we
> are able to use the GPU from any state.

I think it's totally fine to apply as-is and leave it to further
investigation what Nouveau needs to do to properly uninitialize the
device. Like you said it can always happen that somebody else leaves
the GPU in some undefined state, in which case it's good to always
do this at initialization.

Thierry

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

[-- Attachment #2: Type: text/plain, Size: 154 bytes --]

_______________________________________________
Nouveau mailing list
Nouveau@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/nouveau

^ permalink raw reply related	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2017-11-24 14:23 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-11-24  2:56 [PATCH] pci: do a msi rearm on init Karol Herbst
     [not found] ` <20171124025626.14037-1-kherbst-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2017-11-24 14:02   ` Thierry Reding
2017-11-24 14:08     ` Karol Herbst
     [not found]       ` <CACO55tt7xEkxfYHvw=gDRQgULvYr1dCLEUmOztKfXHyBOvUdAw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2017-11-24 14:23         ` Thierry Reding

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.