* Reported regressions for 4.7 as of Sunday, 2016-06-19
@ 2016-06-19 14:52 Thorsten Leemhuis
2016-06-20 10:21 ` Christoph Hellwig
` (2 more replies)
0 siblings, 3 replies; 20+ messages in thread
From: Thorsten Leemhuis @ 2016-06-19 14:52 UTC (permalink / raw)
To: Linus Torvalds, Linux Kernel Mailing List
Hi! Here is my second regression report for 4.7. It has 19 entries; 8 of
them are new; 8 regressions were fixed since the last report (those are
not included in this report) and I dropped 2 which turned out to not be
regressions after all (at least that's what I think right now).
FWIW, it's still a lot of work to generate this report (as expected). I'm still
thinking about a plan how to make the whole tracking process easier and more
attractive for everyone, but it will take a few weeks before I come up with a
concrete plan.
HTH, CU, Thorsten
(¹) last weeks report was http://article.gmane.org/gmane.linux.kernel/2241805
P.S.: Please let me know if a regression is missing in the list; or if there is
something on the list which shouldn't be there.
----
Description: ath10k no longer authenticates and freezes system
Report: https://bugzilla.kernel.org/show_bug.cgi?id=119151
Latest status: http://thread.gmane.org/gmane.linux.kernel.wireless.general/152513/focus=152535
Date rep/stat: 2016-05-27 / 2016-06-02
Notes: forgotten? poked bug report on Friday
Description: Bad flicker on skylake HQD due to code in the 4.7 merge window
Report: http://thread.gmane.org/gmane.linux.kernel/2230377
Latest status: http://thread.gmane.org/gmane.linux.kernel/2230377/focus=92602
Date rep/stat: 2016-05-30 / 2016-06-18
Notes: investigation ongoing
Description: we noticed reaim.jobs_per_min -49.1% regression
Report: http://thread.gmane.org/gmane.linux.kernel/2231025/
Latest status: http://thread.gmane.org/gmane.linux.kernel/2231025/focus=2233571
Date rep/stat: 2016-05-31 / 2016-06-13
Notes: wip? http://article.gmane.org/gmane.linux.kernel/2241911
Description: NULL pointer dereference with BCM4350 wireless device
Report: https://bugzilla.kernel.org/show_bug.cgi?id=119451
Latest status: 7.6.
Date rep/stat: 2016-06-01 / 2016-06-07
Notes: poked bugzilla, likely fixed in mainline by https://git.kernel.org/torvalds/c/31143e2933
Description: 795ae7a0de: pixz.throughput -9.1% regression
Report: http://thread.gmane.org/gmane.linux.kernel/2233056/
Latest status: http://thread.gmane.org/gmane.linux.kernel/2233056/focus=2238208
Date rep/stat: 2016-06-02 / 2016-06-08
Notes: @regression tracker: poke someone
Description: RadeonSI get a huge performance dip with used with the nine state tracker
Report: https://bugzilla.kernel.org/show_bug.cgi?id=119631
Latest status: https://bugzilla.kernel.org/show_bug.cgi?id=119631#c12
Date rep/stat: 2016-06-04 / 2016-06-15
Notes: investigation ongoing, waiting for reporter
Description: 5c0a85fad9: unixbench.score -6.3% regression
Report: http://thread.gmane.org/gmane.linux.kernel/2235794
Latest status: http://thread.gmane.org/gmane.linux.kernel.mm/153151/focus=153409
Date rep/stat: 2016-06-06 / 2016-06-17
Notes: wip, revert discussed
Description: System hang possibly due to brcmfmac regression
Report: https://bugzilla.kernel.org/show_bug.cgi?id=119761
Latest status: https://bugzilla.kernel.org/show_bug.cgi?id=119761#c1
Date rep/stat: 2016-06-07 / 2016-06-12
Notes: might be fixed, waiting for clarification from reporter
Description: Regression in kbuild: fix if_change and friends to consider argument
Report: http://thread.gmane.org/gmane.linux.kbuild.devel/14981/
Latest status: http://thread.gmane.org/gmane.linux.kbuild.devel/14981/focus=15000
Date rep/stat: 2016-06-07 / 2016-06-09
Notes: patch in linux-next
Description: BUG: using smp_processor_id() in preemptible [00000000] code] when using a USB Mass Storage device
Report: http://thread.gmane.org/gmane.linux.usb.general/143504
Latest status: http://thread.gmane.org/gmane.linux.usb.general/143504/focus=153154 https://lkml.org/lkml/2016/6/15/397
Date rep/stat: 2016-06-09 / 2016-06-15
Notes: investigation ongoing
Description: Notebook Clevo N350DW i5-6500T freezes on shutdown (but reboots fine)
Report: https://bugzilla.kernel.org/show_bug.cgi?id=119871
Latest status: https://bugzilla.kernel.org/show_bug.cgi?id=119871#c8
Date rep/stat: 2016-06-09 / 2016-06-16
Notes: reporter needs help to provide more details to debug the problem
Description: BUG() in dmesg after loading nouveau module
Report: https://bugzilla.kernel.org/show_bug.cgi?id=120591
Latest status: https://bugzilla.kernel.org/show_bug.cgi?id=120591#c3
Date rep/stat: 2016-06-18 / 2016-06-19
Notes: wip
Description: BUG: unable to handle kernel NULL pointer dereference […] qla24xx_process_response_queue+0x49/0x4b0 [qla2xxx]
Report: https://bugzilla.kernel.org/show_bug.cgi?id=120201
Latest status: n/a
Date rep/stat: 2016-06-14 / n/a
Notes: poked bugzilla, a bit unsure how to proceed
Description: Performance drop 30-40% for SPECjbb2005 and SPECjvm2008 benchmarks
Report: https://bugzilla.kernel.org/show_bug.cgi?id=120481
Latest status: n/a
Date rep/stat: 2016-06-16 / n/a
Notes: real reason unknown
Description: performance drop on SFC interface around 30 %
Report: https://bugzilla.kernel.org/show_bug.cgi?id=120461
Latest status: https://bugzilla.kernel.org/show_bug.cgi?id=120461#c9
Date rep/stat: 2016-06-17 / 2016-06-17
Notes: wip
Description: System hang when plug/un-plug USB 3.1 key via thunderbolt port on Dell XPS 13
Report: https://bugzilla.kernel.org/show_bug.cgi?id=120241
Latest status: n/a
Date rep/stat: 2016-06-14 / n/a
Notes: waiting for reporter
Description: performance regression on Jetson TK1 since 4.7-rc1: moving windows under X would become unsufferably slow, and graphical performance under X in general is seriously degraded
Report: http://thread.gmane.org/gmane.linux.ports.tegra/26983/focus=2245415
Latest status: n/a
Date rep/stat: 2016-06-16 / n/a
Notes: wip, fix available
Description: lk 4.7 regression: EDAC, amd64_edac: Drop pci_register_driver() use
Report: http://thread.gmane.org/gmane.linux.kernel/2245115/
Latest status: http://thread.gmane.org/gmane.linux.kernel/2246008/focus=2246009
Date rep/stat: 2016-06-15 / 2016-06-16
Notes: wip, fix available
Description: regression in 8250 uart driver
Report: http://thread.gmane.org/gmane.linux.kernel/2243130/focus=2243653
Latest status: http://thread.gmane.org/gmane.linux.kernel/2243130/focus=2243653
Date rep/stat: 2016-06-14 / 2016-06-14
Notes: wip, fix available
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Reported regressions for 4.7 as of Sunday, 2016-06-19
2016-06-19 14:52 Reported regressions for 4.7 as of Sunday, 2016-06-19 Thorsten Leemhuis
@ 2016-06-20 10:21 ` Christoph Hellwig
2016-06-21 11:11 ` Josh Boyer
2016-06-22 6:36 ` Kalle Valo
2 siblings, 0 replies; 20+ messages in thread
From: Christoph Hellwig @ 2016-06-20 10:21 UTC (permalink / raw)
To: Thorsten Leemhuis
Cc: Linus Torvalds, Linux Kernel Mailing List, linux-fsdevel,
linux-ext4, xfs
Another important one is the rename regression in XFS and ext4 that
I suspect is due the VFS changes in 4.7:
http://oss.sgi.com/pipermail/xfs/2016-June/049138.html
http://oss.sgi.com/pipermail/xfs/2016-June/049309.html
possibly related:
http://marc.info/?l=linux-kernel&m=146605889024559&w=2
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Reported regressions for 4.7 as of Sunday, 2016-06-19
2016-06-19 14:52 Reported regressions for 4.7 as of Sunday, 2016-06-19 Thorsten Leemhuis
2016-06-20 10:21 ` Christoph Hellwig
@ 2016-06-21 11:11 ` Josh Boyer
2016-06-21 20:40 ` Linus Torvalds
2016-06-22 6:36 ` Kalle Valo
2 siblings, 1 reply; 20+ messages in thread
From: Josh Boyer @ 2016-06-21 11:11 UTC (permalink / raw)
To: Thorsten Leemhuis; +Cc: Linus Torvalds, Linux Kernel Mailing List
On Sun, Jun 19, 2016 at 10:52 AM, Thorsten Leemhuis
<regressions@leemhuis.info> wrote:
> Description: BUG: unable to handle kernel NULL pointer dereference […] qla24xx_process_response_queue+0x49/0x4b0 [qla2xxx]
> Report: https://bugzilla.kernel.org/show_bug.cgi?id=120201
> Latest status: n/a
> Date rep/stat: 2016-06-14 / n/a
> Notes: poked bugzilla, a bit unsure how to proceed
We have two bug reports against 4.5.5 - 4.5.7 of this as well. So
whatever commit caused this in 4.7 seems to have been pulled into the
4.5.y stable tree. I suspect it is in the 4.6.y stable tree as well,
but we don't have that pushed out yet.
https://bugzilla.redhat.com/show_bug.cgi?id=1348342
https://bugzilla.redhat.com/show_bug.cgi?id=1346753
josh
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Reported regressions for 4.7 as of Sunday, 2016-06-19
2016-06-21 11:11 ` Josh Boyer
@ 2016-06-21 20:40 ` Linus Torvalds
2016-06-22 0:55 ` Josh Boyer
2016-06-22 1:25 ` Martin K. Petersen
0 siblings, 2 replies; 20+ messages in thread
From: Linus Torvalds @ 2016-06-21 20:40 UTC (permalink / raw)
To: Josh Boyer, Martin K. Petersen, Johannes Thumshirn
Cc: Thorsten Leemhuis, Linux Kernel Mailing List
On Tue, Jun 21, 2016 at 4:11 AM, Josh Boyer <jwboyer@fedoraproject.org> wrote:
> On Sun, Jun 19, 2016 at 10:52 AM, Thorsten Leemhuis
> <regressions@leemhuis.info> wrote:
>> Description: BUG: unable to handle kernel NULL pointer dereference […] qla24xx_process_response_queue+0x49/0x4b0 [qla2xxx]
>> Report: https://bugzilla.kernel.org/show_bug.cgi?id=120201
>> Latest status: n/a
>> Date rep/stat: 2016-06-14 / n/a
>> Notes: poked bugzilla, a bit unsure how to proceed
>
> We have two bug reports against 4.5.5 - 4.5.7 of this as well. So
> whatever commit caused this in 4.7 seems to have been pulled into the
> 4.5.y stable tree. I suspect it is in the 4.6.y stable tree as well,
> but we don't have that pushed out yet.
>
> https://bugzilla.redhat.com/show_bug.cgi?id=1348342
> https://bugzilla.redhat.com/show_bug.cgi?id=1346753
That seems pretty unambiguous - 4.5.5 is fine, and 4.5.6 is bad. So
unless it's specific to whatever patches RH is carrying around, we
should be able to just look at the scsi-related stable tree patches in
that region. That seems simple enough.
But theres' really only two (trivial) patches in there:
- scsi: Add intermediate STARGET_REMOVE state to scsi_target_state
(f05795d3d771f30a7bdc3a138bf714b06d42aa95 upstream)
- Revert "scsi: fix soft lockup in scsi_remove_target() on module removal"
(305c2e71b3d733ec065cb716c76af7d554bd5571 upstream)
as far as I can tell. And neither of them looks very likely, but what
do I know. Adding Martin Petersen and Johannes Thumshirn to the
participants just in case they go "Ahh.."
Linus
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Reported regressions for 4.7 as of Sunday, 2016-06-19
2016-06-21 20:40 ` Linus Torvalds
@ 2016-06-22 0:55 ` Josh Boyer
2016-06-22 1:25 ` Martin K. Petersen
1 sibling, 0 replies; 20+ messages in thread
From: Josh Boyer @ 2016-06-22 0:55 UTC (permalink / raw)
To: Linus Torvalds
Cc: Martin K. Petersen, Johannes Thumshirn, Thorsten Leemhuis,
Linux Kernel Mailing List
On Tue, Jun 21, 2016 at 4:40 PM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
> On Tue, Jun 21, 2016 at 4:11 AM, Josh Boyer <jwboyer@fedoraproject.org> wrote:
>> On Sun, Jun 19, 2016 at 10:52 AM, Thorsten Leemhuis
>> <regressions@leemhuis.info> wrote:
>>> Description: BUG: unable to handle kernel NULL pointer dereference […] qla24xx_process_response_queue+0x49/0x4b0 [qla2xxx]
>>> Report: https://bugzilla.kernel.org/show_bug.cgi?id=120201
>>> Latest status: n/a
>>> Date rep/stat: 2016-06-14 / n/a
>>> Notes: poked bugzilla, a bit unsure how to proceed
>>
>> We have two bug reports against 4.5.5 - 4.5.7 of this as well. So
>> whatever commit caused this in 4.7 seems to have been pulled into the
>> 4.5.y stable tree. I suspect it is in the 4.6.y stable tree as well,
>> but we don't have that pushed out yet.
>>
>> https://bugzilla.redhat.com/show_bug.cgi?id=1348342
>> https://bugzilla.redhat.com/show_bug.cgi?id=1346753
>
> That seems pretty unambiguous - 4.5.5 is fine, and 4.5.6 is bad. So
> unless it's specific to whatever patches RH is carrying around, we
> should be able to just look at the scsi-related stable tree patches in
> that region. That seems simple enough.
I thought the same. We're only carrying one very very old scsi patch
to revalidate a pointer. That shouldn't even been involved in this
path and upstream 4.7-rcX is hitting the same issue anyway. Thus far
we've only seen reports for qla2xxx devices as far as I'm aware.
> But theres' really only two (trivial) patches in there:
>
> - scsi: Add intermediate STARGET_REMOVE state to scsi_target_state
> (f05795d3d771f30a7bdc3a138bf714b06d42aa95 upstream)
>
> - Revert "scsi: fix soft lockup in scsi_remove_target() on module removal"
> (305c2e71b3d733ec065cb716c76af7d554bd5571 upstream)
>
> as far as I can tell. And neither of them looks very likely, but what
> do I know. Adding Martin Petersen and Johannes Thumshirn to the
> participants just in case they go "Ahh.."
Right, I had the same head scratching.
josh
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Reported regressions for 4.7 as of Sunday, 2016-06-19
2016-06-21 20:40 ` Linus Torvalds
2016-06-22 0:55 ` Josh Boyer
@ 2016-06-22 1:25 ` Martin K. Petersen
2016-06-22 1:29 ` Quinn Tran
2016-06-22 11:51 ` Johannes Thumshirn
1 sibling, 2 replies; 20+ messages in thread
From: Martin K. Petersen @ 2016-06-22 1:25 UTC (permalink / raw)
To: Linus Torvalds
Cc: Josh Boyer, Martin K. Petersen, Johannes Thumshirn,
Thorsten Leemhuis, Linux Kernel Mailing List, Quinn Tran
>>>>> "Linus" == Linus Torvalds <torvalds@linux-foundation.org> writes:
>> https://bugzilla.redhat.com/show_bug.cgi?id=1348342
This first one appears to be a crash in a USB sound doodad and not
qla2xxx. Also, this appears to be where the 4.5.5 -> 4.5.6 notion comes
from. So we can probably ignore 4.5.5 as the last good revision.
Linus> as far as I can tell. And neither of them looks very likely, but
Linus> what do I know. Adding Martin Petersen and Johannes Thumshirn to
Linus> the participants just in case they go "Ahh.."
Doubt it's Johannes' tweak. The qla2xxx crash from the two other
bugzilla entries is in:
(gdb) list *qla24xx_process_response_queue+0x49
0x27e09 is in qla24xx_process_response_queue (drivers/scsi/qla2xxx/qla_isr.c:2560).
2555 if (rsp->msix->cpuid != smp_processor_id()) {
2556 /* if kernel does not notify qla of IRQ's CPU change,
2557 * then set it here.
2558 */
2559 rsp->msix->cpuid = smp_processor_id();
2560 ha->tgt.rspq_vector_cpuid = rsp->msix->cpuid;
2561 }
2562
2563 while (rsp->ring_ptr->signature != RESPONSE_PROCESSED) {
2564 pkt = (struct sts_entry_24xx *)rsp->ring_ptr;
That particular code went into 4.5 and comes from:
commit cdb898c52d1dfad4b4800b83a58b3fe5d352edde
Author: Quinn Tran <quinn.tran@qlogic.com>
Date: Thu Dec 17 14:57:05 2015 -0500
qla2xxx: Add irq affinity notification
Register to receive notification of when irq setting change
occured.
Signed-off-by: Quinn Tran <quinn.tran@qlogic.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@qlogic.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Quinn?
--
Martin K. Petersen Oracle Linux Engineering
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Reported regressions for 4.7 as of Sunday, 2016-06-19
2016-06-22 1:25 ` Martin K. Petersen
@ 2016-06-22 1:29 ` Quinn Tran
2016-06-22 11:51 ` Johannes Thumshirn
1 sibling, 0 replies; 20+ messages in thread
From: Quinn Tran @ 2016-06-22 1:29 UTC (permalink / raw)
To: Martin K. Petersen, Linus Torvalds
Cc: Josh Boyer, Johannes Thumshirn, Thorsten Leemhuis, linux-kernel
Investigating.
Regards,
Quinn Tran
-----Original Message-----
From: "Martin K. Petersen" <martin.petersen@oracle.com>
Organization: Oracle Corporation
Date: Tuesday, June 21, 2016 at 6:25 PM
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Josh Boyer <jwboyer@fedoraproject.org>, "Martin K. Petersen" <martin.petersen@oracle.com>, Johannes Thumshirn <jthumshirn@suse.de>, Thorsten Leemhuis <regressions@leemhuis.info>, linux-kernel <linux-kernel@vger.kernel.org>, Quinn Tran <quinn.tran@qlogic.com>
Subject: Re: Reported regressions for 4.7 as of Sunday, 2016-06-19
>>>>>> "Linus" == Linus Torvalds <torvalds@linux-foundation.org> writes:
>
>>> https://bugzilla.redhat.com/show_bug.cgi?id=1348342
>
>This first one appears to be a crash in a USB sound doodad and not
>qla2xxx. Also, this appears to be where the 4.5.5 -> 4.5.6 notion comes
>from. So we can probably ignore 4.5.5 as the last good revision.
>
>Linus> as far as I can tell. And neither of them looks very likely, but
>Linus> what do I know. Adding Martin Petersen and Johannes Thumshirn to
>Linus> the participants just in case they go "Ahh.."
>
>Doubt it's Johannes' tweak. The qla2xxx crash from the two other
>bugzilla entries is in:
>
>(gdb) list *qla24xx_process_response_queue+0x49
>0x27e09 is in qla24xx_process_response_queue (drivers/scsi/qla2xxx/qla_isr.c:2560).
>2555 if (rsp->msix->cpuid != smp_processor_id()) {
>2556 /* if kernel does not notify qla of IRQ's CPU change,
>2557 * then set it here.
>2558 */
>2559 rsp->msix->cpuid = smp_processor_id();
>2560 ha->tgt.rspq_vector_cpuid = rsp->msix->cpuid;
>2561 }
>2562
>2563 while (rsp->ring_ptr->signature != RESPONSE_PROCESSED) {
>2564 pkt = (struct sts_entry_24xx *)rsp->ring_ptr;
>
>That particular code went into 4.5 and comes from:
>
>commit cdb898c52d1dfad4b4800b83a58b3fe5d352edde
>Author: Quinn Tran <quinn.tran@qlogic.com>
>Date: Thu Dec 17 14:57:05 2015 -0500
>
> qla2xxx: Add irq affinity notification
>
> Register to receive notification of when irq setting change
> occured.
>
> Signed-off-by: Quinn Tran <quinn.tran@qlogic.com>
> Signed-off-by: Himanshu Madhani <himanshu.madhani@qlogic.com>
> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
>
>Quinn?
>
>--
>Martin K. Petersen Oracle Linux Engineering
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Reported regressions for 4.7 as of Sunday, 2016-06-19
2016-06-19 14:52 Reported regressions for 4.7 as of Sunday, 2016-06-19 Thorsten Leemhuis
2016-06-20 10:21 ` Christoph Hellwig
2016-06-21 11:11 ` Josh Boyer
@ 2016-06-22 6:36 ` Kalle Valo
2 siblings, 0 replies; 20+ messages in thread
From: Kalle Valo @ 2016-06-22 6:36 UTC (permalink / raw)
To: Thorsten Leemhuis; +Cc: Linus Torvalds, Linux Kernel Mailing List
Thorsten Leemhuis <regressions@leemhuis.info> writes:
> Description: ath10k no longer authenticates and freezes system
> Report: https://bugzilla.kernel.org/show_bug.cgi?id=119151
> Latest status: http://thread.gmane.org/gmane.linux.kernel.wireless.general/152513/focus=152535
> Date rep/stat: 2016-05-27 / 2016-06-02
> Notes: forgotten? poked bug report on Friday
Here's the fix:
ath10k: fix deadlock while processing rx_in_ord_ind
https://git.kernel.org/cgit/linux/kernel/git/kvalo/wireless-drivers.git/commit/?id=e50525bef593c3dd0564df676c567d77f7c20322
--
Kalle Valo
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Reported regressions for 4.7 as of Sunday, 2016-06-19
2016-06-22 1:25 ` Martin K. Petersen
2016-06-22 1:29 ` Quinn Tran
@ 2016-06-22 11:51 ` Johannes Thumshirn
2016-06-22 15:57 ` Quinn Tran
1 sibling, 1 reply; 20+ messages in thread
From: Johannes Thumshirn @ 2016-06-22 11:51 UTC (permalink / raw)
To: Martin K. Petersen
Cc: Linus Torvalds, Josh Boyer, Thorsten Leemhuis,
Linux Kernel Mailing List, Quinn Tran
On Tue, Jun 21, 2016 at 09:25:18PM -0400, Martin K. Petersen wrote:
> >>>>> "Linus" == Linus Torvalds <torvalds@linux-foundation.org> writes:
>
> >> https://bugzilla.redhat.com/show_bug.cgi?id=1348342
>
> This first one appears to be a crash in a USB sound doodad and not
> qla2xxx. Also, this appears to be where the 4.5.5 -> 4.5.6 notion comes
> from. So we can probably ignore 4.5.5 as the last good revision.
>
> Linus> as far as I can tell. And neither of them looks very likely, but
> Linus> what do I know. Adding Martin Petersen and Johannes Thumshirn to
> Linus> the participants just in case they go "Ahh.."
>
> Doubt it's Johannes' tweak. The qla2xxx crash from the two other
> bugzilla entries is in:
>
> (gdb) list *qla24xx_process_response_queue+0x49
> 0x27e09 is in qla24xx_process_response_queue (drivers/scsi/qla2xxx/qla_isr.c:2560).
> 2555 if (rsp->msix->cpuid != smp_processor_id()) {
> 2556 /* if kernel does not notify qla of IRQ's CPU change,
> 2557 * then set it here.
> 2558 */
> 2559 rsp->msix->cpuid = smp_processor_id();
> 2560 ha->tgt.rspq_vector_cpuid = rsp->msix->cpuid;
> 2561 }
> 2562
> 2563 while (rsp->ring_ptr->signature != RESPONSE_PROCESSED) {
> 2564 pkt = (struct sts_entry_24xx *)rsp->ring_ptr;
>
> That particular code went into 4.5 and comes from:
>
> commit cdb898c52d1dfad4b4800b83a58b3fe5d352edde
> Author: Quinn Tran <quinn.tran@qlogic.com>
> Date: Thu Dec 17 14:57:05 2015 -0500
>
> qla2xxx: Add irq affinity notification
>
> Register to receive notification of when irq setting change
> occured.
>
> Signed-off-by: Quinn Tran <quinn.tran@qlogic.com>
> Signed-off-by: Himanshu Madhani <himanshu.madhani@qlogic.com>
> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
>
> Quinn?
>
> --
> Martin K. Petersen Oracle Linux Engineering
Having a quick look at it I _think_ this could be the problem.
We request the IRQ _before_ actually assigning the rsp->msix entry. Now If an
IRQ triggers, before the assignment we touch rsp->msix->cpuid, which is
probably the case. At least from what I conduct from Martin's mail.
diff --git a/drivers/scsi/qla2xxx/qla_isr.c b/drivers/scsi/qla2xxx/qla_isr.c
index 5649c20..20743a3 100644
--- a/drivers/scsi/qla2xxx/qla_isr.c
+++ b/drivers/scsi/qla2xxx/qla_isr.c
@@ -3086,6 +3086,8 @@ qla24xx_enable_msix(struct qla_hw_data *ha, struct rsp_que *rsp)
/* Enable MSI-X vectors for the base queue */
for (i = 0; i < 2; i++) {
qentry = &ha->msix_entries[i];
+ qentry->rsp = rsp;
+ rsp->msix = qentry;
if (IS_P3P_TYPE(ha))
ret = request_irq(qentry->vector,
qla82xx_msix_entries[i].handler,
@@ -3097,8 +3099,6 @@ qla24xx_enable_msix(struct qla_hw_data *ha, struct rsp_que *rsp)
if (ret)
goto msix_register_fail;
qentry->have_irq = 1;
- qentry->rsp = rsp;
- rsp->msix = qentry;
/* Register for CPU affinity notification. */
irq_set_affinity_notifier(qentry->vector, &qentry->irq_notify);
@@ -3119,12 +3119,12 @@ qla24xx_enable_msix(struct qla_hw_data *ha, struct rsp_que *rsp)
*/
if (QLA_TGT_MODE_ENABLED() && IS_ATIO_MSIX_CAPABLE(ha)) {
qentry = &ha->msix_entries[ATIO_VECTOR];
+ qentry->rsp = rsp;
+ rsp->msix = qentry;
ret = request_irq(qentry->vector,
qla83xx_msix_entries[ATIO_VECTOR].handler,
0, qla83xx_msix_entries[ATIO_VECTOR].name, rsp);
qentry->have_irq = 1;
- qentry->rsp = rsp;
- rsp->msix = qentry;
}
msix_register_fail:
I'm not sure if we need the qentry->have_irq assingment as well, I'm not
deep enough into the qla2xx driver yet, maybe Quinn can clarify.
Beware of the above change being untested.
Byte,
Johannes
--
Johannes Thumshirn Storage
jthumshirn@suse.de +49 911 74053 689
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: Felix Imendörffer, Jane Smithard, Graham Norton
HRB 21284 (AG Nürnberg)
Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850
^ permalink raw reply related [flat|nested] 20+ messages in thread
* Re: Reported regressions for 4.7 as of Sunday, 2016-06-19
2016-06-22 11:51 ` Johannes Thumshirn
@ 2016-06-22 15:57 ` Quinn Tran
2016-06-23 7:22 ` Johannes Thumshirn
2016-07-05 16:30 ` Josh Boyer
0 siblings, 2 replies; 20+ messages in thread
From: Quinn Tran @ 2016-06-22 15:57 UTC (permalink / raw)
To: Johannes Thumshirn, Martin K. Petersen
Cc: Linus Torvalds, Josh Boyer, Thorsten Leemhuis, linux-kernel
Johannes, Martin,
Based on the screen shot/call trace, it looks like this adapter is not using MSIX. It defaulted back to MSI or INTx interrupt. The code made an assumption of MSIX is available. There is no point in go through that code segment.
Can you try this work around? It’s untested. Thanks.
diff --git a/drivers/scsi/qla2xxx/qla_isr.c b/drivers/scsi/qla2xxx/qla_isr.c
index 5649c20..e033ecb 100644
--- a/drivers/scsi/qla2xxx/qla_isr.c
+++ b/drivers/scsi/qla2xxx/qla_isr.c
@@ -2548,7 +2548,7 @@ void qla24xx_process_response_queue(struct scsi_qla_host *vha,
if (!vha->flags.online)
return;
- if (rsp->msix->cpuid != smp_processor_id()) {
+ if (rsp->msix && (rsp->msix->cpuid != smp_processor_id())) {
/* if kernel does not notify qla of IRQ's CPU change,
* then set it here.
*/
Regards,
Quinn Tran
-----Original Message-----
From: Johannes Thumshirn <jthumshirn@suse.de>
Date: Wednesday, June 22, 2016 at 4:51 AM
To: "Martin K. Petersen" <martin.petersen@oracle.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>, Josh Boyer <jwboyer@fedoraproject.org>, Thorsten Leemhuis <regressions@leemhuis.info>, linux-kernel <linux-kernel@vger.kernel.org>, Quinn Tran <quinn.tran@qlogic.com>
Subject: Re: Reported regressions for 4.7 as of Sunday, 2016-06-19
>On Tue, Jun 21, 2016 at 09:25:18PM -0400, Martin K. Petersen wrote:
>> >>>>> "Linus" == Linus Torvalds <torvalds@linux-foundation.org> writes:
>>
>> >> https://bugzilla.redhat.com/show_bug.cgi?id=1348342
>>
>> This first one appears to be a crash in a USB sound doodad and not
>> qla2xxx. Also, this appears to be where the 4.5.5 -> 4.5.6 notion comes
>> from. So we can probably ignore 4.5.5 as the last good revision.
>>
>> Linus> as far as I can tell. And neither of them looks very likely, but
>> Linus> what do I know. Adding Martin Petersen and Johannes Thumshirn to
>> Linus> the participants just in case they go "Ahh.."
>>
>> Doubt it's Johannes' tweak. The qla2xxx crash from the two other
>> bugzilla entries is in:
>>
>> (gdb) list *qla24xx_process_response_queue+0x49
>> 0x27e09 is in qla24xx_process_response_queue (drivers/scsi/qla2xxx/qla_isr.c:2560).
>> 2555 if (rsp->msix->cpuid != smp_processor_id()) {
>> 2556 /* if kernel does not notify qla of IRQ's CPU change,
>> 2557 * then set it here.
>> 2558 */
>> 2559 rsp->msix->cpuid = smp_processor_id();
>> 2560 ha->tgt.rspq_vector_cpuid = rsp->msix->cpuid;
>> 2561 }
>> 2562
>> 2563 while (rsp->ring_ptr->signature != RESPONSE_PROCESSED) {
>> 2564 pkt = (struct sts_entry_24xx *)rsp->ring_ptr;
>>
>> That particular code went into 4.5 and comes from:
>>
>> commit cdb898c52d1dfad4b4800b83a58b3fe5d352edde
>> Author: Quinn Tran <quinn.tran@qlogic.com>
>> Date: Thu Dec 17 14:57:05 2015 -0500
>>
>> qla2xxx: Add irq affinity notification
>>
>> Register to receive notification of when irq setting change
>> occured.
>>
>> Signed-off-by: Quinn Tran <quinn.tran@qlogic.com>
>> Signed-off-by: Himanshu Madhani <himanshu.madhani@qlogic.com>
>> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
>>
>> Quinn?
>>
>> --
>> Martin K. Petersen Oracle Linux Engineering
>
>Having a quick look at it I _think_ this could be the problem.
>We request the IRQ _before_ actually assigning the rsp->msix entry. Now If an
>IRQ triggers, before the assignment we touch rsp->msix->cpuid, which is
>probably the case. At least from what I conduct from Martin's mail.
>
>diff --git a/drivers/scsi/qla2xxx/qla_isr.c b/drivers/scsi/qla2xxx/qla_isr.c
>index 5649c20..20743a3 100644
>--- a/drivers/scsi/qla2xxx/qla_isr.c
>+++ b/drivers/scsi/qla2xxx/qla_isr.c
>@@ -3086,6 +3086,8 @@ qla24xx_enable_msix(struct qla_hw_data *ha, struct rsp_que *rsp)
> /* Enable MSI-X vectors for the base queue */
> for (i = 0; i < 2; i++) {
> qentry = &ha->msix_entries[i];
>+ qentry->rsp = rsp;
>+ rsp->msix = qentry;
> if (IS_P3P_TYPE(ha))
> ret = request_irq(qentry->vector,
> qla82xx_msix_entries[i].handler,
>@@ -3097,8 +3099,6 @@ qla24xx_enable_msix(struct qla_hw_data *ha, struct rsp_que *rsp)
> if (ret)
> goto msix_register_fail;
> qentry->have_irq = 1;
>- qentry->rsp = rsp;
>- rsp->msix = qentry;
>
> /* Register for CPU affinity notification. */
> irq_set_affinity_notifier(qentry->vector, &qentry->irq_notify);
>@@ -3119,12 +3119,12 @@ qla24xx_enable_msix(struct qla_hw_data *ha, struct rsp_que *rsp)
> */
> if (QLA_TGT_MODE_ENABLED() && IS_ATIO_MSIX_CAPABLE(ha)) {
> qentry = &ha->msix_entries[ATIO_VECTOR];
>+ qentry->rsp = rsp;
>+ rsp->msix = qentry;
> ret = request_irq(qentry->vector,
> qla83xx_msix_entries[ATIO_VECTOR].handler,
> 0, qla83xx_msix_entries[ATIO_VECTOR].name, rsp);
> qentry->have_irq = 1;
>- qentry->rsp = rsp;
>- rsp->msix = qentry;
> }
>
> msix_register_fail:
>
>
>I'm not sure if we need the qentry->have_irq assingment as well, I'm not
>deep enough into the qla2xx driver yet, maybe Quinn can clarify.
>Beware of the above change being untested.
>
>Byte,
> Johannes
>--
>Johannes Thumshirn Storage
>jthumshirn@suse.de +49 911 74053 689
>SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
>GF: Felix Imendörffer, Jane Smithard, Graham Norton
>HRB 21284 (AG Nürnberg)
>Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850
^ permalink raw reply related [flat|nested] 20+ messages in thread
* Re: Reported regressions for 4.7 as of Sunday, 2016-06-19
2016-06-22 15:57 ` Quinn Tran
@ 2016-06-23 7:22 ` Johannes Thumshirn
2016-06-23 16:13 ` Quinn Tran
2016-07-05 16:30 ` Josh Boyer
1 sibling, 1 reply; 20+ messages in thread
From: Johannes Thumshirn @ 2016-06-23 7:22 UTC (permalink / raw)
To: Quinn Tran
Cc: Martin K. Petersen, Linus Torvalds, Josh Boyer,
Thorsten Leemhuis, linux-kernel, linux-scsi
[+ Cc linux-scsi@vger.kernel.org ]
On Wed, Jun 22, 2016 at 03:57:35PM +0000, Quinn Tran wrote:
> Johannes, Martin,
>
> Based on the screen shot/call trace, it looks like this adapter is not using MSIX. It defaulted back to MSI or INTx interrupt. The code made an assumption of MSIX is available. There is no point in go through that code segment.
>
> Can you try this work around? It’s untested. Thanks.
>
>
> diff --git a/drivers/scsi/qla2xxx/qla_isr.c b/drivers/scsi/qla2xxx/qla_isr.c
> index 5649c20..e033ecb 100644
> --- a/drivers/scsi/qla2xxx/qla_isr.c
> +++ b/drivers/scsi/qla2xxx/qla_isr.c
> @@ -2548,7 +2548,7 @@ void qla24xx_process_response_queue(struct scsi_qla_host *vha,
> if (!vha->flags.online)
> return;
>
> - if (rsp->msix->cpuid != smp_processor_id()) {
> + if (rsp->msix && (rsp->msix->cpuid != smp_processor_id())) {
> /* if kernel does not notify qla of IRQ's CPU change,
> * then set it here.
> */
>
But this still does not fix the race which would be possible if the HBA is
using MSI-X but triggering IRQs early enough.
Have a look at this (I admit theoretical) path:
qla24xx_enable_msix(struct qla_hw_data *ha, struct rsp_que *rsp)
{
[...]
/* Enable MSI-X vectors for the base queue */
for (i = 0; i < 2; i++) {
qentry = &ha->msix_entries[i];
if (IS_P3P_TYPE(ha))
ret = request_irq(qentry->vector,
qla82xx_msix_entries[i].handler,
0, qla82xx_msix_entries[i].name, rsp);
else
ret = request_irq(qentry->vector,
msix_entries[i].handler,
0, msix_entries[i].name, rsp);
if (ret)
goto msix_register_fail;
<--- IRQ arrives here
qentry->have_irq = 1;
qentry->rsp = rsp;
rsp->msix = qentry;
[...]
void qla24xx_process_response_queue(struct scsi_qla_host *vha,
struct rsp_que *rsp)
{
[...]
if (rsp->msix->cpuid != smp_processor_id()) {
^
\--- rsp->msix == NULL
/* if kernel does not notify qla of IRQ's CPU change,
* then set it here.
*/
rsp->msix->cpuid = smp_processor_id();
ha->tgt.rspq_vector_cpuid = rsp->msix->cpuid;
--
Johannes Thumshirn Storage
jthumshirn@suse.de +49 911 74053 689
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: Felix Imendörffer, Jane Smithard, Graham Norton
HRB 21284 (AG Nürnberg)
Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Reported regressions for 4.7 as of Sunday, 2016-06-19
2016-06-23 7:22 ` Johannes Thumshirn
@ 2016-06-23 16:13 ` Quinn Tran
2016-06-23 16:35 ` Linus Torvalds
0 siblings, 1 reply; 20+ messages in thread
From: Quinn Tran @ 2016-06-23 16:13 UTC (permalink / raw)
To: Johannes Thumshirn
Cc: Martin K. Petersen, Linus Torvalds, Josh Boyer,
Thorsten Leemhuis, linux-kernel, linux-scsi
-----Original Message-----
From: Johannes Thumshirn <jthumshirn@suse.de>
Date: Thursday, June 23, 2016 at 12:22 AM
To: Quinn Tran <quinn.tran@qlogic.com>
Cc: "Martin K. Petersen" <martin.petersen@oracle.com>, Linus Torvalds <torvalds@linux-foundation.org>, Josh Boyer <jwboyer@fedoraproject.org>, Thorsten Leemhuis <regressions@leemhuis.info>, linux-kernel <linux-kernel@vger.kernel.org>, linux-scsi <linux-scsi@vger.kernel.org>
Subject: Re: Reported regressions for 4.7 as of Sunday, 2016-06-19
>[+ Cc linux-scsi@vger.kernel.org ]
>
>On Wed, Jun 22, 2016 at 03:57:35PM +0000, Quinn Tran wrote:
>> Johannes, Martin,
>>
>> Based on the screen shot/call trace, it looks like this adapter is not using MSIX. It defaulted back to MSI or INTx interrupt. The code made an assumption of MSIX is available. There is no point in go through that code segment.
>>
>> Can you try this work around? It’s untested. Thanks.
>>
>>
>> diff --git a/drivers/scsi/qla2xxx/qla_isr.c b/drivers/scsi/qla2xxx/qla_isr.c
>> index 5649c20..e033ecb 100644
>> --- a/drivers/scsi/qla2xxx/qla_isr.c
>> +++ b/drivers/scsi/qla2xxx/qla_isr.c
>> @@ -2548,7 +2548,7 @@ void qla24xx_process_response_queue(struct scsi_qla_host *vha,
>> if (!vha->flags.online)
>> return;
>>
>> - if (rsp->msix->cpuid != smp_processor_id()) {
>> + if (rsp->msix && (rsp->msix->cpuid != smp_processor_id())) {
>> /* if kernel does not notify qla of IRQ's CPU change,
>> * then set it here.
>> */
>>
>
>But this still does not fix the race which would be possible if the HBA is
>using MSI-X but triggering IRQs early enough.
>
>Have a look at this (I admit theoretical) path:
>qla24xx_enable_msix(struct qla_hw_data *ha, struct rsp_que *rsp)
>{
> [...]
> /* Enable MSI-X vectors for the base queue */
> for (i = 0; i < 2; i++) {
> qentry = &ha->msix_entries[i];
> if (IS_P3P_TYPE(ha))
> ret = request_irq(qentry->vector,
> qla82xx_msix_entries[i].handler,
> 0, qla82xx_msix_entries[i].name, rsp);
> else
> ret = request_irq(qentry->vector,
> msix_entries[i].handler,
> 0, msix_entries[i].name, rsp);
> if (ret)
> goto msix_register_fail;
> <--- IRQ arrives here
QT: setting up the interrupt vector does not mean the interrupt starts firing immediately. Interrupt starting firing when the driver is ready to accept the interrupt by enabling the interrupt (ha->isp_ops->enable_intrs(ha)) later on in time. In addition, that particular code path/qla24xx_process_response_queue is not executed until driver feeds commands to the hardware work queue.
IF there is a left over interrupt that happens to trigger the call immediately, there is another check that prevent the code from getting to the point of the “theoretical" race.
> qentry->have_irq = 1;
> qentry->rsp = rsp;
> rsp->msix = qentry;
>
> [...]
>
>
>void qla24xx_process_response_queue(struct scsi_qla_host *vha,
> struct rsp_que *rsp)
>{
--->8------
if (!vha->flags.online)
return;
---8<------
> if (rsp->msix->cpuid != smp_processor_id()) {
> ^
> \--- rsp->msix == NULL
>
> /* if kernel does not notify qla of IRQ's CPU change,
> * then set it here.
> */
> rsp->msix->cpuid = smp_processor_id();
> ha->tgt.rspq_vector_cpuid = rsp->msix->cpuid;
>
>--
>Johannes Thumshirn Storage
>jthumshirn@suse.de +49 911 74053 689
>SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
>GF: Felix Imendörffer, Jane Smithard, Graham Norton
>HRB 21284 (AG Nürnberg)
>Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Reported regressions for 4.7 as of Sunday, 2016-06-19
2016-06-23 16:13 ` Quinn Tran
@ 2016-06-23 16:35 ` Linus Torvalds
2016-06-23 20:56 ` Eric W. Biederman
0 siblings, 1 reply; 20+ messages in thread
From: Linus Torvalds @ 2016-06-23 16:35 UTC (permalink / raw)
To: Quinn Tran
Cc: Johannes Thumshirn, Martin K. Petersen, Josh Boyer,
Thorsten Leemhuis, linux-kernel, linux-scsi
On Thu, Jun 23, 2016 at 9:13 AM, Quinn Tran <quinn.tran@qlogic.com> wrote:
>
>
> QT: setting up the interrupt vector does not mean the interrupt starts firing immediately.
Actually, it very much can mean that. If the interrupt can possibly be
shared, there is a very real possibility of it fiding immediately.
Now, with MSI(-X) I guess that isn't a worry, so I suspect your patch
that handles just the legacy INTx case anyway is sufficient, but in
general I would like people to always act as if interrupts can happen
immediately after request_irq().
We have had *tons* of situations where the firmware left a device
active, for example. Or where some random interrupt controller ended
up having stale interrupts pending, even.
So in general, it's just good practice to say "spurious interrupts can
and do happen" - the shared irq case is the most obvious case, but
there have been other sources of unexpected spurious interrupts
firing.
Linus
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Reported regressions for 4.7 as of Sunday, 2016-06-19
2016-06-23 16:35 ` Linus Torvalds
@ 2016-06-23 20:56 ` Eric W. Biederman
0 siblings, 0 replies; 20+ messages in thread
From: Eric W. Biederman @ 2016-06-23 20:56 UTC (permalink / raw)
To: Linus Torvalds
Cc: Quinn Tran, Johannes Thumshirn, Martin K. Petersen, Josh Boyer,
Thorsten Leemhuis, linux-kernel, linux-scsi
Linus Torvalds <torvalds@linux-foundation.org> writes:
> On Thu, Jun 23, 2016 at 9:13 AM, Quinn Tran <quinn.tran@qlogic.com> wrote:
>>
>>
>> QT: setting up the interrupt vector does not mean the interrupt starts firing immediately.
>
> Actually, it very much can mean that. If the interrupt can possibly be
> shared, there is a very real possibility of it fiding immediately.
>
> Now, with MSI(-X) I guess that isn't a worry, so I suspect your patch
> that handles just the legacy INTx case anyway is sufficient, but in
> general I would like people to always act as if interrupts can happen
> immediately after request_irq().
>
> We have had *tons* of situations where the firmware left a device
> active, for example. Or where some random interrupt controller ended
> up having stale interrupts pending, even.
>
> So in general, it's just good practice to say "spurious interrupts can
> and do happen" - the shared irq case is the most obvious case, but
> there have been other sources of unexpected spurious interrupts
> firing.
One case that occassionally bytes even for MSI-X is the case of kexec on
panic where the hardware was not shut down before the kernel starts, and
the start of the kernel masks the irq. Then when the driver initializes
and calls request_irq it is possible for an irq to be pending as soon as
the MSI-X irq is actually enabled to the hardware.
And there is always CONFIG_IRQ_DEBUG which always acts like an interrupt
happens right when after request_irq finishes.
Eric
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Reported regressions for 4.7 as of Sunday, 2016-06-19
2016-06-22 15:57 ` Quinn Tran
2016-06-23 7:22 ` Johannes Thumshirn
@ 2016-07-05 16:30 ` Josh Boyer
2016-07-05 17:32 ` Linus Torvalds
1 sibling, 1 reply; 20+ messages in thread
From: Josh Boyer @ 2016-07-05 16:30 UTC (permalink / raw)
To: Quinn Tran
Cc: Johannes Thumshirn, Martin K. Petersen, Linus Torvalds,
Thorsten Leemhuis, linux-kernel
On Wed, Jun 22, 2016 at 11:57 AM, Quinn Tran <quinn.tran@qlogic.com> wrote:
> Johannes, Martin,
>
> Based on the screen shot/call trace, it looks like this adapter is not using MSIX. It defaulted back to MSI or INTx interrupt. The code made an assumption of MSIX is available. There is no point in go through that code segment.
>
> Can you try this work around? It’s untested. Thanks.
>
>
> diff --git a/drivers/scsi/qla2xxx/qla_isr.c b/drivers/scsi/qla2xxx/qla_isr.c
> index 5649c20..e033ecb 100644
> --- a/drivers/scsi/qla2xxx/qla_isr.c
> +++ b/drivers/scsi/qla2xxx/qla_isr.c
> @@ -2548,7 +2548,7 @@ void qla24xx_process_response_queue(struct scsi_qla_host *vha,
> if (!vha->flags.online)
> return;
>
> - if (rsp->msix->cpuid != smp_processor_id()) {
> + if (rsp->msix && (rsp->msix->cpuid != smp_processor_id())) {
> /* if kernel does not notify qla of IRQ's CPU change,
> * then set it here.
> */
Did this wind up going into an official commit somewhere?
josh
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Reported regressions for 4.7 as of Sunday, 2016-06-19
2016-07-05 16:30 ` Josh Boyer
@ 2016-07-05 17:32 ` Linus Torvalds
2016-07-05 18:43 ` Thorsten Leemhuis
2016-07-05 19:40 ` Martin K. Petersen
0 siblings, 2 replies; 20+ messages in thread
From: Linus Torvalds @ 2016-07-05 17:32 UTC (permalink / raw)
To: Josh Boyer
Cc: Quinn Tran, Johannes Thumshirn, Martin K. Petersen,
Thorsten Leemhuis, linux-kernel, Linux SCSI List
On Tue, Jul 5, 2016 at 9:30 AM, Josh Boyer <jwboyer@fedoraproject.org> wrote:
> On Wed, Jun 22, 2016 at 11:57 AM, Quinn Tran <quinn.tran@qlogic.com> wrote:
>>
>> - if (rsp->msix->cpuid != smp_processor_id()) {
>> + if (rsp->msix && (rsp->msix->cpuid != smp_processor_id())) {
>
> Did this wind up going into an official commit somewhere?
It's not in my tree, at least.
And I don't think I've seen a "yes, that fixes it". Although Johannes
was right that in addition to that, the ordering of the irq setup
should probably _also_ be fixed, but that's a separate patch.
Linus
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Reported regressions for 4.7 as of Sunday, 2016-06-19
2016-07-05 17:32 ` Linus Torvalds
@ 2016-07-05 18:43 ` Thorsten Leemhuis
2016-07-05 19:40 ` Martin K. Petersen
1 sibling, 0 replies; 20+ messages in thread
From: Thorsten Leemhuis @ 2016-07-05 18:43 UTC (permalink / raw)
To: Linus Torvalds, Josh Boyer
Cc: Quinn Tran, Johannes Thumshirn, Martin K. Petersen, linux-kernel,
Linux SCSI List
On 05.07.2016 19:32, Linus Torvalds wrote:
> On Tue, Jul 5, 2016 at 9:30 AM, Josh Boyer <jwboyer@fedoraproject.org> wrote:
>> On Wed, Jun 22, 2016 at 11:57 AM, Quinn Tran <quinn.tran@qlogic.com> wrote:
>>>
>>> - if (rsp->msix->cpuid != smp_processor_id()) {
>>> + if (rsp->msix && (rsp->msix->cpuid != smp_processor_id())) {
>>
>> Did this wind up going into an official commit somewhere?
> It's not in my tree, at least.
> And I don't think I've seen a "yes, that fixes it".
Quinn Tran ACKed a nearly identical patch from Bruno Prémont in a
different thread:
http://thread.gmane.org/gmane.linux.kernel/2257008/focus=2257139
>From what I can see in the initial mail in that thread it seems Bruno
successfully tested the patch he submitted. But I have no idea if the
patch is in someones queue to mainline right now. That's why I had it on
my "if nothing happens soon, poke someone" list...
HTH, CU, Thorsten
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Reported regressions for 4.7 as of Sunday, 2016-06-19
2016-07-05 17:32 ` Linus Torvalds
2016-07-05 18:43 ` Thorsten Leemhuis
@ 2016-07-05 19:40 ` Martin K. Petersen
1 sibling, 0 replies; 20+ messages in thread
From: Martin K. Petersen @ 2016-07-05 19:40 UTC (permalink / raw)
To: Linus Torvalds
Cc: Josh Boyer, Quinn Tran, Johannes Thumshirn, Martin K. Petersen,
Thorsten Leemhuis, linux-kernel, Linux SCSI List
>>>>> "Linus" == Linus Torvalds <torvalds@linux-foundation.org> writes:
Linus> It's not in my tree, at least.
Not in scsi-fixes either. I have been waiting for a "real" patch
submission with one or more Tested-by: tags. I generally don't queue
something that comes with a "try this untested workaround" patch
description.
Quinn, please submit a real patch.
Linus> And I don't think I've seen a "yes, that fixes it". Although
Linus> Johannes was right that in addition to that, the ordering of the
Linus> irq setup should probably _also_ be fixed, but that's a separate
Linus> patch.
--
Martin K. Petersen Oracle Linux Engineering
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Reported regressions for 4.7 as of Sunday, 2016-06-19
2016-06-26 12:52 ` Thorsten Leemhuis
@ 2016-06-26 15:36 ` Lucas Stach
0 siblings, 0 replies; 20+ messages in thread
From: Lucas Stach @ 2016-06-26 15:36 UTC (permalink / raw)
To: Thorsten Leemhuis, George Spelvin
Cc: airlied, Linux Kernel Mailing List, dri-devel, nouveau
Am Sonntag, den 26.06.2016, 14:52 +0200 schrieb Thorsten Leemhuis:
> On 24.06.2016 16:19, George Spelvin wrote:
> >
> > Here's a regression you might add.
> Thx, added.
>
Probably the same bug as
https://bugzilla.kernel.org/show_bug.cgi?id=119861 and already fixed in
the last -rc.
Regards,
Lucas
> >
> > I only reported it to dri-devel,
> > since it's DRI-specific, but since there's been thunderous silence
> > for a few weeks, I'm trying to be a squeakier wheel.
> Added the nouveau developers to CC, maybe it's a bug in the drm
> driver
> that triggers this problem; and airlied is "Internet challenged"
> right
> now and Daniel on holidays, so it might be good to get more people
> into
> the loop anyway.
>
> >
> > Given that I bisected it to a single, small, revertable commit, I'd
> > hoped it would be easy to deal with.
> >
> > [BISECTED: 0955c1250e] 4.7-rc1 oops at
> > drm_connector_cleanup+0x5c/0x1d0
> >
> > E-mail report at
> > https://marc.info/?l=dri-devel&m=146577898611849
> >
> > Bugzilla report at
> > https://bugs.freedesktop.org/show_bug.cgi?id=96532
> FWIW the important detail: Reverting
> https://git.kernel.org/linus/0955c1250e (drm/crtc: take references to
> connectors used in a modeset. (v2)) fixes this.
>
> Sincerely, your regression tracker for Linux 4.7 (http://bit.ly/28JRm
> Jo)
> Thorsten
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Reported regressions for 4.7 as of Sunday, 2016-06-19
[not found] <20160624141918.4646.qmail@ns.sciencehorizons.net>
@ 2016-06-26 12:52 ` Thorsten Leemhuis
2016-06-26 15:36 ` Lucas Stach
0 siblings, 1 reply; 20+ messages in thread
From: Thorsten Leemhuis @ 2016-06-26 12:52 UTC (permalink / raw)
To: George Spelvin; +Cc: airlied, dri-devel, Linux Kernel Mailing List, nouveau
On 24.06.2016 16:19, George Spelvin wrote:
> Here's a regression you might add.
Thx, added.
> I only reported it to dri-devel,
> since it's DRI-specific, but since there's been thunderous silence
> for a few weeks, I'm trying to be a squeakier wheel.
Added the nouveau developers to CC, maybe it's a bug in the drm driver
that triggers this problem; and airlied is "Internet challenged" right
now and Daniel on holidays, so it might be good to get more people into
the loop anyway.
> Given that I bisected it to a single, small, revertable commit, I'd
> hoped it would be easy to deal with.
>
> [BISECTED: 0955c1250e] 4.7-rc1 oops at drm_connector_cleanup+0x5c/0x1d0
>
> E-mail report at
> https://marc.info/?l=dri-devel&m=146577898611849
>
> Bugzilla report at
> https://bugs.freedesktop.org/show_bug.cgi?id=96532
FWIW the important detail: Reverting
https://git.kernel.org/linus/0955c1250e (drm/crtc: take references to
connectors used in a modeset. (v2)) fixes this.
Sincerely, your regression tracker for Linux 4.7 (http://bit.ly/28JRmJo)
Thorsten
^ permalink raw reply [flat|nested] 20+ messages in thread
end of thread, other threads:[~2016-07-05 19:40 UTC | newest]
Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-06-19 14:52 Reported regressions for 4.7 as of Sunday, 2016-06-19 Thorsten Leemhuis
2016-06-20 10:21 ` Christoph Hellwig
2016-06-21 11:11 ` Josh Boyer
2016-06-21 20:40 ` Linus Torvalds
2016-06-22 0:55 ` Josh Boyer
2016-06-22 1:25 ` Martin K. Petersen
2016-06-22 1:29 ` Quinn Tran
2016-06-22 11:51 ` Johannes Thumshirn
2016-06-22 15:57 ` Quinn Tran
2016-06-23 7:22 ` Johannes Thumshirn
2016-06-23 16:13 ` Quinn Tran
2016-06-23 16:35 ` Linus Torvalds
2016-06-23 20:56 ` Eric W. Biederman
2016-07-05 16:30 ` Josh Boyer
2016-07-05 17:32 ` Linus Torvalds
2016-07-05 18:43 ` Thorsten Leemhuis
2016-07-05 19:40 ` Martin K. Petersen
2016-06-22 6:36 ` Kalle Valo
[not found] <20160624141918.4646.qmail@ns.sciencehorizons.net>
2016-06-26 12:52 ` Thorsten Leemhuis
2016-06-26 15:36 ` Lucas Stach
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).