linux-scsi.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [Bug 204815] New: qla2xxx: firmware is not responding to mailbox commands
@ 2019-09-11 15:54 bugzilla-daemon
  2019-09-11 15:55 ` [Bug 204815] " bugzilla-daemon
                   ` (3 more replies)
  0 siblings, 4 replies; 5+ messages in thread
From: bugzilla-daemon @ 2019-09-11 15:54 UTC (permalink / raw)
  To: linux-scsi

https://bugzilla.kernel.org/show_bug.cgi?id=204815

            Bug ID: 204815
           Summary: qla2xxx: firmware is not responding to mailbox
                    commands
           Product: SCSI Drivers
           Version: 2.5
    Kernel Version: 5.2-rc1 up to 5.3-rc8
          Hardware: PPC-64
                OS: Linux
              Tree: Mainline
            Status: NEW
          Severity: high
          Priority: P1
         Component: QLOGIC QLA2XXX
          Assignee: scsi_drivers-qla2xxx@kernel-bugs.osdl.org
          Reporter: r.bolshakov@yadro.com
        Regression: No

Created attachment 284925
  --> https://bugzilla.kernel.org/attachment.cgi?id=284925&action=edit
firmware times out on 5.3-rc8

I'm using QLogic HBAs (QLE2560 and QLE2742) inside pseries guests on
ppc64le/POWER8 hypervisor and they are not usable since commit f8f97b0c5b7f7
("scsi: qla2xxx: Cleanups for NVRAM/Flash read/write path"). 

The firmware stops responding to mailbox commands shortly after system boot is
done.
That also triggers an EEH on pseries machine and driver doesn't handle the EEH
properly because firmware is effectively not available. I disabled eeh inside
the guest as it caused a deadlock on the host kernel.

The issue is fixed in linux-next by the commit edbd56472a63 ("scsi: qla2xxx:
qla2x00_alloc_fw_dump: set ha->eft"). I think it should be included to 5.3 if
possible. It can be cherry-picked cleanly to master.

The logs of 5.3-rc8 (bad.log) and 5.3-rc8 with edbd56472a63 (good.log) are
applied.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug 204815] qla2xxx: firmware is not responding to mailbox commands
  2019-09-11 15:54 [Bug 204815] New: qla2xxx: firmware is not responding to mailbox commands bugzilla-daemon
@ 2019-09-11 15:55 ` bugzilla-daemon
  2019-09-11 15:57 ` bugzilla-daemon
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 5+ messages in thread
From: bugzilla-daemon @ 2019-09-11 15:55 UTC (permalink / raw)
  To: linux-scsi

https://bugzilla.kernel.org/show_bug.cgi?id=204815

--- Comment #1 from Roman Bolshakov (r.bolshakov@yadro.com) ---
Created attachment 284927
  --> https://bugzilla.kernel.org/attachment.cgi?id=284927&action=edit
firmware behaves properly

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug 204815] qla2xxx: firmware is not responding to mailbox commands
  2019-09-11 15:54 [Bug 204815] New: qla2xxx: firmware is not responding to mailbox commands bugzilla-daemon
  2019-09-11 15:55 ` [Bug 204815] " bugzilla-daemon
@ 2019-09-11 15:57 ` bugzilla-daemon
  2019-09-13 15:04 ` bugzilla-daemon
  2019-09-13 16:03 ` bugzilla-daemon
  3 siblings, 0 replies; 5+ messages in thread
From: bugzilla-daemon @ 2019-09-11 15:57 UTC (permalink / raw)
  To: linux-scsi

https://bugzilla.kernel.org/show_bug.cgi?id=204815

Roman Bolshakov (r.bolshakov@yadro.com) changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
         Regression|No                          |Yes

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug 204815] qla2xxx: firmware is not responding to mailbox commands
  2019-09-11 15:54 [Bug 204815] New: qla2xxx: firmware is not responding to mailbox commands bugzilla-daemon
  2019-09-11 15:55 ` [Bug 204815] " bugzilla-daemon
  2019-09-11 15:57 ` bugzilla-daemon
@ 2019-09-13 15:04 ` bugzilla-daemon
  2019-09-13 16:03 ` bugzilla-daemon
  3 siblings, 0 replies; 5+ messages in thread
From: bugzilla-daemon @ 2019-09-13 15:04 UTC (permalink / raw)
  To: linux-scsi

https://bugzilla.kernel.org/show_bug.cgi?id=204815

Martin Wilck (mwilck@suse.com) changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |mwilck@suse.com

--- Comment #2 from Martin Wilck (mwilck@suse.com) ---
Nice to hear that edbd56472a63 fixes your problem, but it was meant to fix
a28d9e4ef997 ("scsi: qla2xxx: Add support for multiple fwdump
templates/segments"), which was applied (directly) after f8f97b0c5b7f7.

Maybe your problem has been caused by a28d9e4ef997?

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug 204815] qla2xxx: firmware is not responding to mailbox commands
  2019-09-11 15:54 [Bug 204815] New: qla2xxx: firmware is not responding to mailbox commands bugzilla-daemon
                   ` (2 preceding siblings ...)
  2019-09-13 15:04 ` bugzilla-daemon
@ 2019-09-13 16:03 ` bugzilla-daemon
  3 siblings, 0 replies; 5+ messages in thread
From: bugzilla-daemon @ 2019-09-13 16:03 UTC (permalink / raw)
  To: linux-scsi

https://bugzilla.kernel.org/show_bug.cgi?id=204815

--- Comment #3 from Roman Bolshakov (r.bolshakov@yadro.com) ---
Hi Martin,

I can't tell for sure, because f8f97b0c5b7f7 introduces a regression fixed in
1710ac17547ac8b ("scsi: qla2xxx: Fix read offset in
qla24xx_load_risc_flash()"). 

Here's the possible timeline:
1. f8f97b0c5b7f7 ("scsi: qla2xxx: Cleanups for NVRAM/Flash read/write path")
introduces a regression which prevents successful ISP firmware checksum
validation and kernel panics shortly after.
2. a28d9e4ef997 ("scsi: qla2xxx: Add support for multiple fwdump
templates/segments") introduces a regression which causes EEH and system lockup
on POWER8 or makes firmware unavailable (this bug).
3. 1710ac17547ac8 ("scsi: qla2xxx: Fix read offset in
qla24xx_load_risc_flash()") fixes  f8f97b0c5b7f7 but the fix depends both on #1
and #2.
4. edbd56472a63 ("scsi: qla2xxx: qla2x00_alloc_fw_dump: set ha->eft") fixes
a28d9e4ef997. 

It's not possible to bisect between #1 and #3 because of the panic introduced
in #1. And firmware works reliably only after #4.

And I think it's important to include your fix into 5.3, otherwise qla2xxx is
broken in the release.

Thanks,
Roman

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2019-09-13 16:03 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-09-11 15:54 [Bug 204815] New: qla2xxx: firmware is not responding to mailbox commands bugzilla-daemon
2019-09-11 15:55 ` [Bug 204815] " bugzilla-daemon
2019-09-11 15:57 ` bugzilla-daemon
2019-09-13 15:04 ` bugzilla-daemon
2019-09-13 16:03 ` bugzilla-daemon

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).