linux-scsi.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [Bug 199887] Fibre login failure on older adapters
       [not found] <bug-199887-11613@https.bugzilla.kernel.org/>
@ 2021-12-29  3:43 ` bugzilla-daemon
  2022-08-28 19:54 ` bugzilla-daemon
  2022-09-17 20:50 ` bugzilla-daemon
  2 siblings, 0 replies; 3+ messages in thread
From: bugzilla-daemon @ 2021-12-29  3:43 UTC (permalink / raw)
  To: linux-scsi

https://bugzilla.kernel.org/show_bug.cgi?id=199887

Michael Graham (jmetal88@gmail.com) changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |jmetal88@gmail.com

--- Comment #3 from Michael Graham (jmetal88@gmail.com) ---
I think I ran into this today, trying to set up an ISP2100 based controller on
the server in my basement.  No error messages as far as I could tell, but on
the 5.X kernel I was using the adapter just would not communicate back any info
about the drives attached.  Tried out an old version of OpenSuse using kernel
4.4.104 and it worked with no special configuration on my part (I did try a
current version of OpenSuse first, in which it was also broken).  I'm fine
using an old version of Linux for how I'm using this server for the time being,
but it would be nice if there was a fix for newer kernels.

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 3+ messages in thread

* [Bug 199887] Fibre login failure on older adapters
       [not found] <bug-199887-11613@https.bugzilla.kernel.org/>
  2021-12-29  3:43 ` [Bug 199887] Fibre login failure on older adapters bugzilla-daemon
@ 2022-08-28 19:54 ` bugzilla-daemon
  2022-09-17 20:50 ` bugzilla-daemon
  2 siblings, 0 replies; 3+ messages in thread
From: bugzilla-daemon @ 2022-08-28 19:54 UTC (permalink / raw)
  To: linux-scsi

https://bugzilla.kernel.org/show_bug.cgi?id=199887

Pavel Kankovsky (peak@argo.troja.mff.cuni.cz) changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |peak@argo.troja.mff.cuni.cz

--- Comment #4 from Pavel Kankovsky (peak@argo.troja.mff.cuni.cz) ---
Created attachment 301697
  --> https://bugzilla.kernel.org/attachment.cgi?id=301697&action=edit
kinda fix

I did some experiments with my old QLA2340 (ISP2312, fw 3.03.28) and the most
recent stable kernel, ie. 5.19.4.

"Async-gnlist" failures seem to be survivable and I decided to ignore them for
the time being. In fact, the old driver in 4.9.325 was able to work without
MBC_PORT_NODE_NAME_LIST. There was a function issuing that command, namely
qla2x00_get_node_name_list(), but AFAICT it was never called.

"Async-gpdb" failures are a real problem because they trigger session deletion
(qla24xx_handle_gpdb_event() gets an invalid zero login state).

As far as I can tell, the new asynchronous implementation provides correct
parameters to MBC_GET_PORT_DATABASE (compare qla24xx_async_gpdb() with
qla2x00_get_port_database(), HAS_EXTENDED_IDS is true for ISP2312) but
1. the adapter cannot handle the request when it receives it via the IOCB
interface, and
2. the driver would not be able to handle returned data anyway because their
format is completely different on old non-IS_FWI2_CAPABLE adapters (compare
qla24xx_handle_gpdb_event() with the final part of
qla2x00_get_port_database()).

I tried replacing the new code with a small wrapper around a call to the old
qla2x00_get_port_database() sending the request synchronously via the mbox
interface... and it worked! The driver was able to finish logins and access
available FC targets. See the attached patch.

That said, it is a horrible hack done by someone almost totally ignorant of the
inner workings of the driver. There is absolutely no guarantee. It might crash
your kernel. It might fail to handle some (newly connected?) remote ports. It
might brick your adapter. It might wipe all your disk arrays. It might summon
the Elder Gods. You have been warned.

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 3+ messages in thread

* [Bug 199887] Fibre login failure on older adapters
       [not found] <bug-199887-11613@https.bugzilla.kernel.org/>
  2021-12-29  3:43 ` [Bug 199887] Fibre login failure on older adapters bugzilla-daemon
  2022-08-28 19:54 ` bugzilla-daemon
@ 2022-09-17 20:50 ` bugzilla-daemon
  2 siblings, 0 replies; 3+ messages in thread
From: bugzilla-daemon @ 2022-09-17 20:50 UTC (permalink / raw)
  To: linux-scsi

https://bugzilla.kernel.org/show_bug.cgi?id=199887

--- Comment #5 from Pavel Kankovsky (peak@argo.troja.mff.cuni.cz) ---
Some additional findings:

1. It turns out qla2x00_get_node_name_list() was introduced in 3.5 and it was
called from qla_target.c until 3.11 when the call was removed and the function
remained unused until its own removal in 4.11.

I have not tested whether it would work on an old HBA but it is far from
certain (its result was an array of "struct qla_port_24xx_data", corresponding
to "struct get_name_list" in recent versions), and even if it would, it would
not help much (there seem to be two variants of MBC_PORT_NODE_NAME_LIST, the
old function invoked the variant providing less data while the current code
needs the variant providing more data, "struct get_name_list_extended").

2. The driver is sometimes unable to relogin when an old HBA reconnects to the
fabric because "Async-login" keeps failing with 4007 ie. MBS_PORT_ID_USED. It
turns out qla24xx_handle_plogi_done_event expects an offending loopid in
ea->iop[1] but qla2x00_mbx_iocb_entry stores the value in ea->data[1].

(A similar problem occurs during the handling 4008 ie. MBS_LOOP_ID_USED when
qla24xx_handle_plogi_done_event expects an offending portid in ea->iop[1] but
it is not stored anywhere. But the driver seems to be able to recover in this
case.)

3. Newer HBAs seem to use the same command (MBC_LOGIN_FABRIC_PORT) for both
fabric and private loop port login but old HBAs need a different command
(MBC_LOGIN_LOOP_PORT) in the latter case. See qla2x00_local_device_login.

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2022-09-17 20:50 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <bug-199887-11613@https.bugzilla.kernel.org/>
2021-12-29  3:43 ` [Bug 199887] Fibre login failure on older adapters bugzilla-daemon
2022-08-28 19:54 ` bugzilla-daemon
2022-09-17 20:50 ` bugzilla-daemon

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).