All of lore.kernel.org
 help / color / mirror / Atom feed
From: Hannes Reinecke <hare@suse.de>
To: Ulrich Windl <Ulrich.Windl@rz.uni-regensburg.de>,
	erwin@erwinvanlonden.net, martin.petersen@oracle.com,
	martin.wilck@suse.com
Cc: dgilbert@interlog.com, jejb@linux.vnet.ibm.com,
	"systemd-devel@lists.freedesktop.org" 
	<systemd-devel@lists.freedesktop.org>,
	hch@lst.de, dm-devel@redhat.com, hare@suse.com,
	linux-scsi@vger.kernel.org
Subject: Re: Antw: [EXT] Re: [dm-devel] RFC: one more time: SCSI device identification
Date: Tue, 4 May 2021 09:32:25 +0200	[thread overview]
Message-ID: <b7d593cc-2adb-1350-be16-39c7c309fdaf@suse.de> (raw)
In-Reply-To: <6087ECF1020000A100040C7F@gwsmtp.uni-regensburg.de>

On 4/27/21 12:52 PM, Ulrich Windl wrote:
>>>> Hannes Reinecke <hare@suse.de> schrieb am 27.04.2021 um 10:21 in Nachricht
> <2a6903e4-ff2b-67d5-e772-6971db8448fb@suse.de>:
>> On 4/27/21 10:10 AM, Martin Wilck wrote:
>>> On Tue, 2021‑04‑27 at 13:48 +1000, Erwin van Londen wrote:
>>>>>
>>>>> Wrt 1), we can only hope that it's the case. But 2) and 3) need work,
>>>>> afaics.
>>>>>
>>>> In my view the WWID should never change.
>>>
>>> In an ideal world, perhaps not. But in the dm‑multipath realm, we know
>>> that WWID changes can happen with certain storage arrays. See
>>> https://listman.redhat.com/archives/dm‑devel/2021‑February/msg00116.html
>>> and follow‑ups, for example.
>>>
>> And it's actually something which might happen quite easily.
>> The storage array can unmap a LUN, delete it, create a new one, and map
>> that one into the same LUN number than the old one.
>> If we didn't do I/O during that interval upon the next I/O we will be
>> getting the dreaded 'Power‑On/Reset' sense code.
>> _And nothing else_, due to the arcane rules for sense code generation in
>> SAM.
>> But we end up with a completely different device.
>>
>> The only way out of it is to do a rescan for every POR sense code, and
>> disable the device eg via DID_NO_CONNECT whenever we find that the
>> identification has changed. We already have a copy of the original VPD
>> page 0x83 at hand, so that should be reasonably easy.
> 
> I don't know the depth of the SCSI or FC protocol, but storage systems
> typically signal such events, maybe either via some unit attention or some FC
> event. Older kernels logged that there was a change, but a manual SCSI bus scan
> is needed, while newer kernels find new devices "automagically" for some
> products. The HP EVA 6000 series wored that way, a 3PAR SotorServ 8000 series
> also seems to work that way, but not Pure Storage X70 R3. FOr the latter you
> need something like a FC LIP to make the kernel detect the new devices (LUNs).
> I'm unsure where the problem is, but in principle the kernel can be
> notified...
> 
My point was that while there _is_ a unit attention with the sense code 
'INQUIRY DATA CHANGED' (and that indeed will generate a kernel message), 
it might be obscured by a subsequent unit attention with the sense code 
'Power-On/Reset', as per SCSI spec the latter might cause the previous 
ones to _not_ being sent.
So from that reasoning we will need to rescan the device upon 
'Power-on/Reset'.
But 'Power-On/Reset' is a sense code which we also get during initial 
device scan, so the problem is that we will be triggering a rescan while 
_doing_ a rescan, and as such it would need some really careful testing.

As for the PureStorage behaviour: The problem with changing the LUN 
mapping on the array is that it we might not _have_ a device to send 
unit attentions to.
If the array already exports LUNs to some other hosts, it doesn't need 
to re-initialize the FC port when starting to export LUNs to _this_ 
host. And as _this_ host doesn't have a LUN on which unit attentions can 
be sent, _and_ the FC port is already registered, there are no events 
whatsoever which would cause the host to initiate a rescan.
To resolve that the array would need to induce eg an RSCN, but that will 
only be triggered if a FC port is (re-)registered.
Which is what HPe arrays do; initiate a link-bounce on the attached 
ports, which will cause the attached hosts to initiate a rescan.
Of course, _all_ hosts will need to rescan (and thereby causing an 
interruption even on unrelated hosts), which is why this is not done by 
all vendors.

>>
>> I had a rather lengthy discussion with Fred Knight @ NetApp about
>> Power‑On/Reset handling, what with him complaining that we don't handle
>> is correctly. So this really is something we should be looking into,
>> even independently of multipathing.
>>
>> But actually I like the idea from Martin Petersen to expose the parsed
>> VPD identifiers to sysfs; that would allow us to drop sg_inq completely
>> from the udev rules.
> 
> Talking of VPDs: Somewhere in the last 12 years (within SLES 11)there was a
> kernel change regarding trailing blanks in VPD data. That change blew up
> several configurations being unable to re-recognize the devices. In one case
> the software even had bound a license to a specific device with serial number,
> and that software found "new" devices while missing the "old" ones...
> 
That's probably just for VPD page 0x80. Page 0x83 has pretty strict 
rules on how the entries are formatted, so chopping off trailing blanks 
is not easily done there.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                Kernel Storage Architect
hare@suse.de                              +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Felix Imendörffer

WARNING: multiple messages have this Message-ID (diff)
From: Hannes Reinecke <hare@suse.de>
To: Ulrich Windl <Ulrich.Windl@rz.uni-regensburg.de>,
	erwin@erwinvanlonden.net, martin.petersen@oracle.com,
	martin.wilck@suse.com
Cc: hare@suse.com,
	"systemd-devel@lists.freedesktop.org"
	<systemd-devel@lists.freedesktop.org>,
	linux-scsi@vger.kernel.org, dm-devel@redhat.com,
	dgilbert@interlog.com, jejb@linux.vnet.ibm.com, hch@lst.de
Subject: Re: [dm-devel] Antw: [EXT] Re: RFC: one more time: SCSI device identification
Date: Tue, 4 May 2021 09:32:25 +0200	[thread overview]
Message-ID: <b7d593cc-2adb-1350-be16-39c7c309fdaf@suse.de> (raw)
In-Reply-To: <6087ECF1020000A100040C7F@gwsmtp.uni-regensburg.de>

On 4/27/21 12:52 PM, Ulrich Windl wrote:
>>>> Hannes Reinecke <hare@suse.de> schrieb am 27.04.2021 um 10:21 in Nachricht
> <2a6903e4-ff2b-67d5-e772-6971db8448fb@suse.de>:
>> On 4/27/21 10:10 AM, Martin Wilck wrote:
>>> On Tue, 2021‑04‑27 at 13:48 +1000, Erwin van Londen wrote:
>>>>>
>>>>> Wrt 1), we can only hope that it's the case. But 2) and 3) need work,
>>>>> afaics.
>>>>>
>>>> In my view the WWID should never change.
>>>
>>> In an ideal world, perhaps not. But in the dm‑multipath realm, we know
>>> that WWID changes can happen with certain storage arrays. See
>>> https://listman.redhat.com/archives/dm‑devel/2021‑February/msg00116.html
>>> and follow‑ups, for example.
>>>
>> And it's actually something which might happen quite easily.
>> The storage array can unmap a LUN, delete it, create a new one, and map
>> that one into the same LUN number than the old one.
>> If we didn't do I/O during that interval upon the next I/O we will be
>> getting the dreaded 'Power‑On/Reset' sense code.
>> _And nothing else_, due to the arcane rules for sense code generation in
>> SAM.
>> But we end up with a completely different device.
>>
>> The only way out of it is to do a rescan for every POR sense code, and
>> disable the device eg via DID_NO_CONNECT whenever we find that the
>> identification has changed. We already have a copy of the original VPD
>> page 0x83 at hand, so that should be reasonably easy.
> 
> I don't know the depth of the SCSI or FC protocol, but storage systems
> typically signal such events, maybe either via some unit attention or some FC
> event. Older kernels logged that there was a change, but a manual SCSI bus scan
> is needed, while newer kernels find new devices "automagically" for some
> products. The HP EVA 6000 series wored that way, a 3PAR SotorServ 8000 series
> also seems to work that way, but not Pure Storage X70 R3. FOr the latter you
> need something like a FC LIP to make the kernel detect the new devices (LUNs).
> I'm unsure where the problem is, but in principle the kernel can be
> notified...
> 
My point was that while there _is_ a unit attention with the sense code 
'INQUIRY DATA CHANGED' (and that indeed will generate a kernel message), 
it might be obscured by a subsequent unit attention with the sense code 
'Power-On/Reset', as per SCSI spec the latter might cause the previous 
ones to _not_ being sent.
So from that reasoning we will need to rescan the device upon 
'Power-on/Reset'.
But 'Power-On/Reset' is a sense code which we also get during initial 
device scan, so the problem is that we will be triggering a rescan while 
_doing_ a rescan, and as such it would need some really careful testing.

As for the PureStorage behaviour: The problem with changing the LUN 
mapping on the array is that it we might not _have_ a device to send 
unit attentions to.
If the array already exports LUNs to some other hosts, it doesn't need 
to re-initialize the FC port when starting to export LUNs to _this_ 
host. And as _this_ host doesn't have a LUN on which unit attentions can 
be sent, _and_ the FC port is already registered, there are no events 
whatsoever which would cause the host to initiate a rescan.
To resolve that the array would need to induce eg an RSCN, but that will 
only be triggered if a FC port is (re-)registered.
Which is what HPe arrays do; initiate a link-bounce on the attached 
ports, which will cause the attached hosts to initiate a rescan.
Of course, _all_ hosts will need to rescan (and thereby causing an 
interruption even on unrelated hosts), which is why this is not done by 
all vendors.

>>
>> I had a rather lengthy discussion with Fred Knight @ NetApp about
>> Power‑On/Reset handling, what with him complaining that we don't handle
>> is correctly. So this really is something we should be looking into,
>> even independently of multipathing.
>>
>> But actually I like the idea from Martin Petersen to expose the parsed
>> VPD identifiers to sysfs; that would allow us to drop sg_inq completely
>> from the udev rules.
> 
> Talking of VPDs: Somewhere in the last 12 years (within SLES 11)there was a
> kernel change regarding trailing blanks in VPD data. That change blew up
> several configurations being unable to re-recognize the devices. In one case
> the software even had bound a license to a specific device with serial number,
> and that software found "new" devices while missing the "old" ones...
> 
That's probably just for VPD page 0x80. Page 0x83 has pretty strict 
rules on how the entries are formatted, so chopping off trailing blanks 
is not easily done there.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                Kernel Storage Architect
hare@suse.de                              +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Felix Imendörffer


--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel

  parent reply	other threads:[~2021-05-04  7:32 UTC|newest]

Thread overview: 50+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-03-29  9:58 RFC: one more time: SCSI device identification Martin Wilck
2021-03-29  9:58 ` [dm-devel] " Martin Wilck
2021-04-06  4:47 ` Martin K. Petersen
2021-04-06  4:47   ` [dm-devel] " Martin K. Petersen
2021-04-16 23:28   ` Martin Wilck
2021-04-16 23:28     ` [dm-devel] " Martin Wilck
2021-04-22  2:46     ` Martin K. Petersen
2021-04-22  2:46       ` [dm-devel] " Martin K. Petersen
2021-04-22  9:07       ` Martin Wilck
2021-04-22  9:07         ` [dm-devel] " Martin Wilck
2021-04-22 16:14         ` Benjamin Marzinski
2021-04-22 16:14           ` [dm-devel] " Benjamin Marzinski
2021-04-23  1:40         ` Martin K. Petersen
2021-04-23  1:40           ` [dm-devel] " Martin K. Petersen
2021-04-23 10:28           ` Martin Wilck
2021-04-23 10:28             ` [dm-devel] " Martin Wilck
2021-04-26 11:14             ` Antw: [EXT] Re: [systemd-devel] " Ulrich Windl
2021-04-26 11:14               ` [dm-devel] " Ulrich Windl
2021-04-26 13:16               ` Martin Wilck
2021-04-26 13:16                 ` [dm-devel] " Martin Wilck
2021-04-27  3:48                 ` Erwin van Londen
2021-04-27  7:02                   ` Antw: [EXT] " Ulrich Windl
2021-04-27  7:02                     ` [dm-devel] Antw: [EXT] " Ulrich Windl
2021-04-27  8:10                   ` [dm-devel] " Martin Wilck
2021-04-27  8:10                     ` Martin Wilck
2021-04-27  8:21                     ` Hannes Reinecke
2021-04-27  8:21                       ` Hannes Reinecke
2021-04-27 10:52                       ` Antw: [EXT] " Ulrich Windl
2021-04-27 10:52                         ` [dm-devel] Antw: [EXT] " Ulrich Windl
2021-04-27 20:04                         ` Antw: [EXT] Re: [dm-devel] " Ewan D. Milne
2021-04-27 20:04                           ` [dm-devel] Antw: [EXT] " Ewan D. Milne
2021-05-04  7:32                         ` Hannes Reinecke [this message]
2021-05-04  7:32                           ` Hannes Reinecke
2021-04-28  1:01                       ` [dm-devel] " Erwin van Londen
2021-04-28  6:34                         ` Martin Wilck
2021-04-28  6:34                           ` Martin Wilck
2021-04-29 14:47                           ` Erwin van Londen
2021-04-29 14:47                             ` Erwin van Londen
2021-04-27 20:14                 ` Ewan D. Milne
2021-04-27 20:14                   ` [dm-devel] " Ewan D. Milne
2021-04-27 20:33                   ` Martin Wilck
2021-04-27 20:33                     ` [dm-devel] " Martin Wilck
2021-04-27 20:41                     ` Ewan D. Milne
2021-04-27 20:41                       ` [dm-devel] " Ewan D. Milne
2021-04-28  0:09                       ` Erwin van Londen
2021-04-30 23:44                         ` Ewan D. Milne
2021-05-03  2:34                           ` Erwin van Londen
2021-05-03  2:34                             ` Erwin van Londen
2021-04-28  6:30                       ` [systemd-devel] " Martin Wilck
2021-04-28  6:30                         ` [dm-devel] " Martin Wilck

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=b7d593cc-2adb-1350-be16-39c7c309fdaf@suse.de \
    --to=hare@suse.de \
    --cc=Ulrich.Windl@rz.uni-regensburg.de \
    --cc=dgilbert@interlog.com \
    --cc=dm-devel@redhat.com \
    --cc=erwin@erwinvanlonden.net \
    --cc=hare@suse.com \
    --cc=hch@lst.de \
    --cc=jejb@linux.vnet.ibm.com \
    --cc=linux-scsi@vger.kernel.org \
    --cc=martin.petersen@oracle.com \
    --cc=martin.wilck@suse.com \
    --cc=systemd-devel@lists.freedesktop.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.