All of lore.kernel.org
 help / color / mirror / Atom feed
* RFC: one more time: SCSI device identification
@ 2021-03-29  9:58 ` Martin Wilck
  0 siblings, 0 replies; 50+ messages in thread
From: Martin Wilck @ 2021-03-29  9:58 UTC (permalink / raw)
  To: hch, jejb, bmarzins, Hannes Reinecke, martin.petersen
  Cc: linux-scsi, dgilbert, systemd-devel, dm-devel

Hello,

[sorry for cross-posting, I think this is relevant to multiple
communities.]

I'm referring to the recent discussion about SCSI device identification
for multipath-tools 
(https://listman.redhat.com/archives/dm-devel/2021-March/msg00332.html)

As you all know, there are different designators to identify SCSI LUNs,
and the specs don't mandate priorities for devices that support
multiple designator types. There are various implementations for device
identification, which use different priorities (summarized below).

It's highly desirable to clean up this confusion and settle on a single
instance and a unique priority order. I believe this instance should be
the kernel.

OTOH, changing device WWIDs is highly dangerous for productive systems.
The WWID is prominently used in multipath-tools, but also in lots of
other important places such as fstab, grub.cfg, dracut, etc. No doubt
that we'll be stuck with the different algorithms for years, especially
for LTS distributions. But perhaps we can figure out a long-term exit
strategy?

The kernel's preference for type 8 designators (see below) is in
contrast with the established user space algorithms, which determine
SCSI WWIDs on productive systems in practice. User space can try to
adapt to the kernel logic, but it will necessarily be a slow and
painful path if we want to avoid breaking user setups.

In principle, I believe the kernel is "right" to prefer type 8. But
because the "wwid" attribute isn't actually used for device
identification today, changing the kernel logic would be less prone to
regressions than changing user space, even if it violates the principle
that the kernel's user space API must remain stable.

Would it be an option to modify the kernel logic?

If we can't, I think we should start with making the "wwid" attribute
part of the udev rule logic, and letting distros configure whether the
kernel logic or the traditional udev logic would be used.

Please tell me your thoughts on this matter.

Regards,
Martin

PS: Incomplete list of algorithms for SCSI designator priorities:

The kernel ("wwid" sysfs attribute) prefers "SCSI name string" (type 8)
designators over other types
(https://elixir.bootlin.com/linux/latest/A/ident/designator_prio).

The current set of udev rules in sg3_utils
(https://github.com/hreinecke/sg3_utils/blob/master/scripts/55-scsi-sg3_id.rules)
don't use the kernel's wwid attribute; they parse VPD 83 and 80
instead and prioritize types 36, 35, 32, and 2 over type 8.

udev's "scsi_id" tool, historically the first attempt to implement a
priority for this, doesn't look at the SCSI name attribute at all:
https://github.com/systemd/systemd/blob/main/src/udev/scsi_id/scsi_serial.c

There's a "fallback" logic in multipath-tools in case udev doesn't
provide a WWID:
https://github.com/opensvc/multipath-tools/blob/a41a61e8482def33e3ca8c9e3639ad2c37611551/libmultipath/discovery.c#L1040

-- 
Dr. Martin Wilck <mwilck@suse.com>, Tel. +49 (0)911 74053 2107
SUSE Software Solutions Germany GmbH
HRB 36809, AG Nürnberg GF: Felix Imendörffer



^ permalink raw reply	[flat|nested] 50+ messages in thread

* [dm-devel] RFC: one more time: SCSI device identification
@ 2021-03-29  9:58 ` Martin Wilck
  0 siblings, 0 replies; 50+ messages in thread
From: Martin Wilck @ 2021-03-29  9:58 UTC (permalink / raw)
  To: hch, jejb, bmarzins, Hannes Reinecke, martin.petersen
  Cc: dm-devel, systemd-devel, linux-scsi, dgilbert

Hello,

[sorry for cross-posting, I think this is relevant to multiple
communities.]

I'm referring to the recent discussion about SCSI device identification
for multipath-tools 
(https://listman.redhat.com/archives/dm-devel/2021-March/msg00332.html)

As you all know, there are different designators to identify SCSI LUNs,
and the specs don't mandate priorities for devices that support
multiple designator types. There are various implementations for device
identification, which use different priorities (summarized below).

It's highly desirable to clean up this confusion and settle on a single
instance and a unique priority order. I believe this instance should be
the kernel.

OTOH, changing device WWIDs is highly dangerous for productive systems.
The WWID is prominently used in multipath-tools, but also in lots of
other important places such as fstab, grub.cfg, dracut, etc. No doubt
that we'll be stuck with the different algorithms for years, especially
for LTS distributions. But perhaps we can figure out a long-term exit
strategy?

The kernel's preference for type 8 designators (see below) is in
contrast with the established user space algorithms, which determine
SCSI WWIDs on productive systems in practice. User space can try to
adapt to the kernel logic, but it will necessarily be a slow and
painful path if we want to avoid breaking user setups.

In principle, I believe the kernel is "right" to prefer type 8. But
because the "wwid" attribute isn't actually used for device
identification today, changing the kernel logic would be less prone to
regressions than changing user space, even if it violates the principle
that the kernel's user space API must remain stable.

Would it be an option to modify the kernel logic?

If we can't, I think we should start with making the "wwid" attribute
part of the udev rule logic, and letting distros configure whether the
kernel logic or the traditional udev logic would be used.

Please tell me your thoughts on this matter.

Regards,
Martin

PS: Incomplete list of algorithms for SCSI designator priorities:

The kernel ("wwid" sysfs attribute) prefers "SCSI name string" (type 8)
designators over other types
(https://elixir.bootlin.com/linux/latest/A/ident/designator_prio).

The current set of udev rules in sg3_utils
(https://github.com/hreinecke/sg3_utils/blob/master/scripts/55-scsi-sg3_id.rules)
don't use the kernel's wwid attribute; they parse VPD 83 and 80
instead and prioritize types 36, 35, 32, and 2 over type 8.

udev's "scsi_id" tool, historically the first attempt to implement a
priority for this, doesn't look at the SCSI name attribute at all:
https://github.com/systemd/systemd/blob/main/src/udev/scsi_id/scsi_serial.c

There's a "fallback" logic in multipath-tools in case udev doesn't
provide a WWID:
https://github.com/opensvc/multipath-tools/blob/a41a61e8482def33e3ca8c9e3639ad2c37611551/libmultipath/discovery.c#L1040

-- 
Dr. Martin Wilck <mwilck@suse.com>, Tel. +49 (0)911 74053 2107
SUSE Software Solutions Germany GmbH
HRB 36809, AG Nürnberg GF: Felix Imendörffer



--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: RFC: one more time: SCSI device identification
  2021-03-29  9:58 ` [dm-devel] " Martin Wilck
@ 2021-04-06  4:47   ` Martin K. Petersen
  -1 siblings, 0 replies; 50+ messages in thread
From: Martin K. Petersen @ 2021-04-06  4:47 UTC (permalink / raw)
  To: Martin Wilck
  Cc: hch, jejb, bmarzins, Hannes Reinecke, martin.petersen,
	linux-scsi, dgilbert, systemd-devel, dm-devel


Martin,

> The kernel's preference for type 8 designators (see below) is in
> contrast with the established user space algorithms, which determine
> SCSI WWIDs on productive systems in practice. User space can try to
> adapt to the kernel logic, but it will necessarily be a slow and
> painful path if we want to avoid breaking user setups.

I was concerned when you changed the kernel prioritization a while back
and I still don't think that we should tweak that code any further.

If the kernel picks one ID over another, that should be for the kernel's
use. Letting the kernel decide which ID is best for userland is not a
good approach.

So while I originally liked the idea of exposing a transport and
protocol agnostic wwid for each block device, I think that all the
descriptors and ID formats available in both SCSI and NVMe have shown
that that approach is fraught with peril.

Descriptors that provide "good uniqueness" on one device may be a
completely sub-optimal choice for another (zero-padded values, full of
spaces, vendors getting things wrong in general).

So I think my inclination would be to leave the current wwid as-is to
avoid the risk of breaking things. And then export all ID descriptors
reported in sysfs. Even though vpd83 is already exported in its
entirety, I don't have any particular concerns about the individual
values being exported separately. That makes many userland things so
much easier. And I think the kernel is in a good position to disseminate
information reported by the hardware.

This puts the prioritization entirely in the distro/udev/scripting
domain. Taking the kernel out of the picture will make migration
easier. And it allows a user to pick their descriptor of choice should a
device report something completely unusable in type 8.

-- 
Martin K. Petersen	Oracle Linux Engineering

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [dm-devel] RFC: one more time: SCSI device identification
@ 2021-04-06  4:47   ` Martin K. Petersen
  0 siblings, 0 replies; 50+ messages in thread
From: Martin K. Petersen @ 2021-04-06  4:47 UTC (permalink / raw)
  To: Martin Wilck
  Cc: Hannes Reinecke, jejb, martin.petersen, linux-scsi, dm-devel,
	dgilbert, systemd-devel, hch


Martin,

> The kernel's preference for type 8 designators (see below) is in
> contrast with the established user space algorithms, which determine
> SCSI WWIDs on productive systems in practice. User space can try to
> adapt to the kernel logic, but it will necessarily be a slow and
> painful path if we want to avoid breaking user setups.

I was concerned when you changed the kernel prioritization a while back
and I still don't think that we should tweak that code any further.

If the kernel picks one ID over another, that should be for the kernel's
use. Letting the kernel decide which ID is best for userland is not a
good approach.

So while I originally liked the idea of exposing a transport and
protocol agnostic wwid for each block device, I think that all the
descriptors and ID formats available in both SCSI and NVMe have shown
that that approach is fraught with peril.

Descriptors that provide "good uniqueness" on one device may be a
completely sub-optimal choice for another (zero-padded values, full of
spaces, vendors getting things wrong in general).

So I think my inclination would be to leave the current wwid as-is to
avoid the risk of breaking things. And then export all ID descriptors
reported in sysfs. Even though vpd83 is already exported in its
entirety, I don't have any particular concerns about the individual
values being exported separately. That makes many userland things so
much easier. And I think the kernel is in a good position to disseminate
information reported by the hardware.

This puts the prioritization entirely in the distro/udev/scripting
domain. Taking the kernel out of the picture will make migration
easier. And it allows a user to pick their descriptor of choice should a
device report something completely unusable in type 8.

-- 
Martin K. Petersen	Oracle Linux Engineering

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: RFC: one more time: SCSI device identification
  2021-04-06  4:47   ` [dm-devel] " Martin K. Petersen
@ 2021-04-16 23:28     ` Martin Wilck
  -1 siblings, 0 replies; 50+ messages in thread
From: Martin Wilck @ 2021-04-16 23:28 UTC (permalink / raw)
  To: martin.petersen
  Cc: Hannes Reinecke, hch, dgilbert, dm-devel, linux-scsi, jejb,
	systemd-devel, bmarzins

Hello Martin,

Sorry for the late response, still recovering from a week out of
office.

On Tue, 2021-04-06 at 00:47 -0400, Martin K. Petersen wrote:
> 
> Martin,
> 
> > The kernel's preference for type 8 designators (see below) is in
> > contrast with the established user space algorithms, which
> > determine
> > SCSI WWIDs on productive systems in practice. User space can try to
> > adapt to the kernel logic, but it will necessarily be a slow and
> > painful path if we want to avoid breaking user setups.
> 
> I was concerned when you changed the kernel prioritization a while
> back
> and I still don't think that we should tweak that code any further.

Ok.

> If the kernel picks one ID over another, that should be for the
> kernel's
> use. Letting the kernel decide which ID is best for userland is not a
> good approach.

Well, the kernel itself doesn't make any use of this property currently
(and user space doesn't much either, afaik).


> So I think my inclination would be to leave the current wwid as-is to
> avoid the risk of breaking things. And then export all ID descriptors
> reported in sysfs. Even though vpd83 is already exported in its
> entirety, I don't have any particular concerns about the individual
> values being exported separately. That makes many userland things so
> much easier. And I think the kernel is in a good position to
> disseminate
> information reported by the hardware.
> 
> This puts the prioritization entirely in the distro/udev/scripting
> domain. Taking the kernel out of the picture will make migration
> easier. And it allows a user to pick their descriptor of choice
> should a
> device report something completely unusable in type 8.

Hm, it sounds intriguing, but it has issues in its own right. For years
to come, user space will have to probe whether these attribute exist,
and fall back to the current ones ("wwid", "vpd_pg83") otherwise. So
user space can't be simplified any time soon. Speaking for an important
user space consumer of WWIDs (multipathd), I doubt that this would
improve matters for us. We'd be happy if the kernel could just pick the
"best" designator for us. But I understand that the kernel can't
guarantee a good choice (user space can't either).

What is your idea how these new sysfs attributes should be named? Just
enumerate, or name them by type somehow?

Thanks,
Martin

-- 
Dr. Martin Wilck <mwilck@suse.com>, Tel. +49 (0)911 74053 2107
SUSE Software Solutions Germany GmbH
HRB 36809, AG Nürnberg GF: Felix Imendörffer



^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [dm-devel] RFC: one more time: SCSI device identification
@ 2021-04-16 23:28     ` Martin Wilck
  0 siblings, 0 replies; 50+ messages in thread
From: Martin Wilck @ 2021-04-16 23:28 UTC (permalink / raw)
  To: martin.petersen
  Cc: Hannes Reinecke, jejb, linux-scsi, dm-devel, dgilbert,
	systemd-devel, hch

Hello Martin,

Sorry for the late response, still recovering from a week out of
office.

On Tue, 2021-04-06 at 00:47 -0400, Martin K. Petersen wrote:
> 
> Martin,
> 
> > The kernel's preference for type 8 designators (see below) is in
> > contrast with the established user space algorithms, which
> > determine
> > SCSI WWIDs on productive systems in practice. User space can try to
> > adapt to the kernel logic, but it will necessarily be a slow and
> > painful path if we want to avoid breaking user setups.
> 
> I was concerned when you changed the kernel prioritization a while
> back
> and I still don't think that we should tweak that code any further.

Ok.

> If the kernel picks one ID over another, that should be for the
> kernel's
> use. Letting the kernel decide which ID is best for userland is not a
> good approach.

Well, the kernel itself doesn't make any use of this property currently
(and user space doesn't much either, afaik).


> So I think my inclination would be to leave the current wwid as-is to
> avoid the risk of breaking things. And then export all ID descriptors
> reported in sysfs. Even though vpd83 is already exported in its
> entirety, I don't have any particular concerns about the individual
> values being exported separately. That makes many userland things so
> much easier. And I think the kernel is in a good position to
> disseminate
> information reported by the hardware.
> 
> This puts the prioritization entirely in the distro/udev/scripting
> domain. Taking the kernel out of the picture will make migration
> easier. And it allows a user to pick their descriptor of choice
> should a
> device report something completely unusable in type 8.

Hm, it sounds intriguing, but it has issues in its own right. For years
to come, user space will have to probe whether these attribute exist,
and fall back to the current ones ("wwid", "vpd_pg83") otherwise. So
user space can't be simplified any time soon. Speaking for an important
user space consumer of WWIDs (multipathd), I doubt that this would
improve matters for us. We'd be happy if the kernel could just pick the
"best" designator for us. But I understand that the kernel can't
guarantee a good choice (user space can't either).

What is your idea how these new sysfs attributes should be named? Just
enumerate, or name them by type somehow?

Thanks,
Martin

-- 
Dr. Martin Wilck <mwilck@suse.com>, Tel. +49 (0)911 74053 2107
SUSE Software Solutions Germany GmbH
HRB 36809, AG Nürnberg GF: Felix Imendörffer



--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: RFC: one more time: SCSI device identification
  2021-04-16 23:28     ` [dm-devel] " Martin Wilck
@ 2021-04-22  2:46       ` Martin K. Petersen
  -1 siblings, 0 replies; 50+ messages in thread
From: Martin K. Petersen @ 2021-04-22  2:46 UTC (permalink / raw)
  To: Martin Wilck
  Cc: martin.petersen, Hannes Reinecke, hch, dgilbert, dm-devel,
	linux-scsi, jejb, systemd-devel, bmarzins


Martin,

> Hm, it sounds intriguing, but it has issues in its own right. For
> years to come, user space will have to probe whether these attribute
> exist, and fall back to the current ones ("wwid", "vpd_pg83")
> otherwise. So user space can't be simplified any time soon. Speaking
> for an important user space consumer of WWIDs (multipathd), I doubt
> that this would improve matters for us. We'd be happy if the kernel
> could just pick the "best" designator for us. But I understand that
> the kernel can't guarantee a good choice (user space can't either).

But user space can be adapted at runtime to pick one designator over the
other (ha!).

We could do that in the kernel too, of course, but I'm afraid what the
resulting BLIST changes would end up looking like over time.

I am also very concerned about changing what the kernel currently
exports in a given variable like "wwid". A seemingly innocuous change to
the reported value could lead to a system no longer booting after
updating the kernel.

(Ignoring for a moment that some arrays will helpfully add a new ID
designator after a firmware upgrade and thus change what the kernel
reports. *sigh*)

> What is your idea how these new sysfs attributes should be named? Just
> enumerate, or name them by type somehow?

Up to you. Whatever you think would be easiest for userland to deal
with. I don't have a good feeling for how common vendor specific ones
are in practice. Things would obviously be easier if SCSI didn't have so
many choices :(

But taking a step back: Other than "it's not what userland currently
does", what specifically is the problem with designator_prio()? We've
picked the priority list once and for all. If we promise never to change
it, what is the issue?

-- 
Martin K. Petersen	Oracle Linux Engineering

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [dm-devel] RFC: one more time: SCSI device identification
@ 2021-04-22  2:46       ` Martin K. Petersen
  0 siblings, 0 replies; 50+ messages in thread
From: Martin K. Petersen @ 2021-04-22  2:46 UTC (permalink / raw)
  To: Martin Wilck
  Cc: Reinecke, jejb, linux-scsi, martin.petersen, Hannes, dm-devel,
	dgilbert, systemd-devel, hch


Martin,

> Hm, it sounds intriguing, but it has issues in its own right. For
> years to come, user space will have to probe whether these attribute
> exist, and fall back to the current ones ("wwid", "vpd_pg83")
> otherwise. So user space can't be simplified any time soon. Speaking
> for an important user space consumer of WWIDs (multipathd), I doubt
> that this would improve matters for us. We'd be happy if the kernel
> could just pick the "best" designator for us. But I understand that
> the kernel can't guarantee a good choice (user space can't either).

But user space can be adapted at runtime to pick one designator over the
other (ha!).

We could do that in the kernel too, of course, but I'm afraid what the
resulting BLIST changes would end up looking like over time.

I am also very concerned about changing what the kernel currently
exports in a given variable like "wwid". A seemingly innocuous change to
the reported value could lead to a system no longer booting after
updating the kernel.

(Ignoring for a moment that some arrays will helpfully add a new ID
designator after a firmware upgrade and thus change what the kernel
reports. *sigh*)

> What is your idea how these new sysfs attributes should be named? Just
> enumerate, or name them by type somehow?

Up to you. Whatever you think would be easiest for userland to deal
with. I don't have a good feeling for how common vendor specific ones
are in practice. Things would obviously be easier if SCSI didn't have so
many choices :(

But taking a step back: Other than "it's not what userland currently
does", what specifically is the problem with designator_prio()? We've
picked the priority list once and for all. If we promise never to change
it, what is the issue?

-- 
Martin K. Petersen	Oracle Linux Engineering

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: RFC: one more time: SCSI device identification
  2021-04-22  2:46       ` [dm-devel] " Martin K. Petersen
@ 2021-04-22  9:07         ` Martin Wilck
  -1 siblings, 0 replies; 50+ messages in thread
From: Martin Wilck @ 2021-04-22  9:07 UTC (permalink / raw)
  To: martin.petersen
  Cc: Hannes Reinecke, hch, dgilbert, dm-devel, linux-scsi, jejb,
	systemd-devel, bmarzins

On Wed, 2021-04-21 at 22:46 -0400, Martin K. Petersen wrote:
> 
> Martin,
> 
> > Hm, it sounds intriguing, but it has issues in its own right. For
> > years to come, user space will have to probe whether these attribute
> > exist, and fall back to the current ones ("wwid", "vpd_pg83")
> > otherwise. So user space can't be simplified any time soon. Speaking
> > for an important user space consumer of WWIDs (multipathd), I doubt
> > that this would improve matters for us. We'd be happy if the kernel
> > could just pick the "best" designator for us. But I understand that
> > the kernel can't guarantee a good choice (user space can't either).
> 
> But user space can be adapted at runtime to pick one designator over
> the
> other (ha!).

And that's exactly the problem. Effectively, all user space relies on
udev today, because that's where this "adaptation" is taking place. It
happens

 1) either in systemd's scsi_id built-in 
   (https://github.com/systemd/systemd/blob/7feb1dd6544d1bf373dbe13dd33cf563ed16f891/src/udev/scsi_id/scsi_serial.c#L37)
 2) or in the udev rules coming with sg3_utils 
   (https://github.com/hreinecke/sg3_utils/blob/master/scripts/55-scsi-sg3_id.rules)

1) is just as opaque and un-"adaptable" as the kernel, and the logic is
suboptimal. 2) is of course "adaptable", but that's a problem in
practice, if udev fails to provide a WWID. multipath-tools go through
various twists for this case to figure out "fallback" WWIDs, guessing
whether that "fallback" matches what udev would have returned if it had
worked.

That's the gist of it - the general frustration about udev among some
of its heaviest users (talk to the LVM2 maintainers).

I suppose 99.9% of users never bother with customizing the udev rules.
IOW, these users might as well just use a kernel-provided value. But
the remaining 0.1% causes headaches for user-space applications, which
can't make solid assumptions about the rules. Thus, in a way, the
flexibility of the rules does more harm than it helps.

> We could do that in the kernel too, of course, but I'm afraid what
> the
> resulting BLIST changes would end up looking like over time.

That's something we want to avoid, sure.

But we can actually combine both approaches. If "wwid" yields a good
value most of the time (which is true IMO), we could make user space
rely on it by default, and make it possible to set an udev property
(e.g. ENV{ID_LEGACY}="1") to tell udev rules to determine WWID
differently. User-space apps like multipath could check the ID_LEGACY
property to determine whether or not reading the "wwid" attribute would
be consistent with udev. That would simplify matters a lot for us (Ben,
do you agree?), without the need of adding endless BLIST entries.


> I am also very concerned about changing what the kernel currently
> exports in a given variable like "wwid". A seemingly innocuous change
> to
> the reported value could lead to a system no longer booting after
> updating the kernel.

AFAICT, no major distribution uses "wwid" for this purpose (yet). I
just recently realized that the kernel's ALUA code refers to it. (*)

In a recent discussion with Hannes, the idea came up that the priority
of "SCSI name string" designators should actually depend on their
subtype. "naa." name strings should map to the respective NAA
descriptors, and "eui.", likewise (only "iqn." descriptors have no
binary counterpart; we thought they should rather be put below NAA,
prio-wise).

I wonder if you'd agree with a change made that way for "wwid". I
suppose you don't. I'd then propose to add a new attribute following
this logic. It could simply be an additional attribute with a different
name. Or this new attribute could be a property of the block device
rather than the SCSI device, like NVMe does it
(/sys/block/nvme0n2/wwid).

I don't like the idea of having separate sysfs attributes for
designators of different types, that's impractical for user space.

> But taking a step back: Other than "it's not what userland currently
> does", what specifically is the problem with designator_prio()? We've
> picked the priority list once and for all. If we promise never to
> change
> it, what is the issue?

If the prioritization in kernel and user space was the same, we could
migrate away from udev more easily without risking boot failure.

Thanks,
Martin

(*) which is an argument for using "wwid" in user space too - just to
be consitent with the kernel's internal logic.

-- 
Dr. Martin Wilck <mwilck@suse.com>, Tel. +49 (0)911 74053 2107
SUSE Software Solutions Germany GmbH
HRB 36809, AG Nürnberg GF: Felix Imendörffer



^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [dm-devel] RFC: one more time: SCSI device identification
@ 2021-04-22  9:07         ` Martin Wilck
  0 siblings, 0 replies; 50+ messages in thread
From: Martin Wilck @ 2021-04-22  9:07 UTC (permalink / raw)
  To: martin.petersen
  Cc: Hannes Reinecke, jejb, linux-scsi, dm-devel, dgilbert,
	systemd-devel, hch

On Wed, 2021-04-21 at 22:46 -0400, Martin K. Petersen wrote:
> 
> Martin,
> 
> > Hm, it sounds intriguing, but it has issues in its own right. For
> > years to come, user space will have to probe whether these attribute
> > exist, and fall back to the current ones ("wwid", "vpd_pg83")
> > otherwise. So user space can't be simplified any time soon. Speaking
> > for an important user space consumer of WWIDs (multipathd), I doubt
> > that this would improve matters for us. We'd be happy if the kernel
> > could just pick the "best" designator for us. But I understand that
> > the kernel can't guarantee a good choice (user space can't either).
> 
> But user space can be adapted at runtime to pick one designator over
> the
> other (ha!).

And that's exactly the problem. Effectively, all user space relies on
udev today, because that's where this "adaptation" is taking place. It
happens

 1) either in systemd's scsi_id built-in 
   (https://github.com/systemd/systemd/blob/7feb1dd6544d1bf373dbe13dd33cf563ed16f891/src/udev/scsi_id/scsi_serial.c#L37)
 2) or in the udev rules coming with sg3_utils 
   (https://github.com/hreinecke/sg3_utils/blob/master/scripts/55-scsi-sg3_id.rules)

1) is just as opaque and un-"adaptable" as the kernel, and the logic is
suboptimal. 2) is of course "adaptable", but that's a problem in
practice, if udev fails to provide a WWID. multipath-tools go through
various twists for this case to figure out "fallback" WWIDs, guessing
whether that "fallback" matches what udev would have returned if it had
worked.

That's the gist of it - the general frustration about udev among some
of its heaviest users (talk to the LVM2 maintainers).

I suppose 99.9% of users never bother with customizing the udev rules.
IOW, these users might as well just use a kernel-provided value. But
the remaining 0.1% causes headaches for user-space applications, which
can't make solid assumptions about the rules. Thus, in a way, the
flexibility of the rules does more harm than it helps.

> We could do that in the kernel too, of course, but I'm afraid what
> the
> resulting BLIST changes would end up looking like over time.

That's something we want to avoid, sure.

But we can actually combine both approaches. If "wwid" yields a good
value most of the time (which is true IMO), we could make user space
rely on it by default, and make it possible to set an udev property
(e.g. ENV{ID_LEGACY}="1") to tell udev rules to determine WWID
differently. User-space apps like multipath could check the ID_LEGACY
property to determine whether or not reading the "wwid" attribute would
be consistent with udev. That would simplify matters a lot for us (Ben,
do you agree?), without the need of adding endless BLIST entries.


> I am also very concerned about changing what the kernel currently
> exports in a given variable like "wwid". A seemingly innocuous change
> to
> the reported value could lead to a system no longer booting after
> updating the kernel.

AFAICT, no major distribution uses "wwid" for this purpose (yet). I
just recently realized that the kernel's ALUA code refers to it. (*)

In a recent discussion with Hannes, the idea came up that the priority
of "SCSI name string" designators should actually depend on their
subtype. "naa." name strings should map to the respective NAA
descriptors, and "eui.", likewise (only "iqn." descriptors have no
binary counterpart; we thought they should rather be put below NAA,
prio-wise).

I wonder if you'd agree with a change made that way for "wwid". I
suppose you don't. I'd then propose to add a new attribute following
this logic. It could simply be an additional attribute with a different
name. Or this new attribute could be a property of the block device
rather than the SCSI device, like NVMe does it
(/sys/block/nvme0n2/wwid).

I don't like the idea of having separate sysfs attributes for
designators of different types, that's impractical for user space.

> But taking a step back: Other than "it's not what userland currently
> does", what specifically is the problem with designator_prio()? We've
> picked the priority list once and for all. If we promise never to
> change
> it, what is the issue?

If the prioritization in kernel and user space was the same, we could
migrate away from udev more easily without risking boot failure.

Thanks,
Martin

(*) which is an argument for using "wwid" in user space too - just to
be consitent with the kernel's internal logic.

-- 
Dr. Martin Wilck <mwilck@suse.com>, Tel. +49 (0)911 74053 2107
SUSE Software Solutions Germany GmbH
HRB 36809, AG Nürnberg GF: Felix Imendörffer



--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: RFC: one more time: SCSI device identification
  2021-04-22  9:07         ` [dm-devel] " Martin Wilck
@ 2021-04-22 16:14           ` Benjamin Marzinski
  -1 siblings, 0 replies; 50+ messages in thread
From: Benjamin Marzinski @ 2021-04-22 16:14 UTC (permalink / raw)
  To: Martin Wilck
  Cc: martin.petersen, Hannes Reinecke, hch, dgilbert, dm-devel,
	linux-scsi, jejb, systemd-devel

On Thu, Apr 22, 2021 at 09:07:15AM +0000, Martin Wilck wrote:
> On Wed, 2021-04-21 at 22:46 -0400, Martin K. Petersen wrote:
> > 
> > Martin,
> > 
> > > Hm, it sounds intriguing, but it has issues in its own right. For
> > > years to come, user space will have to probe whether these attribute
> > > exist, and fall back to the current ones ("wwid", "vpd_pg83")
> > > otherwise. So user space can't be simplified any time soon. Speaking
> > > for an important user space consumer of WWIDs (multipathd), I doubt
> > > that this would improve matters for us. We'd be happy if the kernel
> > > could just pick the "best" designator for us. But I understand that
> > > the kernel can't guarantee a good choice (user space can't either).
> > 
> > But user space can be adapted at runtime to pick one designator over
> > the
> > other (ha!).
> 
> And that's exactly the problem. Effectively, all user space relies on
> udev today, because that's where this "adaptation" is taking place. It
> happens
> 
>  1) either in systemd's scsi_id built-in 
>    (https://github.com/systemd/systemd/blob/7feb1dd6544d1bf373dbe13dd33cf563ed16f891/src/udev/scsi_id/scsi_serial.c#L37)
>  2) or in the udev rules coming with sg3_utils 
>    (https://github.com/hreinecke/sg3_utils/blob/master/scripts/55-scsi-sg3_id.rules)
> 
> 1) is just as opaque and un-"adaptable" as the kernel, and the logic is
> suboptimal. 2) is of course "adaptable", but that's a problem in
> practice, if udev fails to provide a WWID. multipath-tools go through
> various twists for this case to figure out "fallback" WWIDs, guessing
> whether that "fallback" matches what udev would have returned if it had
> worked.
> 
> That's the gist of it - the general frustration about udev among some
> of its heaviest users (talk to the LVM2 maintainers).
> 
> I suppose 99.9% of users never bother with customizing the udev rules.
> IOW, these users might as well just use a kernel-provided value. But
> the remaining 0.1% causes headaches for user-space applications, which
> can't make solid assumptions about the rules. Thus, in a way, the
> flexibility of the rules does more harm than it helps.
> 
> > We could do that in the kernel too, of course, but I'm afraid what
> > the
> > resulting BLIST changes would end up looking like over time.
> 
> That's something we want to avoid, sure.
> 
> But we can actually combine both approaches. If "wwid" yields a good
> value most of the time (which is true IMO), we could make user space
> rely on it by default, and make it possible to set an udev property
> (e.g. ENV{ID_LEGACY}="1") to tell udev rules to determine WWID
> differently. User-space apps like multipath could check the ID_LEGACY
> property to determine whether or not reading the "wwid" attribute would
> be consistent with udev. That would simplify matters a lot for us (Ben,
> do you agree?), without the need of adding endless BLIST entries.
> 

Yeah, as long as ID_LEGACY was changed in a careful manner, so WWIDs
didn't simply change without warning because of an upgrade, a path out
of this complexity is a definitely helpful.

-Ben

> 
> > I am also very concerned about changing what the kernel currently
> > exports in a given variable like "wwid". A seemingly innocuous change
> > to
> > the reported value could lead to a system no longer booting after
> > updating the kernel.
> 
> AFAICT, no major distribution uses "wwid" for this purpose (yet). I
> just recently realized that the kernel's ALUA code refers to it. (*)
> 
> In a recent discussion with Hannes, the idea came up that the priority
> of "SCSI name string" designators should actually depend on their
> subtype. "naa." name strings should map to the respective NAA
> descriptors, and "eui.", likewise (only "iqn." descriptors have no
> binary counterpart; we thought they should rather be put below NAA,
> prio-wise).
> 
> I wonder if you'd agree with a change made that way for "wwid". I
> suppose you don't. I'd then propose to add a new attribute following
> this logic. It could simply be an additional attribute with a different
> name. Or this new attribute could be a property of the block device
> rather than the SCSI device, like NVMe does it
> (/sys/block/nvme0n2/wwid).
> 
> I don't like the idea of having separate sysfs attributes for
> designators of different types, that's impractical for user space.
> 
> > But taking a step back: Other than "it's not what userland currently
> > does", what specifically is the problem with designator_prio()? We've
> > picked the priority list once and for all. If we promise never to
> > change
> > it, what is the issue?
> 
> If the prioritization in kernel and user space was the same, we could
> migrate away from udev more easily without risking boot failure.
> 
> Thanks,
> Martin
> 
> (*) which is an argument for using "wwid" in user space too - just to
> be consitent with the kernel's internal logic.
> 
> -- 
> Dr. Martin Wilck <mwilck@suse.com>, Tel. +49 (0)911 74053 2107
> SUSE Software Solutions Germany GmbH
> HRB 36809, AG Nürnberg GF: Felix Imendörffer
> 


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [dm-devel] RFC: one more time: SCSI device identification
@ 2021-04-22 16:14           ` Benjamin Marzinski
  0 siblings, 0 replies; 50+ messages in thread
From: Benjamin Marzinski @ 2021-04-22 16:14 UTC (permalink / raw)
  To: Martin Wilck
  Cc: Hannes Reinecke, jejb, martin.petersen, linux-scsi, dm-devel,
	dgilbert, systemd-devel, hch

On Thu, Apr 22, 2021 at 09:07:15AM +0000, Martin Wilck wrote:
> On Wed, 2021-04-21 at 22:46 -0400, Martin K. Petersen wrote:
> > 
> > Martin,
> > 
> > > Hm, it sounds intriguing, but it has issues in its own right. For
> > > years to come, user space will have to probe whether these attribute
> > > exist, and fall back to the current ones ("wwid", "vpd_pg83")
> > > otherwise. So user space can't be simplified any time soon. Speaking
> > > for an important user space consumer of WWIDs (multipathd), I doubt
> > > that this would improve matters for us. We'd be happy if the kernel
> > > could just pick the "best" designator for us. But I understand that
> > > the kernel can't guarantee a good choice (user space can't either).
> > 
> > But user space can be adapted at runtime to pick one designator over
> > the
> > other (ha!).
> 
> And that's exactly the problem. Effectively, all user space relies on
> udev today, because that's where this "adaptation" is taking place. It
> happens
> 
>  1) either in systemd's scsi_id built-in 
>    (https://github.com/systemd/systemd/blob/7feb1dd6544d1bf373dbe13dd33cf563ed16f891/src/udev/scsi_id/scsi_serial.c#L37)
>  2) or in the udev rules coming with sg3_utils 
>    (https://github.com/hreinecke/sg3_utils/blob/master/scripts/55-scsi-sg3_id.rules)
> 
> 1) is just as opaque and un-"adaptable" as the kernel, and the logic is
> suboptimal. 2) is of course "adaptable", but that's a problem in
> practice, if udev fails to provide a WWID. multipath-tools go through
> various twists for this case to figure out "fallback" WWIDs, guessing
> whether that "fallback" matches what udev would have returned if it had
> worked.
> 
> That's the gist of it - the general frustration about udev among some
> of its heaviest users (talk to the LVM2 maintainers).
> 
> I suppose 99.9% of users never bother with customizing the udev rules.
> IOW, these users might as well just use a kernel-provided value. But
> the remaining 0.1% causes headaches for user-space applications, which
> can't make solid assumptions about the rules. Thus, in a way, the
> flexibility of the rules does more harm than it helps.
> 
> > We could do that in the kernel too, of course, but I'm afraid what
> > the
> > resulting BLIST changes would end up looking like over time.
> 
> That's something we want to avoid, sure.
> 
> But we can actually combine both approaches. If "wwid" yields a good
> value most of the time (which is true IMO), we could make user space
> rely on it by default, and make it possible to set an udev property
> (e.g. ENV{ID_LEGACY}="1") to tell udev rules to determine WWID
> differently. User-space apps like multipath could check the ID_LEGACY
> property to determine whether or not reading the "wwid" attribute would
> be consistent with udev. That would simplify matters a lot for us (Ben,
> do you agree?), without the need of adding endless BLIST entries.
> 

Yeah, as long as ID_LEGACY was changed in a careful manner, so WWIDs
didn't simply change without warning because of an upgrade, a path out
of this complexity is a definitely helpful.

-Ben

> 
> > I am also very concerned about changing what the kernel currently
> > exports in a given variable like "wwid". A seemingly innocuous change
> > to
> > the reported value could lead to a system no longer booting after
> > updating the kernel.
> 
> AFAICT, no major distribution uses "wwid" for this purpose (yet). I
> just recently realized that the kernel's ALUA code refers to it. (*)
> 
> In a recent discussion with Hannes, the idea came up that the priority
> of "SCSI name string" designators should actually depend on their
> subtype. "naa." name strings should map to the respective NAA
> descriptors, and "eui.", likewise (only "iqn." descriptors have no
> binary counterpart; we thought they should rather be put below NAA,
> prio-wise).
> 
> I wonder if you'd agree with a change made that way for "wwid". I
> suppose you don't. I'd then propose to add a new attribute following
> this logic. It could simply be an additional attribute with a different
> name. Or this new attribute could be a property of the block device
> rather than the SCSI device, like NVMe does it
> (/sys/block/nvme0n2/wwid).
> 
> I don't like the idea of having separate sysfs attributes for
> designators of different types, that's impractical for user space.
> 
> > But taking a step back: Other than "it's not what userland currently
> > does", what specifically is the problem with designator_prio()? We've
> > picked the priority list once and for all. If we promise never to
> > change
> > it, what is the issue?
> 
> If the prioritization in kernel and user space was the same, we could
> migrate away from udev more easily without risking boot failure.
> 
> Thanks,
> Martin
> 
> (*) which is an argument for using "wwid" in user space too - just to
> be consitent with the kernel's internal logic.
> 
> -- 
> Dr. Martin Wilck <mwilck@suse.com>, Tel. +49 (0)911 74053 2107
> SUSE Software Solutions Germany GmbH
> HRB 36809, AG Nürnberg GF: Felix Imendörffer
> 

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: RFC: one more time: SCSI device identification
  2021-04-22  9:07         ` [dm-devel] " Martin Wilck
@ 2021-04-23  1:40           ` Martin K. Petersen
  -1 siblings, 0 replies; 50+ messages in thread
From: Martin K. Petersen @ 2021-04-23  1:40 UTC (permalink / raw)
  To: Martin Wilck
  Cc: martin.petersen, Hannes Reinecke, hch, dgilbert, dm-devel,
	linux-scsi, jejb, systemd-devel, bmarzins


Martin,

> I suppose 99.9% of users never bother with customizing the udev rules.

Except for the other 99.9%, perhaps? :) We definitely have many users
that tweak udev storage rules for a variety of reasons. Including being
able to use RII for LUN naming purposes.

> But we can actually combine both approaches. If "wwid" yields a good
> value most of the time (which is true IMO), we could make user space
> rely on it by default, and make it possible to set an udev property
> (e.g. ENV{ID_LEGACY}="1") to tell udev rules to determine WWID
> differently. User-space apps like multipath could check the ID_LEGACY
> property to determine whether or not reading the "wwid" attribute would
> be consistent with udev. That would simplify matters a lot for us (Ben,
> do you agree?), without the need of adding endless BLIST entries.

That's fine with me.

> AFAICT, no major distribution uses "wwid" for this purpose (yet).

We definitely have users that currently rely on wwid, although probably
not through standard distro scripts.

> In a recent discussion with Hannes, the idea came up that the priority
> of "SCSI name string" designators should actually depend on their
> subtype. "naa." name strings should map to the respective NAA
> descriptors, and "eui.", likewise (only "iqn." descriptors have no
> binary counterpart; we thought they should rather be put below NAA,
> prio-wise).

I like what NVMe did wrt. to exporting eui, nguid, uuid separately from
the best-effort wwid. That's why I suggested separate sysfs files for
the various page 0x83 descriptors. I like the idea of being able to
explicitly ask for an eui if that's what I need. But that appears to be
somewhat orthogonal to your request.

> I wonder if you'd agree with a change made that way for "wwid". I
> suppose you don't. I'd then propose to add a new attribute following
> this logic. It could simply be an additional attribute with a different
> name. Or this new attribute could be a property of the block device
> rather than the SCSI device, like NVMe does it
> (/sys/block/nvme0n2/wwid).

That's fine. I am not a big fan of the idea that block/foo/wwid and
block/foo/device/wwid could end up being different. But I do think that
from a userland tooling perspective the consistency with NVMe is more
important.

-- 
Martin K. Petersen	Oracle Linux Engineering

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [dm-devel] RFC: one more time: SCSI device identification
@ 2021-04-23  1:40           ` Martin K. Petersen
  0 siblings, 0 replies; 50+ messages in thread
From: Martin K. Petersen @ 2021-04-23  1:40 UTC (permalink / raw)
  To: Martin Wilck
  Cc: Reinecke, jejb, linux-scsi, martin.petersen, Hannes, dm-devel,
	dgilbert, systemd-devel, hch


Martin,

> I suppose 99.9% of users never bother with customizing the udev rules.

Except for the other 99.9%, perhaps? :) We definitely have many users
that tweak udev storage rules for a variety of reasons. Including being
able to use RII for LUN naming purposes.

> But we can actually combine both approaches. If "wwid" yields a good
> value most of the time (which is true IMO), we could make user space
> rely on it by default, and make it possible to set an udev property
> (e.g. ENV{ID_LEGACY}="1") to tell udev rules to determine WWID
> differently. User-space apps like multipath could check the ID_LEGACY
> property to determine whether or not reading the "wwid" attribute would
> be consistent with udev. That would simplify matters a lot for us (Ben,
> do you agree?), without the need of adding endless BLIST entries.

That's fine with me.

> AFAICT, no major distribution uses "wwid" for this purpose (yet).

We definitely have users that currently rely on wwid, although probably
not through standard distro scripts.

> In a recent discussion with Hannes, the idea came up that the priority
> of "SCSI name string" designators should actually depend on their
> subtype. "naa." name strings should map to the respective NAA
> descriptors, and "eui.", likewise (only "iqn." descriptors have no
> binary counterpart; we thought they should rather be put below NAA,
> prio-wise).

I like what NVMe did wrt. to exporting eui, nguid, uuid separately from
the best-effort wwid. That's why I suggested separate sysfs files for
the various page 0x83 descriptors. I like the idea of being able to
explicitly ask for an eui if that's what I need. But that appears to be
somewhat orthogonal to your request.

> I wonder if you'd agree with a change made that way for "wwid". I
> suppose you don't. I'd then propose to add a new attribute following
> this logic. It could simply be an additional attribute with a different
> name. Or this new attribute could be a property of the block device
> rather than the SCSI device, like NVMe does it
> (/sys/block/nvme0n2/wwid).

That's fine. I am not a big fan of the idea that block/foo/wwid and
block/foo/device/wwid could end up being different. But I do think that
from a userland tooling perspective the consistency with NVMe is more
important.

-- 
Martin K. Petersen	Oracle Linux Engineering

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: RFC: one more time: SCSI device identification
  2021-04-23  1:40           ` [dm-devel] " Martin K. Petersen
@ 2021-04-23 10:28             ` Martin Wilck
  -1 siblings, 0 replies; 50+ messages in thread
From: Martin Wilck @ 2021-04-23 10:28 UTC (permalink / raw)
  To: martin.petersen
  Cc: Hannes Reinecke, hch, dgilbert, dm-devel, linux-scsi, jejb,
	systemd-devel, bmarzins

On Thu, 2021-04-22 at 21:40 -0400, Martin K. Petersen wrote:
> 
> Martin,
> 
> > I suppose 99.9% of users never bother with customizing the udev
> > rules.
> 
> Except for the other 99.9%, perhaps? :) We definitely have many users
> that tweak udev storage rules for a variety of reasons. Including
> being
> able to use RII for LUN naming purposes.
> 
> > But we can actually combine both approaches. If "wwid" yields a
> > good
> > value most of the time (which is true IMO), we could make user
> > space
> > rely on it by default, and make it possible to set an udev property
> > (e.g. ENV{ID_LEGACY}="1") to tell udev rules to determine WWID
> > differently. User-space apps like multipath could check the
> > ID_LEGACY
> > property to determine whether or not reading the "wwid" attribute
> > would
> > be consistent with udev. That would simplify matters a lot for us
> > (Ben,
> > do you agree?), without the need of adding endless BLIST entries.
> 
> That's fine with me.
> 
> > AFAICT, no major distribution uses "wwid" for this purpose (yet).
> 
> We definitely have users that currently rely on wwid, although
> probably
> not through standard distro scripts.
> 
> > In a recent discussion with Hannes, the idea came up that the
> > priority
> > of "SCSI name string" designators should actually depend on their
> > subtype. "naa." name strings should map to the respective NAA
> > descriptors, and "eui.", likewise (only "iqn." descriptors have no
> > binary counterpart; we thought they should rather be put below NAA,
> > prio-wise).
> 
> I like what NVMe did wrt. to exporting eui, nguid, uuid separately
> from
> the best-effort wwid. That's why I suggested separate sysfs files for
> the various page 0x83 descriptors. I like the idea of being able to
> explicitly ask for an eui if that's what I need. But that appears to
> be
> somewhat orthogonal to your request.
> 
> > I wonder if you'd agree with a change made that way for "wwid". I
> > suppose you don't. I'd then propose to add a new attribute
> > following
> > this logic. It could simply be an additional attribute with a
> > different
> > name. Or this new attribute could be a property of the block device
> > rather than the SCSI device, like NVMe does it
> > (/sys/block/nvme0n2/wwid).
> 
> That's fine. I am not a big fan of the idea that block/foo/wwid and
> block/foo/device/wwid could end up being different. But I do think
> that
> from a userland tooling perspective the consistency with NVMe is more
> important.

OK, then here's the plan: Change SCSI (block) device identification to
work similar to NVMe (in addition to what we have now).

 1. add a new sysfs attribute for SCSI block devices as
/sys/block/sd$X/wwid, the value derived similar to the current "wwid"
SCSI device attribute, but using the same prio for SCSI name strings as
for their binary counterparts, as described above.

 2. add "naa" and "eui" attributes, too, for user-space applications
that are interested in these specific attributes. 
Fixme: should we differentiate between different "naa" or eui subtypes,
like "naa_regext", "eui64" or similar? If the device defines multiple
"naa" designators, which one should we choose?

 3. Change udev rules such that they primarily look at the attribute in
1.) on new installments, and introduce a variable ID_LEGACY to tell the
rules to fall back to the current algorithm. I suppose it makes sense
to have at least ID_VENDOR and ID_PRODUCT available when making this
decision, so that it doesn't have to be a global setting on a given
host.

While we're at it, I'd like to mention another issue: WWID changes.

This is a big problem for multipathd. The gist is that the device
identification attributes in sysfs only change after rescanning the
device. Thus if a user changes LUN assignments on a storage system, 
it can happen that a direct INQUIRY returns a different WWID as in
sysfs, which is fatal. If we plan to rely more on sysfs for device
identification in the future, the problem gets worse. 

I wonder if there's a chance that future kernels would automatically
update the attributes if a corresponding UNIT ATTENTION condition such
as INQUIRY DATA HAS CHANGED is received (*), or if we can find some
other way to avoid data corruption resulting from writing to the wrong
device.

Regards,
Martin

(*) I've been told that WWID changes can happen even without receiving
an UA. But in that case I'm inclined to put the blame on the storage.

-- 
Dr. Martin Wilck <mwilck@suse.com>, Tel. +49 (0)911 74053 2107
SUSE Software Solutions Germany GmbH
HRB 36809, AG Nürnberg GF: Felix Imendörffer



^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [dm-devel] RFC: one more time: SCSI device identification
@ 2021-04-23 10:28             ` Martin Wilck
  0 siblings, 0 replies; 50+ messages in thread
From: Martin Wilck @ 2021-04-23 10:28 UTC (permalink / raw)
  To: martin.petersen
  Cc: Hannes Reinecke, jejb, linux-scsi, dm-devel, dgilbert,
	systemd-devel, hch

On Thu, 2021-04-22 at 21:40 -0400, Martin K. Petersen wrote:
> 
> Martin,
> 
> > I suppose 99.9% of users never bother with customizing the udev
> > rules.
> 
> Except for the other 99.9%, perhaps? :) We definitely have many users
> that tweak udev storage rules for a variety of reasons. Including
> being
> able to use RII for LUN naming purposes.
> 
> > But we can actually combine both approaches. If "wwid" yields a
> > good
> > value most of the time (which is true IMO), we could make user
> > space
> > rely on it by default, and make it possible to set an udev property
> > (e.g. ENV{ID_LEGACY}="1") to tell udev rules to determine WWID
> > differently. User-space apps like multipath could check the
> > ID_LEGACY
> > property to determine whether or not reading the "wwid" attribute
> > would
> > be consistent with udev. That would simplify matters a lot for us
> > (Ben,
> > do you agree?), without the need of adding endless BLIST entries.
> 
> That's fine with me.
> 
> > AFAICT, no major distribution uses "wwid" for this purpose (yet).
> 
> We definitely have users that currently rely on wwid, although
> probably
> not through standard distro scripts.
> 
> > In a recent discussion with Hannes, the idea came up that the
> > priority
> > of "SCSI name string" designators should actually depend on their
> > subtype. "naa." name strings should map to the respective NAA
> > descriptors, and "eui.", likewise (only "iqn." descriptors have no
> > binary counterpart; we thought they should rather be put below NAA,
> > prio-wise).
> 
> I like what NVMe did wrt. to exporting eui, nguid, uuid separately
> from
> the best-effort wwid. That's why I suggested separate sysfs files for
> the various page 0x83 descriptors. I like the idea of being able to
> explicitly ask for an eui if that's what I need. But that appears to
> be
> somewhat orthogonal to your request.
> 
> > I wonder if you'd agree with a change made that way for "wwid". I
> > suppose you don't. I'd then propose to add a new attribute
> > following
> > this logic. It could simply be an additional attribute with a
> > different
> > name. Or this new attribute could be a property of the block device
> > rather than the SCSI device, like NVMe does it
> > (/sys/block/nvme0n2/wwid).
> 
> That's fine. I am not a big fan of the idea that block/foo/wwid and
> block/foo/device/wwid could end up being different. But I do think
> that
> from a userland tooling perspective the consistency with NVMe is more
> important.

OK, then here's the plan: Change SCSI (block) device identification to
work similar to NVMe (in addition to what we have now).

 1. add a new sysfs attribute for SCSI block devices as
/sys/block/sd$X/wwid, the value derived similar to the current "wwid"
SCSI device attribute, but using the same prio for SCSI name strings as
for their binary counterparts, as described above.

 2. add "naa" and "eui" attributes, too, for user-space applications
that are interested in these specific attributes. 
Fixme: should we differentiate between different "naa" or eui subtypes,
like "naa_regext", "eui64" or similar? If the device defines multiple
"naa" designators, which one should we choose?

 3. Change udev rules such that they primarily look at the attribute in
1.) on new installments, and introduce a variable ID_LEGACY to tell the
rules to fall back to the current algorithm. I suppose it makes sense
to have at least ID_VENDOR and ID_PRODUCT available when making this
decision, so that it doesn't have to be a global setting on a given
host.

While we're at it, I'd like to mention another issue: WWID changes.

This is a big problem for multipathd. The gist is that the device
identification attributes in sysfs only change after rescanning the
device. Thus if a user changes LUN assignments on a storage system, 
it can happen that a direct INQUIRY returns a different WWID as in
sysfs, which is fatal. If we plan to rely more on sysfs for device
identification in the future, the problem gets worse. 

I wonder if there's a chance that future kernels would automatically
update the attributes if a corresponding UNIT ATTENTION condition such
as INQUIRY DATA HAS CHANGED is received (*), or if we can find some
other way to avoid data corruption resulting from writing to the wrong
device.

Regards,
Martin

(*) I've been told that WWID changes can happen even without receiving
an UA. But in that case I'm inclined to put the blame on the storage.

-- 
Dr. Martin Wilck <mwilck@suse.com>, Tel. +49 (0)911 74053 2107
SUSE Software Solutions Germany GmbH
HRB 36809, AG Nürnberg GF: Felix Imendörffer



--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Antw: [EXT] Re: [systemd-devel] RFC: one more time: SCSI device identification
  2021-04-23 10:28             ` [dm-devel] " Martin Wilck
@ 2021-04-26 11:14               ` Ulrich Windl
  -1 siblings, 0 replies; 50+ messages in thread
From: Ulrich Windl @ 2021-04-26 11:14 UTC (permalink / raw)
  To: martin.petersen, martin.wilck
  Cc: dgilbert, jejb, systemd-devel, hch, bmarzins, dm-devel, hare, linux-scsi

>>> Martin Wilck <martin.wilck@suse.com> schrieb am 23.04.2021 um 12:28 in
Nachricht <e3184501cbf23ab0ae94d664725e72b693c64ba9.camel@suse.com>:
> On Thu, 2021‑04‑22 at 21:40 ‑0400, Martin K. Petersen wrote:
>> 
>> Martin,
>> 
>> > I suppose 99.9% of users never bother with customizing the udev
>> > rules.
>> 
>> Except for the other 99.9%, perhaps? :) We definitely have many users
>> that tweak udev storage rules for a variety of reasons. Including
>> being
>> able to use RII for LUN naming purposes.
>> 
>> > But we can actually combine both approaches. If "wwid" yields a
>> > good
>> > value most of the time (which is true IMO), we could make user
>> > space
>> > rely on it by default, and make it possible to set an udev property
>> > (e.g. ENV{ID_LEGACY}="1") to tell udev rules to determine WWID
>> > differently. User‑space apps like multipath could check the
>> > ID_LEGACY
>> > property to determine whether or not reading the "wwid" attribute
>> > would
>> > be consistent with udev. That would simplify matters a lot for us
>> > (Ben,
>> > do you agree?), without the need of adding endless BLIST entries.
>> 
>> That's fine with me.
>> 
>> > AFAICT, no major distribution uses "wwid" for this purpose (yet).
>> 
>> We definitely have users that currently rely on wwid, although
>> probably
>> not through standard distro scripts.
>> 
>> > In a recent discussion with Hannes, the idea came up that the
>> > priority
>> > of "SCSI name string" designators should actually depend on their
>> > subtype. "naa." name strings should map to the respective NAA
>> > descriptors, and "eui.", likewise (only "iqn." descriptors have no
>> > binary counterpart; we thought they should rather be put below NAA,
>> > prio‑wise).
>> 
>> I like what NVMe did wrt. to exporting eui, nguid, uuid separately
>> from
>> the best‑effort wwid. That's why I suggested separate sysfs files for
>> the various page 0x83 descriptors. I like the idea of being able to
>> explicitly ask for an eui if that's what I need. But that appears to
>> be
>> somewhat orthogonal to your request.
>> 
>> > I wonder if you'd agree with a change made that way for "wwid". I
>> > suppose you don't. I'd then propose to add a new attribute
>> > following
>> > this logic. It could simply be an additional attribute with a
>> > different
>> > name. Or this new attribute could be a property of the block device
>> > rather than the SCSI device, like NVMe does it
>> > (/sys/block/nvme0n2/wwid).
>> 
>> That's fine. I am not a big fan of the idea that block/foo/wwid and
>> block/foo/device/wwid could end up being different. But I do think
>> that
>> from a userland tooling perspective the consistency with NVMe is more
>> important.
> 
> OK, then here's the plan: Change SCSI (block) device identification to
> work similar to NVMe (in addition to what we have now).
> 
>  1. add a new sysfs attribute for SCSI block devices as
> /sys/block/sd$X/wwid, the value derived similar to the current "wwid"
> SCSI device attribute, but using the same prio for SCSI name strings as
> for their binary counterparts, as described above.
> 
>  2. add "naa" and "eui" attributes, too, for user‑space applications
> that are interested in these specific attributes. 
> Fixme: should we differentiate between different "naa" or eui subtypes,
> like "naa_regext", "eui64" or similar? If the device defines multiple
> "naa" designators, which one should we choose?
> 
>  3. Change udev rules such that they primarily look at the attribute in
> 1.) on new installments, and introduce a variable ID_LEGACY to tell the
> rules to fall back to the current algorithm. I suppose it makes sense
> to have at least ID_VENDOR and ID_PRODUCT available when making this
> decision, so that it doesn't have to be a global setting on a given
> host.
> 
> While we're at it, I'd like to mention another issue: WWID changes.
> 
> This is a big problem for multipathd. The gist is that the device
> identification attributes in sysfs only change after rescanning the
> device. Thus if a user changes LUN assignments on a storage system, 
> it can happen that a direct INQUIRY returns a different WWID as in
> sysfs, which is fatal. If we plan to rely more on sysfs for device
> identification in the future, the problem gets worse. 

I think many devices rely on the fact that they are identified by
Vendor/model/serial_nr, because in most professional SAN storage systems you
can pre-set the serial number to a custom value; so if you want a new disk
(maybe a snapshot) to be compatible with the old one, just assign the same
serial number. I guess that's the idea behind.

> 
> I wonder if there's a chance that future kernels would automatically
> update the attributes if a corresponding UNIT ATTENTION condition such
> as INQUIRY DATA HAS CHANGED is received (*), or if we can find some
> other way to avoid data corruption resulting from writing to the wrong
> device.
> 
> Regards,
> Martin
> 
> (*) I've been told that WWID changes can happen even without receiving
> an UA. But in that case I'm inclined to put the blame on the storage.
> 
> ‑‑ 
> Dr. Martin Wilck <mwilck@suse.com>, Tel. +49 (0)911 74053 2107
> SUSE Software Solutions Germany GmbH
> HRB 36809, AG Nürnberg GF: Felix Imendörffer
> 
> 
> _______________________________________________
> systemd‑devel mailing list
> systemd‑devel@lists.freedesktop.org 
> https://lists.freedesktop.org/mailman/listinfo/systemd‑devel 




^ permalink raw reply	[flat|nested] 50+ messages in thread

* [dm-devel] Antw: [EXT] Re: [systemd-devel] RFC: one more time: SCSI device identification
@ 2021-04-26 11:14               ` Ulrich Windl
  0 siblings, 0 replies; 50+ messages in thread
From: Ulrich Windl @ 2021-04-26 11:14 UTC (permalink / raw)
  To: martin.petersen, martin.wilck
  Cc: hare, systemd-devel, linux-scsi, dm-devel, dgilbert, jejb, hch

>>> Martin Wilck <martin.wilck@suse.com> schrieb am 23.04.2021 um 12:28 in
Nachricht <e3184501cbf23ab0ae94d664725e72b693c64ba9.camel@suse.com>:
> On Thu, 2021‑04‑22 at 21:40 ‑0400, Martin K. Petersen wrote:
>> 
>> Martin,
>> 
>> > I suppose 99.9% of users never bother with customizing the udev
>> > rules.
>> 
>> Except for the other 99.9%, perhaps? :) We definitely have many users
>> that tweak udev storage rules for a variety of reasons. Including
>> being
>> able to use RII for LUN naming purposes.
>> 
>> > But we can actually combine both approaches. If "wwid" yields a
>> > good
>> > value most of the time (which is true IMO), we could make user
>> > space
>> > rely on it by default, and make it possible to set an udev property
>> > (e.g. ENV{ID_LEGACY}="1") to tell udev rules to determine WWID
>> > differently. User‑space apps like multipath could check the
>> > ID_LEGACY
>> > property to determine whether or not reading the "wwid" attribute
>> > would
>> > be consistent with udev. That would simplify matters a lot for us
>> > (Ben,
>> > do you agree?), without the need of adding endless BLIST entries.
>> 
>> That's fine with me.
>> 
>> > AFAICT, no major distribution uses "wwid" for this purpose (yet).
>> 
>> We definitely have users that currently rely on wwid, although
>> probably
>> not through standard distro scripts.
>> 
>> > In a recent discussion with Hannes, the idea came up that the
>> > priority
>> > of "SCSI name string" designators should actually depend on their
>> > subtype. "naa." name strings should map to the respective NAA
>> > descriptors, and "eui.", likewise (only "iqn." descriptors have no
>> > binary counterpart; we thought they should rather be put below NAA,
>> > prio‑wise).
>> 
>> I like what NVMe did wrt. to exporting eui, nguid, uuid separately
>> from
>> the best‑effort wwid. That's why I suggested separate sysfs files for
>> the various page 0x83 descriptors. I like the idea of being able to
>> explicitly ask for an eui if that's what I need. But that appears to
>> be
>> somewhat orthogonal to your request.
>> 
>> > I wonder if you'd agree with a change made that way for "wwid". I
>> > suppose you don't. I'd then propose to add a new attribute
>> > following
>> > this logic. It could simply be an additional attribute with a
>> > different
>> > name. Or this new attribute could be a property of the block device
>> > rather than the SCSI device, like NVMe does it
>> > (/sys/block/nvme0n2/wwid).
>> 
>> That's fine. I am not a big fan of the idea that block/foo/wwid and
>> block/foo/device/wwid could end up being different. But I do think
>> that
>> from a userland tooling perspective the consistency with NVMe is more
>> important.
> 
> OK, then here's the plan: Change SCSI (block) device identification to
> work similar to NVMe (in addition to what we have now).
> 
>  1. add a new sysfs attribute for SCSI block devices as
> /sys/block/sd$X/wwid, the value derived similar to the current "wwid"
> SCSI device attribute, but using the same prio for SCSI name strings as
> for their binary counterparts, as described above.
> 
>  2. add "naa" and "eui" attributes, too, for user‑space applications
> that are interested in these specific attributes. 
> Fixme: should we differentiate between different "naa" or eui subtypes,
> like "naa_regext", "eui64" or similar? If the device defines multiple
> "naa" designators, which one should we choose?
> 
>  3. Change udev rules such that they primarily look at the attribute in
> 1.) on new installments, and introduce a variable ID_LEGACY to tell the
> rules to fall back to the current algorithm. I suppose it makes sense
> to have at least ID_VENDOR and ID_PRODUCT available when making this
> decision, so that it doesn't have to be a global setting on a given
> host.
> 
> While we're at it, I'd like to mention another issue: WWID changes.
> 
> This is a big problem for multipathd. The gist is that the device
> identification attributes in sysfs only change after rescanning the
> device. Thus if a user changes LUN assignments on a storage system, 
> it can happen that a direct INQUIRY returns a different WWID as in
> sysfs, which is fatal. If we plan to rely more on sysfs for device
> identification in the future, the problem gets worse. 

I think many devices rely on the fact that they are identified by
Vendor/model/serial_nr, because in most professional SAN storage systems you
can pre-set the serial number to a custom value; so if you want a new disk
(maybe a snapshot) to be compatible with the old one, just assign the same
serial number. I guess that's the idea behind.

> 
> I wonder if there's a chance that future kernels would automatically
> update the attributes if a corresponding UNIT ATTENTION condition such
> as INQUIRY DATA HAS CHANGED is received (*), or if we can find some
> other way to avoid data corruption resulting from writing to the wrong
> device.
> 
> Regards,
> Martin
> 
> (*) I've been told that WWID changes can happen even without receiving
> an UA. But in that case I'm inclined to put the blame on the storage.
> 
> ‑‑ 
> Dr. Martin Wilck <mwilck@suse.com>, Tel. +49 (0)911 74053 2107
> SUSE Software Solutions Germany GmbH
> HRB 36809, AG Nürnberg GF: Felix Imendörffer
> 
> 
> _______________________________________________
> systemd‑devel mailing list
> systemd‑devel@lists.freedesktop.org 
> https://lists.freedesktop.org/mailman/listinfo/systemd‑devel 




--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: RFC: one more time: SCSI device identification
  2021-04-26 11:14               ` [dm-devel] " Ulrich Windl
@ 2021-04-26 13:16                 ` Martin Wilck
  -1 siblings, 0 replies; 50+ messages in thread
From: Martin Wilck @ 2021-04-26 13:16 UTC (permalink / raw)
  To: Ulrich.Windl, martin.petersen
  Cc: Hannes Reinecke, hch, dgilbert, dm-devel, linux-scsi, jejb,
	systemd-devel, bmarzins

On Mon, 2021-04-26 at 13:14 +0200, Ulrich Windl wrote:
> > > 
> > 
> > While we're at it, I'd like to mention another issue: WWID changes.
> > 
> > This is a big problem for multipathd. The gist is that the device
> > identification attributes in sysfs only change after rescanning the
> > device. Thus if a user changes LUN assignments on a storage system,
> > it can happen that a direct INQUIRY returns a different WWID as in
> > sysfs, which is fatal. If we plan to rely more on sysfs for device
> > identification in the future, the problem gets worse. 
> 
> I think many devices rely on the fact that they are identified by
> Vendor/model/serial_nr, because in most professional SAN storage
> systems you
> can pre-set the serial number to a custom value; so if you want a new
> disk
> (maybe a snapshot) to be compatible with the old one, just assign the
> same
> serial number. I guess that's the idea behind.

What you are saying sounds dangerous to me. If a snapshot has the same
WWID as the device it's a snapshot of, it must not be exposed to any
host(s) at the same time with its origin, otherwise the host may
happily combine it with the origin into one multipath map, and data
corruption will almost certainly result. 

My argument is about how the host is supposed to deal with a WWID
change if it happens. Here, "WWID change" means that a given H:C:T:L
suddenly exposes different device designators than it used to, while
this device is in use by a host. Here, too, data corruption is
imminent, and can happen in a blink of an eye. To avoid this, several
things are needed:

 1) the host needs to get notified about the change (likely by an UA of
some sort)
 2) the kernel needs to react to the notification immediately, e.g. by
blocking IO to the device,
 3) userspace tooling such as udev or multipathd need to figure out how
to  how to deal with the situation cleanly, and eventually unblock it.

Wrt 1), we can only hope that it's the case. But 2) and 3) need work,
afaics.

Martin

-- 
Dr. Martin Wilck <mwilck@suse.com>, Tel. +49 (0)911 74053 2107
SUSE Software Solutions Germany GmbH
HRB 36809, AG Nürnberg GF: Felix Imendörffer



^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [dm-devel] RFC: one more time: SCSI device identification
@ 2021-04-26 13:16                 ` Martin Wilck
  0 siblings, 0 replies; 50+ messages in thread
From: Martin Wilck @ 2021-04-26 13:16 UTC (permalink / raw)
  To: Ulrich.Windl, martin.petersen
  Cc: Hannes Reinecke, jejb, linux-scsi, dm-devel, dgilbert,
	systemd-devel, hch

On Mon, 2021-04-26 at 13:14 +0200, Ulrich Windl wrote:
> > > 
> > 
> > While we're at it, I'd like to mention another issue: WWID changes.
> > 
> > This is a big problem for multipathd. The gist is that the device
> > identification attributes in sysfs only change after rescanning the
> > device. Thus if a user changes LUN assignments on a storage system,
> > it can happen that a direct INQUIRY returns a different WWID as in
> > sysfs, which is fatal. If we plan to rely more on sysfs for device
> > identification in the future, the problem gets worse. 
> 
> I think many devices rely on the fact that they are identified by
> Vendor/model/serial_nr, because in most professional SAN storage
> systems you
> can pre-set the serial number to a custom value; so if you want a new
> disk
> (maybe a snapshot) to be compatible with the old one, just assign the
> same
> serial number. I guess that's the idea behind.

What you are saying sounds dangerous to me. If a snapshot has the same
WWID as the device it's a snapshot of, it must not be exposed to any
host(s) at the same time with its origin, otherwise the host may
happily combine it with the origin into one multipath map, and data
corruption will almost certainly result. 

My argument is about how the host is supposed to deal with a WWID
change if it happens. Here, "WWID change" means that a given H:C:T:L
suddenly exposes different device designators than it used to, while
this device is in use by a host. Here, too, data corruption is
imminent, and can happen in a blink of an eye. To avoid this, several
things are needed:

 1) the host needs to get notified about the change (likely by an UA of
some sort)
 2) the kernel needs to react to the notification immediately, e.g. by
blocking IO to the device,
 3) userspace tooling such as udev or multipathd need to figure out how
to  how to deal with the situation cleanly, and eventually unblock it.

Wrt 1), we can only hope that it's the case. But 2) and 3) need work,
afaics.

Martin

-- 
Dr. Martin Wilck <mwilck@suse.com>, Tel. +49 (0)911 74053 2107
SUSE Software Solutions Germany GmbH
HRB 36809, AG Nürnberg GF: Felix Imendörffer



--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [dm-devel] RFC: one more time: SCSI device identification
  2021-04-26 13:16                 ` [dm-devel] " Martin Wilck
  (?)
@ 2021-04-27  3:48                 ` Erwin van Londen
  2021-04-27  7:02                     ` [dm-devel] Antw: [EXT] " Ulrich Windl
  2021-04-27  8:10                     ` Martin Wilck
  -1 siblings, 2 replies; 50+ messages in thread
From: Erwin van Londen @ 2021-04-27  3:48 UTC (permalink / raw)
  To: Martin Wilck, Ulrich.Windl, martin.petersen
  Cc: Hannes Reinecke, systemd-devel, linux-scsi, dm-devel, dgilbert,
	jejb, hch


[-- Attachment #1.1: Type: text/plain, Size: 3029 bytes --]



On Mon, 2021-04-26 at 13:16 +0000, Martin Wilck wrote:
> On Mon, 2021-04-26 at 13:14 +0200, Ulrich Windl wrote:
> > > > 
> > > 
> > > While we're at it, I'd like to mention another issue: WWID
> > > changes.
> > > 
> > > This is a big problem for multipathd. The gist is that the device
> > > identification attributes in sysfs only change after rescanning
> > > the
> > > device. Thus if a user changes LUN assignments on a storage
> > > system,
> > > it can happen that a direct INQUIRY returns a different WWID as
> > > in
> > > sysfs, which is fatal. If we plan to rely more on sysfs for
> > > device
> > > identification in the future, the problem gets worse. 
> > 
> > I think many devices rely on the fact that they are identified by
> > Vendor/model/serial_nr, because in most professional SAN storage
> > systems you
> > can pre-set the serial number to a custom value; so if you want a
> > new
> > disk
> > (maybe a snapshot) to be compatible with the old one, just assign
> > the
> > same
> > serial number. I guess that's the idea behind.
> 
> What you are saying sounds dangerous to me. If a snapshot has the
> same
> WWID as the device it's a snapshot of, it must not be exposed to any
> host(s) at the same time with its origin, otherwise the host may
> happily combine it with the origin into one multipath map, and data
> corruption will almost certainly result. 
> 
> My argument is about how the host is supposed to deal with a WWID
> change if it happens. Here, "WWID change" means that a given H:C:T:L
> suddenly exposes different device designators than it used to, while
> this device is in use by a host. Here, too, data corruption is
> imminent, and can happen in a blink of an eye. To avoid this, several
> things are needed:
> 
>  1) the host needs to get notified about the change (likely by an UA
> of
> some sort)
>  2) the kernel needs to react to the notification immediately, e.g.
> by
> blocking IO to the device,
>  3) userspace tooling such as udev or multipathd need to figure out
> how
> to  how to deal with the situation cleanly, and eventually unblock
> it.
> 
> Wrt 1), we can only hope that it's the case. But 2) and 3) need work,
> afaics.
> 
In my view the WWID should never change. If a snapshot is created it
should either obtain a new WWID. An example out of a Hitachi array is

Device Identification VPD page:
Addressed logical unit:
designator type: T10 vendor identification, code set: ASCII
vendor id: HITACHI 
vendor specific: 50403B050709
designator type: NAA, code set: Binary
0x60060e80123b050050403b0500000709

The majority of the naa wwid is tied to the storage subsystem and
identifies the vendor oui, model, serial etc. The last 4 in this
example indicate the LDEV ID (Sorry mainframe heritage here..). When a
snapshot is taken these 4 will change as a new LDEV ID is assigned to
the snapshot. This sort of behaviour should be consistent across all
storage vendors imho.

> Martin
> 

[-- Attachment #1.2: Type: text/html, Size: 4244 bytes --]

[-- Attachment #2: Type: text/plain, Size: 97 bytes --]

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Antw: [EXT] Re: [dm-devel] RFC: one more time: SCSI device identification
  2021-04-27  3:48                 ` Erwin van Londen
@ 2021-04-27  7:02                     ` Ulrich Windl
  2021-04-27  8:10                     ` Martin Wilck
  1 sibling, 0 replies; 50+ messages in thread
From: Ulrich Windl @ 2021-04-27  7:02 UTC (permalink / raw)
  To: erwin, martin.petersen, martin.wilck
  Cc: dgilbert, jejb, systemd-devel, hch, dm-devel, hare, linux-scsi

>>> Erwin van Londen <erwin@erwinvanlonden.net> schrieb am 27.04.2021 um 05:48 in
Nachricht
<b5f288fb43bc79e0206794a901aef5b1761813de.camel@erwinvanlonden.net>:

> 
> On Mon, 2021-04-26 at 13:16 +0000, Martin Wilck wrote:
>> On Mon, 2021-04-26 at 13:14 +0200, Ulrich Windl wrote:
>> > > > 
>> > > 
>> > > While we're at it, I'd like to mention another issue: WWID
>> > > changes.
>> > > 
>> > > This is a big problem for multipathd. The gist is that the device
>> > > identification attributes in sysfs only change after rescanning
>> > > the
>> > > device. Thus if a user changes LUN assignments on a storage
>> > > system,
>> > > it can happen that a direct INQUIRY returns a different WWID as
>> > > in
>> > > sysfs, which is fatal. If we plan to rely more on sysfs for
>> > > device
>> > > identification in the future, the problem gets worse. 
>> > 
>> > I think many devices rely on the fact that they are identified by
>> > Vendor/model/serial_nr, because in most professional SAN storage
>> > systems you
>> > can pre-set the serial number to a custom value; so if you want a
>> > new
>> > disk
>> > (maybe a snapshot) to be compatible with the old one, just assign
>> > the
>> > same
>> > serial number. I guess that's the idea behind.
>> 
>> What you are saying sounds dangerous to me. If a snapshot has the
>> same
>> WWID as the device it's a snapshot of, it must not be exposed to any
>> host(s) at the same time with its origin, otherwise the host may
>> happily combine it with the origin into one multipath map, and data
>> corruption will almost certainly result. 
>> 
>> My argument is about how the host is supposed to deal with a WWID
>> change if it happens. Here, "WWID change" means that a given H:C:T:L
>> suddenly exposes different device designators than it used to, while
>> this device is in use by a host. Here, too, data corruption is
>> imminent, and can happen in a blink of an eye. To avoid this, several
>> things are needed:
>> 
>>  1) the host needs to get notified about the change (likely by an UA
>> of
>> some sort)
>>  2) the kernel needs to react to the notification immediately, e.g.
>> by
>> blocking IO to the device,
>>  3) userspace tooling such as udev or multipathd need to figure out
>> how
>> to  how to deal with the situation cleanly, and eventually unblock
>> it.
>> 
>> Wrt 1), we can only hope that it's the case. But 2) and 3) need work,
>> afaics.
>> 
> In my view the WWID should never change. If a snapshot is created it
> should either obtain a new WWID. An example out of a Hitachi array is
> 
> Device Identification VPD page:
> Addressed logical unit:
> designator type: T10 vendor identification, code set: ASCII
> vendor id: HITACHI 
> vendor specific: 50403B050709
> designator type: NAA, code set: Binary
> 0x60060e80123b050050403b0500000709
> 
> The majority of the naa wwid is tied to the storage subsystem and
> identifies the vendor oui, model, serial etc. The last 4 in this
> example indicate the LDEV ID (Sorry mainframe heritage here..). When a
> snapshot is taken these 4 will change as a new LDEV ID is assigned to
> the snapshot. This sort of behaviour should be consistent across all
> storage vendors imho.

It's getting off-topic, but in automatic desaster recovery scenarios one might want that the "new disk" (maybe a snapshot of the original disk before it got corrupted) looks like the "old disk", so that the OS can boot without needing any adjustments.

Regards,
Ulrich

> 
>> Martin
>> 





^ permalink raw reply	[flat|nested] 50+ messages in thread

* [dm-devel] Antw: [EXT] Re: RFC: one more time: SCSI device identification
@ 2021-04-27  7:02                     ` Ulrich Windl
  0 siblings, 0 replies; 50+ messages in thread
From: Ulrich Windl @ 2021-04-27  7:02 UTC (permalink / raw)
  To: erwin, martin.petersen, martin.wilck
  Cc: hare, systemd-devel, linux-scsi, dm-devel, dgilbert, jejb, hch

>>> Erwin van Londen <erwin@erwinvanlonden.net> schrieb am 27.04.2021 um 05:48 in
Nachricht
<b5f288fb43bc79e0206794a901aef5b1761813de.camel@erwinvanlonden.net>:

> 
> On Mon, 2021-04-26 at 13:16 +0000, Martin Wilck wrote:
>> On Mon, 2021-04-26 at 13:14 +0200, Ulrich Windl wrote:
>> > > > 
>> > > 
>> > > While we're at it, I'd like to mention another issue: WWID
>> > > changes.
>> > > 
>> > > This is a big problem for multipathd. The gist is that the device
>> > > identification attributes in sysfs only change after rescanning
>> > > the
>> > > device. Thus if a user changes LUN assignments on a storage
>> > > system,
>> > > it can happen that a direct INQUIRY returns a different WWID as
>> > > in
>> > > sysfs, which is fatal. If we plan to rely more on sysfs for
>> > > device
>> > > identification in the future, the problem gets worse. 
>> > 
>> > I think many devices rely on the fact that they are identified by
>> > Vendor/model/serial_nr, because in most professional SAN storage
>> > systems you
>> > can pre-set the serial number to a custom value; so if you want a
>> > new
>> > disk
>> > (maybe a snapshot) to be compatible with the old one, just assign
>> > the
>> > same
>> > serial number. I guess that's the idea behind.
>> 
>> What you are saying sounds dangerous to me. If a snapshot has the
>> same
>> WWID as the device it's a snapshot of, it must not be exposed to any
>> host(s) at the same time with its origin, otherwise the host may
>> happily combine it with the origin into one multipath map, and data
>> corruption will almost certainly result. 
>> 
>> My argument is about how the host is supposed to deal with a WWID
>> change if it happens. Here, "WWID change" means that a given H:C:T:L
>> suddenly exposes different device designators than it used to, while
>> this device is in use by a host. Here, too, data corruption is
>> imminent, and can happen in a blink of an eye. To avoid this, several
>> things are needed:
>> 
>>  1) the host needs to get notified about the change (likely by an UA
>> of
>> some sort)
>>  2) the kernel needs to react to the notification immediately, e.g.
>> by
>> blocking IO to the device,
>>  3) userspace tooling such as udev or multipathd need to figure out
>> how
>> to  how to deal with the situation cleanly, and eventually unblock
>> it.
>> 
>> Wrt 1), we can only hope that it's the case. But 2) and 3) need work,
>> afaics.
>> 
> In my view the WWID should never change. If a snapshot is created it
> should either obtain a new WWID. An example out of a Hitachi array is
> 
> Device Identification VPD page:
> Addressed logical unit:
> designator type: T10 vendor identification, code set: ASCII
> vendor id: HITACHI 
> vendor specific: 50403B050709
> designator type: NAA, code set: Binary
> 0x60060e80123b050050403b0500000709
> 
> The majority of the naa wwid is tied to the storage subsystem and
> identifies the vendor oui, model, serial etc. The last 4 in this
> example indicate the LDEV ID (Sorry mainframe heritage here..). When a
> snapshot is taken these 4 will change as a new LDEV ID is assigned to
> the snapshot. This sort of behaviour should be consistent across all
> storage vendors imho.

It's getting off-topic, but in automatic desaster recovery scenarios one might want that the "new disk" (maybe a snapshot of the original disk before it got corrupted) looks like the "old disk", so that the OS can boot without needing any adjustments.

Regards,
Ulrich

> 
>> Martin
>> 





--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [dm-devel] RFC: one more time: SCSI device identification
  2021-04-27  3:48                 ` Erwin van Londen
@ 2021-04-27  8:10                     ` Martin Wilck
  2021-04-27  8:10                     ` Martin Wilck
  1 sibling, 0 replies; 50+ messages in thread
From: Martin Wilck @ 2021-04-27  8:10 UTC (permalink / raw)
  To: erwin, Ulrich.Windl, martin.petersen
  Cc: jejb, linux-scsi, dgilbert, dm-devel, systemd-devel,
	Hannes Reinecke, hch

On Tue, 2021-04-27 at 13:48 +1000, Erwin van Londen wrote:
> > 
> > Wrt 1), we can only hope that it's the case. But 2) and 3) need work,
> > afaics.
> > 
> In my view the WWID should never change. 

In an ideal world, perhaps not. But in the dm-multipath realm, we know
that WWID changes can happen with certain storage arrays. See 
https://listman.redhat.com/archives/dm-devel/2021-February/msg00116.html 
and follow-ups, for example.

Regards,
Martin

-- 
Dr. Martin Wilck <mwilck@suse.com>, Tel. +49 (0)911 74053 2107
SUSE Software Solutions Germany GmbH
HRB 36809, AG Nürnberg GF: Felix Imendörffer



^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [dm-devel] RFC: one more time: SCSI device identification
@ 2021-04-27  8:10                     ` Martin Wilck
  0 siblings, 0 replies; 50+ messages in thread
From: Martin Wilck @ 2021-04-27  8:10 UTC (permalink / raw)
  To: erwin, Ulrich.Windl, martin.petersen
  Cc: Hannes Reinecke, systemd-devel, linux-scsi, dm-devel, dgilbert,
	jejb, hch

On Tue, 2021-04-27 at 13:48 +1000, Erwin van Londen wrote:
> > 
> > Wrt 1), we can only hope that it's the case. But 2) and 3) need work,
> > afaics.
> > 
> In my view the WWID should never change. 

In an ideal world, perhaps not. But in the dm-multipath realm, we know
that WWID changes can happen with certain storage arrays. See 
https://listman.redhat.com/archives/dm-devel/2021-February/msg00116.html 
and follow-ups, for example.

Regards,
Martin

-- 
Dr. Martin Wilck <mwilck@suse.com>, Tel. +49 (0)911 74053 2107
SUSE Software Solutions Germany GmbH
HRB 36809, AG Nürnberg GF: Felix Imendörffer



--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [dm-devel] RFC: one more time: SCSI device identification
  2021-04-27  8:10                     ` Martin Wilck
@ 2021-04-27  8:21                       ` Hannes Reinecke
  -1 siblings, 0 replies; 50+ messages in thread
From: Hannes Reinecke @ 2021-04-27  8:21 UTC (permalink / raw)
  To: Martin Wilck, erwin, Ulrich.Windl, martin.petersen
  Cc: jejb, linux-scsi, dgilbert, dm-devel, systemd-devel,
	Hannes Reinecke, hch

On 4/27/21 10:10 AM, Martin Wilck wrote:
> On Tue, 2021-04-27 at 13:48 +1000, Erwin van Londen wrote:
>>>
>>> Wrt 1), we can only hope that it's the case. But 2) and 3) need work,
>>> afaics.
>>>
>> In my view the WWID should never change. 
> 
> In an ideal world, perhaps not. But in the dm-multipath realm, we know
> that WWID changes can happen with certain storage arrays. See 
> https://listman.redhat.com/archives/dm-devel/2021-February/msg00116.html 
> and follow-ups, for example.
> 
And it's actually something which might happen quite easily.
The storage array can unmap a LUN, delete it, create a new one, and map
that one into the same LUN number than the old one.
If we didn't do I/O during that interval upon the next I/O we will be
getting the dreaded 'Power-On/Reset' sense code.
_And nothing else_, due to the arcane rules for sense code generation in
SAM.
But we end up with a completely different device.

The only way out of it is to do a rescan for every POR sense code, and
disable the device eg via DID_NO_CONNECT whenever we find that the
identification has changed. We already have a copy of the original VPD
page 0x83 at hand, so that should be reasonably easy.

I had a rather lengthy discussion with Fred Knight @ NetApp about
Power-On/Reset handling, what with him complaining that we don't handle
is correctly. So this really is something we should be looking into,
even independently of multipathing.

But actually I like the idea from Martin Petersen to expose the parsed
VPD identifiers to sysfs; that would allow us to drop sg_inq completely
from the udev rules.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		        Kernel Storage Architect
hare@suse.de			               +49 911 74053 688
SUSE Software Solutions Germany GmbH, 90409 Nürnberg
GF: F. Imendörffer, HRB 36809 (AG Nürnberg)

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [dm-devel] RFC: one more time: SCSI device identification
@ 2021-04-27  8:21                       ` Hannes Reinecke
  0 siblings, 0 replies; 50+ messages in thread
From: Hannes Reinecke @ 2021-04-27  8:21 UTC (permalink / raw)
  To: Martin Wilck, erwin, Ulrich.Windl, martin.petersen
  Cc: Hannes Reinecke, systemd-devel, linux-scsi, dm-devel, dgilbert,
	jejb, hch

On 4/27/21 10:10 AM, Martin Wilck wrote:
> On Tue, 2021-04-27 at 13:48 +1000, Erwin van Londen wrote:
>>>
>>> Wrt 1), we can only hope that it's the case. But 2) and 3) need work,
>>> afaics.
>>>
>> In my view the WWID should never change. 
> 
> In an ideal world, perhaps not. But in the dm-multipath realm, we know
> that WWID changes can happen with certain storage arrays. See 
> https://listman.redhat.com/archives/dm-devel/2021-February/msg00116.html 
> and follow-ups, for example.
> 
And it's actually something which might happen quite easily.
The storage array can unmap a LUN, delete it, create a new one, and map
that one into the same LUN number than the old one.
If we didn't do I/O during that interval upon the next I/O we will be
getting the dreaded 'Power-On/Reset' sense code.
_And nothing else_, due to the arcane rules for sense code generation in
SAM.
But we end up with a completely different device.

The only way out of it is to do a rescan for every POR sense code, and
disable the device eg via DID_NO_CONNECT whenever we find that the
identification has changed. We already have a copy of the original VPD
page 0x83 at hand, so that should be reasonably easy.

I had a rather lengthy discussion with Fred Knight @ NetApp about
Power-On/Reset handling, what with him complaining that we don't handle
is correctly. So this really is something we should be looking into,
even independently of multipathing.

But actually I like the idea from Martin Petersen to expose the parsed
VPD identifiers to sysfs; that would allow us to drop sg_inq completely
from the udev rules.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		        Kernel Storage Architect
hare@suse.de			               +49 911 74053 688
SUSE Software Solutions Germany GmbH, 90409 Nürnberg
GF: F. Imendörffer, HRB 36809 (AG Nürnberg)


--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Antw: [EXT] Re: [dm-devel] RFC: one more time: SCSI device identification
  2021-04-27  8:21                       ` Hannes Reinecke
@ 2021-04-27 10:52                         ` Ulrich Windl
  -1 siblings, 0 replies; 50+ messages in thread
From: Ulrich Windl @ 2021-04-27 10:52 UTC (permalink / raw)
  To: erwin, martin.petersen, martin.wilck, hare
  Cc: dgilbert, jejb, systemd-devel, hch, dm-devel, hare, linux-scsi

>>> Hannes Reinecke <hare@suse.de> schrieb am 27.04.2021 um 10:21 in Nachricht
<2a6903e4-ff2b-67d5-e772-6971db8448fb@suse.de>:
> On 4/27/21 10:10 AM, Martin Wilck wrote:
>> On Tue, 2021‑04‑27 at 13:48 +1000, Erwin van Londen wrote:
>>>>
>>>> Wrt 1), we can only hope that it's the case. But 2) and 3) need work,
>>>> afaics.
>>>>
>>> In my view the WWID should never change. 
>> 
>> In an ideal world, perhaps not. But in the dm‑multipath realm, we know
>> that WWID changes can happen with certain storage arrays. See 
>> https://listman.redhat.com/archives/dm‑devel/2021‑February/msg00116.html 
>> and follow‑ups, for example.
>> 
> And it's actually something which might happen quite easily.
> The storage array can unmap a LUN, delete it, create a new one, and map
> that one into the same LUN number than the old one.
> If we didn't do I/O during that interval upon the next I/O we will be
> getting the dreaded 'Power‑On/Reset' sense code.
> _And nothing else_, due to the arcane rules for sense code generation in
> SAM.
> But we end up with a completely different device.
> 
> The only way out of it is to do a rescan for every POR sense code, and
> disable the device eg via DID_NO_CONNECT whenever we find that the
> identification has changed. We already have a copy of the original VPD
> page 0x83 at hand, so that should be reasonably easy.

I don't know the depth of the SCSI or FC protocol, but storage systems
typically signal such events, maybe either via some unit attention or some FC
event. Older kernels logged that there was a change, but a manual SCSI bus scan
is needed, while newer kernels find new devices "automagically" for some
products. The HP EVA 6000 series wored that way, a 3PAR SotorServ 8000 series
also seems to work that way, but not Pure Storage X70 R3. FOr the latter you
need something like a FC LIP to make the kernel detect the new devices (LUNs).
I'm unsure where the problem is, but in principle the kernel can be
notified...

> 
> I had a rather lengthy discussion with Fred Knight @ NetApp about
> Power‑On/Reset handling, what with him complaining that we don't handle
> is correctly. So this really is something we should be looking into,
> even independently of multipathing.
> 
> But actually I like the idea from Martin Petersen to expose the parsed
> VPD identifiers to sysfs; that would allow us to drop sg_inq completely
> from the udev rules.

Talking of VPDs: Somewhere in the last 12 years (within SLES 11)there was a
kernel change regarding trailing blanks in VPD data. That change blew up
several configurations being unable to re-recognize the devices. In one case
the software even had bound a license to a specific device with serial number,
and that software found "new" devices while missing the "old" ones...

Regards,
Ulrich

> 
> Cheers,
> 
> Hannes
> ‑‑ 
> Dr. Hannes Reinecke		        Kernel Storage Architect
> hare@suse.de			               +49 911 74053 688
> SUSE Software Solutions Germany GmbH, 90409 Nürnberg
> GF: F. Imendörffer, HRB 36809 (AG Nürnberg)




^ permalink raw reply	[flat|nested] 50+ messages in thread

* [dm-devel] Antw: [EXT] Re: RFC: one more time: SCSI device identification
@ 2021-04-27 10:52                         ` Ulrich Windl
  0 siblings, 0 replies; 50+ messages in thread
From: Ulrich Windl @ 2021-04-27 10:52 UTC (permalink / raw)
  To: erwin, martin.petersen, martin.wilck, hare
  Cc: hare, systemd-devel, linux-scsi, dm-devel, dgilbert, jejb, hch

>>> Hannes Reinecke <hare@suse.de> schrieb am 27.04.2021 um 10:21 in Nachricht
<2a6903e4-ff2b-67d5-e772-6971db8448fb@suse.de>:
> On 4/27/21 10:10 AM, Martin Wilck wrote:
>> On Tue, 2021‑04‑27 at 13:48 +1000, Erwin van Londen wrote:
>>>>
>>>> Wrt 1), we can only hope that it's the case. But 2) and 3) need work,
>>>> afaics.
>>>>
>>> In my view the WWID should never change. 
>> 
>> In an ideal world, perhaps not. But in the dm‑multipath realm, we know
>> that WWID changes can happen with certain storage arrays. See 
>> https://listman.redhat.com/archives/dm‑devel/2021‑February/msg00116.html 
>> and follow‑ups, for example.
>> 
> And it's actually something which might happen quite easily.
> The storage array can unmap a LUN, delete it, create a new one, and map
> that one into the same LUN number than the old one.
> If we didn't do I/O during that interval upon the next I/O we will be
> getting the dreaded 'Power‑On/Reset' sense code.
> _And nothing else_, due to the arcane rules for sense code generation in
> SAM.
> But we end up with a completely different device.
> 
> The only way out of it is to do a rescan for every POR sense code, and
> disable the device eg via DID_NO_CONNECT whenever we find that the
> identification has changed. We already have a copy of the original VPD
> page 0x83 at hand, so that should be reasonably easy.

I don't know the depth of the SCSI or FC protocol, but storage systems
typically signal such events, maybe either via some unit attention or some FC
event. Older kernels logged that there was a change, but a manual SCSI bus scan
is needed, while newer kernels find new devices "automagically" for some
products. The HP EVA 6000 series wored that way, a 3PAR SotorServ 8000 series
also seems to work that way, but not Pure Storage X70 R3. FOr the latter you
need something like a FC LIP to make the kernel detect the new devices (LUNs).
I'm unsure where the problem is, but in principle the kernel can be
notified...

> 
> I had a rather lengthy discussion with Fred Knight @ NetApp about
> Power‑On/Reset handling, what with him complaining that we don't handle
> is correctly. So this really is something we should be looking into,
> even independently of multipathing.
> 
> But actually I like the idea from Martin Petersen to expose the parsed
> VPD identifiers to sysfs; that would allow us to drop sg_inq completely
> from the udev rules.

Talking of VPDs: Somewhere in the last 12 years (within SLES 11)there was a
kernel change regarding trailing blanks in VPD data. That change blew up
several configurations being unable to re-recognize the devices. In one case
the software even had bound a license to a specific device with serial number,
and that software found "new" devices while missing the "old" ones...

Regards,
Ulrich

> 
> Cheers,
> 
> Hannes
> ‑‑ 
> Dr. Hannes Reinecke		        Kernel Storage Architect
> hare@suse.de			               +49 911 74053 688
> SUSE Software Solutions Germany GmbH, 90409 Nürnberg
> GF: F. Imendörffer, HRB 36809 (AG Nürnberg)




--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: Antw: [EXT] Re: [dm-devel] RFC: one more time: SCSI device identification
  2021-04-27 10:52                         ` [dm-devel] Antw: [EXT] " Ulrich Windl
@ 2021-04-27 20:04                           ` Ewan D. Milne
  -1 siblings, 0 replies; 50+ messages in thread
From: Ewan D. Milne @ 2021-04-27 20:04 UTC (permalink / raw)
  To: Ulrich Windl, erwin, martin.petersen, martin.wilck, hare
  Cc: dgilbert, jejb, systemd-devel, hch, dm-devel, hare, linux-scsi

On Tue, 2021-04-27 at 12:52 +0200, Ulrich Windl wrote:
> > > > Hannes Reinecke <hare@suse.de> schrieb am 27.04.2021 um 10:21
> > > > in Nachricht
> 
> <2a6903e4-ff2b-67d5-e772-6971db8448fb@suse.de>:
> > On 4/27/21 10:10 AM, Martin Wilck wrote:
> > > On Tue, 2021‑04‑27 at 13:48 +1000, Erwin van Londen wrote:
> > > > > 
> > > > > Wrt 1), we can only hope that it's the case. But 2) and 3)
> > > > > need work,
> > > > > afaics.
> > > > > 
> > > > 
> > > > In my view the WWID should never change. 
> > > 
> > > In an ideal world, perhaps not. But in the dm‑multipath realm, we
> > > know
> > > that WWID changes can happen with certain storage arrays. See 
> > > 
https://listman.redhat.com/archives/dm‑devel/2021‑February/msg00116.html
> > >  
> > > and follow‑ups, for example.
> > > 
> > 
> > And it's actually something which might happen quite easily.
> > The storage array can unmap a LUN, delete it, create a new one, and
> > map
> > that one into the same LUN number than the old one.
> > If we didn't do I/O during that interval upon the next I/O we will
> > be
> > getting the dreaded 'Power‑On/Reset' sense code.
> > _And nothing else_, due to the arcane rules for sense code
> > generation in
> > SAM.
> > But we end up with a completely different device.
> > 
> > The only way out of it is to do a rescan for every POR sense code,
> > and
> > disable the device eg via DID_NO_CONNECT whenever we find that the
> > identification has changed. We already have a copy of the original
> > VPD
> > page 0x83 at hand, so that should be reasonably easy.
> 
> I don't know the depth of the SCSI or FC protocol, but storage
> systems
> typically signal such events, maybe either via some unit attention or
> some FC
> event. Older kernels logged that there was a change, but a manual
> SCSI bus scan
> is needed, while newer kernels find new devices "automagically" for
> some
> products. The HP EVA 6000 series wored that way, a 3PAR SotorServ
> 8000 series
> also seems to work that way, but not Pure Storage X70 R3. FOr the
> latter you
> need something like a FC LIP to make the kernel detect the new
> devices (LUNs).
> I'm unsure where the problem is, but in principle the kernel can be
> notified...

There has to be some command on which the Unit Attention status
can be returned.  (In a multipath configuration, the path checker
commands may do this).  In absence of a command, there is no
asynchronous mechanism in SCSI to report the status.

On FC things related to finding a remote port will trigger a rescan.

-Ewan

> 
> > 
> > I had a rather lengthy discussion with Fred Knight @ NetApp about
> > Power‑On/Reset handling, what with him complaining that we don't
> > handle
> > is correctly. So this really is something we should be looking
> > into,
> > even independently of multipathing.
> > 
> > But actually I like the idea from Martin Petersen to expose the
> > parsed
> > VPD identifiers to sysfs; that would allow us to drop sg_inq
> > completely
> > from the udev rules.
> 
> Talking of VPDs: Somewhere in the last 12 years (within SLES 11)there
> was a
> kernel change regarding trailing blanks in VPD data. That change blew
> up
> several configurations being unable to re-recognize the devices. In
> one case
> the software even had bound a license to a specific device with
> serial number,
> and that software found "new" devices while missing the "old" ones...
> 
> Regards,
> Ulrich
> 
> > 
> > Cheers,
> > 
> > Hannes
> > ‑‑ 
> > Dr. Hannes Reinecke		        Kernel Storage Architect
> > hare@suse.de			               +49 911 74053
> > 688
> > SUSE Software Solutions Germany GmbH, 90409 Nürnberg
> > GF: F. Imendörffer, HRB 36809 (AG Nürnberg)
> 
> 
> 


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [dm-devel] Antw: [EXT] Re: RFC: one more time: SCSI device identification
@ 2021-04-27 20:04                           ` Ewan D. Milne
  0 siblings, 0 replies; 50+ messages in thread
From: Ewan D. Milne @ 2021-04-27 20:04 UTC (permalink / raw)
  To: Ulrich Windl, erwin, martin.petersen, martin.wilck, hare
  Cc: hare, systemd-devel, linux-scsi, dm-devel, dgilbert, jejb, hch

On Tue, 2021-04-27 at 12:52 +0200, Ulrich Windl wrote:
> > > > Hannes Reinecke <hare@suse.de> schrieb am 27.04.2021 um 10:21
> > > > in Nachricht
> 
> <2a6903e4-ff2b-67d5-e772-6971db8448fb@suse.de>:
> > On 4/27/21 10:10 AM, Martin Wilck wrote:
> > > On Tue, 2021‑04‑27 at 13:48 +1000, Erwin van Londen wrote:
> > > > > 
> > > > > Wrt 1), we can only hope that it's the case. But 2) and 3)
> > > > > need work,
> > > > > afaics.
> > > > > 
> > > > 
> > > > In my view the WWID should never change. 
> > > 
> > > In an ideal world, perhaps not. But in the dm‑multipath realm, we
> > > know
> > > that WWID changes can happen with certain storage arrays. See 
> > > 
https://listman.redhat.com/archives/dm‑devel/2021‑February/msg00116.html
> > >  
> > > and follow‑ups, for example.
> > > 
> > 
> > And it's actually something which might happen quite easily.
> > The storage array can unmap a LUN, delete it, create a new one, and
> > map
> > that one into the same LUN number than the old one.
> > If we didn't do I/O during that interval upon the next I/O we will
> > be
> > getting the dreaded 'Power‑On/Reset' sense code.
> > _And nothing else_, due to the arcane rules for sense code
> > generation in
> > SAM.
> > But we end up with a completely different device.
> > 
> > The only way out of it is to do a rescan for every POR sense code,
> > and
> > disable the device eg via DID_NO_CONNECT whenever we find that the
> > identification has changed. We already have a copy of the original
> > VPD
> > page 0x83 at hand, so that should be reasonably easy.
> 
> I don't know the depth of the SCSI or FC protocol, but storage
> systems
> typically signal such events, maybe either via some unit attention or
> some FC
> event. Older kernels logged that there was a change, but a manual
> SCSI bus scan
> is needed, while newer kernels find new devices "automagically" for
> some
> products. The HP EVA 6000 series wored that way, a 3PAR SotorServ
> 8000 series
> also seems to work that way, but not Pure Storage X70 R3. FOr the
> latter you
> need something like a FC LIP to make the kernel detect the new
> devices (LUNs).
> I'm unsure where the problem is, but in principle the kernel can be
> notified...

There has to be some command on which the Unit Attention status
can be returned.  (In a multipath configuration, the path checker
commands may do this).  In absence of a command, there is no
asynchronous mechanism in SCSI to report the status.

On FC things related to finding a remote port will trigger a rescan.

-Ewan

> 
> > 
> > I had a rather lengthy discussion with Fred Knight @ NetApp about
> > Power‑On/Reset handling, what with him complaining that we don't
> > handle
> > is correctly. So this really is something we should be looking
> > into,
> > even independently of multipathing.
> > 
> > But actually I like the idea from Martin Petersen to expose the
> > parsed
> > VPD identifiers to sysfs; that would allow us to drop sg_inq
> > completely
> > from the udev rules.
> 
> Talking of VPDs: Somewhere in the last 12 years (within SLES 11)there
> was a
> kernel change regarding trailing blanks in VPD data. That change blew
> up
> several configurations being unable to re-recognize the devices. In
> one case
> the software even had bound a license to a specific device with
> serial number,
> and that software found "new" devices while missing the "old" ones...
> 
> Regards,
> Ulrich
> 
> > 
> > Cheers,
> > 
> > Hannes
> > ‑‑ 
> > Dr. Hannes Reinecke		        Kernel Storage Architect
> > hare@suse.de			               +49 911 74053
> > 688
> > SUSE Software Solutions Germany GmbH, 90409 Nürnberg
> > GF: F. Imendörffer, HRB 36809 (AG Nürnberg)
> 
> 
> 

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: RFC: one more time: SCSI device identification
  2021-04-26 13:16                 ` [dm-devel] " Martin Wilck
@ 2021-04-27 20:14                   ` Ewan D. Milne
  -1 siblings, 0 replies; 50+ messages in thread
From: Ewan D. Milne @ 2021-04-27 20:14 UTC (permalink / raw)
  To: Martin Wilck, Ulrich.Windl, martin.petersen
  Cc: Hannes Reinecke, hch, dgilbert, dm-devel, linux-scsi, jejb,
	systemd-devel, bmarzins

On Mon, 2021-04-26 at 13:16 +0000, Martin Wilck wrote:
> On Mon, 2021-04-26 at 13:14 +0200, Ulrich Windl wrote:
> > > > 
> > > 
> > > While we're at it, I'd like to mention another issue: WWID
> > > changes.
> > > 
> > > This is a big problem for multipathd. The gist is that the device
> > > identification attributes in sysfs only change after rescanning
> > > the
> > > device. Thus if a user changes LUN assignments on a storage
> > > system,
> > > it can happen that a direct INQUIRY returns a different WWID as
> > > in
> > > sysfs, which is fatal. If we plan to rely more on sysfs for
> > > device
> > > identification in the future, the problem gets worse. 
> > 
> > I think many devices rely on the fact that they are identified by
> > Vendor/model/serial_nr, because in most professional SAN storage
> > systems you
> > can pre-set the serial number to a custom value; so if you want a
> > new
> > disk
> > (maybe a snapshot) to be compatible with the old one, just assign
> > the
> > same
> > serial number. I guess that's the idea behind.
> 
> What you are saying sounds dangerous to me. If a snapshot has the
> same
> WWID as the device it's a snapshot of, it must not be exposed to any
> host(s) at the same time with its origin, otherwise the host may
> happily combine it with the origin into one multipath map, and data
> corruption will almost certainly result. 
> 
> My argument is about how the host is supposed to deal with a WWID
> change if it happens. Here, "WWID change" means that a given H:C:T:L
> suddenly exposes different device designators than it used to, while
> this device is in use by a host. Here, too, data corruption is
> imminent, and can happen in a blink of an eye. To avoid this, several
> things are needed:
> 
>  1) the host needs to get notified about the change (likely by an UA
> of
> some sort)
>  2) the kernel needs to react to the notification immediately, e.g.
> by
> blocking IO to the device,

There's no way to do that, in principle.  Because there could be
other I/Os in flight.  You might (somehow) avoid retrying an I/O
that got a UA until you figured out if something changed, but other
I/Os can already have been sent to the target, or issued before you
get to look at the status.

-Ewan

>  3) userspace tooling such as udev or multipathd need to figure out
> how
> to  how to deal with the situation cleanly, and eventually unblock
> it.
> 
> Wrt 1), we can only hope that it's the case. But 2) and 3) need work,
> afaics.
> 
> Martin
> 


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [dm-devel] RFC: one more time: SCSI device identification
@ 2021-04-27 20:14                   ` Ewan D. Milne
  0 siblings, 0 replies; 50+ messages in thread
From: Ewan D. Milne @ 2021-04-27 20:14 UTC (permalink / raw)
  To: Martin Wilck, Ulrich.Windl, martin.petersen
  Cc: Hannes Reinecke, jejb, linux-scsi, dm-devel, dgilbert,
	systemd-devel, hch

On Mon, 2021-04-26 at 13:16 +0000, Martin Wilck wrote:
> On Mon, 2021-04-26 at 13:14 +0200, Ulrich Windl wrote:
> > > > 
> > > 
> > > While we're at it, I'd like to mention another issue: WWID
> > > changes.
> > > 
> > > This is a big problem for multipathd. The gist is that the device
> > > identification attributes in sysfs only change after rescanning
> > > the
> > > device. Thus if a user changes LUN assignments on a storage
> > > system,
> > > it can happen that a direct INQUIRY returns a different WWID as
> > > in
> > > sysfs, which is fatal. If we plan to rely more on sysfs for
> > > device
> > > identification in the future, the problem gets worse. 
> > 
> > I think many devices rely on the fact that they are identified by
> > Vendor/model/serial_nr, because in most professional SAN storage
> > systems you
> > can pre-set the serial number to a custom value; so if you want a
> > new
> > disk
> > (maybe a snapshot) to be compatible with the old one, just assign
> > the
> > same
> > serial number. I guess that's the idea behind.
> 
> What you are saying sounds dangerous to me. If a snapshot has the
> same
> WWID as the device it's a snapshot of, it must not be exposed to any
> host(s) at the same time with its origin, otherwise the host may
> happily combine it with the origin into one multipath map, and data
> corruption will almost certainly result. 
> 
> My argument is about how the host is supposed to deal with a WWID
> change if it happens. Here, "WWID change" means that a given H:C:T:L
> suddenly exposes different device designators than it used to, while
> this device is in use by a host. Here, too, data corruption is
> imminent, and can happen in a blink of an eye. To avoid this, several
> things are needed:
> 
>  1) the host needs to get notified about the change (likely by an UA
> of
> some sort)
>  2) the kernel needs to react to the notification immediately, e.g.
> by
> blocking IO to the device,

There's no way to do that, in principle.  Because there could be
other I/Os in flight.  You might (somehow) avoid retrying an I/O
that got a UA until you figured out if something changed, but other
I/Os can already have been sent to the target, or issued before you
get to look at the status.

-Ewan

>  3) userspace tooling such as udev or multipathd need to figure out
> how
> to  how to deal with the situation cleanly, and eventually unblock
> it.
> 
> Wrt 1), we can only hope that it's the case. But 2) and 3) need work,
> afaics.
> 
> Martin
> 

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: RFC: one more time: SCSI device identification
  2021-04-27 20:14                   ` [dm-devel] " Ewan D. Milne
@ 2021-04-27 20:33                     ` Martin Wilck
  -1 siblings, 0 replies; 50+ messages in thread
From: Martin Wilck @ 2021-04-27 20:33 UTC (permalink / raw)
  To: emilne, Ulrich.Windl, martin.petersen
  Cc: Hannes Reinecke, hch, dgilbert, dm-devel, linux-scsi, jejb,
	systemd-devel, bmarzins

On Tue, 2021-04-27 at 16:14 -0400, Ewan D. Milne wrote:
> On Mon, 2021-04-26 at 13:16 +0000, Martin Wilck wrote:
> > On Mon, 2021-04-26 at 13:14 +0200, Ulrich Windl wrote:
> > > > > 
> > > > 
> > > > While we're at it, I'd like to mention another issue: WWID
> > > > changes.
> > > > 
> > > > This is a big problem for multipathd. The gist is that the
> > > > device
> > > > identification attributes in sysfs only change after rescanning
> > > > the
> > > > device. Thus if a user changes LUN assignments on a storage
> > > > system,
> > > > it can happen that a direct INQUIRY returns a different WWID as
> > > > in
> > > > sysfs, which is fatal. If we plan to rely more on sysfs for
> > > > device
> > > > identification in the future, the problem gets worse. 
> > > 
> > > I think many devices rely on the fact that they are identified by
> > > Vendor/model/serial_nr, because in most professional SAN storage
> > > systems you
> > > can pre-set the serial number to a custom value; so if you want a
> > > new
> > > disk
> > > (maybe a snapshot) to be compatible with the old one, just assign
> > > the
> > > same
> > > serial number. I guess that's the idea behind.
> > 
> > What you are saying sounds dangerous to me. If a snapshot has the
> > same
> > WWID as the device it's a snapshot of, it must not be exposed to
> > any
> > host(s) at the same time with its origin, otherwise the host may
> > happily combine it with the origin into one multipath map, and data
> > corruption will almost certainly result. 
> > 
> > My argument is about how the host is supposed to deal with a WWID
> > change if it happens. Here, "WWID change" means that a given
> > H:C:T:L
> > suddenly exposes different device designators than it used to,
> > while
> > this device is in use by a host. Here, too, data corruption is
> > imminent, and can happen in a blink of an eye. To avoid this,
> > several
> > things are needed:
> > 
> >  1) the host needs to get notified about the change (likely by an
> > UA
> > of
> > some sort)
> >  2) the kernel needs to react to the notification immediately, e.g.
> > by
> > blocking IO to the device,
> 
> There's no way to do that, in principle.  Because there could be
> other I/Os in flight.  You might (somehow) avoid retrying an I/O
> that got a UA until you figured out if something changed, but other
> I/Os can already have been sent to the target, or issued before you
> get to look at the status.

Right. But in practice, a WWID change will hardly happen under full IO
load. The storage side will probably have to block IO while this
happens, at least for a short time period. So blocking and quiescing
the queue upon an UA might still work, most of the time. Even if we
were too late already, the sooner we stop the queue, the better.

The current algorithm in multipath-tools needs to detect a path going
down and being reinstated. The time interval during which a WWID change
will go unnoticed is one or more path checker intervals, typically on
the order of 5-30 seconds. If we could decrease this interval to a sub-
second or even millisecond range by blocking the queue in the kernel
quickly, we'd have made a big step forward.

Regards
Martin


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [dm-devel] RFC: one more time: SCSI device identification
@ 2021-04-27 20:33                     ` Martin Wilck
  0 siblings, 0 replies; 50+ messages in thread
From: Martin Wilck @ 2021-04-27 20:33 UTC (permalink / raw)
  To: emilne, Ulrich.Windl, martin.petersen
  Cc: Hannes Reinecke, jejb, linux-scsi, dm-devel, dgilbert,
	systemd-devel, hch

On Tue, 2021-04-27 at 16:14 -0400, Ewan D. Milne wrote:
> On Mon, 2021-04-26 at 13:16 +0000, Martin Wilck wrote:
> > On Mon, 2021-04-26 at 13:14 +0200, Ulrich Windl wrote:
> > > > > 
> > > > 
> > > > While we're at it, I'd like to mention another issue: WWID
> > > > changes.
> > > > 
> > > > This is a big problem for multipathd. The gist is that the
> > > > device
> > > > identification attributes in sysfs only change after rescanning
> > > > the
> > > > device. Thus if a user changes LUN assignments on a storage
> > > > system,
> > > > it can happen that a direct INQUIRY returns a different WWID as
> > > > in
> > > > sysfs, which is fatal. If we plan to rely more on sysfs for
> > > > device
> > > > identification in the future, the problem gets worse. 
> > > 
> > > I think many devices rely on the fact that they are identified by
> > > Vendor/model/serial_nr, because in most professional SAN storage
> > > systems you
> > > can pre-set the serial number to a custom value; so if you want a
> > > new
> > > disk
> > > (maybe a snapshot) to be compatible with the old one, just assign
> > > the
> > > same
> > > serial number. I guess that's the idea behind.
> > 
> > What you are saying sounds dangerous to me. If a snapshot has the
> > same
> > WWID as the device it's a snapshot of, it must not be exposed to
> > any
> > host(s) at the same time with its origin, otherwise the host may
> > happily combine it with the origin into one multipath map, and data
> > corruption will almost certainly result. 
> > 
> > My argument is about how the host is supposed to deal with a WWID
> > change if it happens. Here, "WWID change" means that a given
> > H:C:T:L
> > suddenly exposes different device designators than it used to,
> > while
> > this device is in use by a host. Here, too, data corruption is
> > imminent, and can happen in a blink of an eye. To avoid this,
> > several
> > things are needed:
> > 
> >  1) the host needs to get notified about the change (likely by an
> > UA
> > of
> > some sort)
> >  2) the kernel needs to react to the notification immediately, e.g.
> > by
> > blocking IO to the device,
> 
> There's no way to do that, in principle.  Because there could be
> other I/Os in flight.  You might (somehow) avoid retrying an I/O
> that got a UA until you figured out if something changed, but other
> I/Os can already have been sent to the target, or issued before you
> get to look at the status.

Right. But in practice, a WWID change will hardly happen under full IO
load. The storage side will probably have to block IO while this
happens, at least for a short time period. So blocking and quiescing
the queue upon an UA might still work, most of the time. Even if we
were too late already, the sooner we stop the queue, the better.

The current algorithm in multipath-tools needs to detect a path going
down and being reinstated. The time interval during which a WWID change
will go unnoticed is one or more path checker intervals, typically on
the order of 5-30 seconds. If we could decrease this interval to a sub-
second or even millisecond range by blocking the queue in the kernel
quickly, we'd have made a big step forward.

Regards
Martin


--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: RFC: one more time: SCSI device identification
  2021-04-27 20:33                     ` [dm-devel] " Martin Wilck
@ 2021-04-27 20:41                       ` Ewan D. Milne
  -1 siblings, 0 replies; 50+ messages in thread
From: Ewan D. Milne @ 2021-04-27 20:41 UTC (permalink / raw)
  To: Martin Wilck, Ulrich.Windl, martin.petersen
  Cc: Hannes Reinecke, hch, dgilbert, dm-devel, linux-scsi, jejb,
	systemd-devel, bmarzins

On Tue, 2021-04-27 at 20:33 +0000, Martin Wilck wrote:
> On Tue, 2021-04-27 at 16:14 -0400, Ewan D. Milne wrote:
> > 
> > There's no way to do that, in principle.  Because there could be
> > other I/Os in flight.  You might (somehow) avoid retrying an I/O
> > that got a UA until you figured out if something changed, but other
> > I/Os can already have been sent to the target, or issued before you
> > get to look at the status.
> 
> Right. But in practice, a WWID change will hardly happen under full
> IO
> load. The storage side will probably have to block IO while this
> happens, at least for a short time period. So blocking and quiescing
> the queue upon an UA might still work, most of the time. Even if we
> were too late already, the sooner we stop the queue, the better.
> 
> The current algorithm in multipath-tools needs to detect a path going
> down and being reinstated. The time interval during which a WWID
> change
> will go unnoticed is one or more path checker intervals, typically on
> the order of 5-30 seconds. If we could decrease this interval to a
> sub-
> second or even millisecond range by blocking the queue in the kernel
> quickly, we'd have made a big step forward.

Yes, and in many situations this may help.  But in the general case
we can't protect against a storage array misconfiguration,
where something like this can happen.  So I worry about people
believing the host software will protect them against a mistake,
when we can't really do that.

All it takes is one I/O (a discard) to make a thorough mess of the LUN.

-Ewan

> 
> Regards
> Martin
> 


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [dm-devel] RFC: one more time: SCSI device identification
@ 2021-04-27 20:41                       ` Ewan D. Milne
  0 siblings, 0 replies; 50+ messages in thread
From: Ewan D. Milne @ 2021-04-27 20:41 UTC (permalink / raw)
  To: Martin Wilck, Ulrich.Windl, martin.petersen
  Cc: Hannes Reinecke, jejb, linux-scsi, dm-devel, dgilbert,
	systemd-devel, hch

On Tue, 2021-04-27 at 20:33 +0000, Martin Wilck wrote:
> On Tue, 2021-04-27 at 16:14 -0400, Ewan D. Milne wrote:
> > 
> > There's no way to do that, in principle.  Because there could be
> > other I/Os in flight.  You might (somehow) avoid retrying an I/O
> > that got a UA until you figured out if something changed, but other
> > I/Os can already have been sent to the target, or issued before you
> > get to look at the status.
> 
> Right. But in practice, a WWID change will hardly happen under full
> IO
> load. The storage side will probably have to block IO while this
> happens, at least for a short time period. So blocking and quiescing
> the queue upon an UA might still work, most of the time. Even if we
> were too late already, the sooner we stop the queue, the better.
> 
> The current algorithm in multipath-tools needs to detect a path going
> down and being reinstated. The time interval during which a WWID
> change
> will go unnoticed is one or more path checker intervals, typically on
> the order of 5-30 seconds. If we could decrease this interval to a
> sub-
> second or even millisecond range by blocking the queue in the kernel
> quickly, we'd have made a big step forward.

Yes, and in many situations this may help.  But in the general case
we can't protect against a storage array misconfiguration,
where something like this can happen.  So I worry about people
believing the host software will protect them against a mistake,
when we can't really do that.

All it takes is one I/O (a discard) to make a thorough mess of the LUN.

-Ewan

> 
> Regards
> Martin
> 

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [dm-devel] RFC: one more time: SCSI device identification
  2021-04-27 20:41                       ` [dm-devel] " Ewan D. Milne
  (?)
@ 2021-04-28  0:09                       ` Erwin van Londen
  2021-04-30 23:44                         ` Ewan D. Milne
  -1 siblings, 1 reply; 50+ messages in thread
From: Erwin van Londen @ 2021-04-28  0:09 UTC (permalink / raw)
  To: Ewan D. Milne, Martin Wilck, Ulrich.Windl, martin.petersen
  Cc: Hannes Reinecke, systemd-devel, linux-scsi, dm-devel, dgilbert,
	jejb, hch


[-- Attachment #1.1: Type: text/plain, Size: 2698 bytes --]



On Tue, 2021-04-27 at 16:41 -0400, Ewan D. Milne wrote:
> On Tue, 2021-04-27 at 20:33 +0000, Martin Wilck wrote:
> > On Tue, 2021-04-27 at 16:14 -0400, Ewan D. Milne wrote:
> > > 
> > > There's no way to do that, in principle.  Because there could be
> > > other I/Os in flight.  You might (somehow) avoid retrying an I/O
> > > that got a UA until you figured out if something changed, but
> > > other
> > > I/Os can already have been sent to the target, or issued before
> > > you
> > > get to look at the status.

If something happens on a storage side where a lun gets it's attributes
changed (any, doesn't matter which one) a UA should be sent. Also all
outstanding IO's on that lun should be returning an Abort as it can no
longer warrant the validity of any IO due to these changes. Especially
when parameters are involved like reservations (PR's) etc. If that does
not happen from an array side all bets are off as the only way to be
able to get back in business is to start from scratch.

> > 
> > 
> > Right. But in practice, a WWID change will hardly happen under full
> > IO
> > load. The storage side will probably have to block IO while this
> > happens, at least for a short time period. So blocking and
> > quiescing
> > the queue upon an UA might still work, most of the time. Even if we
> > were too late already, the sooner we stop the queue, the better.

I think in most cases when something happens on an array side you will
see IO's being aborted. That might be a good time to start doing TUR's
and if these come back OK do a new inquiry. From a host side there is
only so much you can do.

> > 
> > The current algorithm in multipath-tools needs to detect a path
> > going
> > down and being reinstated. The time interval during which a WWID
> > change
> > will go unnoticed is one or more path checker intervals, typically
> > on
> > the order of 5-30 seconds. If we could decrease this interval to a
> > sub-
> > second or even millisecond range by blocking the queue in the
> > kernel
> > quickly, we'd have made a big step forward.
> 
> Yes, and in many situations this may help.  But in the general case
> we can't protect against a storage array misconfiguration,
> where something like this can happen.  So I worry about people
> believing the host software will protect them against a mistake,
> when we can't really do that.

My thought exactly. 

> 
> All it takes is one I/O (a discard) to make a thorough mess of the
> LUN.
> 
> -Ewan
> 
> > 
> > Regards
> > Martin
> > 
> 
> --
> dm-devel mailing list
> dm-devel@redhat.com
> https://listman.redhat.com/mailman/listinfo/dm-devel
> 

[-- Attachment #1.2: Type: text/html, Size: 4518 bytes --]

[-- Attachment #2: Type: text/plain, Size: 97 bytes --]

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [dm-devel] RFC: one more time: SCSI device identification
  2021-04-27  8:21                       ` Hannes Reinecke
  (?)
  (?)
@ 2021-04-28  1:01                       ` Erwin van Londen
  2021-04-28  6:34                           ` Martin Wilck
  -1 siblings, 1 reply; 50+ messages in thread
From: Erwin van Londen @ 2021-04-28  1:01 UTC (permalink / raw)
  To: Hannes Reinecke, Martin Wilck, Ulrich.Windl, martin.petersen
  Cc: Hannes Reinecke, systemd-devel, linux-scsi, dm-devel, dgilbert,
	jejb, hch


[-- Attachment #1.1: Type: text/plain, Size: 4873 bytes --]



On Tue, 2021-04-27 at 10:21 +0200, Hannes Reinecke wrote:
> On 4/27/21 10:10 AM, Martin Wilck wrote:
> > On Tue, 2021-04-27 at 13:48 +1000, Erwin van Londen wrote:
> > > > 
> > > > Wrt 1), we can only hope that it's the case. But 2) and 3) need
> > > > work,
> > > > afaics.
> > > > 
> > > In my view the WWID should never change. 
> > 
> > In an ideal world, perhaps not. But in the dm-multipath realm, we
> > know
> > that WWID changes can happen with certain storage arrays. See 
> > https://listman.redhat.com/archives/dm-devel/2021-February/msg00116.html
> >  
> > and follow-ups, for example.
> > 
> And it's actually something which might happen quite easily.
> The storage array can unmap a LUN, delete it, create a new one, and
> map
> that one into the same LUN number than the old one.
> If we didn't do I/O during that interval upon the next I/O we will be
> getting the dreaded 'Power-On/Reset' sense code.
> _And nothing else_, due to the arcane rules for sense code generation
> in
> SAM.
> But we end up with a completely different device.
> 
> The only way out of it is to do a rescan for every POR sense code,
> and
> disable the device eg via DID_NO_CONNECT whenever we find that the
> identification has changed. We already have a copy of the original
> VPD
> page 0x83 at hand, so that should be reasonably easy.

The way out of this is to chuck the array in the bin. As I mentioned in
one of my other emails when a scenario happens as you described above
and the array does not inform the initiator it goes against the SAM-5
standard.

That standard shows:
5.14 Unit attention conditions
5.14.1 Unit attention conditions that are not coalesced
Each logical unit shall establish a unit attention condition whenever
one of the following events occurs:
 a) a power on (see 6.3.1), hard reset (see 6.3.2), logical unit
reset (see 6.3.3), I_T nexus loss (see 6.3.4), or power loss expected
(see 6.3.5) occurs;
 b) commands received on this I_T nexus have been cleared by a
command or a task management function associated with another I_T nexus
and the TAS bit was set to zero in the Control mode page associated
with this I_T nexus (see 5.6);
 c) the portion of the logical unit inventory that consists of
administrative logical units and hierarchical logical units has been
changed (see 4.6.18.1); or
 d) any other event requiring the attention of the SCSI
initiator device.

Especially the I_T nexus loss under a is an important trigger.

---
6.3.4 I_T nexus loss
An I_T nexus loss is a SCSI device condition resulting from:

a) a hard reset condition (see 6.3.2);
b) an I_T nexus loss event (e.g., logout) indicated by a Nexus Loss
event notification (see 6.4);
c) indication that an I_T NEXUS RESET task management request (see 7.6)
has been processed; or
d) an indication that a REMOVE I_T NEXUS command (see SPC-4) has been
processed.
An I_T nexus loss event is an indication from the SCSI transport
protocol to the SAL that an I_T nexus no
longer exists. SCSI transport protocols may define I_T nexus loss
events.

Each SCSI transport protocol standard that defines I_T nexus loss
events should specify when those events
result in the delivery of a Nexus Loss event notification to the SAL.

The I_T nexus loss condition applies to both SCSI initiator devices and
SCSI target devices.

If a SCSI target port detects an I_T nexus loss, then a Nexus Loss
event notification shall be delivered to
each logical unit to which the I_T nexus has access.

In response to an I_T nexus loss condition a logical unit shall take
the following actions:
a) abort all commands received on the I_T nexus as described in 5.6;
b) abort all background third-party copy operations (see SPC-4) that
are using the I_T nexus;
c) terminate all task management functions received on the I_T nexus;
d) clear all ACA conditions (see 5.9.5) associated with the I_T nexus;
e) establish a unit attention condition for the SCSI initiator port
associated with the I_T nexus (see 5.14
and 6.2); and
f) perform any additional functions required by the applicable command
standards.
---

This does also mean that any underlying transport protocol issues like
on FC or TCP for iSCSI will very often trigger aborted commands or UA's
as well which will be picked up by the kernel/respected drivers.

> 
> I had a rather lengthy discussion with Fred Knight @ NetApp about
> Power-On/Reset handling, what with him complaining that we don't
> handle
> is correctly. So this really is something we should be looking into,
> even independently of multipathing.
> 
> But actually I like the idea from Martin Petersen to expose the
> parsed
> VPD identifiers to sysfs; that would allow us to drop sg_inq
> completely
> from the udev rules.
> 
> Cheers,
> 
> Hannes

[-- Attachment #1.2: Type: text/html, Size: 6694 bytes --]

[-- Attachment #2: Type: text/plain, Size: 97 bytes --]

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [systemd-devel] RFC: one more time: SCSI device identification
  2021-04-27 20:41                       ` [dm-devel] " Ewan D. Milne
@ 2021-04-28  6:30                         ` Martin Wilck
  -1 siblings, 0 replies; 50+ messages in thread
From: Martin Wilck @ 2021-04-28  6:30 UTC (permalink / raw)
  To: emilne, Ulrich.Windl, martin.petersen
  Cc: Hannes Reinecke, hch, dgilbert, dm-devel, linux-scsi, jejb,
	systemd-devel, bmarzins

On Tue, 2021-04-27 at 16:41 -0400, Ewan D. Milne wrote:
> On Tue, 2021-04-27 at 20:33 +0000, Martin Wilck wrote:
> > On Tue, 2021-04-27 at 16:14 -0400, Ewan D. Milne wrote:
> > > 
> > > There's no way to do that, in principle.  Because there could be
> > > other I/Os in flight.  You might (somehow) avoid retrying an I/O
> > > that got a UA until you figured out if something changed, but other
> > > I/Os can already have been sent to the target, or issued before you
> > > get to look at the status.
> > 
> > Right. But in practice, a WWID change will hardly happen under full
> > IO
> > load. The storage side will probably have to block IO while this
> > happens, at least for a short time period. So blocking and quiescing
> > the queue upon an UA might still work, most of the time. Even if we
> > were too late already, the sooner we stop the queue, the better.
> > 
> > The current algorithm in multipath-tools needs to detect a path going
> > down and being reinstated. The time interval during which a WWID
> > change
> > will go unnoticed is one or more path checker intervals, typically on
> > the order of 5-30 seconds. If we could decrease this interval to a
> > sub-
> > second or even millisecond range by blocking the queue in the kernel
> > quickly, we'd have made a big step forward.
> 
> Yes, and in many situations this may help.  But in the general case
> we can't protect against a storage array misconfiguration,
> where something like this can happen.  So I worry about people
> believing the host software will protect them against a mistake,
> when we can't really do that.

I agree. I expressed a similar notion in the following thread about
multipathd's WWID change detection capabilities in the face of really
bad mistakes on the administrator's (or storage array's, FTM)  part:
https://listman.redhat.com/archives/dm-devel/2021-February/msg00248.html
But others stressed that nonetheless we should try our best to
avoid customer data corruption (which I agree with, too), and thus we
settled on the current algorithm, which suited the needs at least of
the affected user(s) in that specific case.

Personally I think that the current "5-30s" time period for WWID change
detection in multipathd is unsafe both theoretically and practially,
and may lure users into a false feeling of safety. Therefore I'd
strongly welcome a kernel-side solution that might still not be safe
theoretically, but cover most practical problem scenarios much better
than we currently do.

Regards
Martin

-- 
Dr. Martin Wilck <mwilck@suse.com>, Tel. +49 (0)911 74053 2107
SUSE Software Solutions Germany GmbH
HRB 36809, AG Nürnberg GF: Felix Imendörffer



^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [dm-devel] [systemd-devel] RFC: one more time: SCSI device identification
@ 2021-04-28  6:30                         ` Martin Wilck
  0 siblings, 0 replies; 50+ messages in thread
From: Martin Wilck @ 2021-04-28  6:30 UTC (permalink / raw)
  To: emilne, Ulrich.Windl, martin.petersen
  Cc: Hannes Reinecke, jejb, linux-scsi, dm-devel, dgilbert,
	systemd-devel, hch

On Tue, 2021-04-27 at 16:41 -0400, Ewan D. Milne wrote:
> On Tue, 2021-04-27 at 20:33 +0000, Martin Wilck wrote:
> > On Tue, 2021-04-27 at 16:14 -0400, Ewan D. Milne wrote:
> > > 
> > > There's no way to do that, in principle.  Because there could be
> > > other I/Os in flight.  You might (somehow) avoid retrying an I/O
> > > that got a UA until you figured out if something changed, but other
> > > I/Os can already have been sent to the target, or issued before you
> > > get to look at the status.
> > 
> > Right. But in practice, a WWID change will hardly happen under full
> > IO
> > load. The storage side will probably have to block IO while this
> > happens, at least for a short time period. So blocking and quiescing
> > the queue upon an UA might still work, most of the time. Even if we
> > were too late already, the sooner we stop the queue, the better.
> > 
> > The current algorithm in multipath-tools needs to detect a path going
> > down and being reinstated. The time interval during which a WWID
> > change
> > will go unnoticed is one or more path checker intervals, typically on
> > the order of 5-30 seconds. If we could decrease this interval to a
> > sub-
> > second or even millisecond range by blocking the queue in the kernel
> > quickly, we'd have made a big step forward.
> 
> Yes, and in many situations this may help.  But in the general case
> we can't protect against a storage array misconfiguration,
> where something like this can happen.  So I worry about people
> believing the host software will protect them against a mistake,
> when we can't really do that.

I agree. I expressed a similar notion in the following thread about
multipathd's WWID change detection capabilities in the face of really
bad mistakes on the administrator's (or storage array's, FTM)  part:
https://listman.redhat.com/archives/dm-devel/2021-February/msg00248.html
But others stressed that nonetheless we should try our best to
avoid customer data corruption (which I agree with, too), and thus we
settled on the current algorithm, which suited the needs at least of
the affected user(s) in that specific case.

Personally I think that the current "5-30s" time period for WWID change
detection in multipathd is unsafe both theoretically and practially,
and may lure users into a false feeling of safety. Therefore I'd
strongly welcome a kernel-side solution that might still not be safe
theoretically, but cover most practical problem scenarios much better
than we currently do.

Regards
Martin

-- 
Dr. Martin Wilck <mwilck@suse.com>, Tel. +49 (0)911 74053 2107
SUSE Software Solutions Germany GmbH
HRB 36809, AG Nürnberg GF: Felix Imendörffer



--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [dm-devel] RFC: one more time: SCSI device identification
  2021-04-28  1:01                       ` [dm-devel] " Erwin van Londen
@ 2021-04-28  6:34                           ` Martin Wilck
  0 siblings, 0 replies; 50+ messages in thread
From: Martin Wilck @ 2021-04-28  6:34 UTC (permalink / raw)
  To: erwin, hare, Ulrich.Windl, martin.petersen
  Cc: jejb, linux-scsi, dgilbert, dm-devel, systemd-devel,
	Hannes Reinecke, hch

On Wed, 2021-04-28 at 11:01 +1000, Erwin van Londen wrote:
> 
> The way out of this is to chuck the array in the bin. As I mentioned
> in one of my other emails when a scenario happens as you described
> above and the array does not inform the initiator it goes against the
> SAM-5 standard.
> 
> That standard shows:
> 5.14 Unit attention conditions
> 5.14.1 Unit attention conditions that are not coalesced
> Each logical unit shall establish a unit attention condition whenever
> one of the following events occurs:
> 	a) a power on (see 6.3.1), hard reset (see 6.3.2), logical
> unit reset (see 6.3.3), I_T nexus loss (see 6.3.4), or power loss
> expected (see 6.3.5) occurs;
> 	b) commands received on this I_T nexus have been cleared by
> a command or a task management function associated with another I_T
> nexus and the TAS bit was set to zero in the Control mode page
> associated with this I_T nexus (see 5.6);
> 	c) the portion of the logical unit inventory that consists
> of administrative logical units and hierarchical logical units has
> been changed (see 4.6.18.1); or
> 	d) any other event requiring the attention of the SCSI
> initiator device.
> 
> Especially the I_T nexus loss under a is an important trigger.
> 
> ---
> 6.3.4 I_T nexus loss
> An I_T nexus loss is a SCSI device condition resulting from:
> 
>  a) a hard reset condition (see 6.3.2);
>  b) an I_T nexus loss event (e.g., logout) indicated by a Nexus Loss
> event notification (see 6.4);
>  c) indication that an I_T NEXUS RESET task management request (see
> 7.6) has been processed; or
>  d) an indication that a REMOVE I_T NEXUS command (see SPC-4) has
> been processed.
> An I_T nexus loss event is an indication from the SCSI transport
> protocol to the SAL that an I_T nexus no
> longer exists. SCSI transport protocols may define I_T nexus loss
> events.
> 
> Each SCSI transport protocol standard that defines I_T nexus loss
> events should specify when those events
> result in the delivery of a Nexus Loss event notification to the SAL.
> 
> The I_T nexus loss condition applies to both SCSI initiator devices
> and SCSI target devices.
> 
> If a SCSI target port detects an I_T nexus loss, then a Nexus Loss
> event notification shall be delivered to
> each logical unit to which the I_T nexus has access.
> 
> In response to an I_T nexus loss condition a logical unit shall take
> the following actions:
> a) abort all commands received on the I_T nexus as described in 5.6;
> b) abort all background third-party copy operations (see SPC-4) that
> are using the I_T nexus;
> c) terminate all task management functions received on the I_T nexus;
> d) clear all ACA conditions (see 5.9.5) associated with the I_T
> nexus;
> e) establish a unit attention condition for the SCSI initiator port
> associated with the I_T nexus (see 5.14
> and 6.2); and
> f) perform any additional functions required by the applicable
> command standards.
> ---
> 
> This does also mean that any underlying transport protocol issues
> like on FC or TCP for iSCSI will very often trigger aborted commands
> or UA's as well which will be picked up by the kernel/respected
> drivers.

Thanks a lot. I'm not quite certain which of these paragraphs would
apply to the situation I had in mind (administrator remapping an
existing LUN on a storage array to a different volume). That scenario
wouldn't necessarily involve transport-level errors, or an I_T nexus
loss. 5.14.1 c) or d) might apply, is that what you meant?

Regards
Martin

-- 
Dr. Martin Wilck <mwilck@suse.com>, Tel. +49 (0)911 74053 2107
SUSE Software Solutions Germany GmbH
HRB 36809, AG Nürnberg GF: Felix Imendörffer



^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [dm-devel] RFC: one more time: SCSI device identification
@ 2021-04-28  6:34                           ` Martin Wilck
  0 siblings, 0 replies; 50+ messages in thread
From: Martin Wilck @ 2021-04-28  6:34 UTC (permalink / raw)
  To: erwin, hare, Ulrich.Windl, martin.petersen
  Cc: Hannes Reinecke, systemd-devel, linux-scsi, dm-devel, dgilbert,
	jejb, hch

On Wed, 2021-04-28 at 11:01 +1000, Erwin van Londen wrote:
> 
> The way out of this is to chuck the array in the bin. As I mentioned
> in one of my other emails when a scenario happens as you described
> above and the array does not inform the initiator it goes against the
> SAM-5 standard.
> 
> That standard shows:
> 5.14 Unit attention conditions
> 5.14.1 Unit attention conditions that are not coalesced
> Each logical unit shall establish a unit attention condition whenever
> one of the following events occurs:
> 	a) a power on (see 6.3.1), hard reset (see 6.3.2), logical
> unit reset (see 6.3.3), I_T nexus loss (see 6.3.4), or power loss
> expected (see 6.3.5) occurs;
> 	b) commands received on this I_T nexus have been cleared by
> a command or a task management function associated with another I_T
> nexus and the TAS bit was set to zero in the Control mode page
> associated with this I_T nexus (see 5.6);
> 	c) the portion of the logical unit inventory that consists
> of administrative logical units and hierarchical logical units has
> been changed (see 4.6.18.1); or
> 	d) any other event requiring the attention of the SCSI
> initiator device.
> 
> Especially the I_T nexus loss under a is an important trigger.
> 
> ---
> 6.3.4 I_T nexus loss
> An I_T nexus loss is a SCSI device condition resulting from:
> 
>  a) a hard reset condition (see 6.3.2);
>  b) an I_T nexus loss event (e.g., logout) indicated by a Nexus Loss
> event notification (see 6.4);
>  c) indication that an I_T NEXUS RESET task management request (see
> 7.6) has been processed; or
>  d) an indication that a REMOVE I_T NEXUS command (see SPC-4) has
> been processed.
> An I_T nexus loss event is an indication from the SCSI transport
> protocol to the SAL that an I_T nexus no
> longer exists. SCSI transport protocols may define I_T nexus loss
> events.
> 
> Each SCSI transport protocol standard that defines I_T nexus loss
> events should specify when those events
> result in the delivery of a Nexus Loss event notification to the SAL.
> 
> The I_T nexus loss condition applies to both SCSI initiator devices
> and SCSI target devices.
> 
> If a SCSI target port detects an I_T nexus loss, then a Nexus Loss
> event notification shall be delivered to
> each logical unit to which the I_T nexus has access.
> 
> In response to an I_T nexus loss condition a logical unit shall take
> the following actions:
> a) abort all commands received on the I_T nexus as described in 5.6;
> b) abort all background third-party copy operations (see SPC-4) that
> are using the I_T nexus;
> c) terminate all task management functions received on the I_T nexus;
> d) clear all ACA conditions (see 5.9.5) associated with the I_T
> nexus;
> e) establish a unit attention condition for the SCSI initiator port
> associated with the I_T nexus (see 5.14
> and 6.2); and
> f) perform any additional functions required by the applicable
> command standards.
> ---
> 
> This does also mean that any underlying transport protocol issues
> like on FC or TCP for iSCSI will very often trigger aborted commands
> or UA's as well which will be picked up by the kernel/respected
> drivers.

Thanks a lot. I'm not quite certain which of these paragraphs would
apply to the situation I had in mind (administrator remapping an
existing LUN on a storage array to a different volume). That scenario
wouldn't necessarily involve transport-level errors, or an I_T nexus
loss. 5.14.1 c) or d) might apply, is that what you meant?

Regards
Martin

-- 
Dr. Martin Wilck <mwilck@suse.com>, Tel. +49 (0)911 74053 2107
SUSE Software Solutions Germany GmbH
HRB 36809, AG Nürnberg GF: Felix Imendörffer



--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [dm-devel] RFC: one more time: SCSI device identification
  2021-04-28  6:34                           ` Martin Wilck
@ 2021-04-29 14:47                             ` Erwin van Londen
  -1 siblings, 0 replies; 50+ messages in thread
From: Erwin van Londen @ 2021-04-29 14:47 UTC (permalink / raw)
  To: Martin Wilck, hare, Ulrich.Windl, martin.petersen
  Cc: jejb, linux-scsi, dgilbert, dm-devel, systemd-devel,
	Hannes Reinecke, hch



On Wed, 2021-04-28 at 06:34 +0000, Martin Wilck wrote:
> On Wed, 2021-04-28 at 11:01 +1000, Erwin van Londen wrote:
> > 
> > The way out of this is to chuck the array in the bin. As I mentioned
> > in one of my other emails when a scenario happens as you described
> > above and the array does not inform the initiator it goes against the
> > SAM-5 standard.
> > 
> > That standard shows:
> > 5.14 Unit attention conditions
> > 5.14.1 Unit attention conditions that are not coalesced
> > Each logical unit shall establish a unit attention condition whenever
> > one of the following events occurs:
> >         a) a power on (see 6.3.1), hard reset (see 6.3.2), logical
> > unit reset (see 6.3.3), I_T nexus loss (see 6.3.4), or power loss
> > expected (see 6.3.5) occurs;
> >         b) commands received on this I_T nexus have been cleared by
> > a command or a task management function associated with another I_T
> > nexus and the TAS bit was set to zero in the Control mode page
> > associated with this I_T nexus (see 5.6);
> >         c) the portion of the logical unit inventory that consists
> > of administrative logical units and hierarchical logical units has
> > been changed (see 4.6.18.1); or
> >         d) any other event requiring the attention of the SCSI
> > initiator device.
> > 
> > Especially the I_T nexus loss under a is an important trigger.
> > 
> > ---
> > 6.3.4 I_T nexus loss
> > An I_T nexus loss is a SCSI device condition resulting from:
> > 
> >  a) a hard reset condition (see 6.3.2);
> >  b) an I_T nexus loss event (e.g., logout) indicated by a Nexus Loss
> > event notification (see 6.4);
> >  c) indication that an I_T NEXUS RESET task management request (see
> > 7.6) has been processed; or
> >  d) an indication that a REMOVE I_T NEXUS command (see SPC-4) has
> > been processed.
> > An I_T nexus loss event is an indication from the SCSI transport
> > protocol to the SAL that an I_T nexus no
> > longer exists. SCSI transport protocols may define I_T nexus loss
> > events.
> > 
> > Each SCSI transport protocol standard that defines I_T nexus loss
> > events should specify when those events
> > result in the delivery of a Nexus Loss event notification to the SAL.
> > 
> > The I_T nexus loss condition applies to both SCSI initiator devices
> > and SCSI target devices.
> > 
> > If a SCSI target port detects an I_T nexus loss, then a Nexus Loss
> > event notification shall be delivered to
> > each logical unit to which the I_T nexus has access.
> > 
> > In response to an I_T nexus loss condition a logical unit shall take
> > the following actions:
> > a) abort all commands received on the I_T nexus as described in 5.6;
> > b) abort all background third-party copy operations (see SPC-4) that
> > are using the I_T nexus;
> > c) terminate all task management functions received on the I_T nexus;
> > d) clear all ACA conditions (see 5.9.5) associated with the I_T
> > nexus;
> > e) establish a unit attention condition for the SCSI initiator port
> > associated with the I_T nexus (see 5.14
> > and 6.2); and
> > f) perform any additional functions required by the applicable
> > command standards.
> > ---
> > 
> > This does also mean that any underlying transport protocol issues
> > like on FC or TCP for iSCSI will very often trigger aborted commands
> > or UA's as well which will be picked up by the kernel/respected
> > drivers.
> 
> Thanks a lot. I'm not quite certain which of these paragraphs would
> apply to the situation I had in mind (administrator remapping an
> existing LUN on a storage array to a different volume). That scenario
> wouldn't necessarily involve transport-level errors, or an I_T nexus
> loss. 5.14.1 c) or d) might apply, is that what you meant?

I was indeed mostly referring to:

 	c) the portion of the logical unit inventory that consists
 of administrative logical units and hierarchical logical units has
 been changed (see 4.6.18.1); or
 	d) any other event requiring the attention of the SCSI
 initiator device.

The IT nexus status itself might not have changed but if an abstraction
layer representing a totally different set of data that would most
definitely fall under d. I think swapping between a volume and one of
its snapshots also falls under this 


> 
> Regards
> Martin
> 


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [dm-devel] RFC: one more time: SCSI device identification
@ 2021-04-29 14:47                             ` Erwin van Londen
  0 siblings, 0 replies; 50+ messages in thread
From: Erwin van Londen @ 2021-04-29 14:47 UTC (permalink / raw)
  To: Martin Wilck, hare, Ulrich.Windl, martin.petersen
  Cc: Hannes Reinecke, systemd-devel, linux-scsi, dm-devel, dgilbert,
	jejb, hch



On Wed, 2021-04-28 at 06:34 +0000, Martin Wilck wrote:
> On Wed, 2021-04-28 at 11:01 +1000, Erwin van Londen wrote:
> > 
> > The way out of this is to chuck the array in the bin. As I mentioned
> > in one of my other emails when a scenario happens as you described
> > above and the array does not inform the initiator it goes against the
> > SAM-5 standard.
> > 
> > That standard shows:
> > 5.14 Unit attention conditions
> > 5.14.1 Unit attention conditions that are not coalesced
> > Each logical unit shall establish a unit attention condition whenever
> > one of the following events occurs:
> >         a) a power on (see 6.3.1), hard reset (see 6.3.2), logical
> > unit reset (see 6.3.3), I_T nexus loss (see 6.3.4), or power loss
> > expected (see 6.3.5) occurs;
> >         b) commands received on this I_T nexus have been cleared by
> > a command or a task management function associated with another I_T
> > nexus and the TAS bit was set to zero in the Control mode page
> > associated with this I_T nexus (see 5.6);
> >         c) the portion of the logical unit inventory that consists
> > of administrative logical units and hierarchical logical units has
> > been changed (see 4.6.18.1); or
> >         d) any other event requiring the attention of the SCSI
> > initiator device.
> > 
> > Especially the I_T nexus loss under a is an important trigger.
> > 
> > ---
> > 6.3.4 I_T nexus loss
> > An I_T nexus loss is a SCSI device condition resulting from:
> > 
> >  a) a hard reset condition (see 6.3.2);
> >  b) an I_T nexus loss event (e.g., logout) indicated by a Nexus Loss
> > event notification (see 6.4);
> >  c) indication that an I_T NEXUS RESET task management request (see
> > 7.6) has been processed; or
> >  d) an indication that a REMOVE I_T NEXUS command (see SPC-4) has
> > been processed.
> > An I_T nexus loss event is an indication from the SCSI transport
> > protocol to the SAL that an I_T nexus no
> > longer exists. SCSI transport protocols may define I_T nexus loss
> > events.
> > 
> > Each SCSI transport protocol standard that defines I_T nexus loss
> > events should specify when those events
> > result in the delivery of a Nexus Loss event notification to the SAL.
> > 
> > The I_T nexus loss condition applies to both SCSI initiator devices
> > and SCSI target devices.
> > 
> > If a SCSI target port detects an I_T nexus loss, then a Nexus Loss
> > event notification shall be delivered to
> > each logical unit to which the I_T nexus has access.
> > 
> > In response to an I_T nexus loss condition a logical unit shall take
> > the following actions:
> > a) abort all commands received on the I_T nexus as described in 5.6;
> > b) abort all background third-party copy operations (see SPC-4) that
> > are using the I_T nexus;
> > c) terminate all task management functions received on the I_T nexus;
> > d) clear all ACA conditions (see 5.9.5) associated with the I_T
> > nexus;
> > e) establish a unit attention condition for the SCSI initiator port
> > associated with the I_T nexus (see 5.14
> > and 6.2); and
> > f) perform any additional functions required by the applicable
> > command standards.
> > ---
> > 
> > This does also mean that any underlying transport protocol issues
> > like on FC or TCP for iSCSI will very often trigger aborted commands
> > or UA's as well which will be picked up by the kernel/respected
> > drivers.
> 
> Thanks a lot. I'm not quite certain which of these paragraphs would
> apply to the situation I had in mind (administrator remapping an
> existing LUN on a storage array to a different volume). That scenario
> wouldn't necessarily involve transport-level errors, or an I_T nexus
> loss. 5.14.1 c) or d) might apply, is that what you meant?

I was indeed mostly referring to:

 	c) the portion of the logical unit inventory that consists
 of administrative logical units and hierarchical logical units has
 been changed (see 4.6.18.1); or
 	d) any other event requiring the attention of the SCSI
 initiator device.

The IT nexus status itself might not have changed but if an abstraction
layer representing a totally different set of data that would most
definitely fall under d. I think swapping between a volume and one of
its snapshots also falls under this 


> 
> Regards
> Martin
> 


--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [dm-devel] RFC: one more time: SCSI device identification
  2021-04-28  0:09                       ` Erwin van Londen
@ 2021-04-30 23:44                         ` Ewan D. Milne
  2021-05-03  2:34                             ` Erwin van Londen
  0 siblings, 1 reply; 50+ messages in thread
From: Ewan D. Milne @ 2021-04-30 23:44 UTC (permalink / raw)
  To: Erwin van Londen, Martin Wilck, Ulrich.Windl, martin.petersen
  Cc: Hannes Reinecke, jejb, linux-scsi, dm-devel, dgilbert,
	systemd-devel, hch


[-- Attachment #1.1: Type: text/plain, Size: 3709 bytes --]

On Wed, 2021-04-28 at 10:09 +1000, Erwin van Londen wrote:
> 
> On Tue, 2021-04-27 at 16:41 -0400, Ewan D. Milne wrote:
> > On Tue, 2021-04-27 at 20:33 +0000, Martin Wilck wrote:
> > > On Tue, 2021-04-27 at 16:14 -0400, Ewan D. Milne wrote:
> > > > There's no way to do that, in principle.  Because there could
> > > > be
> > > > other I/Os in flight.  You might (somehow) avoid retrying an
> > > > I/O
> > > > that got a UA until you figured out if something changed, but
> > > > other
> > > > I/Os can already have been sent to the target, or issued before
> > > > you
> > > > get to look at the status.
> 
> If something happens on a storage side where a lun gets it's
> attributes changed (any, doesn't matter which one) a UA should be
> sent. Also all outstanding IO's on that lun should be returning an
> Abort as it can no longer warrant the validity of any IO due to these
> changes. Especially when parameters are involved like reservations
> (PR's) etc. If that does not happen from an array side all bets are
> off as the only way to be able to get back in business is to start
> from scratch.

Perhaps an array might abort I/Os it has received in the Device Server
whensomething changes.  I have no idea if most or any arrays actually
do that.
But, what about I/O that has already been queued from the host to
thehost bus adapter?  I don't see how we can abort those I/Os
properly.Most high-performance HBAs have a queue of commands and a
queueof responses, there could be lots of commands queued before
wemanage to notice an interesting status.  And AFAIK there is no
conditionalmechanism that could hold them off (and, they could be in-
flight on thewire anyway).
I get what you are saying about what SAM describes, I just don't see
howwe can guarantee we don't send any further commands after the
statuswith the UA is sent back, before we can understand what happened.
-Ewan
> > > 
> > > Right. But in practice, a WWID change will hardly happen under
> > > full
> > > IO
> > > load. The storage side will probably have to block IO while this
> > > happens, at least for a short time period. So blocking and
> > > quiescing
> > > the queue upon an UA might still work, most of the time. Even if
> > > we
> > > were too late already, the sooner we stop the queue, the better.
> 
> I think in most cases when something happens on an array side you
> will see IO's being aborted. That might be a good time to start doing
> TUR's and if these come back OK do a new inquiry. From a host side
> there is only so much you can do.
> 
> > > The current algorithm in multipath-tools needs to detect a path
> > > going
> > > down and being reinstated. The time interval during which a WWID
> > > change
> > > will go unnoticed is one or more path checker intervals,
> > > typically on
> > > the order of 5-30 seconds. If we could decrease this interval to
> > > a
> > > sub-
> > > second or even millisecond range by blocking the queue in the
> > > kernel
> > > quickly, we'd have made a big step forward.
> > 
> > Yes, and in many situations this may help.  But in the general case
> > we can't protect against a storage array misconfiguration,
> > where something like this can happen.  So I worry about people
> > believing the host software will protect them against a mistake,
> > when we can't really do that.
> 
> My thought exactly. 
> 
> > All it takes is one I/O (a discard) to make a thorough mess of the
> > LUN.
> > 
> > -Ewan
> > 
> > > Regards
> > > Martin
> > > 
> > 
> > --
> > dm-devel mailing list
> > dm-devel@redhat.com
> > https://listman.redhat.com/mailman/listinfo/dm-devel
> > 
> 
> --dm-devel mailing listdm-devel@redhat.com
> https://listman.redhat.com/mailman/listinfo/dm-devel

[-- Attachment #1.2: Type: text/html, Size: 6104 bytes --]

[-- Attachment #2: Type: text/plain, Size: 97 bytes --]

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [dm-devel] RFC: one more time: SCSI device identification
  2021-04-30 23:44                         ` Ewan D. Milne
@ 2021-05-03  2:34                             ` Erwin van Londen
  0 siblings, 0 replies; 50+ messages in thread
From: Erwin van Londen @ 2021-05-03  2:34 UTC (permalink / raw)
  To: Ewan D. Milne, Martin Wilck, Ulrich.Windl, martin.petersen
  Cc: Hannes Reinecke, systemd-devel, linux-scsi, dm-devel, dgilbert,
	jejb, hch



On Fri, 2021-04-30 at 19:44 -0400, Ewan D. Milne wrote:
> On Wed, 2021-04-28 at 10:09 +1000, Erwin van Londen wrote:
> > > > 
> 
> Perhaps an array might abort I/Os it has received in the Device
> Server when
> something changes. I have no idea if most or any arrays actually do
> that.
> 
> But, what about I/O that has already been queued from the host to the
> host bus adapter? I don't see how we can abort those I/Os properly.
> Most high-performance HBAs have a queue of commands and a queue
> of responses, there could be lots of commands queued before we
> manage to notice an interesting status. And AFAIK there is no
> conditional
> mechanism that could hold them off (and, they could be in-flight on
> the
> wire anyway).
> 
> I get what you are saying about what SAM describes, I just don't see
> how
> we can guarantee we don't send any further commands after the status
> with the UA is sent back, before we can understand what happened.
> 
> -Ewan

I agree there is only so much we can do especially when IO's have been
dispatched to hardware queues. I think if anything happens to those,
too bad, these ones will incur an abort or status check as well. These
would just need to be identified and subsequent IO's then sent to a
different path but that is a different topic. 

My primary concern is that if anything happens on a lun that changes
its attributes or access characteristics a UA should be sent in order
to inform a host. It cannot be that an array shuffles a lun id onto a
different physical volume without the host knowing. This will for sure
cause data corruption. 

> 
> > > > 
> > > 
> > > 
> > --
> > dm-devel mailing list


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [dm-devel] RFC: one more time: SCSI device identification
@ 2021-05-03  2:34                             ` Erwin van Londen
  0 siblings, 0 replies; 50+ messages in thread
From: Erwin van Londen @ 2021-05-03  2:34 UTC (permalink / raw)
  To: Ewan D. Milne, Martin Wilck, Ulrich.Windl, martin.petersen
  Cc: Hannes Reinecke, jejb, linux-scsi, dm-devel, dgilbert,
	systemd-devel, hch



On Fri, 2021-04-30 at 19:44 -0400, Ewan D. Milne wrote:
> On Wed, 2021-04-28 at 10:09 +1000, Erwin van Londen wrote:
> > > > 
> 
> Perhaps an array might abort I/Os it has received in the Device
> Server when
> something changes. I have no idea if most or any arrays actually do
> that.
> 
> But, what about I/O that has already been queued from the host to the
> host bus adapter? I don't see how we can abort those I/Os properly.
> Most high-performance HBAs have a queue of commands and a queue
> of responses, there could be lots of commands queued before we
> manage to notice an interesting status. And AFAIK there is no
> conditional
> mechanism that could hold them off (and, they could be in-flight on
> the
> wire anyway).
> 
> I get what you are saying about what SAM describes, I just don't see
> how
> we can guarantee we don't send any further commands after the status
> with the UA is sent back, before we can understand what happened.
> 
> -Ewan

I agree there is only so much we can do especially when IO's have been
dispatched to hardware queues. I think if anything happens to those,
too bad, these ones will incur an abort or status check as well. These
would just need to be identified and subsequent IO's then sent to a
different path but that is a different topic. 

My primary concern is that if anything happens on a lun that changes
its attributes or access characteristics a UA should be sent in order
to inform a host. It cannot be that an array shuffles a lun id onto a
different physical volume without the host knowing. This will for sure
cause data corruption. 

> 
> > > > 
> > > 
> > > 
> > --
> > dm-devel mailing list

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: Antw: [EXT] Re: [dm-devel] RFC: one more time: SCSI device identification
  2021-04-27 10:52                         ` [dm-devel] Antw: [EXT] " Ulrich Windl
@ 2021-05-04  7:32                           ` Hannes Reinecke
  -1 siblings, 0 replies; 50+ messages in thread
From: Hannes Reinecke @ 2021-05-04  7:32 UTC (permalink / raw)
  To: Ulrich Windl, erwin, martin.petersen, martin.wilck
  Cc: dgilbert, jejb, systemd-devel, hch, dm-devel, hare, linux-scsi

On 4/27/21 12:52 PM, Ulrich Windl wrote:
>>>> Hannes Reinecke <hare@suse.de> schrieb am 27.04.2021 um 10:21 in Nachricht
> <2a6903e4-ff2b-67d5-e772-6971db8448fb@suse.de>:
>> On 4/27/21 10:10 AM, Martin Wilck wrote:
>>> On Tue, 2021‑04‑27 at 13:48 +1000, Erwin van Londen wrote:
>>>>>
>>>>> Wrt 1), we can only hope that it's the case. But 2) and 3) need work,
>>>>> afaics.
>>>>>
>>>> In my view the WWID should never change.
>>>
>>> In an ideal world, perhaps not. But in the dm‑multipath realm, we know
>>> that WWID changes can happen with certain storage arrays. See
>>> https://listman.redhat.com/archives/dm‑devel/2021‑February/msg00116.html
>>> and follow‑ups, for example.
>>>
>> And it's actually something which might happen quite easily.
>> The storage array can unmap a LUN, delete it, create a new one, and map
>> that one into the same LUN number than the old one.
>> If we didn't do I/O during that interval upon the next I/O we will be
>> getting the dreaded 'Power‑On/Reset' sense code.
>> _And nothing else_, due to the arcane rules for sense code generation in
>> SAM.
>> But we end up with a completely different device.
>>
>> The only way out of it is to do a rescan for every POR sense code, and
>> disable the device eg via DID_NO_CONNECT whenever we find that the
>> identification has changed. We already have a copy of the original VPD
>> page 0x83 at hand, so that should be reasonably easy.
> 
> I don't know the depth of the SCSI or FC protocol, but storage systems
> typically signal such events, maybe either via some unit attention or some FC
> event. Older kernels logged that there was a change, but a manual SCSI bus scan
> is needed, while newer kernels find new devices "automagically" for some
> products. The HP EVA 6000 series wored that way, a 3PAR SotorServ 8000 series
> also seems to work that way, but not Pure Storage X70 R3. FOr the latter you
> need something like a FC LIP to make the kernel detect the new devices (LUNs).
> I'm unsure where the problem is, but in principle the kernel can be
> notified...
> 
My point was that while there _is_ a unit attention with the sense code 
'INQUIRY DATA CHANGED' (and that indeed will generate a kernel message), 
it might be obscured by a subsequent unit attention with the sense code 
'Power-On/Reset', as per SCSI spec the latter might cause the previous 
ones to _not_ being sent.
So from that reasoning we will need to rescan the device upon 
'Power-on/Reset'.
But 'Power-On/Reset' is a sense code which we also get during initial 
device scan, so the problem is that we will be triggering a rescan while 
_doing_ a rescan, and as such it would need some really careful testing.

As for the PureStorage behaviour: The problem with changing the LUN 
mapping on the array is that it we might not _have_ a device to send 
unit attentions to.
If the array already exports LUNs to some other hosts, it doesn't need 
to re-initialize the FC port when starting to export LUNs to _this_ 
host. And as _this_ host doesn't have a LUN on which unit attentions can 
be sent, _and_ the FC port is already registered, there are no events 
whatsoever which would cause the host to initiate a rescan.
To resolve that the array would need to induce eg an RSCN, but that will 
only be triggered if a FC port is (re-)registered.
Which is what HPe arrays do; initiate a link-bounce on the attached 
ports, which will cause the attached hosts to initiate a rescan.
Of course, _all_ hosts will need to rescan (and thereby causing an 
interruption even on unrelated hosts), which is why this is not done by 
all vendors.

>>
>> I had a rather lengthy discussion with Fred Knight @ NetApp about
>> Power‑On/Reset handling, what with him complaining that we don't handle
>> is correctly. So this really is something we should be looking into,
>> even independently of multipathing.
>>
>> But actually I like the idea from Martin Petersen to expose the parsed
>> VPD identifiers to sysfs; that would allow us to drop sg_inq completely
>> from the udev rules.
> 
> Talking of VPDs: Somewhere in the last 12 years (within SLES 11)there was a
> kernel change regarding trailing blanks in VPD data. That change blew up
> several configurations being unable to re-recognize the devices. In one case
> the software even had bound a license to a specific device with serial number,
> and that software found "new" devices while missing the "old" ones...
> 
That's probably just for VPD page 0x80. Page 0x83 has pretty strict 
rules on how the entries are formatted, so chopping off trailing blanks 
is not easily done there.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                Kernel Storage Architect
hare@suse.de                              +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Felix Imendörffer

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [dm-devel] Antw: [EXT] Re: RFC: one more time: SCSI device identification
@ 2021-05-04  7:32                           ` Hannes Reinecke
  0 siblings, 0 replies; 50+ messages in thread
From: Hannes Reinecke @ 2021-05-04  7:32 UTC (permalink / raw)
  To: Ulrich Windl, erwin, martin.petersen, martin.wilck
  Cc: hare, systemd-devel, linux-scsi, dm-devel, dgilbert, jejb, hch

On 4/27/21 12:52 PM, Ulrich Windl wrote:
>>>> Hannes Reinecke <hare@suse.de> schrieb am 27.04.2021 um 10:21 in Nachricht
> <2a6903e4-ff2b-67d5-e772-6971db8448fb@suse.de>:
>> On 4/27/21 10:10 AM, Martin Wilck wrote:
>>> On Tue, 2021‑04‑27 at 13:48 +1000, Erwin van Londen wrote:
>>>>>
>>>>> Wrt 1), we can only hope that it's the case. But 2) and 3) need work,
>>>>> afaics.
>>>>>
>>>> In my view the WWID should never change.
>>>
>>> In an ideal world, perhaps not. But in the dm‑multipath realm, we know
>>> that WWID changes can happen with certain storage arrays. See
>>> https://listman.redhat.com/archives/dm‑devel/2021‑February/msg00116.html
>>> and follow‑ups, for example.
>>>
>> And it's actually something which might happen quite easily.
>> The storage array can unmap a LUN, delete it, create a new one, and map
>> that one into the same LUN number than the old one.
>> If we didn't do I/O during that interval upon the next I/O we will be
>> getting the dreaded 'Power‑On/Reset' sense code.
>> _And nothing else_, due to the arcane rules for sense code generation in
>> SAM.
>> But we end up with a completely different device.
>>
>> The only way out of it is to do a rescan for every POR sense code, and
>> disable the device eg via DID_NO_CONNECT whenever we find that the
>> identification has changed. We already have a copy of the original VPD
>> page 0x83 at hand, so that should be reasonably easy.
> 
> I don't know the depth of the SCSI or FC protocol, but storage systems
> typically signal such events, maybe either via some unit attention or some FC
> event. Older kernels logged that there was a change, but a manual SCSI bus scan
> is needed, while newer kernels find new devices "automagically" for some
> products. The HP EVA 6000 series wored that way, a 3PAR SotorServ 8000 series
> also seems to work that way, but not Pure Storage X70 R3. FOr the latter you
> need something like a FC LIP to make the kernel detect the new devices (LUNs).
> I'm unsure where the problem is, but in principle the kernel can be
> notified...
> 
My point was that while there _is_ a unit attention with the sense code 
'INQUIRY DATA CHANGED' (and that indeed will generate a kernel message), 
it might be obscured by a subsequent unit attention with the sense code 
'Power-On/Reset', as per SCSI spec the latter might cause the previous 
ones to _not_ being sent.
So from that reasoning we will need to rescan the device upon 
'Power-on/Reset'.
But 'Power-On/Reset' is a sense code which we also get during initial 
device scan, so the problem is that we will be triggering a rescan while 
_doing_ a rescan, and as such it would need some really careful testing.

As for the PureStorage behaviour: The problem with changing the LUN 
mapping on the array is that it we might not _have_ a device to send 
unit attentions to.
If the array already exports LUNs to some other hosts, it doesn't need 
to re-initialize the FC port when starting to export LUNs to _this_ 
host. And as _this_ host doesn't have a LUN on which unit attentions can 
be sent, _and_ the FC port is already registered, there are no events 
whatsoever which would cause the host to initiate a rescan.
To resolve that the array would need to induce eg an RSCN, but that will 
only be triggered if a FC port is (re-)registered.
Which is what HPe arrays do; initiate a link-bounce on the attached 
ports, which will cause the attached hosts to initiate a rescan.
Of course, _all_ hosts will need to rescan (and thereby causing an 
interruption even on unrelated hosts), which is why this is not done by 
all vendors.

>>
>> I had a rather lengthy discussion with Fred Knight @ NetApp about
>> Power‑On/Reset handling, what with him complaining that we don't handle
>> is correctly. So this really is something we should be looking into,
>> even independently of multipathing.
>>
>> But actually I like the idea from Martin Petersen to expose the parsed
>> VPD identifiers to sysfs; that would allow us to drop sg_inq completely
>> from the udev rules.
> 
> Talking of VPDs: Somewhere in the last 12 years (within SLES 11)there was a
> kernel change regarding trailing blanks in VPD data. That change blew up
> several configurations being unable to re-recognize the devices. In one case
> the software even had bound a license to a specific device with serial number,
> and that software found "new" devices while missing the "old" ones...
> 
That's probably just for VPD page 0x80. Page 0x83 has pretty strict 
rules on how the entries are formatted, so chopping off trailing blanks 
is not easily done there.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                Kernel Storage Architect
hare@suse.de                              +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Felix Imendörffer


--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel

^ permalink raw reply	[flat|nested] 50+ messages in thread

end of thread, other threads:[~2021-05-04  7:32 UTC | newest]

Thread overview: 50+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-03-29  9:58 RFC: one more time: SCSI device identification Martin Wilck
2021-03-29  9:58 ` [dm-devel] " Martin Wilck
2021-04-06  4:47 ` Martin K. Petersen
2021-04-06  4:47   ` [dm-devel] " Martin K. Petersen
2021-04-16 23:28   ` Martin Wilck
2021-04-16 23:28     ` [dm-devel] " Martin Wilck
2021-04-22  2:46     ` Martin K. Petersen
2021-04-22  2:46       ` [dm-devel] " Martin K. Petersen
2021-04-22  9:07       ` Martin Wilck
2021-04-22  9:07         ` [dm-devel] " Martin Wilck
2021-04-22 16:14         ` Benjamin Marzinski
2021-04-22 16:14           ` [dm-devel] " Benjamin Marzinski
2021-04-23  1:40         ` Martin K. Petersen
2021-04-23  1:40           ` [dm-devel] " Martin K. Petersen
2021-04-23 10:28           ` Martin Wilck
2021-04-23 10:28             ` [dm-devel] " Martin Wilck
2021-04-26 11:14             ` Antw: [EXT] Re: [systemd-devel] " Ulrich Windl
2021-04-26 11:14               ` [dm-devel] " Ulrich Windl
2021-04-26 13:16               ` Martin Wilck
2021-04-26 13:16                 ` [dm-devel] " Martin Wilck
2021-04-27  3:48                 ` Erwin van Londen
2021-04-27  7:02                   ` Antw: [EXT] " Ulrich Windl
2021-04-27  7:02                     ` [dm-devel] Antw: [EXT] " Ulrich Windl
2021-04-27  8:10                   ` [dm-devel] " Martin Wilck
2021-04-27  8:10                     ` Martin Wilck
2021-04-27  8:21                     ` Hannes Reinecke
2021-04-27  8:21                       ` Hannes Reinecke
2021-04-27 10:52                       ` Antw: [EXT] " Ulrich Windl
2021-04-27 10:52                         ` [dm-devel] Antw: [EXT] " Ulrich Windl
2021-04-27 20:04                         ` Antw: [EXT] Re: [dm-devel] " Ewan D. Milne
2021-04-27 20:04                           ` [dm-devel] Antw: [EXT] " Ewan D. Milne
2021-05-04  7:32                         ` Antw: [EXT] Re: [dm-devel] " Hannes Reinecke
2021-05-04  7:32                           ` [dm-devel] Antw: [EXT] " Hannes Reinecke
2021-04-28  1:01                       ` [dm-devel] " Erwin van Londen
2021-04-28  6:34                         ` Martin Wilck
2021-04-28  6:34                           ` Martin Wilck
2021-04-29 14:47                           ` Erwin van Londen
2021-04-29 14:47                             ` Erwin van Londen
2021-04-27 20:14                 ` Ewan D. Milne
2021-04-27 20:14                   ` [dm-devel] " Ewan D. Milne
2021-04-27 20:33                   ` Martin Wilck
2021-04-27 20:33                     ` [dm-devel] " Martin Wilck
2021-04-27 20:41                     ` Ewan D. Milne
2021-04-27 20:41                       ` [dm-devel] " Ewan D. Milne
2021-04-28  0:09                       ` Erwin van Londen
2021-04-30 23:44                         ` Ewan D. Milne
2021-05-03  2:34                           ` Erwin van Londen
2021-05-03  2:34                             ` Erwin van Londen
2021-04-28  6:30                       ` [systemd-devel] " Martin Wilck
2021-04-28  6:30                         ` [dm-devel] " Martin Wilck

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.