Linux-Block Archive on lore.kernel.org
 help / Atom feed
From: Alibek Amaev <alibek.a@gmail.com>
To: Hannes Reinecke <hare@suse.de>
Cc: linux-block@vger.kernel.org, linux-scsi@vger.kernel.org
Subject: Re: Block device naming
Date: Fri, 17 May 2019 14:24:24 +0300
Message-ID: <CA+TYKz14Y2D2as1xetEAL2C4tD8fuSmoz7dYnm8HxNpLZR5skQ@mail.gmail.com> (raw)
In-Reply-To: <680c9440-f1c9-68d1-cc5a-17aa9071fcc1@suse.de>

I understand that changing the block device name does not matter
between reboots.
But as I understood in these cases, the order of HCTL (Host: Channel:
Target: Lun) for devices is changed. Unfortunately, I did not capture
the order of HCTL before the failure and I can't provide evidence. But
if I rely on my brain, then I know that the order of the HCTL before
the failure was different in all the cases presented.
This is indirectly confirmed by how the state of the pool in zfs is
demonstrated. And it seems that it depends on how the device was added
(by scsi-id or by wwn-id).
By scsi-id (when there were messages in the dmesg about device
changes), the failure was shown as follows:
---
# zpool status
  pool: pool
 state: UNAVAIL
status: One or more devices are faulted in response to IO failures.
action: Make sure the affected devices are connected, then run 'zpool clear'.
   see: http://zfsonlinux.org/msg/ZFS-8000-HC
  scan: scrub repaired 0 in 1h39m with 0 errors on Sun Oct  8 02:03:34 2017
config:

    NAME                                      STATE     READ WRITE CKSUM
    pool                                      UNAVAIL      0     0
0  insufficient replicas
      scsi-3600144f0c7a5bc61000058d3b96d001d  FAULTED      3     0
0  too many errors

errors: 51 data errors, use '-v' for a list
---
Than in normal state zpool status show:
---
# zpool status
  pool: pool
 state: ONLINE
  scan: scrub repaired 0 in 1h39m with 0 errors on Sun Oct  8 02:03:34 2017
config:

    NAME                                      STATE     READ WRITE CKSUM
    pool                                      ONLINE       0     0     0
      scsi-3600144f0c7a5bc61000058d3b96d001d  ONLINE       0     0     0

errors: No known data errors
---

And in another case, when the LUN is imported by wwn-id (and now any
errors in dmesg) in error state, zpool status is:
---
# zpool status
  pool: pool1
 state: ONLINE
  scan: scrub repaired 0B in 17h30m with 0 errors on Sun Apr 14 17:54:55 2019
config:

NAME                                      STATE     READ WRITE CKSUM
pool1                                     ONLINE       0     0     0
  sdc                                     ONLINE       0     0     0

errors: No known data errors
---
In the status there are no errors, but show block device name from /dev/
Than in normal state zpool status show wwn-id from /dev/disk/by-id
instead of device name from /dev:
---
root@lpr11a:~# zpool status
  pool: pool1
 state: ONLINE
  scan: scrub repaired 0B in 17h30m with 0 errors on Sun Apr 14 17:54:55 2019
config:

NAME                                      STATE     READ WRITE CKSUM
pool1                                     ONLINE       0     0     0
  wwn-0x600144f0b49c14d100005b7af8ee001c  ONLINE       0     0     0

errors: No known data errors
---

P.S. I would also like to note /dev/disk is not reflect reality - SSD
are not disks.

On Thu, May 16, 2019 at 5:07 PM Hannes Reinecke <hare@suse.de> wrote:
>
> On 5/16/19 3:49 PM, Alibek Amaev wrote:
> > I have more example from IRL:
> > In Aug 2018 I was start server with attached storages by FC from ZS3
> > and ZS5 (it is Oracle ZFS Storage Appliance, not NetApp and also
> > export space as LUN) server use one LUN from ZS5. And recently on
> > server stopped all IO on this exported LUN  and io-wait is grow, in
> > dmesg no any errors or any changes about FC, no errors in
> > /var/log/kern.log* /var/log/syslog.log*, no throttling, no edac errors
> > or other.
> > But before reboot I saw:
> > wwn-0x600144f0b49c14d100005b7af8ee001c -> ../../sdc
> > I try to run partprobe or try to copy from this block device some data
> > to /dev/null by dd - operations wasn't finished IO is blocked
> > And after reboot i seen:
> > wwn-0x600144f0b49c14d100005b7af8ee001c -> ../../sdd
> > And server is run ok.
> >
> > Also I have LUN exported from storage in shared mode and it accesible
> > for all servers by FC. Currently this LUN not need, but now I doubt it
> > is possible to safely remove it...
> >
> It's all a bit conjecture at this point.
> 'sdc' could be show up as 'sdd' after the next reboot, with no
> side-effects whatsoever.
> At the same time, 'sdc' could have been blocked by a host of reasons,
> none of which are related to the additional device being exported.
>
> It doesn't really look like an issue with device naming; you would need
> to do proper investigation on your server to figure out why I/O stopped.
> Device renaming is typically the least likely cause here.
>
> Cheers,
>
> Hannes
> --
> Dr. Hannes Reinecke                Teamlead Storage & Networking
> hare@suse.de                                   +49 911 74053 688
> SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
> GF: Felix Imendörffer, Mary Higgins, Sri Rasiah
> HRB 21284 (AG Nürnberg)

      reply index

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-05-16 12:26 Alibek Amaev
2019-05-16 12:33 ` Hannes Reinecke
2019-05-16 13:49   ` Alibek Amaev
2019-05-16 14:07     ` Hannes Reinecke
2019-05-17 11:24       ` Alibek Amaev [this message]

Reply instructions:

You may reply publically to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CA+TYKz14Y2D2as1xetEAL2C4tD8fuSmoz7dYnm8HxNpLZR5skQ@mail.gmail.com \
    --to=alibek.a@gmail.com \
    --cc=hare@suse.de \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-scsi@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Linux-Block Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-block/0 linux-block/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-block linux-block/ https://lore.kernel.org/linux-block \
		linux-block@vger.kernel.org linux-block@archiver.kernel.org
	public-inbox-index linux-block


Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-block


AGPL code for this site: git clone https://public-inbox.org/ public-inbox