All of lore.kernel.org
 help / color / mirror / Atom feed
From: Benjamin Block <bblock@linux.ibm.com>
To: "Martin K. Petersen" <martin.petersen@oracle.com>
Cc: linux-scsi@vger.kernel.org, Steffen Maier <maier@linux.ibm.com>,
	Alexander Egorenkov <egorenar@linux.ibm.com>
Subject: Re: regression next-20220714: mkfs.ext4 on multipath device over scsi disks causes 'lifelock' in block layer
Date: Thu, 21 Jul 2022 13:25:55 +0000	[thread overview]
Message-ID: <YtlT41EUrt2gDncP@t480-pf1aa2c2.fritz.box> (raw)
In-Reply-To: <yq1v8rrkxwv.fsf@ca-mkp.ca.oracle.com>

On Wed, Jul 20, 2022 at 10:29:06PM -0400, Martin K. Petersen wrote:
> > This is one of the oldest storage boxes we have right now, and this
> > regression it doesn't seem to happen on newer models as far as I can
> > see.
> 
> I have not had much luck reproducing your results today despite
> reporting the same parameters in the VPD pages as your device.
> 
> I would appreciate if you send me the output of:
> 
> # grep . /sys/block/{sd,dm}*/queue/write_zeroes_max_bytes /sys/block/sd*/device/scsi_disk/*/max_write_same_blocks
> 
> for the failing configuration.

Again at commit
1bd95bb98f83 ("scsi: sd: Move WRITE_ZEROES configuration to a separate function")

The device names keep changing like that because these are CI systems
that are reset and re-installed every night, and I don't get the same
system every day; just in case this looks strange.

  # lsblk -s -o '+KNAME' /dev/mapper/mpathd1
  NAME     MAJ:MIN RM SIZE RO TYPE  MOUNTPOINTS KNAME
  mpathd1  251:2    0  20G  0 part              dm-2
  └─mpathd 251:0    0  20G  0 mpath             dm-0
    ├─sda    8:0    0  20G  0 disk              sda
    └─sde    8:64   0  20G  0 disk              sde
  
  # grep . /sys/block/{sda,sde,dm-0,dm-2}/queue/write_zeroes_max_bytes /sys/block/{sda,sde}/device/scsi_disk/*/max_write_same_blocks
  /sys/block/sda/queue/write_zeroes_max_bytes:33553920
  /sys/block/sde/queue/write_zeroes_max_bytes:33553920
  /sys/block/dm-0/queue/write_zeroes_max_bytes:33553920
  /sys/block/dm-2/queue/write_zeroes_max_bytes:33553920
  /sys/block/sda/device/scsi_disk/0:0:0:1079787648/max_write_same_blocks:65535
  /sys/block/sde/device/scsi_disk/1:0:0:1079787648/max_write_same_blocks:65535
  
  # for fl in /sys/block/sda/device/scsi_disk/*/device/inquiry /sys/block/sda/device/scsi_disk/*/device/vpd_*; do (set -x; xxd "${fl}"); done
  + xxd /sys/block/sda/device/scsi_disk/0:0:0:1079787648/device/inquiry
  00000000: 0000 0532 9f10 1002 4942 4d20 2020 2020  ...2....IBM
  00000010: 3231 3037 3930 3020 2020 2020 2020 2020  2107900
  00000020: 3130 3630 3735 444c 3234 3138 3035 4320  106075DL241805C
  00000030: 2020 2020 2020 2020 0000 0060 0da0 0a00          ...`....
  00000040: 0300 0320 0000 0000 0000 0000 0000 0000  ... ............
  00000050: 0000 0000 0000 0000 0000 0000 0000 0000  ................
  00000060: 0101 3037 3500 3034 3731 3400 0000 8800  ..075.04714.....
  00000070: 0000 0000 0000 0000 0000 0000 0000 0000  ................
  00000080: 0000 0000 0000 0000 0000 0000 0000 0000  ................
  00000090: 0000 0000 0000 0000 0000 0000 0000 0000  ................
  000000a0: 0002 0800                                ....
  + xxd /sys/block/sda/device/scsi_disk/0:0:0:1079787648/device/vpd_pg0
  00000000: 0000 000a 0080 8386 b0b1 b2c0 c1c2       ..............
  + xxd /sys/block/sda/device/scsi_disk/0:0:0:1079787648/device/vpd_pg80
  00000000: 0080 0010 3735 444c 3234 3138 3035 4320  ....75DL241805C
  00000010: 2020 2020
  + xxd /sys/block/sda/device/scsi_disk/0:0:0:1079787648/device/vpd_pg83
  00000000: 0083 0024 0103 0010 6005 0763 07ff c5e3  ...$....`..c....
  00000010: 0000 0000 0000 805c 0114 0004 0000 0101  .......\........
  00000020: 0115 0004 0000 0000                      ........
  + xxd /sys/block/sda/device/scsi_disk/0:0:0:1079787648/device/vpd_pgb0
  00000000: 00b0 003c 0001 0000 0000 0000 0000 0000  ...<............
  00000010: 0000 0000 ffff ffff 0000 0000 0020 0000  ............. ..
  00000020: 8000 0000 0000 0000 0000 0000 0000 0000  ................
  00000030: 0000 0000 0000 0000 0000 0000 0000 0000  ................
  + xxd /sys/block/sda/device/scsi_disk/0:0:0:1079787648/device/vpd_pgb1
  00000000: 00b1 003c 1c20 0000 0000 0000 0000 0000  ...<. ..........
  00000010: 0000 0000 0000 0000 0000 0000 0000 0000  ................
  00000020: 0000 0000 0000 0000 0000 0000 0000 0000  ................
  00000030: 0000 0000 0000 0000 0000 0000 0000 0000  ................
  + xxd /sys/block/sda/device/scsi_disk/0:0:0:1079787648/device/vpd_pgb2
  00000000: 00b2 0004 1540 0000                      .....@..

Maybe this also helps, I took a snapshot of dmesg when the devices are
sensed:

  ...
  [    3.228175] SCSI subsystem initialized
  ...
  [    3.942878] device-mapper: core: CONFIG_IMA_DISABLE_HTABLE is disabled. Duplicate IMA measurements will not be recorded in the IMA log.
  [    3.942912] device-mapper: uevent: version 1.0.3
  [    3.942981] device-mapper: ioctl: 4.46.0-ioctl (2022-02-22) initialised: dm-devel@redhat.com
  ...
  [   31.563828] zfcp 0.0.1900: qdio: ZFCP on SC 15b using AI:1 QEBSM:0 PRI:1 TDD:1 SIGA: W
  [   32.688192] scsi host0: scsi_eh_0: sleeping
  [   32.688238] scsi host0: zfcp
  ...
  [   32.725595] scsi 0:0:0:0: scsi scan: INQUIRY pass 1 length 36
  [   32.725921] scsi 0:0:0:0: scsi scan: INQUIRY successful with code 0x0
  [   32.725931] scsi 0:0:0:0: scsi scan: INQUIRY pass 2 length 164
  [   32.726133] scsi 0:0:0:0: scsi scan: INQUIRY successful with code 0x0
  [   32.726139] scsi 0:0:0:0: scsi scan: peripheral device type of 31, no device added
  [   32.726560] scsi 0:0:0:0: scsi scan: Sending REPORT LUNS to (try 0)
  [   32.727220] scsi 0:0:0:0: scsi scan: REPORT LUNS successful (try 0) result 0x0
  [   32.727222] scsi 0:0:0:0: scsi scan: REPORT LUN scan
  [   32.727475] scsi 0:0:0:1079787648: scsi scan: INQUIRY pass 1 length 36
  [   32.727717] scsi 0:0:0:1079787648: scsi scan: INQUIRY successful with code 0x0
  [   32.727723] scsi 0:0:0:1079787648: scsi scan: INQUIRY pass 2 length 164
  [   32.727927] scsi 0:0:0:1079787648: scsi scan: INQUIRY successful with code 0x0
  [   32.727935] scsi 0:0:0:1079787648: Direct-Access     IBM      2107900          1060 PQ: 0 ANSI: 5
  [   32.729982] scsi 0:0:0:1079787648: sg_alloc: dev=0
  [   32.730027] sd 0:0:0:1079787648: Attached scsi generic sg0 type 0
  ...
  [   32.730502] sd 0:0:0:1079787648: Power-on or device reset occurred
  [   32.730508] sd 0:0:0:1079787648: [sda] tag#2176 Done: SUCCESS Result: hostbyte=DID_OK driverbyte=DRIVER_OK cmd_age=0s
  [   32.730511] sd 0:0:0:1079787648: [sda] tag#2176 CDB: Test Unit Ready 00 00 00 00 00 00
  [   32.730514] sd 0:0:0:1079787648: [sda] tag#2176 Sense Key : Unit Attention [current]
  [   32.730516] sd 0:0:0:1079787648: [sda] tag#2176 Add. Sense: Power on, reset, or bus device reset occurred
  ...
  [   32.730774] sd 0:0:0:1079787648: [sda] 41943040 512-byte logical blocks: (21.5 GB/20.0 GiB)
  [   32.730928] sd 0:0:0:1079787648: [sda] Write Protect is off
  [   32.730930] sd 0:0:0:1079787648: [sda] Mode Sense: ed 00 00 08
  [   32.731243] sd 0:0:0:1079787648: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
  [   32.731376] sd 0:0:0:1079787648: [sda] tag#3781 Done: SUCCESS Result: hostbyte=DID_TARGET_FAILURE driverbyte=DRIVER_OK cmd_age=0s
  [   32.731379] sd 0:0:0:1079787648: [sda] tag#3781 CDB: Report supported operation codes a3 0c 01 12 00 00 00 00 00 0a 00 00
  [   32.731381] sd 0:0:0:1079787648: [sda] tag#3781 Sense Key : Illegal Request [current]
  [   32.731383] sd 0:0:0:1079787648: [sda] tag#3781 Add. Sense: Invalid field in cdb
  ...
  [   32.734915] sd 0:0:0:1079787648: [sda] tag#3784 Done: SUCCESS Result: hostbyte=DID_TARGET_FAILURE driverbyte=DRIVER_OK cmd_age=0s
  [   32.734917] sd 0:0:0:1079787648: [sda] tag#3784 CDB: Inquiry 12 01 b9 00 04 00
  [   32.734919] sd 0:0:0:1079787648: [sda] tag#3784 Sense Key : Illegal Request [current]
  [   32.734921] sd 0:0:0:1079787648: [sda] tag#3784 Add. Sense: Invalid field in cdb
  ...
  [   32.737032]  sda: sda1
  ...
  [   32.737185] sd 0:0:0:1079787648: [sda] Attached SCSI disk
  ...
  [   37.294939] alua: device handler registered
  [   37.296390] emc: device handler registered
  [   37.297819] rdac: device handler registered
  [   37.351199] device-mapper: multipath service-time: version 0.3.0 loaded
  [   37.351424] sd 0:0:0:1079787648: alua: supports implicit TPGS
  [   37.351427] sd 0:0:0:1079787648: alua: device naa.6005076307ffc5e3000000000000805c port group 0 rel port 101
  ...
  [   37.378201] sd 0:0:0:1079787648: alua: transition timeout set to 60 seconds
  [   37.378204] sd 0:0:0:1079787648: alua: port group 00 state A preferred supports tolusnA
  ...

Thanks for the support Martin.

-- 
Best Regards, Benjamin Block  / Linux on IBM Z Kernel Development / IBM Systems
IBM Deutschland Research & Development GmbH    /    https://www.ibm.com/privacy
Vorsitz. AufsR.: Gregor Pillen         /         Geschäftsführung: David Faller
Sitz der Gesellschaft: Böblingen / Registergericht: AmtsG Stuttgart, HRB 243294

  reply	other threads:[~2022-07-21 13:26 UTC|newest]

Thread overview: 75+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-03-02  5:35 SCSI discovery update Martin K. Petersen
2022-03-02  5:35 ` [PATCH 01/14] scsi: mpt3sas: Use cached ATA Information VPD page Martin K. Petersen
2022-03-02  9:47   ` Christoph Hellwig
2022-03-02 14:18   ` Johannes Thumshirn
2022-04-20 12:06   ` Hannes Reinecke
2022-05-03  0:51   ` Martin K. Petersen
2022-03-02  5:35 ` [PATCH 02/14] scsi: core: Query VPD size before getting full page Martin K. Petersen
2022-03-02  9:48   ` Christoph Hellwig
2022-03-02 14:25   ` Johannes Thumshirn
2022-03-04  3:42     ` Martin K. Petersen
2022-03-03  0:30   ` Bart Van Assche
2022-03-04  3:28     ` Martin K. Petersen
2022-03-02  5:35 ` [PATCH 03/14] scsi: core: Do not truncate INQUIRY data on modern devices Martin K. Petersen
2022-03-02  9:49   ` Christoph Hellwig
2022-03-02 14:27   ` Johannes Thumshirn
2022-03-03  0:14   ` Bart Van Assche
2022-03-04  3:40     ` Martin K. Petersen
2022-03-02  5:35 ` [PATCH 04/14] scsi: core: Pick suitable allocation length in scsi_report_opcode() Martin K. Petersen
2022-03-02  9:49   ` Christoph Hellwig
2022-03-02 14:29   ` Johannes Thumshirn
2022-03-03  0:39   ` Bart Van Assche
2022-04-20 12:06   ` Hannes Reinecke
2022-03-02  5:35 ` [PATCH 05/14] scsi: core: Cache VPD pages b0, b1, b2 Martin K. Petersen
2022-03-02  9:50   ` Christoph Hellwig
2022-03-02 14:30   ` Johannes Thumshirn
2022-03-03  1:30   ` Bart Van Assche
2022-04-20 12:07   ` Hannes Reinecke
2022-03-02  5:35 ` [PATCH 06/14] scsi: sd: Use cached ATA Information VPD page Martin K. Petersen
2022-03-02  9:50   ` Christoph Hellwig
2022-03-03  0:40   ` Bart Van Assche
2022-03-04  9:29   ` Johannes Thumshirn
2022-03-02  5:35 ` [PATCH 07/14] scsi: sd: Switch to using scsi_device VPD pages Martin K. Petersen
2022-03-02  9:51   ` Christoph Hellwig
2022-03-03  0:42   ` Bart Van Assche
2022-03-04  9:29   ` Johannes Thumshirn
2022-03-02  5:35 ` [PATCH 08/14] scsi: sd: Optimal I/O size should be a multiple of reported granularity Martin K. Petersen
2022-03-02  9:51   ` Christoph Hellwig
2022-03-03 20:17   ` Bart Van Assche
2022-03-04  3:45     ` Martin K. Petersen
2022-03-04  5:06       ` Bart Van Assche
2022-03-02  5:35 ` [PATCH 09/14] scsi: sd: Fix discard errors during revalidate Martin K. Petersen
2022-03-02  9:52   ` Christoph Hellwig
2022-03-03 21:06   ` Bart Van Assche
2022-03-04  3:55     ` Martin K. Petersen
2022-03-06  0:35       ` Bart Van Assche
2022-03-02  5:35 ` [PATCH 10/14] scsi: sd: Move WRITE_ZEROES configuration to a separate function Martin K. Petersen
2022-03-02  9:53   ` Christoph Hellwig
2022-03-03  0:52   ` Bart Van Assche
2022-07-18 16:51   ` regression next-20220714: mkfs.ext4 on multipath device over scsi disks causes 'lifelock' in block layer Benjamin Block
2022-07-19  2:23     ` Martin K. Petersen
2022-07-19 11:37       ` Benjamin Block
2022-07-21  2:29         ` Martin K. Petersen
2022-07-21 13:25           ` Benjamin Block [this message]
2022-03-02  5:35 ` [PATCH 11/14] scsi: sd: Implement support for NDOB flag in WRITE SAME(16) Martin K. Petersen
2022-03-02  9:54   ` Christoph Hellwig
2022-03-03  1:29   ` Bart Van Assche
2022-03-04  9:32   ` Johannes Thumshirn
2022-03-02  5:35 ` [PATCH 12/14] scsi: sd: sd_read_cpr() requires VPD pages Martin K. Petersen
2022-03-02  9:54   ` Christoph Hellwig
2022-03-02 10:45   ` Damien Le Moal
2022-04-06  1:29     ` Damien Le Moal
2022-04-07  2:19       ` Martin K. Petersen
2022-04-07  2:36         ` Damien Le Moal
2022-04-07  2:48           ` Martin K. Petersen
2022-03-03 20:13   ` Bart Van Assche
2022-03-04  9:33   ` Johannes Thumshirn
2022-03-02  5:35 ` [PATCH 13/14] scsi: sd: Reorganize DIF/DIX code to avoid calling revalidate twice Martin K. Petersen
2022-03-02  9:57   ` Christoph Hellwig
2022-03-04  9:36   ` Johannes Thumshirn
2022-03-02  5:35 ` [PATCH 14/14] scsi: sd: Enable modern protocol features on more devices Martin K. Petersen
2022-03-02  9:58   ` Christoph Hellwig
2022-03-03  1:25   ` Bart Van Assche
2022-03-04  9:38   ` Johannes Thumshirn
2022-03-03  6:09 ` SCSI discovery update Douglas Gilbert
2022-03-04  3:26   ` Martin K. Petersen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YtlT41EUrt2gDncP@t480-pf1aa2c2.fritz.box \
    --to=bblock@linux.ibm.com \
    --cc=egorenar@linux.ibm.com \
    --cc=linux-scsi@vger.kernel.org \
    --cc=maier@linux.ibm.com \
    --cc=martin.petersen@oracle.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.