All of lore.kernel.org
 help / color / mirror / Atom feed
From: Benjamin Block <bblock@linux.ibm.com>
To: "Martin K. Petersen" <martin.petersen@oracle.com>
Cc: linux-scsi@vger.kernel.org, Steffen Maier <maier@linux.ibm.com>,
	Alexander Egorenkov <egorenar@linux.ibm.com>
Subject: Re: regression next-20220714: mkfs.ext4 on multipath device over scsi disks causes 'lifelock' in block layer
Date: Tue, 19 Jul 2022 11:37:47 +0000	[thread overview]
Message-ID: <YtaXi23TBli7F8Pz@t480-pf1aa2c2.fritz.box> (raw)
In-Reply-To: <yq1edyhrgpi.fsf@ca-mkp.ca.oracle.com>

On Mon, Jul 18, 2022 at 10:23:26PM -0400, Martin K. Petersen wrote:
> Please send the output of:
> 
> # grep . /sys/block/sdN/queue/discard_* /sys/block/sdN/device/scsi_disk/*/*_mode
> # sg_readcap -l /dev/sdN
> # sg_vpg -p bl /dev/sdN
> # sg_vpg -p lbpv /dev/sdN
> 
> Ideally (for the grep) before and after the offending commit.

Sure,

I assume with `sg_vpg` you mean `sg_vpd`.

1bd95bb98f83 ("scsi: sd: Move WRITE_ZEROES configuration to a separate function")
---------------------------------------------------------------------------------

This is the first bad commit 

  # lsblk -s /dev/mapper/mpathe1
  NAME     MAJ:MIN RM SIZE RO TYPE  MOUNTPOINTS
  mpathe1  251:5    0  20G  0 part
  └─mpathe 251:2    0  20G  0 mpath
    ├─sde    8:64   0  20G  0 disk
    └─sdi    8:128  0  20G  0 disk
  
  # ll /dev/mapper/{mpathe1,mpathe}
  lrwxrwxrwx. 1 root root 7 Jul 19 12:52 /dev/mapper/mpathe -> ../dm-2
  lrwxrwxrwx. 1 root root 7 Jul 19 12:52 /dev/mapper/mpathe1 -> ../dm-5
  
  # lsblk -st /dev/mapper/mpathe1
  NAME     ALIGNMENT MIN-IO OPT-IO PHY-SEC LOG-SEC ROTA SCHED       RQ-SIZE  RA WSAME
  mpathe1          0    512      0     512     512    1                 128 128    0B
  └─mpathe         0    512      0     512     512    1 mq-deadline     256 128    0B
    ├─sde          0    512      0     512     512    1 bfq             256 512    0B
    └─sdi          0    512      0     512     512    1 bfq             256 512    0B
  
  # lsblk -sD /dev/mapper/mpathe1
  NAME     DISC-ALN DISC-GRAN DISC-MAX DISC-ZERO
  mpathe1         0        1G      32M         0
  └─mpathe        0        1G      32M         0
    ├─sde         0        1G      32M         0
    └─sdi         0        1G      32M         0
  
  # grep -H . /sys/block/{sde,sdi,dm-2,dm-5}/queue/discard_* /sys/block/{sde,sdi}/device/scsi_disk/*/*_mode
  /sys/block/sde/queue/discard_granularity:1073741824
  /sys/block/sde/queue/discard_max_bytes:33553920
  /sys/block/sde/queue/discard_max_hw_bytes:33553920
  /sys/block/sde/queue/discard_zeroes_data:0
  /sys/block/sdi/queue/discard_granularity:1073741824
  /sys/block/sdi/queue/discard_max_bytes:33553920
  /sys/block/sdi/queue/discard_max_hw_bytes:33553920
  /sys/block/sdi/queue/discard_zeroes_data:0
  /sys/block/dm-2/queue/discard_granularity:1073741824
  /sys/block/dm-2/queue/discard_max_bytes:33553920
  /sys/block/dm-2/queue/discard_max_hw_bytes:33553920
  /sys/block/dm-2/queue/discard_zeroes_data:0
  /sys/block/dm-5/queue/discard_granularity:1073741824
  /sys/block/dm-5/queue/discard_max_bytes:33553920
  /sys/block/dm-5/queue/discard_max_hw_bytes:33553920
  /sys/block/dm-5/queue/discard_zeroes_data:0
  /sys/block/sde/device/scsi_disk/2:0:0:1083719810/protection_mode:none
  /sys/block/sde/device/scsi_disk/2:0:0:1083719810/provisioning_mode:writesame_16
  /sys/block/sde/device/scsi_disk/2:0:0:1083719810/zeroing_mode:writesame_16_unmap
  /sys/block/sdi/device/scsi_disk/3:0:0:1083719810/protection_mode:none
  /sys/block/sdi/device/scsi_disk/3:0:0:1083719810/provisioning_mode:writesame_16
  /sys/block/sdi/device/scsi_disk/3:0:0:1083719810/zeroing_mode:writesame_16_unmap
  
  # sg_readcap -l /dev/sde
  Read Capacity results:
     Protection: prot_en=0, p_type=0, p_i_exponent=0
     Logical block provisioning: lbpme=1, lbprz=1
     Last LBA=41943039 (0x27fffff), Number of logical blocks=41943040
     Logical block length=512 bytes
     Logical blocks per physical block exponent=0
     Lowest aligned LBA=0
  Hence:
     Device size: 21474836480 bytes, 20480.0 MiB, 21.47 GB
  
  # dmesg | tail -n 5
  [  111.308428] sd 2:0:0:1083719810: [sde] tag#2053 Done: SUCCESS Result: hostbyte=DID_TARGET_FAILURE driverbyte=DRIVER_OK cmd_age=0s
  [  111.308438] sd 2:0:0:1083719810: [sde] tag#2053 CDB: Inquiry 12 01 b9 00 04 00
  [  111.308441] sd 2:0:0:1083719810: [sde] tag#2053 Sense Key : Illegal Request [current]
  [  111.308444] sd 2:0:0:1083719810: [sde] tag#2053 Add. Sense: Invalid field in cdb
  [  111.311099]  sde: sde1
  
  # sg_readcap -l /dev/sdi
  Read Capacity results:
     Protection: prot_en=0, p_type=0, p_i_exponent=0
     Logical block provisioning: lbpme=1, lbprz=1
     Last LBA=41943039 (0x27fffff), Number of logical blocks=41943040
     Logical block length=512 bytes
     Logical blocks per physical block exponent=0
     Lowest aligned LBA=0
  Hence:
     Device size: 21474836480 bytes, 20480.0 MiB, 21.47 GB
  
  # dmesg | tail -n 5
  [  125.621343] sd 3:0:0:1083719810: [sdi] tag#2325 Done: SUCCESS Result: hostbyte=DID_TARGET_FAILURE driverbyte=DRIVER_OK cmd_age=0s
  [  125.621352] sd 3:0:0:1083719810: [sdi] tag#2325 CDB: Inquiry 12 01 b9 00 04 00
  [  125.621355] sd 3:0:0:1083719810: [sdi] tag#2325 Sense Key : Illegal Request [current]
  [  125.621358] sd 3:0:0:1083719810: [sdi] tag#2325 Add. Sense: Invalid field in cdb
  [  125.623898]  sdi: sdi1
  
  # sg_vpd -p bl /dev/sde
  Block limits VPD page (SBC):
    Write same non-zero (WSNZ): 0
    Maximum compare and write length: 1 blocks
    Optimal transfer length granularity: 0 blocks [not reported]
    Maximum transfer length: 0 blocks [not reported]
    Optimal transfer length: 0 blocks [not reported]
    Maximum prefetch transfer length: 0 blocks [ignored]
    Maximum unmap LBA count: -1 [unbounded]
    Maximum unmap block descriptor count: 0 [Unmap command not implemented]
    Optimal unmap granularity: 2097152 blocks
    Unmap granularity alignment valid: true
    Unmap granularity alignment: 0
    Maximum write same length: 0 blocks [not reported]
    Maximum atomic transfer length: 0 blocks [not reported]
    Atomic alignment: 0 [unaligned atomic writes permitted]
    Atomic transfer length granularity: 0 [no granularity requirement
    Maximum atomic transfer length with atomic boundary: 0 blocks [not reported]
    Maximum atomic boundary size: 0 blocks [can only write atomic 1 block]
  
  # sg_vpd -p lbpv /dev/sde
  Logical block provisioning VPD page (SBC):
    Unmap command supported (LBPU): 0
    Write same (16) with unmap bit supported (LBPWS): 1
    Write same (10) with unmap bit supported (LBPWS10): 0
    Logical block provisioning read zeros (LBPRZ): 0
    Anchored LBAs supported (ANC_SUP): 0
    Threshold exponent: 21
    Descriptor present (DP): 0
    Minimum percentage: 0 [not reported]
    Provisioning type: 0 (not known or fully provisioned)
    Threshold percentage: 0 [percentages not supported]
  
  # sg_vpd -p bl /dev/sdi
  Block limits VPD page (SBC):
    Write same non-zero (WSNZ): 0
    Maximum compare and write length: 1 blocks
    Optimal transfer length granularity: 0 blocks [not reported]
    Maximum transfer length: 0 blocks [not reported]
    Optimal transfer length: 0 blocks [not reported]
    Maximum prefetch transfer length: 0 blocks [ignored]
    Maximum unmap LBA count: -1 [unbounded]
    Maximum unmap block descriptor count: 0 [Unmap command not implemented]
    Optimal unmap granularity: 2097152 blocks
    Unmap granularity alignment valid: true
    Unmap granularity alignment: 0
    Maximum write same length: 0 blocks [not reported]
    Maximum atomic transfer length: 0 blocks [not reported]
    Atomic alignment: 0 [unaligned atomic writes permitted]
    Atomic transfer length granularity: 0 [no granularity requirement
    Maximum atomic transfer length with atomic boundary: 0 blocks [not reported]
    Maximum atomic boundary size: 0 blocks [can only write atomic 1 block]
  
  # sg_vpd -p lbpv /dev/sdi
  Logical block provisioning VPD page (SBC):
    Unmap command supported (LBPU): 0
    Write same (16) with unmap bit supported (LBPWS): 1
    Write same (10) with unmap bit supported (LBPWS10): 0
    Logical block provisioning read zeros (LBPRZ): 0
    Anchored LBAs supported (ANC_SUP): 0
    Threshold exponent: 21
    Descriptor present (DP): 0
    Minimum percentage: 0 [not reported]
    Provisioning type: 0 (not known or fully provisioned)
    Threshold percentage: 0 [percentages not supported]
  
  # mkfs.ext4 -F /dev/mapper/mpathe1
  ...
  [  307.192885] blk_insert_cloned_request: over max size limit. (4194304 > 65535)
  [  307.192892] device-mapper: multipath: 251:2: Failing path 8:128.
  [  307.192938] blk_insert_cloned_request: over max size limit. (4194304 > 65535)
  [  307.192941] device-mapper: multipath: 251:2: Failing path 8:64.
  [  311.548555] device-mapper: multipath: 251:2: Reinstating path 8:128.
  [  311.548883] device-mapper: multipath: 251:2: Reinstating path 8:64.
  [  311.562499] blk_insert_cloned_request: over max size limit. (4194304 > 65535)
  [  311.562521] device-mapper: multipath: 251:2: Failing path 8:128.
  [  311.562553] blk_insert_cloned_request: over max size limit. (4194304 > 65535)
  [  311.562557] device-mapper: multipath: 251:2: Failing path 8:64.
  ...

5be0f08e9d95 ("scsi: sd: Fix discard errors during revalidate")
---------------------------------------------------------------

This is the last good commit

  # lsblk -s /dev/mapper/mpathe1
  NAME     MAJ:MIN RM SIZE RO TYPE  MOUNTPOINTS
  mpathe1  251:6    0  20G  0 part
  └─mpathe 251:2    0  20G  0 mpath
    ├─sde    8:64   0  20G  0 disk
    └─sdf    8:80   0  20G  0 disk
  
  # ll /dev/mapper/{mpathe1,mpathe}
  lrwxrwxrwx. 1 root root 7 Jul 19 12:29 /dev/mapper/mpathe -> ../dm-2
  lrwxrwxrwx. 1 root root 7 Jul 19 12:37 /dev/mapper/mpathe1 -> ../dm-6
  
  # lsblk -st /dev/mapper/mpathe1
  NAME     ALIGNMENT MIN-IO OPT-IO PHY-SEC LOG-SEC ROTA SCHED       RQ-SIZE  RA WSAME
  mpathe1          0    512      0     512     512    1                 128 128    0B
  └─mpathe         0    512      0     512     512    1 mq-deadline     256 128    0B
    ├─sde          0    512      0     512     512    1 bfq             256 512    0B
    └─sdf          0    512      0     512     512    1 bfq             256 512    0B
  
  # lsblk -sD /dev/mapper/mpathe1
  NAME     DISC-ALN DISC-GRAN DISC-MAX DISC-ZERO
  mpathe1         0        1G       4G         0
  └─mpathe        0        1G       4G         0
    ├─sde         0        1G       4G         0
    └─sdf         0        1G       4G         0
  
  # grep -H . /sys/block/{sde,sdf,dm-2,dm-6}/queue/discard_* /sys/block/{sde,sdf}/device/scsi_disk/*/*_mode
  /sys/block/sde/queue/discard_granularity:1073741824
  /sys/block/sde/queue/discard_max_bytes:4294966784
  /sys/block/sde/queue/discard_max_hw_bytes:4294966784
  /sys/block/sde/queue/discard_zeroes_data:0
  /sys/block/sdf/queue/discard_granularity:1073741824
  /sys/block/sdf/queue/discard_max_bytes:4294966784
  /sys/block/sdf/queue/discard_max_hw_bytes:4294966784
  /sys/block/sdf/queue/discard_zeroes_data:0
  /sys/block/dm-2/queue/discard_granularity:1073741824
  /sys/block/dm-2/queue/discard_max_bytes:4294966784
  /sys/block/dm-2/queue/discard_max_hw_bytes:4294966784
  /sys/block/dm-2/queue/discard_zeroes_data:0
  /sys/block/dm-6/queue/discard_granularity:1073741824
  /sys/block/dm-6/queue/discard_max_bytes:4294966784
  /sys/block/dm-6/queue/discard_max_hw_bytes:4294966784
  /sys/block/dm-6/queue/discard_zeroes_data:0
  /sys/block/sde/device/scsi_disk/2:0:0:1083719810/protection_mode:none
  /sys/block/sde/device/scsi_disk/2:0:0:1083719810/provisioning_mode:writesame_16
  /sys/block/sde/device/scsi_disk/2:0:0:1083719810/zeroing_mode:writesame_16_unmap
  /sys/block/sdf/device/scsi_disk/3:0:0:1083719810/protection_mode:none
  /sys/block/sdf/device/scsi_disk/3:0:0:1083719810/provisioning_mode:writesame_16
  /sys/block/sdf/device/scsi_disk/3:0:0:1083719810/zeroing_mode:writesame_16_unmap
  
  # mkfs.ext4 -F /dev/mapper/mpathe1
  mke2fs 1.46.5 (30-Dec-2021)
  Discarding device blocks: done
  Creating filesystem with 5242368 4k blocks and 1310720 inodes
  Filesystem UUID: 5d0dc4c2-445c-4a90-aaa1-0998459497c5
  Superblock backups stored on blocks:
  	32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632, 2654208,
  	4096000
  
  Allocating group tables: done
  Writing inode tables: done
  Creating journal (32768 blocks): done
  Writing superblocks and filesystem accounting information: done

This is a IBM DS8870 (first announced in 2012):
https://www.ibm.com/common/ssi/rep_sm/4/877/ENUS2424-_h04/index.html

This is one of the oldest storage boxes we have right now, and this
regression it doesn't seem to happen on newer models as far as I can
see.

-- 
Best Regards, Benjamin Block  / Linux on IBM Z Kernel Development / IBM Systems
IBM Deutschland Research & Development GmbH    /    https://www.ibm.com/privacy
Vorsitz. AufsR.: Gregor Pillen         /         Geschäftsführung: David Faller
Sitz der Gesellschaft: Böblingen / Registergericht: AmtsG Stuttgart, HRB 243294

  reply	other threads:[~2022-07-19 11:38 UTC|newest]

Thread overview: 75+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-03-02  5:35 SCSI discovery update Martin K. Petersen
2022-03-02  5:35 ` [PATCH 01/14] scsi: mpt3sas: Use cached ATA Information VPD page Martin K. Petersen
2022-03-02  9:47   ` Christoph Hellwig
2022-03-02 14:18   ` Johannes Thumshirn
2022-04-20 12:06   ` Hannes Reinecke
2022-05-03  0:51   ` Martin K. Petersen
2022-03-02  5:35 ` [PATCH 02/14] scsi: core: Query VPD size before getting full page Martin K. Petersen
2022-03-02  9:48   ` Christoph Hellwig
2022-03-02 14:25   ` Johannes Thumshirn
2022-03-04  3:42     ` Martin K. Petersen
2022-03-03  0:30   ` Bart Van Assche
2022-03-04  3:28     ` Martin K. Petersen
2022-03-02  5:35 ` [PATCH 03/14] scsi: core: Do not truncate INQUIRY data on modern devices Martin K. Petersen
2022-03-02  9:49   ` Christoph Hellwig
2022-03-02 14:27   ` Johannes Thumshirn
2022-03-03  0:14   ` Bart Van Assche
2022-03-04  3:40     ` Martin K. Petersen
2022-03-02  5:35 ` [PATCH 04/14] scsi: core: Pick suitable allocation length in scsi_report_opcode() Martin K. Petersen
2022-03-02  9:49   ` Christoph Hellwig
2022-03-02 14:29   ` Johannes Thumshirn
2022-03-03  0:39   ` Bart Van Assche
2022-04-20 12:06   ` Hannes Reinecke
2022-03-02  5:35 ` [PATCH 05/14] scsi: core: Cache VPD pages b0, b1, b2 Martin K. Petersen
2022-03-02  9:50   ` Christoph Hellwig
2022-03-02 14:30   ` Johannes Thumshirn
2022-03-03  1:30   ` Bart Van Assche
2022-04-20 12:07   ` Hannes Reinecke
2022-03-02  5:35 ` [PATCH 06/14] scsi: sd: Use cached ATA Information VPD page Martin K. Petersen
2022-03-02  9:50   ` Christoph Hellwig
2022-03-03  0:40   ` Bart Van Assche
2022-03-04  9:29   ` Johannes Thumshirn
2022-03-02  5:35 ` [PATCH 07/14] scsi: sd: Switch to using scsi_device VPD pages Martin K. Petersen
2022-03-02  9:51   ` Christoph Hellwig
2022-03-03  0:42   ` Bart Van Assche
2022-03-04  9:29   ` Johannes Thumshirn
2022-03-02  5:35 ` [PATCH 08/14] scsi: sd: Optimal I/O size should be a multiple of reported granularity Martin K. Petersen
2022-03-02  9:51   ` Christoph Hellwig
2022-03-03 20:17   ` Bart Van Assche
2022-03-04  3:45     ` Martin K. Petersen
2022-03-04  5:06       ` Bart Van Assche
2022-03-02  5:35 ` [PATCH 09/14] scsi: sd: Fix discard errors during revalidate Martin K. Petersen
2022-03-02  9:52   ` Christoph Hellwig
2022-03-03 21:06   ` Bart Van Assche
2022-03-04  3:55     ` Martin K. Petersen
2022-03-06  0:35       ` Bart Van Assche
2022-03-02  5:35 ` [PATCH 10/14] scsi: sd: Move WRITE_ZEROES configuration to a separate function Martin K. Petersen
2022-03-02  9:53   ` Christoph Hellwig
2022-03-03  0:52   ` Bart Van Assche
2022-07-18 16:51   ` regression next-20220714: mkfs.ext4 on multipath device over scsi disks causes 'lifelock' in block layer Benjamin Block
2022-07-19  2:23     ` Martin K. Petersen
2022-07-19 11:37       ` Benjamin Block [this message]
2022-07-21  2:29         ` Martin K. Petersen
2022-07-21 13:25           ` Benjamin Block
2022-03-02  5:35 ` [PATCH 11/14] scsi: sd: Implement support for NDOB flag in WRITE SAME(16) Martin K. Petersen
2022-03-02  9:54   ` Christoph Hellwig
2022-03-03  1:29   ` Bart Van Assche
2022-03-04  9:32   ` Johannes Thumshirn
2022-03-02  5:35 ` [PATCH 12/14] scsi: sd: sd_read_cpr() requires VPD pages Martin K. Petersen
2022-03-02  9:54   ` Christoph Hellwig
2022-03-02 10:45   ` Damien Le Moal
2022-04-06  1:29     ` Damien Le Moal
2022-04-07  2:19       ` Martin K. Petersen
2022-04-07  2:36         ` Damien Le Moal
2022-04-07  2:48           ` Martin K. Petersen
2022-03-03 20:13   ` Bart Van Assche
2022-03-04  9:33   ` Johannes Thumshirn
2022-03-02  5:35 ` [PATCH 13/14] scsi: sd: Reorganize DIF/DIX code to avoid calling revalidate twice Martin K. Petersen
2022-03-02  9:57   ` Christoph Hellwig
2022-03-04  9:36   ` Johannes Thumshirn
2022-03-02  5:35 ` [PATCH 14/14] scsi: sd: Enable modern protocol features on more devices Martin K. Petersen
2022-03-02  9:58   ` Christoph Hellwig
2022-03-03  1:25   ` Bart Van Assche
2022-03-04  9:38   ` Johannes Thumshirn
2022-03-03  6:09 ` SCSI discovery update Douglas Gilbert
2022-03-04  3:26   ` Martin K. Petersen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YtaXi23TBli7F8Pz@t480-pf1aa2c2.fritz.box \
    --to=bblock@linux.ibm.com \
    --cc=egorenar@linux.ibm.com \
    --cc=linux-scsi@vger.kernel.org \
    --cc=maier@linux.ibm.com \
    --cc=martin.petersen@oracle.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.