All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 00/14] Pass data temperature information to SCSI disk devices
@ 2023-10-17 20:47 Bart Van Assche
  2023-10-17 20:47 ` [PATCH v3 01/14] fs: Move enum rw_hint into a new header file Bart Van Assche
                   ` (14 more replies)
  0 siblings, 15 replies; 29+ messages in thread
From: Bart Van Assche @ 2023-10-17 20:47 UTC (permalink / raw)
  To: Jens Axboe
  Cc: linux-block, linux-scsi, linux-fsdevel, Martin K . Petersen,
	Christoph Hellwig, Niklas Cassel, Avri Altman, Bean Huo,
	Daejun Park, Bart Van Assche

Hi Jens,

UFS vendors need the data lifetime information to achieve good performance.
Without this information there is significantly higher write amplification due
to garbage collection. Hence this patch series that add support in F2FS and
also in the block layer for data lifetime information. The SCSI disk (sd)
driver is modified such that it passes write hint information to SCSI devices
via the GROUP NUMBER field.

Please consider this patch series for the next merge window.

Thanks,

Bart.

Changes compared to v1:
- Instead of storing data lifetime information in bi_ioprio, introduce the
  new struct bio member bi_lifetime and also the struct request member
  'lifetime'.
- Removed the bio_set_data_lifetime() and bio_get_data_lifetime() functions
  and replaced these with direct assignments.
- Dropped all changes related to I/O priority.
- Improved patch descriptions.

Changes compared to v1:
- Use six bits from the ioprio field for data lifetime information. The
  bio->bi_write_hint / req->write_hint / iocb->ki_hint members that were
  introduced in v1 have been removed again.
- The F_GET_FILE_RW_HINT and F_SET_FILE_RW_HINT fcntls have been removed.
- In the SCSI disk (sd) driver, query the stream status and check the PERM bit.
- The GET STREAM STATUS command has been implemented in the scsi_debug driver.

Bart Van Assche (14):
  fs: Move enum rw_hint into a new header file
  block: Restore data lifetime support in struct bio and struct request
  fs: Restore write hint support
  fs/f2fs: Restore data lifetime support
  scsi: core: Query the Block Limits Extension VPD page
  scsi_proto: Add structures and constants related to I/O groups and
    streams
  sd: Translate data lifetime information
  scsi_debug: Reduce code duplication
  scsi_debug: Support the block limits extension VPD page
  scsi_debug: Rework page code error handling
  scsi_debug: Rework subpage code error handling
  scsi_debug: Implement the IO Advice Hints Grouping mode page
  scsi_debug: Implement GET STREAM STATUS
  scsi_debug: Maintain write statistics per group number

 Documentation/filesystems/f2fs.rst |  70 ++++++++
 block/bio.c                        |   2 +
 block/blk-crypto-fallback.c        |   1 +
 block/blk-merge.c                  |   6 +
 block/blk-mq.c                     |   1 +
 block/bounce.c                     |   1 +
 block/fops.c                       |   3 +
 drivers/scsi/scsi.c                |   2 +
 drivers/scsi/scsi_debug.c          | 247 +++++++++++++++++++++--------
 drivers/scsi/scsi_sysfs.c          |  10 ++
 drivers/scsi/sd.c                  | 111 ++++++++++++-
 drivers/scsi/sd.h                  |   3 +
 fs/f2fs/data.c                     |   2 +
 fs/f2fs/f2fs.h                     |  10 ++
 fs/f2fs/segment.c                  |  95 +++++++++++
 fs/f2fs/super.c                    |  32 +++-
 fs/fcntl.c                         |   1 +
 fs/inode.c                         |   1 +
 fs/iomap/buffered-io.c             |   2 +
 fs/iomap/direct-io.c               |   1 +
 fs/mpage.c                         |   1 +
 include/linux/blk-mq.h             |   2 +
 include/linux/blk_types.h          |   2 +
 include/linux/fs.h                 |  16 +-
 include/linux/rw_hint.h            |  20 +++
 include/scsi/scsi_device.h         |   1 +
 include/scsi/scsi_proto.h          |  75 +++++++++
 27 files changed, 635 insertions(+), 83 deletions(-)
 create mode 100644 include/linux/rw_hint.h


^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH v3 01/14] fs: Move enum rw_hint into a new header file
  2023-10-17 20:47 [PATCH v3 00/14] Pass data temperature information to SCSI disk devices Bart Van Assche
@ 2023-10-17 20:47 ` Bart Van Assche
  2023-10-30 11:11   ` Kanchan Joshi
  2023-10-17 20:47 ` [PATCH v3 02/14] block: Restore data lifetime support in struct bio and struct request Bart Van Assche
                   ` (13 subsequent siblings)
  14 siblings, 1 reply; 29+ messages in thread
From: Bart Van Assche @ 2023-10-17 20:47 UTC (permalink / raw)
  To: Jens Axboe
  Cc: linux-block, linux-scsi, linux-fsdevel, Martin K . Petersen,
	Christoph Hellwig, Niklas Cassel, Avri Altman, Bean Huo,
	Daejun Park, Bart Van Assche, Jan Kara, Christian Brauner,
	Jaegeuk Kim, Chao Yu, Alexander Viro, Jeff Layton, Chuck Lever

Move enum rw_hint into a new header file to prepare for using this data
type in the block layer. Add the attribute __packed to reduce the space
occupied by instances of this data type from four bytes to one byte.
Change the data type of i_write_hint from u8 into enum rw_hint. Change
the RWH_* constants into literal constants to prevent that
<uapi/linux/fcntl.h> would have to be included.

Cc: Jan Kara <jack@suse.cz>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Christian Brauner <brauner@kernel.org>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
---
 fs/f2fs/f2fs.h          |  1 +
 fs/fcntl.c              |  1 +
 fs/inode.c              |  1 +
 include/linux/fs.h      | 16 ++--------------
 include/linux/rw_hint.h | 20 ++++++++++++++++++++
 5 files changed, 25 insertions(+), 14 deletions(-)
 create mode 100644 include/linux/rw_hint.h

diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
index 6d688e42d89c..56ee7fff55c7 100644
--- a/fs/f2fs/f2fs.h
+++ b/fs/f2fs/f2fs.h
@@ -24,6 +24,7 @@
 #include <linux/blkdev.h>
 #include <linux/quotaops.h>
 #include <linux/part_stat.h>
+#include <linux/rw_hint.h>
 #include <crypto/hash.h>
 
 #include <linux/fscrypt.h>
diff --git a/fs/fcntl.c b/fs/fcntl.c
index e871009f6c88..ed923640aecf 100644
--- a/fs/fcntl.c
+++ b/fs/fcntl.c
@@ -27,6 +27,7 @@
 #include <linux/memfd.h>
 #include <linux/compat.h>
 #include <linux/mount.h>
+#include <linux/rw_hint.h>
 
 #include <linux/poll.h>
 #include <asm/siginfo.h>
diff --git a/fs/inode.c b/fs/inode.c
index 84bc3c76e5cc..ebcc41ac9682 100644
--- a/fs/inode.c
+++ b/fs/inode.c
@@ -20,6 +20,7 @@
 #include <linux/ratelimit.h>
 #include <linux/list_lru.h>
 #include <linux/iversion.h>
+#include <linux/rw_hint.h>
 #include <trace/events/writeback.h>
 #include "internal.h"
 
diff --git a/include/linux/fs.h b/include/linux/fs.h
index b528f063e8ff..971f0bafa782 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -43,6 +43,7 @@
 #include <linux/cred.h>
 #include <linux/mnt_idmapping.h>
 #include <linux/slab.h>
+#include <linux/rw_hint.h>
 
 #include <asm/byteorder.h>
 #include <uapi/linux/fs.h>
@@ -309,19 +310,6 @@ struct address_space;
 struct writeback_control;
 struct readahead_control;
 
-/*
- * Write life time hint values.
- * Stored in struct inode as u8.
- */
-enum rw_hint {
-	WRITE_LIFE_NOT_SET	= 0,
-	WRITE_LIFE_NONE		= RWH_WRITE_LIFE_NONE,
-	WRITE_LIFE_SHORT	= RWH_WRITE_LIFE_SHORT,
-	WRITE_LIFE_MEDIUM	= RWH_WRITE_LIFE_MEDIUM,
-	WRITE_LIFE_LONG		= RWH_WRITE_LIFE_LONG,
-	WRITE_LIFE_EXTREME	= RWH_WRITE_LIFE_EXTREME,
-};
-
 /* Match RWF_* bits to IOCB bits */
 #define IOCB_HIPRI		(__force int) RWF_HIPRI
 #define IOCB_DSYNC		(__force int) RWF_DSYNC
@@ -677,7 +665,7 @@ struct inode {
 	spinlock_t		i_lock;	/* i_blocks, i_bytes, maybe i_size */
 	unsigned short          i_bytes;
 	u8			i_blkbits;
-	u8			i_write_hint;
+	enum rw_hint		i_write_hint;
 	blkcnt_t		i_blocks;
 
 #ifdef __NEED_I_SIZE_ORDERED
diff --git a/include/linux/rw_hint.h b/include/linux/rw_hint.h
new file mode 100644
index 000000000000..4a7d28945973
--- /dev/null
+++ b/include/linux/rw_hint.h
@@ -0,0 +1,20 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _LINUX_RW_HINT_H
+#define _LINUX_RW_HINT_H
+
+#include <linux/build_bug.h>
+#include <linux/compiler_attributes.h>
+
+/* Block storage write lifetime hint values. */
+enum rw_hint {
+	WRITE_LIFE_NOT_SET	= 0, /* RWH_WRITE_LIFE_NOT_SET */
+	WRITE_LIFE_NONE		= 1, /* RWH_WRITE_LIFE_NONE */
+	WRITE_LIFE_SHORT	= 2, /* RWH_WRITE_LIFE_SHORT */
+	WRITE_LIFE_MEDIUM	= 3, /* RWH_WRITE_LIFE_MEDIUM */
+	WRITE_LIFE_LONG		= 4, /* RWH_WRITE_LIFE_LONG */
+	WRITE_LIFE_EXTREME	= 5, /* RWH_WRITE_LIFE_EXTREME */
+} __packed;
+
+static_assert(sizeof(enum rw_hint) == 1);
+
+#endif /* _LINUX_RW_HINT_H */

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v3 02/14] block: Restore data lifetime support in struct bio and struct request
  2023-10-17 20:47 [PATCH v3 00/14] Pass data temperature information to SCSI disk devices Bart Van Assche
  2023-10-17 20:47 ` [PATCH v3 01/14] fs: Move enum rw_hint into a new header file Bart Van Assche
@ 2023-10-17 20:47 ` Bart Van Assche
  2023-10-17 20:47 ` [PATCH v3 03/14] fs: Restore write hint support Bart Van Assche
                   ` (12 subsequent siblings)
  14 siblings, 0 replies; 29+ messages in thread
From: Bart Van Assche @ 2023-10-17 20:47 UTC (permalink / raw)
  To: Jens Axboe
  Cc: linux-block, linux-scsi, linux-fsdevel, Martin K . Petersen,
	Christoph Hellwig, Niklas Cassel, Avri Altman, Bean Huo,
	Daejun Park, Bart Van Assche, Damien Le Moal

Provide a mechanism for filesystems to pass data lifetime information
to block drivers. Data lifetime information can be used by block devices
with append/erase storage technology (NAND flash) to reduce garbage
collection activity.

This patch restores a subset of the functionality that was removed by
commit c75e707fe1aa ("block: remove the per-bio/request write hint").

Cc: Christoph Hellwig <hch@lst.de>
Cc: Damien Le Moal <dlemoal@kernel.org>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
---
 block/bio.c                 | 2 ++
 block/blk-crypto-fallback.c | 1 +
 block/blk-merge.c           | 6 ++++++
 block/blk-mq.c              | 1 +
 block/bounce.c              | 1 +
 include/linux/blk-mq.h      | 2 ++
 include/linux/blk_types.h   | 2 ++
 7 files changed, 15 insertions(+)

diff --git a/block/bio.c b/block/bio.c
index 816d412c06e9..1a3733635079 100644
--- a/block/bio.c
+++ b/block/bio.c
@@ -251,6 +251,7 @@ void bio_init(struct bio *bio, struct block_device *bdev, struct bio_vec *table,
 	bio->bi_opf = opf;
 	bio->bi_flags = 0;
 	bio->bi_ioprio = 0;
+	bio->bi_lifetime = 0;
 	bio->bi_status = 0;
 	bio->bi_iter.bi_sector = 0;
 	bio->bi_iter.bi_size = 0;
@@ -813,6 +814,7 @@ static int __bio_clone(struct bio *bio, struct bio *bio_src, gfp_t gfp)
 {
 	bio_set_flag(bio, BIO_CLONED);
 	bio->bi_ioprio = bio_src->bi_ioprio;
+	bio->bi_lifetime = bio_src->bi_lifetime;
 	bio->bi_iter = bio_src->bi_iter;
 
 	if (bio->bi_bdev) {
diff --git a/block/blk-crypto-fallback.c b/block/blk-crypto-fallback.c
index e6468eab2681..e25a6d551594 100644
--- a/block/blk-crypto-fallback.c
+++ b/block/blk-crypto-fallback.c
@@ -172,6 +172,7 @@ static struct bio *blk_crypto_fallback_clone_bio(struct bio *bio_src)
 	if (bio_flagged(bio_src, BIO_REMAPPED))
 		bio_set_flag(bio, BIO_REMAPPED);
 	bio->bi_ioprio		= bio_src->bi_ioprio;
+	bio->bi_lifetime	= bio_src->bi_lifetime;
 	bio->bi_iter.bi_sector	= bio_src->bi_iter.bi_sector;
 	bio->bi_iter.bi_size	= bio_src->bi_iter.bi_size;
 
diff --git a/block/blk-merge.c b/block/blk-merge.c
index 65e75efa9bd3..62718cc871bd 100644
--- a/block/blk-merge.c
+++ b/block/blk-merge.c
@@ -814,6 +814,9 @@ static struct request *attempt_merge(struct request_queue *q,
 	if (rq_data_dir(req) != rq_data_dir(next))
 		return NULL;
 
+	if (req->lifetime != next->lifetime)
+		return NULL;
+
 	if (req->ioprio != next->ioprio)
 		return NULL;
 
@@ -941,6 +944,9 @@ bool blk_rq_merge_ok(struct request *rq, struct bio *bio)
 	if (!bio_crypt_rq_ctx_compatible(rq, bio))
 		return false;
 
+	if (rq->lifetime != bio->bi_lifetime)
+		return NULL;
+
 	if (rq->ioprio != bio_prio(bio))
 		return false;
 
diff --git a/block/blk-mq.c b/block/blk-mq.c
index a815403f375c..10540a3b3c49 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -3148,6 +3148,7 @@ int blk_rq_prep_clone(struct request *rq, struct request *rq_src,
 	}
 	rq->nr_phys_segments = rq_src->nr_phys_segments;
 	rq->ioprio = rq_src->ioprio;
+	rq->lifetime = rq_src->lifetime;
 
 	if (rq->bio && blk_crypto_rq_bio_prep(rq, rq->bio, gfp_mask) < 0)
 		goto free_and_out;
diff --git a/block/bounce.c b/block/bounce.c
index 7cfcb242f9a1..b03e4944ace8 100644
--- a/block/bounce.c
+++ b/block/bounce.c
@@ -169,6 +169,7 @@ static struct bio *bounce_clone_bio(struct bio *bio_src)
 	if (bio_flagged(bio_src, BIO_REMAPPED))
 		bio_set_flag(bio, BIO_REMAPPED);
 	bio->bi_ioprio		= bio_src->bi_ioprio;
+	bio->bi_lifetime	= bio_src->bi_lifetime;
 	bio->bi_iter.bi_sector	= bio_src->bi_iter.bi_sector;
 	bio->bi_iter.bi_size	= bio_src->bi_iter.bi_size;
 
diff --git a/include/linux/blk-mq.h b/include/linux/blk-mq.h
index 1ab3081c82ed..1afd731432fe 100644
--- a/include/linux/blk-mq.h
+++ b/include/linux/blk-mq.h
@@ -136,6 +136,7 @@ struct request {
 #endif
 
 	unsigned short ioprio;
+	enum rw_hint lifetime;
 
 	enum mq_rq_state state;
 	atomic_t ref;
@@ -957,6 +958,7 @@ static inline void blk_rq_bio_prep(struct request *rq, struct bio *bio,
 	rq->__data_len = bio->bi_iter.bi_size;
 	rq->bio = rq->biotail = bio;
 	rq->ioprio = bio_prio(bio);
+	rq->lifetime = bio->bi_lifetime;
 }
 
 void blk_mq_hctx_set_fq_lock_class(struct blk_mq_hw_ctx *hctx,
diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h
index d5c5e59ddbd2..5e21f44141fb 100644
--- a/include/linux/blk_types.h
+++ b/include/linux/blk_types.h
@@ -10,6 +10,7 @@
 #include <linux/bvec.h>
 #include <linux/device.h>
 #include <linux/ktime.h>
+#include <linux/rw_hint.h>
 
 struct bio_set;
 struct bio;
@@ -269,6 +270,7 @@ struct bio {
 						 */
 	unsigned short		bi_flags;	/* BIO_* below */
 	unsigned short		bi_ioprio;
+	enum rw_hint		bi_lifetime;	/* data lifetime */
 	blk_status_t		bi_status;
 	atomic_t		__bi_remaining;
 

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v3 03/14] fs: Restore write hint support
  2023-10-17 20:47 [PATCH v3 00/14] Pass data temperature information to SCSI disk devices Bart Van Assche
  2023-10-17 20:47 ` [PATCH v3 01/14] fs: Move enum rw_hint into a new header file Bart Van Assche
  2023-10-17 20:47 ` [PATCH v3 02/14] block: Restore data lifetime support in struct bio and struct request Bart Van Assche
@ 2023-10-17 20:47 ` Bart Van Assche
  2023-10-17 20:47 ` [PATCH v3 04/14] fs/f2fs: Restore data lifetime support Bart Van Assche
                   ` (11 subsequent siblings)
  14 siblings, 0 replies; 29+ messages in thread
From: Bart Van Assche @ 2023-10-17 20:47 UTC (permalink / raw)
  To: Jens Axboe
  Cc: linux-block, linux-scsi, linux-fsdevel, Martin K . Petersen,
	Christoph Hellwig, Niklas Cassel, Avri Altman, Bean Huo,
	Daejun Park, Bart Van Assche, Jan Kara, Christian Brauner,
	Darrick J. Wong, Alexander Viro

Initialize the bio lifetime to the data lifetime information that is
available in struct inode. This patch reverts a small subset of commit
c75e707fe1aa ("block: remove the per-bio/request write hint").

Cc: Jan Kara <jack@suse.cz>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Christian Brauner <brauner@kernel.org>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
---
 block/fops.c           | 3 +++
 fs/iomap/buffered-io.c | 2 ++
 fs/iomap/direct-io.c   | 1 +
 fs/mpage.c             | 1 +
 4 files changed, 7 insertions(+)

diff --git a/block/fops.c b/block/fops.c
index acff3d5d22d4..c9ca9f0fd48f 100644
--- a/block/fops.c
+++ b/block/fops.c
@@ -74,6 +74,7 @@ static ssize_t __blkdev_direct_IO_simple(struct kiocb *iocb,
 	}
 	bio.bi_iter.bi_sector = pos >> SECTOR_SHIFT;
 	bio.bi_ioprio = iocb->ki_ioprio;
+	bio.bi_lifetime	= iocb->ki_filp->f_inode->i_write_hint;
 
 	ret = bio_iov_iter_get_pages(&bio, iter);
 	if (unlikely(ret))
@@ -206,6 +207,7 @@ static ssize_t __blkdev_direct_IO(struct kiocb *iocb, struct iov_iter *iter,
 		bio->bi_private = dio;
 		bio->bi_end_io = blkdev_bio_end_io;
 		bio->bi_ioprio = iocb->ki_ioprio;
+		bio->bi_lifetime = iocb->ki_filp->f_inode->i_write_hint;
 
 		ret = bio_iov_iter_get_pages(bio, iter);
 		if (unlikely(ret)) {
@@ -323,6 +325,7 @@ static ssize_t __blkdev_direct_IO_async(struct kiocb *iocb,
 	bio->bi_iter.bi_sector = pos >> SECTOR_SHIFT;
 	bio->bi_end_io = blkdev_bio_end_io_async;
 	bio->bi_ioprio = iocb->ki_ioprio;
+	bio->bi_lifetime = iocb->ki_filp->f_inode->i_write_hint;
 
 	if (iov_iter_is_bvec(iter)) {
 		/*
diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
index 644479ccefbd..fe82c8882c62 100644
--- a/fs/iomap/buffered-io.c
+++ b/fs/iomap/buffered-io.c
@@ -1660,6 +1660,7 @@ iomap_alloc_ioend(struct inode *inode, struct iomap_writepage_ctx *wpc,
 			       REQ_OP_WRITE | wbc_to_write_flags(wbc),
 			       GFP_NOFS, &iomap_ioend_bioset);
 	bio->bi_iter.bi_sector = sector;
+	bio->bi_lifetime = inode->i_write_hint;
 	wbc_init_bio(wbc, bio);
 
 	ioend = container_of(bio, struct iomap_ioend, io_inline_bio);
@@ -1690,6 +1691,7 @@ iomap_chain_bio(struct bio *prev)
 	new = bio_alloc(prev->bi_bdev, BIO_MAX_VECS, prev->bi_opf, GFP_NOFS);
 	bio_clone_blkg_association(new, prev);
 	new->bi_iter.bi_sector = bio_end_sector(prev);
+	new->bi_lifetime = prev->bi_lifetime;
 
 	bio_chain(prev, new);
 	bio_get(prev);		/* for iomap_finish_ioend */
diff --git a/fs/iomap/direct-io.c b/fs/iomap/direct-io.c
index bcd3f8cf5ea4..df095b9700a7 100644
--- a/fs/iomap/direct-io.c
+++ b/fs/iomap/direct-io.c
@@ -381,6 +381,7 @@ static loff_t iomap_dio_bio_iter(const struct iomap_iter *iter,
 					  GFP_KERNEL);
 		bio->bi_iter.bi_sector = iomap_sector(iomap, pos);
 		bio->bi_ioprio = dio->iocb->ki_ioprio;
+		bio->bi_lifetime = dio->iocb->ki_filp->f_inode->i_write_hint;
 		bio->bi_private = dio;
 		bio->bi_end_io = iomap_dio_bio_end_io;
 
diff --git a/fs/mpage.c b/fs/mpage.c
index 242e213ee064..19b7ced1a9aa 100644
--- a/fs/mpage.c
+++ b/fs/mpage.c
@@ -612,6 +612,7 @@ static int __mpage_writepage(struct folio *folio, struct writeback_control *wbc,
 				GFP_NOFS);
 		bio->bi_iter.bi_sector = blocks[0] << (blkbits - 9);
 		wbc_init_bio(wbc, bio);
+		bio->bi_lifetime = inode->i_write_hint;
 	}
 
 	/*

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v3 04/14] fs/f2fs: Restore data lifetime support
  2023-10-17 20:47 [PATCH v3 00/14] Pass data temperature information to SCSI disk devices Bart Van Assche
                   ` (2 preceding siblings ...)
  2023-10-17 20:47 ` [PATCH v3 03/14] fs: Restore write hint support Bart Van Assche
@ 2023-10-17 20:47 ` Bart Van Assche
  2023-10-17 20:47 ` [PATCH v3 05/14] scsi: core: Query the Block Limits Extension VPD page Bart Van Assche
                   ` (10 subsequent siblings)
  14 siblings, 0 replies; 29+ messages in thread
From: Bart Van Assche @ 2023-10-17 20:47 UTC (permalink / raw)
  To: Jens Axboe
  Cc: linux-block, linux-scsi, linux-fsdevel, Martin K . Petersen,
	Christoph Hellwig, Niklas Cassel, Avri Altman, Bean Huo,
	Daejun Park, Bart Van Assche, Avri Altman, Bean Huo, Jaegeuk Kim,
	Chao Yu, Jonathan Corbet

Restore support for the whint_mode mount option by reverting commit
930e2607638d ("f2fs: remove obsolete whint_mode"). Additionally, restore
the bio->bi_lifetime assignment in __bio_alloc() that was removed by
commit c75e707fe1aa ("block: remove the per-bio/request write hint").

Cc: Avri Altman <avri.altman@wdc.com>
Cc: Bean Huo <beanhuo@micron.com>
Cc: Daejun Park <daejun7.park@samsung.com>
Cc: Jaegeuk Kim <jaegeuk@kernel.org>
Cc: Chao Yu <chao@kernel.org>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
---
 Documentation/filesystems/f2fs.rst | 70 ++++++++++++++++++++++
 fs/f2fs/data.c                     |  2 +
 fs/f2fs/f2fs.h                     |  9 +++
 fs/f2fs/segment.c                  | 95 ++++++++++++++++++++++++++++++
 fs/f2fs/super.c                    | 32 +++++++++-
 5 files changed, 207 insertions(+), 1 deletion(-)

diff --git a/Documentation/filesystems/f2fs.rst b/Documentation/filesystems/f2fs.rst
index d32c6209685d..de412ddebcc8 100644
--- a/Documentation/filesystems/f2fs.rst
+++ b/Documentation/filesystems/f2fs.rst
@@ -242,6 +242,12 @@ offgrpjquota		 Turn off group journalled quota.
 offprjjquota		 Turn off project journalled quota.
 quota			 Enable plain user disk quota accounting.
 noquota			 Disable all plain disk quota option.
+whint_mode=%s		 Control which write hints are passed down to block
+			 layer. This supports "off", "user-based", and
+			 "fs-based".  In "off" mode (default), f2fs does not pass
+			 down hints. In "user-based" mode, f2fs tries to pass
+			 down hints given by users. And in "fs-based" mode, f2fs
+			 passes down hints with its policy.
 alloc_mode=%s		 Adjust block allocation policy, which supports "reuse"
 			 and "default".
 fsync_mode=%s		 Control the policy of fsync. Currently supports "posix",
@@ -776,6 +782,70 @@ In order to identify whether the data in the victim segment are valid or not,
 F2FS manages a bitmap. Each bit represents the validity of a block, and the
 bitmap is composed of a bit stream covering whole blocks in main area.
 
+Write-hint Policy
+-----------------
+
+1) whint_mode=off. F2FS only passes down WRITE_LIFE_NOT_SET.
+
+2) whint_mode=user-based. F2FS tries to pass down hints given by
+users.
+
+===================== ======================== ===================
+User                  F2FS                     Block
+===================== ======================== ===================
+N/A                   META                     WRITE_LIFE_NOT_SET
+N/A                   HOT_NODE                 "
+N/A                   WARM_NODE                "
+N/A                   COLD_NODE                "
+ioctl(COLD)           COLD_DATA                WRITE_LIFE_EXTREME
+extension list        "                        "
+
+-- buffered io
+WRITE_LIFE_EXTREME    COLD_DATA                WRITE_LIFE_EXTREME
+WRITE_LIFE_SHORT      HOT_DATA                 WRITE_LIFE_SHORT
+WRITE_LIFE_NOT_SET    WARM_DATA                WRITE_LIFE_NOT_SET
+WRITE_LIFE_NONE       "                        "
+WRITE_LIFE_MEDIUM     "                        "
+WRITE_LIFE_LONG       "                        "
+
+-- direct io
+WRITE_LIFE_EXTREME    COLD_DATA                WRITE_LIFE_EXTREME
+WRITE_LIFE_SHORT      HOT_DATA                 WRITE_LIFE_SHORT
+WRITE_LIFE_NOT_SET    WARM_DATA                WRITE_LIFE_NOT_SET
+WRITE_LIFE_NONE       "                        WRITE_LIFE_NONE
+WRITE_LIFE_MEDIUM     "                        WRITE_LIFE_MEDIUM
+WRITE_LIFE_LONG       "                        WRITE_LIFE_LONG
+===================== ======================== ===================
+
+3) whint_mode=fs-based. F2FS passes down hints with its policy.
+
+===================== ======================== ===================
+User                  F2FS                     Block
+===================== ======================== ===================
+N/A                   META                     WRITE_LIFE_MEDIUM;
+N/A                   HOT_NODE                 WRITE_LIFE_NOT_SET
+N/A                   WARM_NODE                "
+N/A                   COLD_NODE                WRITE_LIFE_NONE
+ioctl(COLD)           COLD_DATA                WRITE_LIFE_EXTREME
+extension list        "                        "
+
+-- buffered io
+WRITE_LIFE_EXTREME    COLD_DATA                WRITE_LIFE_EXTREME
+WRITE_LIFE_SHORT      HOT_DATA                 WRITE_LIFE_SHORT
+WRITE_LIFE_NOT_SET    WARM_DATA                WRITE_LIFE_LONG
+WRITE_LIFE_NONE       "                        "
+WRITE_LIFE_MEDIUM     "                        "
+WRITE_LIFE_LONG       "                        "
+
+-- direct io
+WRITE_LIFE_EXTREME    COLD_DATA                WRITE_LIFE_EXTREME
+WRITE_LIFE_SHORT      HOT_DATA                 WRITE_LIFE_SHORT
+WRITE_LIFE_NOT_SET    WARM_DATA                WRITE_LIFE_NOT_SET
+WRITE_LIFE_NONE       "                        WRITE_LIFE_NONE
+WRITE_LIFE_MEDIUM     "                        WRITE_LIFE_MEDIUM
+WRITE_LIFE_LONG       "                        WRITE_LIFE_LONG
+===================== ======================== ===================
+
 Fallocate(2) Policy
 -------------------
 
diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
index 916e317ac925..4a5edb9a1a1e 100644
--- a/fs/f2fs/data.c
+++ b/fs/f2fs/data.c
@@ -478,6 +478,8 @@ static struct bio *__bio_alloc(struct f2fs_io_info *fio, int npages)
 	} else {
 		bio->bi_end_io = f2fs_write_end_io;
 		bio->bi_private = sbi;
+		bio->bi_lifetime = f2fs_io_type_to_rw_hint(sbi, fio->type,
+							   fio->temp);
 	}
 	iostat_alloc_and_bind_ctx(sbi, bio, NULL);
 
diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
index 56ee7fff55c7..8d408afb044b 100644
--- a/fs/f2fs/f2fs.h
+++ b/fs/f2fs/f2fs.h
@@ -158,6 +158,7 @@ struct f2fs_mount_info {
 	int s_jquota_fmt;			/* Format of quota to use */
 #endif
 	/* For which write hints are passed down to block layer */
+	int whint_mode;
 	int alloc_mode;			/* segment allocation policy */
 	int fsync_mode;			/* fsync policy */
 	int fs_mode;			/* fs mode: LFS or ADAPTIVE */
@@ -1344,6 +1345,12 @@ enum {
 	FS_MODE_FRAGMENT_BLK,		/* block fragmentation mode */
 };
 
+enum {
+	WHINT_MODE_OFF,		/* not pass down write hints */
+	WHINT_MODE_USER,	/* try to pass down hints given by users */
+	WHINT_MODE_FS,		/* pass down hints with F2FS policy */
+};
+
 enum {
 	ALLOC_MODE_DEFAULT,	/* stay default */
 	ALLOC_MODE_REUSE,	/* reuse segments as much as possible */
@@ -3728,6 +3735,8 @@ void f2fs_destroy_segment_manager(struct f2fs_sb_info *sbi);
 int __init f2fs_create_segment_manager_caches(void);
 void f2fs_destroy_segment_manager_caches(void);
 int f2fs_rw_hint_to_seg_type(enum rw_hint hint);
+enum rw_hint f2fs_io_type_to_rw_hint(struct f2fs_sb_info *sbi,
+			enum page_type type, enum temp_type temp);
 unsigned int f2fs_usable_segs_in_sec(struct f2fs_sb_info *sbi,
 			unsigned int segno);
 unsigned int f2fs_usable_blks_in_seg(struct f2fs_sb_info *sbi,
diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
index d05b41608fc0..38c0cb8d9571 100644
--- a/fs/f2fs/segment.c
+++ b/fs/f2fs/segment.c
@@ -3290,6 +3290,101 @@ int f2fs_rw_hint_to_seg_type(enum rw_hint hint)
 	}
 }
 
+/* This returns write hints for each segment type. This hints will be
+ * passed down to block layer. There are mapping tables which depend on
+ * the mount option 'whint_mode'.
+ *
+ * 1) whint_mode=off. F2FS only passes down WRITE_LIFE_NOT_SET.
+ *
+ * 2) whint_mode=user-based. F2FS tries to pass down hints given by users.
+ *
+ * User                  F2FS                     Block
+ * ----                  ----                     -----
+ *                       META                     WRITE_LIFE_NOT_SET
+ *                       HOT_NODE                 "
+ *                       WARM_NODE                "
+ *                       COLD_NODE                "
+ * ioctl(COLD)           COLD_DATA                WRITE_LIFE_EXTREME
+ * extension list        "                        "
+ *
+ * -- buffered io
+ * WRITE_LIFE_EXTREME    COLD_DATA                WRITE_LIFE_EXTREME
+ * WRITE_LIFE_SHORT      HOT_DATA                 WRITE_LIFE_SHORT
+ * WRITE_LIFE_NOT_SET    WARM_DATA                WRITE_LIFE_NOT_SET
+ * WRITE_LIFE_NONE       "                        "
+ * WRITE_LIFE_MEDIUM     "                        "
+ * WRITE_LIFE_LONG       "                        "
+ *
+ * -- direct io
+ * WRITE_LIFE_EXTREME    COLD_DATA                WRITE_LIFE_EXTREME
+ * WRITE_LIFE_SHORT      HOT_DATA                 WRITE_LIFE_SHORT
+ * WRITE_LIFE_NOT_SET    WARM_DATA                WRITE_LIFE_NOT_SET
+ * WRITE_LIFE_NONE       "                        WRITE_LIFE_NONE
+ * WRITE_LIFE_MEDIUM     "                        WRITE_LIFE_MEDIUM
+ * WRITE_LIFE_LONG       "                        WRITE_LIFE_LONG
+ *
+ * 3) whint_mode=fs-based. F2FS passes down hints with its policy.
+ *
+ * User                  F2FS                     Block
+ * ----                  ----                     -----
+ *                       META                     WRITE_LIFE_MEDIUM;
+ *                       HOT_NODE                 WRITE_LIFE_NOT_SET
+ *                       WARM_NODE                "
+ *                       COLD_NODE                WRITE_LIFE_NONE
+ * ioctl(COLD)           COLD_DATA                WRITE_LIFE_EXTREME
+ * extension list        "                        "
+ *
+ * -- buffered io
+ * WRITE_LIFE_EXTREME    COLD_DATA                WRITE_LIFE_EXTREME
+ * WRITE_LIFE_SHORT      HOT_DATA                 WRITE_LIFE_SHORT
+ * WRITE_LIFE_NOT_SET    WARM_DATA                WRITE_LIFE_LONG
+ * WRITE_LIFE_NONE       "                        "
+ * WRITE_LIFE_MEDIUM     "                        "
+ * WRITE_LIFE_LONG       "                        "
+ *
+ * -- direct io
+ * WRITE_LIFE_EXTREME    COLD_DATA                WRITE_LIFE_EXTREME
+ * WRITE_LIFE_SHORT      HOT_DATA                 WRITE_LIFE_SHORT
+ * WRITE_LIFE_NOT_SET    WARM_DATA                WRITE_LIFE_NOT_SET
+ * WRITE_LIFE_NONE       "                        WRITE_LIFE_NONE
+ * WRITE_LIFE_MEDIUM     "                        WRITE_LIFE_MEDIUM
+ * WRITE_LIFE_LONG       "                        WRITE_LIFE_LONG
+ */
+
+enum rw_hint f2fs_io_type_to_rw_hint(struct f2fs_sb_info *sbi,
+				enum page_type type, enum temp_type temp)
+{
+	if (F2FS_OPTION(sbi).whint_mode == WHINT_MODE_USER) {
+		if (type == DATA) {
+			if (temp == WARM)
+				return WRITE_LIFE_NOT_SET;
+			else if (temp == HOT)
+				return WRITE_LIFE_SHORT;
+			else if (temp == COLD)
+				return WRITE_LIFE_EXTREME;
+		} else {
+			return WRITE_LIFE_NOT_SET;
+		}
+	} else if (F2FS_OPTION(sbi).whint_mode == WHINT_MODE_FS) {
+		if (type == DATA) {
+			if (temp == WARM)
+				return WRITE_LIFE_LONG;
+			else if (temp == HOT)
+				return WRITE_LIFE_SHORT;
+			else if (temp == COLD)
+				return WRITE_LIFE_EXTREME;
+		} else if (type == NODE) {
+			if (temp == WARM || temp == HOT)
+				return WRITE_LIFE_NOT_SET;
+			else if (temp == COLD)
+				return WRITE_LIFE_NONE;
+		} else if (type == META) {
+			return WRITE_LIFE_MEDIUM;
+		}
+	}
+	return WRITE_LIFE_NOT_SET;
+}
+
 static int __get_segment_type_2(struct f2fs_io_info *fio)
 {
 	if (fio->type == DATA)
diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
index a8c8232852bb..5bb062075acf 100644
--- a/fs/f2fs/super.c
+++ b/fs/f2fs/super.c
@@ -141,6 +141,7 @@ enum {
 	Opt_jqfmt_vfsold,
 	Opt_jqfmt_vfsv0,
 	Opt_jqfmt_vfsv1,
+	Opt_whint,
 	Opt_alloc,
 	Opt_fsync,
 	Opt_test_dummy_encryption,
@@ -220,6 +221,7 @@ static match_table_t f2fs_tokens = {
 	{Opt_jqfmt_vfsold, "jqfmt=vfsold"},
 	{Opt_jqfmt_vfsv0, "jqfmt=vfsv0"},
 	{Opt_jqfmt_vfsv1, "jqfmt=vfsv1"},
+	{Opt_whint, "whint_mode=%s"},
 	{Opt_alloc, "alloc_mode=%s"},
 	{Opt_fsync, "fsync_mode=%s"},
 	{Opt_test_dummy_encryption, "test_dummy_encryption=%s"},
@@ -988,6 +990,22 @@ static int parse_options(struct super_block *sb, char *options, bool is_remount)
 			f2fs_info(sbi, "quota operations not supported");
 			break;
 #endif
+		case Opt_whint:
+			name = match_strdup(&args[0]);
+			if (!name)
+				return -ENOMEM;
+			if (!strcmp(name, "user-based")) {
+				F2FS_OPTION(sbi).whint_mode = WHINT_MODE_USER;
+			} else if (!strcmp(name, "off")) {
+				F2FS_OPTION(sbi).whint_mode = WHINT_MODE_OFF;
+			} else if (!strcmp(name, "fs-based")) {
+				F2FS_OPTION(sbi).whint_mode = WHINT_MODE_FS;
+			} else {
+				kfree(name);
+				return -EINVAL;
+			}
+			kfree(name);
+			break;
 		case Opt_alloc:
 			name = match_strdup(&args[0]);
 			if (!name)
@@ -1389,6 +1407,12 @@ static int parse_options(struct super_block *sb, char *options, bool is_remount)
 		return -EINVAL;
 	}
 
+	/* Not pass down write hints if the number of active logs is lesser
+	 * than NR_CURSEG_PERSIST_TYPE.
+	 */
+	if (F2FS_OPTION(sbi).active_logs != NR_CURSEG_PERSIST_TYPE)
+		F2FS_OPTION(sbi).whint_mode = WHINT_MODE_OFF;
+
 	if (f2fs_sb_has_readonly(sbi) && !f2fs_readonly(sbi->sb)) {
 		f2fs_err(sbi, "Allow to mount readonly mode only");
 		return -EROFS;
@@ -2060,6 +2084,10 @@ static int f2fs_show_options(struct seq_file *seq, struct dentry *root)
 		seq_puts(seq, ",prjquota");
 #endif
 	f2fs_show_quota_options(seq, sbi->sb);
+	if (F2FS_OPTION(sbi).whint_mode == WHINT_MODE_USER)
+		seq_printf(seq, ",whint_mode=%s", "user-based");
+	else if (F2FS_OPTION(sbi).whint_mode == WHINT_MODE_FS)
+		seq_printf(seq, ",whint_mode=%s", "fs-based");
 
 	fscrypt_show_test_dummy_encryption(seq, ',', sbi->sb);
 
@@ -2129,6 +2157,7 @@ static void default_options(struct f2fs_sb_info *sbi, bool remount)
 		F2FS_OPTION(sbi).active_logs = NR_CURSEG_PERSIST_TYPE;
 
 	F2FS_OPTION(sbi).inline_xattr_size = DEFAULT_INLINE_XATTR_ADDRS;
+	F2FS_OPTION(sbi).whint_mode = WHINT_MODE_OFF;
 	if (le32_to_cpu(F2FS_RAW_SUPER(sbi)->segment_count_main) <=
 							SMALL_VOLUME_SEGMENTS)
 		F2FS_OPTION(sbi).alloc_mode = ALLOC_MODE_REUSE;
@@ -2443,7 +2472,8 @@ static int f2fs_remount(struct super_block *sb, int *flags, char *data)
 		need_stop_gc = true;
 	}
 
-	if (*flags & SB_RDONLY) {
+	if (*flags & SB_RDONLY ||
+	    F2FS_OPTION(sbi).whint_mode != org_mount_opt.whint_mode) {
 		sync_inodes_sb(sb);
 
 		set_sbi_flag(sbi, SBI_IS_DIRTY);

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v3 05/14] scsi: core: Query the Block Limits Extension VPD page
  2023-10-17 20:47 [PATCH v3 00/14] Pass data temperature information to SCSI disk devices Bart Van Assche
                   ` (3 preceding siblings ...)
  2023-10-17 20:47 ` [PATCH v3 04/14] fs/f2fs: Restore data lifetime support Bart Van Assche
@ 2023-10-17 20:47 ` Bart Van Assche
  2023-10-17 20:47 ` [PATCH v3 06/14] scsi_proto: Add structures and constants related to I/O groups and streams Bart Van Assche
                   ` (9 subsequent siblings)
  14 siblings, 0 replies; 29+ messages in thread
From: Bart Van Assche @ 2023-10-17 20:47 UTC (permalink / raw)
  To: Jens Axboe
  Cc: linux-block, linux-scsi, linux-fsdevel, Martin K . Petersen,
	Christoph Hellwig, Niklas Cassel, Avri Altman, Bean Huo,
	Daejun Park, Bart Van Assche, Avri Altman, James E.J. Bottomley

Parse the Reduced Stream Control Supported (RSCS) bit from the block
limits extension VPD page. The RSCS bit is defined in SBC-5 r05
(https://www.t10.org/cgi-bin/ac.pl?t=f&f=sbc5r05.pdf).

Reviewed-by: Avri Altman <avri.altman@wdc.com>
Reviewed-by: Daejun Park <daejun7.park@samsung.com>
Cc: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
---
 drivers/scsi/scsi.c        |  2 ++
 drivers/scsi/scsi_sysfs.c  | 10 ++++++++++
 drivers/scsi/sd.c          | 13 +++++++++++++
 drivers/scsi/sd.h          |  1 +
 include/scsi/scsi_device.h |  1 +
 5 files changed, 27 insertions(+)

diff --git a/drivers/scsi/scsi.c b/drivers/scsi/scsi.c
index d0911bc28663..5ad291770806 100644
--- a/drivers/scsi/scsi.c
+++ b/drivers/scsi/scsi.c
@@ -499,6 +499,8 @@ void scsi_attach_vpd(struct scsi_device *sdev)
 			scsi_update_vpd_page(sdev, 0xb1, &sdev->vpd_pgb1);
 		if (vpd_buf->data[i] == 0xb2)
 			scsi_update_vpd_page(sdev, 0xb2, &sdev->vpd_pgb2);
+		if (vpd_buf->data[i] == 0xb7)
+			scsi_update_vpd_page(sdev, 0xb7, &sdev->vpd_pgb7);
 	}
 	kfree(vpd_buf);
 }
diff --git a/drivers/scsi/scsi_sysfs.c b/drivers/scsi/scsi_sysfs.c
index 24f6eefb6803..93652a786a46 100644
--- a/drivers/scsi/scsi_sysfs.c
+++ b/drivers/scsi/scsi_sysfs.c
@@ -449,6 +449,7 @@ static void scsi_device_dev_release(struct device *dev)
 	struct scsi_vpd *vpd_pg80 = NULL, *vpd_pg83 = NULL;
 	struct scsi_vpd *vpd_pg0 = NULL, *vpd_pg89 = NULL;
 	struct scsi_vpd *vpd_pgb0 = NULL, *vpd_pgb1 = NULL, *vpd_pgb2 = NULL;
+	struct scsi_vpd *vpd_pgb7 = NULL;
 	unsigned long flags;
 
 	might_sleep();
@@ -494,6 +495,8 @@ static void scsi_device_dev_release(struct device *dev)
 				       lockdep_is_held(&sdev->inquiry_mutex));
 	vpd_pgb2 = rcu_replace_pointer(sdev->vpd_pgb2, vpd_pgb2,
 				       lockdep_is_held(&sdev->inquiry_mutex));
+	vpd_pgb7 = rcu_replace_pointer(sdev->vpd_pgb7, vpd_pgb7,
+				       lockdep_is_held(&sdev->inquiry_mutex));
 	mutex_unlock(&sdev->inquiry_mutex);
 
 	if (vpd_pg0)
@@ -510,6 +513,8 @@ static void scsi_device_dev_release(struct device *dev)
 		kfree_rcu(vpd_pgb1, rcu);
 	if (vpd_pgb2)
 		kfree_rcu(vpd_pgb2, rcu);
+	if (vpd_pgb7)
+		kfree_rcu(vpd_pgb7, rcu);
 	kfree(sdev->inquiry);
 	kfree(sdev);
 
@@ -921,6 +926,7 @@ sdev_vpd_pg_attr(pg89);
 sdev_vpd_pg_attr(pgb0);
 sdev_vpd_pg_attr(pgb1);
 sdev_vpd_pg_attr(pgb2);
+sdev_vpd_pg_attr(pgb7);
 sdev_vpd_pg_attr(pg0);
 
 static ssize_t show_inquiry(struct file *filep, struct kobject *kobj,
@@ -1295,6 +1301,9 @@ static umode_t scsi_sdev_bin_attr_is_visible(struct kobject *kobj,
 	if (attr == &dev_attr_vpd_pgb2 && !sdev->vpd_pgb2)
 		return 0;
 
+	if (attr == &dev_attr_vpd_pgb7 && !sdev->vpd_pgb7)
+		return 0;
+
 	return S_IRUGO;
 }
 
@@ -1347,6 +1356,7 @@ static struct bin_attribute *scsi_sdev_bin_attrs[] = {
 	&dev_attr_vpd_pgb0,
 	&dev_attr_vpd_pgb1,
 	&dev_attr_vpd_pgb2,
+	&dev_attr_vpd_pgb7,
 	&dev_attr_inquiry,
 	NULL
 };
diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
index c92a317ba547..879edbc1a065 100644
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -3019,6 +3019,18 @@ static void sd_read_block_limits(struct scsi_disk *sdkp)
 	rcu_read_unlock();
 }
 
+/* Parse the Block Limits Extension VPD page (0xb7) */
+static void sd_read_block_limits_ext(struct scsi_disk *sdkp)
+{
+	struct scsi_vpd *vpd;
+
+	rcu_read_lock();
+	vpd = rcu_dereference(sdkp->device->vpd_pgb7);
+	if (vpd && vpd->len >= 2)
+		sdkp->rscs = vpd->data[5] & 1;
+	rcu_read_unlock();
+}
+
 /**
  * sd_read_block_characteristics - Query block dev. characteristics
  * @sdkp: disk to query
@@ -3373,6 +3385,7 @@ static int sd_revalidate_disk(struct gendisk *disk)
 		if (scsi_device_supports_vpd(sdp)) {
 			sd_read_block_provisioning(sdkp);
 			sd_read_block_limits(sdkp);
+			sd_read_block_limits_ext(sdkp);
 			sd_read_block_characteristics(sdkp);
 			sd_zbc_read_zones(sdkp, buffer);
 			sd_read_cpr(sdkp);
diff --git a/drivers/scsi/sd.h b/drivers/scsi/sd.h
index 5eea762f84d1..84685168b6e0 100644
--- a/drivers/scsi/sd.h
+++ b/drivers/scsi/sd.h
@@ -150,6 +150,7 @@ struct scsi_disk {
 	unsigned	urswrz : 1;
 	unsigned	security : 1;
 	unsigned	ignore_medium_access_errors : 1;
+	bool		rscs : 1; /* reduced stream control support */
 };
 #define to_scsi_disk(obj) container_of(obj, struct scsi_disk, disk_dev)
 
diff --git a/include/scsi/scsi_device.h b/include/scsi/scsi_device.h
index b9230b6add04..2dd96ae101e1 100644
--- a/include/scsi/scsi_device.h
+++ b/include/scsi/scsi_device.h
@@ -153,6 +153,7 @@ struct scsi_device {
 	struct scsi_vpd __rcu *vpd_pgb0;
 	struct scsi_vpd __rcu *vpd_pgb1;
 	struct scsi_vpd __rcu *vpd_pgb2;
+	struct scsi_vpd __rcu *vpd_pgb7;
 
 	struct scsi_target      *sdev_target;
 

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v3 06/14] scsi_proto: Add structures and constants related to I/O groups and streams
  2023-10-17 20:47 [PATCH v3 00/14] Pass data temperature information to SCSI disk devices Bart Van Assche
                   ` (4 preceding siblings ...)
  2023-10-17 20:47 ` [PATCH v3 05/14] scsi: core: Query the Block Limits Extension VPD page Bart Van Assche
@ 2023-10-17 20:47 ` Bart Van Assche
  2023-10-17 20:47 ` [PATCH v3 07/14] sd: Translate data lifetime information Bart Van Assche
                   ` (8 subsequent siblings)
  14 siblings, 0 replies; 29+ messages in thread
From: Bart Van Assche @ 2023-10-17 20:47 UTC (permalink / raw)
  To: Jens Axboe
  Cc: linux-block, linux-scsi, linux-fsdevel, Martin K . Petersen,
	Christoph Hellwig, Niklas Cassel, Avri Altman, Bean Huo,
	Daejun Park, Bart Van Assche, James E.J. Bottomley

Prepare for adding code that will query the I/O advice hints group
descriptors and for adding code that will retrieve the stream status.

Cc: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
---
 include/scsi/scsi_proto.h | 75 +++++++++++++++++++++++++++++++++++++++
 1 file changed, 75 insertions(+)

diff --git a/include/scsi/scsi_proto.h b/include/scsi/scsi_proto.h
index 07d65c1f59db..9ee4983c23b4 100644
--- a/include/scsi/scsi_proto.h
+++ b/include/scsi/scsi_proto.h
@@ -10,6 +10,7 @@
 #ifndef _SCSI_PROTO_H_
 #define _SCSI_PROTO_H_
 
+#include <linux/build_bug.h>
 #include <linux/types.h>
 
 /*
@@ -126,6 +127,7 @@
 #define	SAI_READ_CAPACITY_16  0x10
 #define SAI_GET_LBA_STATUS    0x12
 #define SAI_REPORT_REFERRALS  0x13
+#define SAI_GET_STREAM_STATUS 0x16
 /* values for maintenance in */
 #define MI_REPORT_IDENTIFYING_INFORMATION 0x05
 #define MI_REPORT_TARGET_PGS  0x0a
@@ -275,6 +277,79 @@ struct scsi_lun {
 	__u8 scsi_lun[8];
 };
 
+/* SBC-5 IO advice hints group descriptor */
+struct scsi_io_group_descriptor {
+#if defined(__BIG_ENDIAN)
+	u8 io_advice_hints_mode: 2;
+	u8 reserved1: 3;
+	u8 st_enble: 1;
+	u8 cs_enble: 1;
+	u8 ic_enable: 1;
+#elif defined(__LITTLE_ENDIAN)
+	u8 ic_enable: 1;
+	u8 cs_enble: 1;
+	u8 st_enble: 1;
+	u8 reserved1: 3;
+	u8 io_advice_hints_mode: 2;
+#else
+#error
+#endif
+	u8 reserved2[3];
+	/* Logical block markup descriptor */
+#if defined(__BIG_ENDIAN)
+	u8 acdlu: 1;
+	u8 reserved3: 1;
+	u8 rlbsr: 2;
+	u8 lbm_descriptor_type: 4;
+#elif defined(__LITTLE_ENDIAN)
+	u8 lbm_descriptor_type: 4;
+	u8 rlbsr: 2;
+	u8 reserved3: 1;
+	u8 acdlu: 1;
+#else
+#error
+#endif
+	u8 params[2];
+	u8 reserved4;
+	u8 reserved5[8];
+};
+
+static_assert(sizeof(struct scsi_io_group_descriptor) == 16);
+
+struct scsi_stream_status {
+#if defined(__BIG_ENDIAN)
+	u16 perm: 1;
+	u16 reserved1: 15;
+#elif defined(__LITTLE_ENDIAN)
+	u16 reserved1: 15;
+	u16 perm: 1;
+#else
+#error
+#endif
+	__be16 stream_identifier;
+#if defined(__BIG_ENDIAN)
+	u8 reserved2: 2;
+	u8 rel_lifetime: 6;
+#elif defined(__LITTLE_ENDIAN)
+	u8 rel_lifetime: 6;
+	u8 reserved2: 2;
+#else
+#error
+#endif
+	u8 reserved3[3];
+};
+
+static_assert(sizeof(struct scsi_stream_status) == 8);
+
+struct scsi_stream_status_header {
+	__be32 len;	/* length in bytes of stream_status[] array. */
+	u16 reserved;
+	u16 number_of_open_streams;
+	DECLARE_FLEX_ARRAY(struct scsi_stream_status, stream_status);
+};
+
+static_assert(sizeof(struct scsi_stream_status_header) == 8);
+
 /* SPC asymmetric access states */
 #define SCSI_ACCESS_STATE_OPTIMAL     0x00
 #define SCSI_ACCESS_STATE_ACTIVE      0x01

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v3 07/14] sd: Translate data lifetime information
  2023-10-17 20:47 [PATCH v3 00/14] Pass data temperature information to SCSI disk devices Bart Van Assche
                   ` (5 preceding siblings ...)
  2023-10-17 20:47 ` [PATCH v3 06/14] scsi_proto: Add structures and constants related to I/O groups and streams Bart Van Assche
@ 2023-10-17 20:47 ` Bart Van Assche
  2023-10-17 22:43   ` kernel test robot
  2023-10-17 20:47 ` [PATCH v3 08/14] scsi_debug: Reduce code duplication Bart Van Assche
                   ` (7 subsequent siblings)
  14 siblings, 1 reply; 29+ messages in thread
From: Bart Van Assche @ 2023-10-17 20:47 UTC (permalink / raw)
  To: Jens Axboe
  Cc: linux-block, linux-scsi, linux-fsdevel, Martin K . Petersen,
	Christoph Hellwig, Niklas Cassel, Avri Altman, Bean Huo,
	Daejun Park, Bart Van Assche, Damien Le Moal,
	James E.J. Bottomley

Recently T10 standardized SBC constrained streams. This mechanism allows
to pass data lifetime information to SCSI devices in the group number
field. Add support for translating write hint information into a
permanent stream number in the sd driver. Use WRITE(10) instead of
WRITE(6) if data lifetime information is present because the WRITE(6)
command does not have a GROUP NUMBER field.

Cc: Martin K. Petersen <martin.petersen@oracle.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Damien Le Moal <dlemoal@kernel.org>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
---
 drivers/scsi/sd.c | 98 +++++++++++++++++++++++++++++++++++++++++++++--
 drivers/scsi/sd.h |  2 +
 2 files changed, 97 insertions(+), 3 deletions(-)

diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
index 879edbc1a065..8b6e6e4d0f51 100644
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -47,6 +47,7 @@
 #include <linux/blkpg.h>
 #include <linux/blk-pm.h>
 #include <linux/delay.h>
+#include <linux/rw_hint.h>
 #include <linux/major.h>
 #include <linux/mutex.h>
 #include <linux/string_helpers.h>
@@ -1001,12 +1002,38 @@ static blk_status_t sd_setup_flush_cmnd(struct scsi_cmnd *cmd)
 	return BLK_STS_OK;
 }
 
+/**
+ * sd_group_number() - Compute the GROUP NUMBER field
+ * @cmd: SCSI command for which to compute the value of the six-bit GROUP NUMBER
+ *	field.
+ *
+ * From SBC-5 r05 (https://www.t10.org/cgi-bin/ac.pl?t=f&f=sbc5r05.pdf):
+ * 0: no relative lifetime.
+ * 1: shortest relative lifetime.
+ * 2: second shortest relative lifetime.
+ * 3 - 0x3d: intermediate relative lifetimes.
+ * 0x3e: second longest relative lifetime.
+ * 0x3f: longest relative lifetime.
+ */
+static u8 sd_group_number(struct scsi_cmnd *cmd)
+{
+	const struct request *rq = scsi_cmd_to_rq(cmd);
+	struct scsi_disk *sdkp = scsi_disk(rq->q->disk);
+
+	if (!sdkp->rscs)
+		return 0;
+
+	return min3((u32)rq->lifetime, (u32)sdkp->permanent_stream_count,
+		    0x3fu);
+}
+
 static blk_status_t sd_setup_rw32_cmnd(struct scsi_cmnd *cmd, bool write,
 				       sector_t lba, unsigned int nr_blocks,
 				       unsigned char flags, unsigned int dld)
 {
 	cmd->cmd_len = SD_EXT_CDB_SIZE;
 	cmd->cmnd[0]  = VARIABLE_LENGTH_CMD;
+	cmd->cmnd[6]  = sd_group_number(cmd);
 	cmd->cmnd[7]  = 0x18; /* Additional CDB len */
 	cmd->cmnd[9]  = write ? WRITE_32 : READ_32;
 	cmd->cmnd[10] = flags;
@@ -1025,7 +1052,7 @@ static blk_status_t sd_setup_rw16_cmnd(struct scsi_cmnd *cmd, bool write,
 	cmd->cmd_len  = 16;
 	cmd->cmnd[0]  = write ? WRITE_16 : READ_16;
 	cmd->cmnd[1]  = flags | ((dld >> 2) & 0x01);
-	cmd->cmnd[14] = (dld & 0x03) << 6;
+	cmd->cmnd[14] = ((dld & 0x03) << 6) | sd_group_number(cmd);
 	cmd->cmnd[15] = 0;
 	put_unaligned_be64(lba, &cmd->cmnd[2]);
 	put_unaligned_be32(nr_blocks, &cmd->cmnd[10]);
@@ -1040,7 +1067,7 @@ static blk_status_t sd_setup_rw10_cmnd(struct scsi_cmnd *cmd, bool write,
 	cmd->cmd_len = 10;
 	cmd->cmnd[0] = write ? WRITE_10 : READ_10;
 	cmd->cmnd[1] = flags;
-	cmd->cmnd[6] = 0;
+	cmd->cmnd[6] = sd_group_number(cmd);
 	cmd->cmnd[9] = 0;
 	put_unaligned_be32(lba, &cmd->cmnd[2]);
 	put_unaligned_be16(nr_blocks, &cmd->cmnd[7]);
@@ -1177,7 +1204,7 @@ static blk_status_t sd_setup_read_write_cmnd(struct scsi_cmnd *cmd)
 		ret = sd_setup_rw16_cmnd(cmd, write, lba, nr_blocks,
 					 protect | fua, dld);
 	} else if ((nr_blocks > 0xff) || (lba > 0x1fffff) ||
-		   sdp->use_10_for_rw || protect) {
+		   sdp->use_10_for_rw || protect || rq->lifetime) {
 		ret = sd_setup_rw10_cmnd(cmd, write, lba, nr_blocks,
 					 protect | fua);
 	} else {
@@ -2912,6 +2939,70 @@ sd_read_cache_type(struct scsi_disk *sdkp, unsigned char *buffer)
 	sdkp->DPOFUA = 0;
 }
 
+static bool sd_is_perm_stream(struct scsi_disk *sdkp, unsigned stream_id)
+{
+	u8 cdb[16] = { SERVICE_ACTION_IN_16, SAI_GET_STREAM_STATUS };
+	struct {
+		struct scsi_stream_status_header h;
+		struct scsi_stream_status s;
+	} buf;
+	struct scsi_device *sdev = sdkp->device;
+	struct scsi_sense_hdr sshdr;
+	const struct scsi_exec_args exec_args = {
+		.sshdr = &sshdr,
+	};
+	int res;
+
+	put_unaligned_be16(stream_id, &cdb[4]);
+	put_unaligned_be32(sizeof(buf), &cdb[10]);
+
+	res = scsi_execute_cmd(sdev, cdb, REQ_OP_DRV_IN, &buf, sizeof(buf),
+			       SD_TIMEOUT, sdkp->max_retries, &exec_args);
+	if (res < 0)
+		return false;
+	if (scsi_status_is_check_condition(res) && scsi_sense_valid(&sshdr))
+		sd_print_sense_hdr(sdkp, &sshdr);
+	if (res)
+		return false;
+	if (get_unaligned_be32(&buf.h.len) < sizeof(struct scsi_stream_status))
+		return false;
+	return buf.h.stream_status[0].perm;
+}
+
+static void sd_read_io_hints(struct scsi_disk *sdkp, unsigned char *buffer)
+{
+	struct scsi_device *sdp = sdkp->device;
+	const struct scsi_io_group_descriptor *desc, *start, *end;
+	struct scsi_sense_hdr sshdr;
+	struct scsi_mode_data data;
+	int res;
+
+	res = scsi_mode_sense(sdp, /*dbd=*/0x8, /*modepage=*/0x0a,
+			      /*subpage=*/0x05, buffer, SD_BUF_SIZE,
+			      SD_TIMEOUT, sdkp->max_retries, &data, &sshdr);
+	if (res < 0)
+		return;
+	start = (void *)buffer + data.header_length + 16;
+	end = (void *)buffer + ALIGN_DOWN(data.header_length + data.length,
+					  sizeof(*end));
+	/*
+	 * From "SBC-5 Constrained Streams with Data Lifetimes": Device severs
+	 * should assign the lowest numbered stream identifiers to permanent
+	 * streams.
+	 */
+	for (desc = start; desc < end; desc++)
+		if (!desc->st_enble || !sd_is_perm_stream(sdkp, desc - start))
+			break;
+	sdkp->permanent_stream_count = desc - start;
+	if (sdkp->rscs && sdkp->permanent_stream_count < 2)
+		sd_printk(KERN_INFO, sdkp,
+			  "Unexpected: RSCS has been set and the permanent stream count is %u\n",
+			  sdkp->permanent_stream_count);
+	else if (sdkp->permanent_stream_count)
+		sd_printk(KERN_INFO, sdkp, "permanent stream count = %d\n",
+			  sdkp->permanent_stream_count);
+}
+
 /*
  * The ATO bit indicates whether the DIF application tag is available
  * for use by the operating system.
@@ -3395,6 +3486,7 @@ static int sd_revalidate_disk(struct gendisk *disk)
 
 		sd_read_write_protect_flag(sdkp, buffer);
 		sd_read_cache_type(sdkp, buffer);
+		sd_read_io_hints(sdkp, buffer);
 		sd_read_app_tag_own(sdkp, buffer);
 		sd_read_write_same(sdkp, buffer);
 		sd_read_security(sdkp, buffer);
diff --git a/drivers/scsi/sd.h b/drivers/scsi/sd.h
index 84685168b6e0..570d5a72749a 100644
--- a/drivers/scsi/sd.h
+++ b/drivers/scsi/sd.h
@@ -125,6 +125,8 @@ struct scsi_disk {
 	unsigned int	physical_block_size;
 	unsigned int	max_medium_access_timeouts;
 	unsigned int	medium_access_timed_out;
+			/* number of permanent streams */
+	u16		permanent_stream_count;
 	u8		media_present;
 	u8		write_prot;
 	u8		protection_type;/* Data Integrity Field */

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v3 08/14] scsi_debug: Reduce code duplication
  2023-10-17 20:47 [PATCH v3 00/14] Pass data temperature information to SCSI disk devices Bart Van Assche
                   ` (6 preceding siblings ...)
  2023-10-17 20:47 ` [PATCH v3 07/14] sd: Translate data lifetime information Bart Van Assche
@ 2023-10-17 20:47 ` Bart Van Assche
  2023-10-17 20:47 ` [PATCH v3 09/14] scsi_debug: Support the block limits extension VPD page Bart Van Assche
                   ` (6 subsequent siblings)
  14 siblings, 0 replies; 29+ messages in thread
From: Bart Van Assche @ 2023-10-17 20:47 UTC (permalink / raw)
  To: Jens Axboe
  Cc: linux-block, linux-scsi, linux-fsdevel, Martin K . Petersen,
	Christoph Hellwig, Niklas Cassel, Avri Altman, Bean Huo,
	Daejun Park, Bart Van Assche, Avri Altman, Douglas Gilbert,
	James E.J. Bottomley

All VPD pages have the page code in byte one. Reduce code duplication by
storing the VPD page code once.

Reviewed-by: Avri Altman <avri.altman@wdc.com>
Cc: Martin K. Petersen <martin.petersen@oracle.com>
Cc: Douglas Gilbert <dgilbert@interlog.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
---
 drivers/scsi/scsi_debug.c | 16 ++--------------
 1 file changed, 2 insertions(+), 14 deletions(-)

diff --git a/drivers/scsi/scsi_debug.c b/drivers/scsi/scsi_debug.c
index 9c0af50501f9..46eaa2f9e63b 100644
--- a/drivers/scsi/scsi_debug.c
+++ b/drivers/scsi/scsi_debug.c
@@ -1598,7 +1598,8 @@ static int resp_inquiry(struct scsi_cmnd *scp, struct sdebug_dev_info *devip)
 		u32 len;
 		char lu_id_str[6];
 		int host_no = devip->sdbg_host->shost->host_no;
-		
+
+		arr[1] = cmd[2];
 		port_group_id = (((host_no + 1) & 0x7f) << 8) +
 		    (devip->channel & 0x7f);
 		if (sdebug_vpd_use_hostno == 0)
@@ -1609,7 +1610,6 @@ static int resp_inquiry(struct scsi_cmnd *scp, struct sdebug_dev_info *devip)
 				 (devip->target * 1000) - 3;
 		len = scnprintf(lu_id_str, 6, "%d", lu_id_num);
 		if (0 == cmd[2]) { /* supported vital product data pages */
-			arr[1] = cmd[2];	/*sanity */
 			n = 4;
 			arr[n++] = 0x0;   /* this page */
 			arr[n++] = 0x80;  /* unit serial number */
@@ -1630,23 +1630,18 @@ static int resp_inquiry(struct scsi_cmnd *scp, struct sdebug_dev_info *devip)
 			}
 			arr[3] = n - 4;	  /* number of supported VPD pages */
 		} else if (0x80 == cmd[2]) { /* unit serial number */
-			arr[1] = cmd[2];	/*sanity */
 			arr[3] = len;
 			memcpy(&arr[4], lu_id_str, len);
 		} else if (0x83 == cmd[2]) { /* device identification */
-			arr[1] = cmd[2];	/*sanity */
 			arr[3] = inquiry_vpd_83(&arr[4], port_group_id,
 						target_dev_id, lu_id_num,
 						lu_id_str, len,
 						&devip->lu_name);
 		} else if (0x84 == cmd[2]) { /* Software interface ident. */
-			arr[1] = cmd[2];	/*sanity */
 			arr[3] = inquiry_vpd_84(&arr[4]);
 		} else if (0x85 == cmd[2]) { /* Management network addresses */
-			arr[1] = cmd[2];	/*sanity */
 			arr[3] = inquiry_vpd_85(&arr[4]);
 		} else if (0x86 == cmd[2]) { /* extended inquiry */
-			arr[1] = cmd[2];	/*sanity */
 			arr[3] = 0x3c;	/* number of following entries */
 			if (sdebug_dif == T10_PI_TYPE3_PROTECTION)
 				arr[4] = 0x4;	/* SPT: GRD_CHK:1 */
@@ -1656,30 +1651,23 @@ static int resp_inquiry(struct scsi_cmnd *scp, struct sdebug_dev_info *devip)
 				arr[4] = 0x0;   /* no protection stuff */
 			arr[5] = 0x7;   /* head of q, ordered + simple q's */
 		} else if (0x87 == cmd[2]) { /* mode page policy */
-			arr[1] = cmd[2];	/*sanity */
 			arr[3] = 0x8;	/* number of following entries */
 			arr[4] = 0x2;	/* disconnect-reconnect mp */
 			arr[6] = 0x80;	/* mlus, shared */
 			arr[8] = 0x18;	 /* protocol specific lu */
 			arr[10] = 0x82;	 /* mlus, per initiator port */
 		} else if (0x88 == cmd[2]) { /* SCSI Ports */
-			arr[1] = cmd[2];	/*sanity */
 			arr[3] = inquiry_vpd_88(&arr[4], target_dev_id);
 		} else if (is_disk_zbc && 0x89 == cmd[2]) { /* ATA info */
-			arr[1] = cmd[2];        /*sanity */
 			n = inquiry_vpd_89(&arr[4]);
 			put_unaligned_be16(n, arr + 2);
 		} else if (is_disk_zbc && 0xb0 == cmd[2]) { /* Block limits */
-			arr[1] = cmd[2];        /*sanity */
 			arr[3] = inquiry_vpd_b0(&arr[4]);
 		} else if (is_disk_zbc && 0xb1 == cmd[2]) { /* Block char. */
-			arr[1] = cmd[2];        /*sanity */
 			arr[3] = inquiry_vpd_b1(devip, &arr[4]);
 		} else if (is_disk && 0xb2 == cmd[2]) { /* LB Prov. */
-			arr[1] = cmd[2];        /*sanity */
 			arr[3] = inquiry_vpd_b2(&arr[4]);
 		} else if (is_zbc && cmd[2] == 0xb6) { /* ZB dev. charact. */
-			arr[1] = cmd[2];        /*sanity */
 			arr[3] = inquiry_vpd_b6(devip, &arr[4]);
 		} else {
 			mk_sense_invalid_fld(scp, SDEB_IN_CDB, 2, -1);

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v3 09/14] scsi_debug: Support the block limits extension VPD page
  2023-10-17 20:47 [PATCH v3 00/14] Pass data temperature information to SCSI disk devices Bart Van Assche
                   ` (7 preceding siblings ...)
  2023-10-17 20:47 ` [PATCH v3 08/14] scsi_debug: Reduce code duplication Bart Van Assche
@ 2023-10-17 20:47 ` Bart Van Assche
  2023-10-17 20:47 ` [PATCH v3 10/14] scsi_debug: Rework page code error handling Bart Van Assche
                   ` (5 subsequent siblings)
  14 siblings, 0 replies; 29+ messages in thread
From: Bart Van Assche @ 2023-10-17 20:47 UTC (permalink / raw)
  To: Jens Axboe
  Cc: linux-block, linux-scsi, linux-fsdevel, Martin K . Petersen,
	Christoph Hellwig, Niklas Cassel, Avri Altman, Bean Huo,
	Daejun Park, Bart Van Assche, Douglas Gilbert,
	James E.J. Bottomley

From SBC-5 r05:

"Reduced stream control:
a) reduces the maximum number of streams that the device server supports;
   and
b) increases the number of write commands that are able to specify a stream
   to be written in any write command that contains the GROUP NUMBER field
   in its CDB.

If the RSCS bit (see 6.6.5) is set to one, then the device server shall:
a) support per group stream identifier usage as described in 4.32.2;
b) support the IO Advice Hints Grouping mode page (see 6.5.7); and
c) set the MAXIMUM NUMBER OF STREAMS field (see 6.6.5) to a value that is
   less than 64.

Device servers that set the RSCS bit to one may support other features
(e.g., permanent streams (see 4.32.4)).

4.32.4 Permanent streams

A permanent stream is a stream for which the device server does not allow
closing or otherwise modifying the configuration of that stream. The PERM
bit (see 5.9.2.3) indicates whether a stream is a permanent stream. If a
STREAM CONTROL command (see 5.32) specifies the closing of a permanent
stream, the device server terminates that command with CHECK CONDITION
status instead of closing the specified stream. A permanent stream is always
an open stream. Device severs should assign the lowest numbered stream
identifiers to permanent streams."

Report that reduced stream control is supported.

Cc: Martin K. Petersen <martin.petersen@oracle.com>
Cc: Douglas Gilbert <dgilbert@interlog.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
---
 drivers/scsi/scsi_debug.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/drivers/scsi/scsi_debug.c b/drivers/scsi/scsi_debug.c
index 46eaa2f9e63b..88cba9374166 100644
--- a/drivers/scsi/scsi_debug.c
+++ b/drivers/scsi/scsi_debug.c
@@ -1627,6 +1627,7 @@ static int resp_inquiry(struct scsi_cmnd *scp, struct sdebug_dev_info *devip)
 					arr[n++] = 0xb2;  /* LB Provisioning */
 				if (is_zbc)
 					arr[n++] = 0xb6;  /* ZB dev. char. */
+				arr[n++] = 0xb7;  /* Block limits extension */
 			}
 			arr[3] = n - 4;	  /* number of supported VPD pages */
 		} else if (0x80 == cmd[2]) { /* unit serial number */
@@ -1669,6 +1670,9 @@ static int resp_inquiry(struct scsi_cmnd *scp, struct sdebug_dev_info *devip)
 			arr[3] = inquiry_vpd_b2(&arr[4]);
 		} else if (is_zbc && cmd[2] == 0xb6) { /* ZB dev. charact. */
 			arr[3] = inquiry_vpd_b6(devip, &arr[4]);
+		} else if (cmd[2] == 0xb7) { /* block limits extension page */
+			arr[3] = 2; /* page length */
+			arr[5] = 1; /* Reduced stream control support (RSCS) */
 		} else {
 			mk_sense_invalid_fld(scp, SDEB_IN_CDB, 2, -1);
 			kfree(arr);

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v3 10/14] scsi_debug: Rework page code error handling
  2023-10-17 20:47 [PATCH v3 00/14] Pass data temperature information to SCSI disk devices Bart Van Assche
                   ` (8 preceding siblings ...)
  2023-10-17 20:47 ` [PATCH v3 09/14] scsi_debug: Support the block limits extension VPD page Bart Van Assche
@ 2023-10-17 20:47 ` Bart Van Assche
  2023-10-17 20:47 ` [PATCH v3 11/14] scsi_debug: Rework subpage " Bart Van Assche
                   ` (4 subsequent siblings)
  14 siblings, 0 replies; 29+ messages in thread
From: Bart Van Assche @ 2023-10-17 20:47 UTC (permalink / raw)
  To: Jens Axboe
  Cc: linux-block, linux-scsi, linux-fsdevel, Martin K . Petersen,
	Christoph Hellwig, Niklas Cassel, Avri Altman, Bean Huo,
	Daejun Park, Bart Van Assche, Douglas Gilbert,
	James E.J. Bottomley

Instead of tracking whether or not the page code is valid in a boolean
variable, jump to error handling code if an unsupported page code is
encountered.

Cc: Martin K. Petersen <martin.petersen@oracle.com>
Cc: Douglas Gilbert <dgilbert@interlog.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
---
 drivers/scsi/scsi_debug.c | 24 ++++++++++++------------
 1 file changed, 12 insertions(+), 12 deletions(-)

diff --git a/drivers/scsi/scsi_debug.c b/drivers/scsi/scsi_debug.c
index 88cba9374166..6b87d267c9c5 100644
--- a/drivers/scsi/scsi_debug.c
+++ b/drivers/scsi/scsi_debug.c
@@ -2327,7 +2327,7 @@ static int resp_mode_sense(struct scsi_cmnd *scp,
 	unsigned char *ap;
 	unsigned char arr[SDEBUG_MAX_MSENSE_SZ];
 	unsigned char *cmd = scp->cmnd;
-	bool dbd, llbaa, msense_6, is_disk, is_zbc, bad_pcode;
+	bool dbd, llbaa, msense_6, is_disk, is_zbc;
 
 	dbd = !!(cmd[1] & 0x8);		/* disable block descriptors */
 	pcontrol = (cmd[2] & 0xc0) >> 6;
@@ -2391,7 +2391,6 @@ static int resp_mode_sense(struct scsi_cmnd *scp,
 		mk_sense_invalid_fld(scp, SDEB_IN_CDB, 3, -1);
 		return check_condition_result;
 	}
-	bad_pcode = false;
 
 	switch (pcode) {
 	case 0x1:	/* Read-Write error recovery page, direct access */
@@ -2406,15 +2405,17 @@ static int resp_mode_sense(struct scsi_cmnd *scp,
 		if (is_disk) {
 			len = resp_format_pg(ap, pcontrol, target);
 			offset += len;
-		} else
-			bad_pcode = true;
+		} else {
+			goto bad_pcode;
+		}
 		break;
 	case 0x8:	/* Caching page, direct access */
 		if (is_disk || is_zbc) {
 			len = resp_caching_pg(ap, pcontrol, target);
 			offset += len;
-		} else
-			bad_pcode = true;
+		} else {
+			goto bad_pcode;
+		}
 		break;
 	case 0xa:	/* Control Mode page, all devices */
 		len = resp_ctrl_m_pg(ap, pcontrol, target);
@@ -2467,18 +2468,17 @@ static int resp_mode_sense(struct scsi_cmnd *scp,
 		}
 		break;
 	default:
-		bad_pcode = true;
-		break;
-	}
-	if (bad_pcode) {
-		mk_sense_invalid_fld(scp, SDEB_IN_CDB, 2, 5);
-		return check_condition_result;
+		goto bad_pcode;
 	}
 	if (msense_6)
 		arr[0] = offset - 1;
 	else
 		put_unaligned_be16((offset - 2), arr + 0);
 	return fill_from_dev_buffer(scp, arr, min_t(u32, alloc_len, offset));
+
+bad_pcode:
+	mk_sense_invalid_fld(scp, SDEB_IN_CDB, 2, 5);
+	return check_condition_result;
 }
 
 #define SDEBUG_MAX_MSELECT_SZ 512

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v3 11/14] scsi_debug: Rework subpage code error handling
  2023-10-17 20:47 [PATCH v3 00/14] Pass data temperature information to SCSI disk devices Bart Van Assche
                   ` (9 preceding siblings ...)
  2023-10-17 20:47 ` [PATCH v3 10/14] scsi_debug: Rework page code error handling Bart Van Assche
@ 2023-10-17 20:47 ` Bart Van Assche
  2023-10-17 23:05   ` kernel test robot
  2023-10-17 20:47 ` [PATCH v3 12/14] scsi_debug: Implement the IO Advice Hints Grouping mode page Bart Van Assche
                   ` (3 subsequent siblings)
  14 siblings, 1 reply; 29+ messages in thread
From: Bart Van Assche @ 2023-10-17 20:47 UTC (permalink / raw)
  To: Jens Axboe
  Cc: linux-block, linux-scsi, linux-fsdevel, Martin K . Petersen,
	Christoph Hellwig, Niklas Cassel, Avri Altman, Bean Huo,
	Daejun Park, Bart Van Assche, Douglas Gilbert,
	James E.J. Bottomley

Move the subpage code checks into the switch statement to make it easier
to add support for new page code / subpage code combinations.

Cc: Martin K. Petersen <martin.petersen@oracle.com>
Cc: Douglas Gilbert <dgilbert@interlog.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
---
 drivers/scsi/scsi_debug.c | 70 ++++++++++++++++++++-------------------
 1 file changed, 36 insertions(+), 34 deletions(-)

diff --git a/drivers/scsi/scsi_debug.c b/drivers/scsi/scsi_debug.c
index 6b87d267c9c5..a96eb0d10346 100644
--- a/drivers/scsi/scsi_debug.c
+++ b/drivers/scsi/scsi_debug.c
@@ -2386,22 +2386,22 @@ static int resp_mode_sense(struct scsi_cmnd *scp,
 		ap = arr + offset;
 	}
 
-	if ((subpcode > 0x0) && (subpcode < 0xff) && (0x19 != pcode)) {
-		/* TODO: Control Extension page */
-		mk_sense_invalid_fld(scp, SDEB_IN_CDB, 3, -1);
-		return check_condition_result;
-	}
-
 	switch (pcode) {
 	case 0x1:	/* Read-Write error recovery page, direct access */
+		if (subpcode > 0x0 && subpcode < 0xff)
+			goto bad_subpcode;
 		len = resp_err_recov_pg(ap, pcontrol, target);
 		offset += len;
 		break;
 	case 0x2:	/* Disconnect-Reconnect page, all devices */
+		if (subpcode > 0x0 && subpcode < 0xff)
+			goto bad_subpcode;
 		len = resp_disconnect_pg(ap, pcontrol, target);
 		offset += len;
 		break;
 	case 0x3:       /* Format device page, direct access */
+		if (subpcode > 0x0 && subpcode < 0xff)
+			goto bad_subpcode;
 		if (is_disk) {
 			len = resp_format_pg(ap, pcontrol, target);
 			offset += len;
@@ -2410,6 +2410,8 @@ static int resp_mode_sense(struct scsi_cmnd *scp,
 		}
 		break;
 	case 0x8:	/* Caching page, direct access */
+		if (subpcode > 0x0 && subpcode < 0xff)
+			goto bad_subpcode;
 		if (is_disk || is_zbc) {
 			len = resp_caching_pg(ap, pcontrol, target);
 			offset += len;
@@ -2418,14 +2420,14 @@ static int resp_mode_sense(struct scsi_cmnd *scp,
 		}
 		break;
 	case 0xa:	/* Control Mode page, all devices */
+		if (subpcode > 0x0 && subpcode < 0xff)
+			goto bad_subpcode;
 		len = resp_ctrl_m_pg(ap, pcontrol, target);
 		offset += len;
 		break;
 	case 0x19:	/* if spc==1 then sas phy, control+discover */
-		if ((subpcode > 0x2) && (subpcode < 0xff)) {
-			mk_sense_invalid_fld(scp, SDEB_IN_CDB, 3, -1);
-			return check_condition_result;
-		}
+		if (subpcode > 0x2 && subpcode < 0xff)
+			goto bad_subpcode;
 		len = 0;
 		if ((0x0 == subpcode) || (0xff == subpcode))
 			len += resp_sas_sf_m_pg(ap + len, pcontrol, target);
@@ -2437,35 +2439,31 @@ static int resp_mode_sense(struct scsi_cmnd *scp,
 		offset += len;
 		break;
 	case 0x1c:	/* Informational Exceptions Mode page, all devices */
+		if (subpcode > 0x0 && subpcode < 0xff)
+			goto bad_subpcode;
 		len = resp_iec_m_pg(ap, pcontrol, target);
 		offset += len;
 		break;
 	case 0x3f:	/* Read all Mode pages */
-		if ((0 == subpcode) || (0xff == subpcode)) {
-			len = resp_err_recov_pg(ap, pcontrol, target);
-			len += resp_disconnect_pg(ap + len, pcontrol, target);
-			if (is_disk) {
-				len += resp_format_pg(ap + len, pcontrol,
-						      target);
-				len += resp_caching_pg(ap + len, pcontrol,
-						       target);
-			} else if (is_zbc) {
-				len += resp_caching_pg(ap + len, pcontrol,
-						       target);
-			}
-			len += resp_ctrl_m_pg(ap + len, pcontrol, target);
-			len += resp_sas_sf_m_pg(ap + len, pcontrol, target);
-			if (0xff == subpcode) {
-				len += resp_sas_pcd_m_spg(ap + len, pcontrol,
-						  target, target_dev_id);
-				len += resp_sas_sha_m_spg(ap + len, pcontrol);
-			}
-			len += resp_iec_m_pg(ap + len, pcontrol, target);
-			offset += len;
-		} else {
-			mk_sense_invalid_fld(scp, SDEB_IN_CDB, 3, -1);
-			return check_condition_result;
+		if (subpcode > 0x0 && subpcode < 0xff)
+			goto bad_subpcode;
+		len = resp_err_recov_pg(ap, pcontrol, target);
+		len += resp_disconnect_pg(ap + len, pcontrol, target);
+		if (is_disk) {
+			len += resp_format_pg(ap + len, pcontrol, target);
+			len += resp_caching_pg(ap + len, pcontrol, target);
+		} else if (is_zbc) {
+			len += resp_caching_pg(ap + len, pcontrol, target);
+		}
+		len += resp_ctrl_m_pg(ap + len, pcontrol, target);
+		len += resp_sas_sf_m_pg(ap + len, pcontrol, target);
+		if (0xff == subpcode) {
+			len += resp_sas_pcd_m_spg(ap + len, pcontrol, target,
+						  target_dev_id);
+			len += resp_sas_sha_m_spg(ap + len, pcontrol);
 		}
+		len += resp_iec_m_pg(ap + len, pcontrol, target);
+		offset += len;
 		break;
 	default:
 		goto bad_pcode;
@@ -2479,6 +2477,10 @@ static int resp_mode_sense(struct scsi_cmnd *scp,
 bad_pcode:
 	mk_sense_invalid_fld(scp, SDEB_IN_CDB, 2, 5);
 	return check_condition_result;
+
+bad_subpcode:
+	mk_sense_invalid_fld(scp, SDEB_IN_CDB, 3, -1);
+	return check_condition_result;
 }
 
 #define SDEBUG_MAX_MSELECT_SZ 512

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v3 12/14] scsi_debug: Implement the IO Advice Hints Grouping mode page
  2023-10-17 20:47 [PATCH v3 00/14] Pass data temperature information to SCSI disk devices Bart Van Assche
                   ` (10 preceding siblings ...)
  2023-10-17 20:47 ` [PATCH v3 11/14] scsi_debug: Rework subpage " Bart Van Assche
@ 2023-10-17 20:47 ` Bart Van Assche
  2023-10-17 20:47 ` [PATCH v3 13/14] scsi_debug: Implement GET STREAM STATUS Bart Van Assche
                   ` (2 subsequent siblings)
  14 siblings, 0 replies; 29+ messages in thread
From: Bart Van Assche @ 2023-10-17 20:47 UTC (permalink / raw)
  To: Jens Axboe
  Cc: linux-block, linux-scsi, linux-fsdevel, Martin K . Petersen,
	Christoph Hellwig, Niklas Cassel, Avri Altman, Bean Huo,
	Daejun Park, Bart Van Assche, Douglas Gilbert,
	James E.J. Bottomley

Implement an IO Advice Hints Grouping mode page with three permanent
streams. A permanent stream is a stream for which the device server does
not allow closing or otherwise modifying the configuration of that
stream. The stream identifier enable (ST_ENBLE) bit specifies whether
the stream identifier may be used in the GROUP NUMBER field of SCSI
WRITE commands.

Cc: Martin K. Petersen <martin.petersen@oracle.com>
Cc: Douglas Gilbert <dgilbert@interlog.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
---
 drivers/scsi/scsi_debug.c | 42 +++++++++++++++++++++++++++++++++++++--
 1 file changed, 40 insertions(+), 2 deletions(-)

diff --git a/drivers/scsi/scsi_debug.c b/drivers/scsi/scsi_debug.c
index a96eb0d10346..d56989e94c4a 100644
--- a/drivers/scsi/scsi_debug.c
+++ b/drivers/scsi/scsi_debug.c
@@ -2241,6 +2241,36 @@ static int resp_ctrl_m_pg(unsigned char *p, int pcontrol, int target)
 	return sizeof(ctrl_m_pg);
 }
 
+enum { MAXIMUM_NUMBER_OF_STREAMS = 4 };
+
+/* IO Advice Hints Grouping mode page */
+static int resp_grouping_m_pg(unsigned char *p, int pcontrol, int target)
+{
+	/* IO Advice Hints Grouping mode page */
+	struct grouping_m_pg {
+		u8 page_code;
+		u8 subpage_code;
+		__be16 page_length;
+		u8 reserved[12];
+		struct scsi_io_group_descriptor
+			descr[MAXIMUM_NUMBER_OF_STREAMS];
+	};
+	static const struct grouping_m_pg gr_m_pg = {
+		.page_code = 0xa,
+		.subpage_code = 5,
+		.page_length = cpu_to_be16(sizeof(gr_m_pg) - 4),
+		.descr = {
+			{ .st_enble = 1 },
+			{ .st_enble = 1 },
+			{ .st_enble = 1 },
+			{ .st_enble = 0 },
+		}
+	};
+
+	BUILD_BUG_ON(sizeof(struct grouping_m_pg) != 16 + 4 * 16);
+	memcpy(p, &gr_m_pg, sizeof(gr_m_pg));
+	return sizeof(gr_m_pg);
+}
 
 static int resp_iec_m_pg(unsigned char *p, int pcontrol, int target)
 {	/* Informational Exceptions control mode page for mode_sense */
@@ -2420,9 +2450,17 @@ static int resp_mode_sense(struct scsi_cmnd *scp,
 		}
 		break;
 	case 0xa:	/* Control Mode page, all devices */
-		if (subpcode > 0x0 && subpcode < 0xff)
+		switch (subpcode) {
+		case 0:
+		case 0xff:
+			len = resp_ctrl_m_pg(ap, pcontrol, target);
+			break;
+		case 0x05:
+			len = resp_grouping_m_pg(ap, pcontrol, target);
+			break;
+		default:
 			goto bad_subpcode;
-		len = resp_ctrl_m_pg(ap, pcontrol, target);
+		}
 		offset += len;
 		break;
 	case 0x19:	/* if spc==1 then sas phy, control+discover */

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v3 13/14] scsi_debug: Implement GET STREAM STATUS
  2023-10-17 20:47 [PATCH v3 00/14] Pass data temperature information to SCSI disk devices Bart Van Assche
                   ` (11 preceding siblings ...)
  2023-10-17 20:47 ` [PATCH v3 12/14] scsi_debug: Implement the IO Advice Hints Grouping mode page Bart Van Assche
@ 2023-10-17 20:47 ` Bart Van Assche
  2023-10-17 20:47 ` [PATCH v3 14/14] scsi_debug: Maintain write statistics per group number Bart Van Assche
  2023-10-18 19:09 ` [PATCH v3 00/14] Pass data temperature information to SCSI disk devices Jens Axboe
  14 siblings, 0 replies; 29+ messages in thread
From: Bart Van Assche @ 2023-10-17 20:47 UTC (permalink / raw)
  To: Jens Axboe
  Cc: linux-block, linux-scsi, linux-fsdevel, Martin K . Petersen,
	Christoph Hellwig, Niklas Cassel, Avri Altman, Bean Huo,
	Daejun Park, Bart Van Assche, Douglas Gilbert,
	James E.J. Bottomley

Implement the GET STREAM STATUS SCSI command. Report that the first
three stream indexes correspond to permanent streams.

Cc: Martin K. Petersen <martin.petersen@oracle.com>
Cc: Douglas Gilbert <dgilbert@interlog.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
---
 drivers/scsi/scsi_debug.c | 44 ++++++++++++++++++++++++++++++++++++++-
 1 file changed, 43 insertions(+), 1 deletion(-)

diff --git a/drivers/scsi/scsi_debug.c b/drivers/scsi/scsi_debug.c
index d56989e94c4a..801448570960 100644
--- a/drivers/scsi/scsi_debug.c
+++ b/drivers/scsi/scsi_debug.c
@@ -481,6 +481,8 @@ static int resp_write_scat(struct scsi_cmnd *, struct sdebug_dev_info *);
 static int resp_start_stop(struct scsi_cmnd *, struct sdebug_dev_info *);
 static int resp_readcap16(struct scsi_cmnd *, struct sdebug_dev_info *);
 static int resp_get_lba_status(struct scsi_cmnd *, struct sdebug_dev_info *);
+static int resp_get_stream_status(struct scsi_cmnd *scp,
+				  struct sdebug_dev_info *devip);
 static int resp_report_tgtpgs(struct scsi_cmnd *, struct sdebug_dev_info *);
 static int resp_unmap(struct scsi_cmnd *, struct sdebug_dev_info *);
 static int resp_rsup_opcodes(struct scsi_cmnd *, struct sdebug_dev_info *);
@@ -555,6 +557,9 @@ static const struct opcode_info_t sa_in_16_iarr[] = {
 	{0, 0x9e, 0x12, F_SA_LOW | F_D_IN, resp_get_lba_status, NULL,
 	    {16,  0x12, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
 	     0xff, 0xff, 0xff, 0, 0xc7} },	/* GET LBA STATUS(16) */
+	{0, 0x9e, 0x16, F_SA_LOW | F_D_IN, resp_get_stream_status, NULL,
+	    {16, 0x16, 0, 0, 0xff, 0xff, 0, 0, 0, 0, 0xff, 0xff, 0xff, 0xff,
+	     0, 0} },	/* GET STREAM STATUS */
 };
 
 static const struct opcode_info_t vl_iarr[] = {	/* VARIABLE LENGTH */
@@ -2241,7 +2246,7 @@ static int resp_ctrl_m_pg(unsigned char *p, int pcontrol, int target)
 	return sizeof(ctrl_m_pg);
 }
 
-enum { MAXIMUM_NUMBER_OF_STREAMS = 4 };
+enum { MAXIMUM_NUMBER_OF_STREAMS = 4, PERMANENT_STREAM_COUNT = 3 };
 
 /* IO Advice Hints Grouping mode page */
 static int resp_grouping_m_pg(unsigned char *p, int pcontrol, int target)
@@ -4236,6 +4241,43 @@ static int resp_get_lba_status(struct scsi_cmnd *scp,
 	return fill_from_dev_buffer(scp, arr, SDEBUG_GET_LBA_STATUS_LEN);
 }
 
+static int resp_get_stream_status(struct scsi_cmnd *scp,
+				  struct sdebug_dev_info *devip)
+{
+	u16 starting_stream_id, stream_id;
+	const u8 *cmd = scp->cmnd;
+	u32 alloc_len, offset;
+	u8 arr[256];
+
+	starting_stream_id = get_unaligned_be16(cmd + 4);
+	alloc_len = get_unaligned_be32(cmd + 10);
+
+	if (alloc_len < 8) {
+		mk_sense_invalid_fld(scp, SDEB_IN_CDB, 10, -1);
+		return check_condition_result;
+	}
+
+	if (starting_stream_id >= MAXIMUM_NUMBER_OF_STREAMS) {
+		mk_sense_invalid_fld(scp, SDEB_IN_CDB, 4, -1);
+		return check_condition_result;
+	}
+
+	for (offset = 8, stream_id = starting_stream_id;
+	     offset + 8 <= min_t(u32, alloc_len, sizeof(arr)) &&
+		     stream_id < MAXIMUM_NUMBER_OF_STREAMS;
+	     offset += 8, stream_id++) {
+		struct scsi_stream_status *stream_status = (void *)arr + offset;
+
+		stream_status->perm = stream_id < PERMANENT_STREAM_COUNT;
+		put_unaligned_be16(stream_id,
+				   &stream_status->stream_identifier);
+		stream_status->rel_lifetime = stream_id + 1;
+	}
+	put_unaligned_be32(offset - 8, arr + 0); /* PARAMETER DATA LENGTH */
+
+	return fill_from_dev_buffer(scp, arr, min(offset, alloc_len));
+}
+
 static int resp_sync_cache(struct scsi_cmnd *scp,
 			   struct sdebug_dev_info *devip)
 {

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v3 14/14] scsi_debug: Maintain write statistics per group number
  2023-10-17 20:47 [PATCH v3 00/14] Pass data temperature information to SCSI disk devices Bart Van Assche
                   ` (12 preceding siblings ...)
  2023-10-17 20:47 ` [PATCH v3 13/14] scsi_debug: Implement GET STREAM STATUS Bart Van Assche
@ 2023-10-17 20:47 ` Bart Van Assche
  2023-10-18 19:09 ` [PATCH v3 00/14] Pass data temperature information to SCSI disk devices Jens Axboe
  14 siblings, 0 replies; 29+ messages in thread
From: Bart Van Assche @ 2023-10-17 20:47 UTC (permalink / raw)
  To: Jens Axboe
  Cc: linux-block, linux-scsi, linux-fsdevel, Martin K . Petersen,
	Christoph Hellwig, Niklas Cassel, Avri Altman, Bean Huo,
	Daejun Park, Bart Van Assche, Douglas Gilbert,
	James E.J. Bottomley

Track per GROUP NUMBER how many write commands have been processed. Make
this information available in sysfs. Reset these statistics if any data
is written into the sysfs attribute.

Note: SCSI devices should only interpret the information in the GROUP
NUMBER field as a stream identifier if the ST_ENBLE bit has been set to
one. This patch follows a simpler approach: count the number of writes
per GROUP NUMBER whether or not the group number represents a stream
identifier.

Cc: Martin K. Petersen <martin.petersen@oracle.com>
Cc: Douglas Gilbert <dgilbert@interlog.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
---
 drivers/scsi/scsi_debug.c | 51 ++++++++++++++++++++++++++++++++++++---
 1 file changed, 47 insertions(+), 4 deletions(-)

diff --git a/drivers/scsi/scsi_debug.c b/drivers/scsi/scsi_debug.c
index 801448570960..c2102c0046ad 100644
--- a/drivers/scsi/scsi_debug.c
+++ b/drivers/scsi/scsi_debug.c
@@ -846,6 +846,8 @@ static int sdeb_zbc_nr_conv = DEF_ZBC_NR_CONV_ZONES;
 static int submit_queues = DEF_SUBMIT_QUEUES;  /* > 1 for multi-queue (mq) */
 static int poll_queues; /* iouring iopoll interface.*/
 
+static atomic_long_t writes_by_group_number[64];
+
 static char sdebug_proc_name[] = MY_NAME;
 static const char *my_name = MY_NAME;
 
@@ -3040,7 +3042,8 @@ static inline struct sdeb_store_info *devip2sip(struct sdebug_dev_info *devip,
 
 /* Returns number of bytes copied or -1 if error. */
 static int do_device_access(struct sdeb_store_info *sip, struct scsi_cmnd *scp,
-			    u32 sg_skip, u64 lba, u32 num, bool do_write)
+			    u32 sg_skip, u64 lba, u32 num, bool do_write,
+			    u8 group_number)
 {
 	int ret;
 	u64 block, rest = 0;
@@ -3059,6 +3062,10 @@ static int do_device_access(struct sdeb_store_info *sip, struct scsi_cmnd *scp,
 		return 0;
 	if (scp->sc_data_direction != dir)
 		return -1;
+
+	if (do_write && group_number < ARRAY_SIZE(writes_by_group_number))
+		atomic_long_inc(&writes_by_group_number[group_number]);
+
 	fsp = sip->storep;
 
 	block = do_div(lba, sdebug_store_sectors);
@@ -3432,7 +3439,7 @@ static int resp_read_dt0(struct scsi_cmnd *scp, struct sdebug_dev_info *devip)
 		}
 	}
 
-	ret = do_device_access(sip, scp, 0, lba, num, false);
+	ret = do_device_access(sip, scp, 0, lba, num, false, 0);
 	sdeb_read_unlock(sip);
 	if (unlikely(ret == -1))
 		return DID_ERROR << 16;
@@ -3617,6 +3624,7 @@ static int resp_write_dt0(struct scsi_cmnd *scp, struct sdebug_dev_info *devip)
 {
 	bool check_prot;
 	u32 num;
+	u8 group = 0;
 	u32 ei_lba;
 	int ret;
 	u64 lba;
@@ -3628,11 +3636,13 @@ static int resp_write_dt0(struct scsi_cmnd *scp, struct sdebug_dev_info *devip)
 		ei_lba = 0;
 		lba = get_unaligned_be64(cmd + 2);
 		num = get_unaligned_be32(cmd + 10);
+		group = cmd[14] & 0x3f;
 		check_prot = true;
 		break;
 	case WRITE_10:
 		ei_lba = 0;
 		lba = get_unaligned_be32(cmd + 2);
+		group = cmd[6] & 0x3f;
 		num = get_unaligned_be16(cmd + 7);
 		check_prot = true;
 		break;
@@ -3647,15 +3657,18 @@ static int resp_write_dt0(struct scsi_cmnd *scp, struct sdebug_dev_info *devip)
 		ei_lba = 0;
 		lba = get_unaligned_be32(cmd + 2);
 		num = get_unaligned_be32(cmd + 6);
+		group = cmd[6] & 0x3f;
 		check_prot = true;
 		break;
 	case 0x53:	/* XDWRITEREAD(10) */
 		ei_lba = 0;
 		lba = get_unaligned_be32(cmd + 2);
+		group = cmd[6] & 0x1f;
 		num = get_unaligned_be16(cmd + 7);
 		check_prot = false;
 		break;
 	default:	/* assume WRITE(32) */
+		group = cmd[6] & 0x3f;
 		lba = get_unaligned_be64(cmd + 12);
 		ei_lba = get_unaligned_be32(cmd + 20);
 		num = get_unaligned_be32(cmd + 28);
@@ -3710,7 +3723,7 @@ static int resp_write_dt0(struct scsi_cmnd *scp, struct sdebug_dev_info *devip)
 		}
 	}
 
-	ret = do_device_access(sip, scp, 0, lba, num, true);
+	ret = do_device_access(sip, scp, 0, lba, num, true, group);
 	if (unlikely(scsi_debug_lbp()))
 		map_region(sip, lba, num);
 	/* If ZBC zone then bump its write pointer */
@@ -3762,12 +3775,14 @@ static int resp_write_scat(struct scsi_cmnd *scp,
 	u32 lb_size = sdebug_sector_size;
 	u32 ei_lba;
 	u64 lba;
+	u8 group;
 	int ret, res;
 	bool is_16;
 	static const u32 lrd_size = 32; /* + parameter list header size */
 
 	if (cmd[0] == VARIABLE_LENGTH_CMD) {
 		is_16 = false;
+		group = cmd[6] & 0x3f;
 		wrprotect = (cmd[10] >> 5) & 0x7;
 		lbdof = get_unaligned_be16(cmd + 12);
 		num_lrd = get_unaligned_be16(cmd + 16);
@@ -3778,6 +3793,7 @@ static int resp_write_scat(struct scsi_cmnd *scp,
 		lbdof = get_unaligned_be16(cmd + 4);
 		num_lrd = get_unaligned_be16(cmd + 8);
 		bt_len = get_unaligned_be32(cmd + 10);
+		group = cmd[14] & 0x3f;
 		if (unlikely(have_dif_prot)) {
 			if (sdebug_dif == T10_PI_TYPE2_PROTECTION &&
 			    wrprotect) {
@@ -3866,7 +3882,8 @@ static int resp_write_scat(struct scsi_cmnd *scp,
 			}
 		}
 
-		ret = do_device_access(sip, scp, sg_off, lba, num, true);
+		ret = do_device_access(sip, scp, sg_off, lba, num, true,
+				       group);
 		/* If ZBC zone then bump its write pointer */
 		if (sdebug_dev_is_zoned(devip))
 			zbc_inc_wp(devip, lba, num);
@@ -6828,6 +6845,31 @@ static ssize_t tur_ms_to_ready_show(struct device_driver *ddp, char *buf)
 }
 static DRIVER_ATTR_RO(tur_ms_to_ready);
 
+static ssize_t group_number_stats_show(struct device_driver *ddp, char *buf)
+{
+	char *p = buf, *end = buf + PAGE_SIZE;
+	int i;
+
+	for (i = 0; i < ARRAY_SIZE(writes_by_group_number); i++)
+		p += scnprintf(p, end - p, "%d %ld\n", i,
+			       atomic_long_read(&writes_by_group_number[i]));
+
+	return p - buf;
+}
+
+static ssize_t group_number_stats_store(struct device_driver *ddp,
+					const char *buf,
+				  size_t count)
+{
+	int i;
+
+	for (i = 0; i < ARRAY_SIZE(writes_by_group_number); i++)
+		atomic_long_set(&writes_by_group_number[i], 0);
+
+	return 0;
+}
+static DRIVER_ATTR_RW(group_number_stats);
+
 /* Note: The following array creates attribute files in the
    /sys/bus/pseudo/drivers/scsi_debug directory. The advantage of these
    files (over those found in the /sys/module/scsi_debug/parameters
@@ -6874,6 +6916,7 @@ static struct attribute *sdebug_drv_attrs[] = {
 	&driver_attr_cdb_len.attr,
 	&driver_attr_tur_ms_to_ready.attr,
 	&driver_attr_zbc.attr,
+	&driver_attr_group_number_stats.attr,
 	NULL,
 };
 ATTRIBUTE_GROUPS(sdebug_drv);

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* Re: [PATCH v3 07/14] sd: Translate data lifetime information
  2023-10-17 20:47 ` [PATCH v3 07/14] sd: Translate data lifetime information Bart Van Assche
@ 2023-10-17 22:43   ` kernel test robot
  0 siblings, 0 replies; 29+ messages in thread
From: kernel test robot @ 2023-10-17 22:43 UTC (permalink / raw)
  To: Bart Van Assche; +Cc: oe-kbuild-all

Hi Bart,

kernel test robot noticed the following build warnings:

[auto build test WARNING on jejb-scsi/for-next]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Bart-Van-Assche/fs-Move-enum-rw_hint-into-a-new-header-file/20231018-045042
base:   https://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi.git for-next
patch link:    https://lore.kernel.org/r/20231017204739.3409052-8-bvanassche%40acm.org
patch subject: [PATCH v3 07/14] sd: Translate data lifetime information
reproduce: (https://download.01.org/0day-ci/archive/20231018/202310180631.1XPMVNAI-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202310180631.1XPMVNAI-lkp@intel.com/

# many are suggestions rather than must-fix

WARNING:UNSPECIFIED_INT: Prefer 'unsigned int' to bare use of 'unsigned'
#104: FILE: drivers/scsi/sd.c:2944:
+static bool sd_is_perm_stream(struct scsi_disk *sdkp, unsigned stream_id)

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v3 11/14] scsi_debug: Rework subpage code error handling
  2023-10-17 20:47 ` [PATCH v3 11/14] scsi_debug: Rework subpage " Bart Van Assche
@ 2023-10-17 23:05   ` kernel test robot
  0 siblings, 0 replies; 29+ messages in thread
From: kernel test robot @ 2023-10-17 23:05 UTC (permalink / raw)
  To: Bart Van Assche; +Cc: oe-kbuild-all

Hi Bart,

kernel test robot noticed the following build warnings:

[auto build test WARNING on jejb-scsi/for-next]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Bart-Van-Assche/fs-Move-enum-rw_hint-into-a-new-header-file/20231018-045042
base:   https://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi.git for-next
patch link:    https://lore.kernel.org/r/20231017204739.3409052-12-bvanassche%40acm.org
patch subject: [PATCH v3 11/14] scsi_debug: Rework subpage code error handling
reproduce: (https://download.01.org/0day-ci/archive/20231018/202310180625.WS8qZdmF-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202310180625.WS8qZdmF-lkp@intel.com/

# many are suggestions rather than must-fix

WARNING:CONSTANT_COMPARISON: Comparisons should place the constant on the right side of the test
#123: FILE: drivers/scsi/scsi_debug.c:2460:
+		if (0xff == subpcode) {

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v3 00/14] Pass data temperature information to SCSI disk devices
  2023-10-17 20:47 [PATCH v3 00/14] Pass data temperature information to SCSI disk devices Bart Van Assche
                   ` (13 preceding siblings ...)
  2023-10-17 20:47 ` [PATCH v3 14/14] scsi_debug: Maintain write statistics per group number Bart Van Assche
@ 2023-10-18 19:09 ` Jens Axboe
  2023-10-18 19:34   ` Bart Van Assche
  2023-10-20 20:45   ` Bart Van Assche
  14 siblings, 2 replies; 29+ messages in thread
From: Jens Axboe @ 2023-10-18 19:09 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: linux-block, linux-scsi, linux-fsdevel, Martin K . Petersen,
	Christoph Hellwig, Niklas Cassel, Avri Altman, Bean Huo,
	Daejun Park

On 10/17/23 2:47 PM, Bart Van Assche wrote:
> Hi Jens,
> 
> UFS vendors need the data lifetime information to achieve good performance.
> Without this information there is significantly higher write amplification due
> to garbage collection. Hence this patch series that add support in F2FS and
> also in the block layer for data lifetime information. The SCSI disk (sd)
> driver is modified such that it passes write hint information to SCSI devices
> via the GROUP NUMBER field.
> 
> Please consider this patch series for the next merge window.

My main hesitation with this is that there's a big gap between what
makes theoretical sense and practical sense. When we previously tried
this, turns out devices retained the data temperature on media, as
expected, but tossed it out when data was GC'ed. That made it more of a
benchmarking case than anything else. How do we know that things are
better now? In previous postings I've seen you point at some papers, but
I'm mostly concerned with practical use cases and devices. Are there any
results, at all, from that? Or is this a case of vendors asking for
something to check some marketing boxes or have value add?

I can take a closer look once this is fully understood. Not adding
something like this without proper justification.

I'm also really against growing struct bio just for this. Why is patch 2
not just using the ioprio field at least?

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v3 00/14] Pass data temperature information to SCSI disk devices
  2023-10-18 19:09 ` [PATCH v3 00/14] Pass data temperature information to SCSI disk devices Jens Axboe
@ 2023-10-18 19:34   ` Bart Van Assche
  2023-10-19  0:33     ` Damien Le Moal
  2023-10-20 20:45   ` Bart Van Assche
  1 sibling, 1 reply; 29+ messages in thread
From: Bart Van Assche @ 2023-10-18 19:34 UTC (permalink / raw)
  To: Jens Axboe
  Cc: linux-block, linux-scsi, linux-fsdevel, Martin K . Petersen,
	Christoph Hellwig, Niklas Cassel, Avri Altman, Bean Huo,
	Daejun Park


On 10/18/23 12:09, Jens Axboe wrote:
> My main hesitation with this is that there's a big gap between what
> makes theoretical sense and practical sense. When we previously tried
> this, turns out devices retained the data temperature on media, as
> expected, but tossed it out when data was GC'ed. That made it more of a
> benchmarking case than anything else. How do we know that things are
> better now? In previous postings I've seen you point at some papers, but
> I'm mostly concerned with practical use cases and devices. Are there any
> results, at all, from that? Or is this a case of vendors asking for
> something to check some marketing boxes or have value add?

Hi Jens,

Multiple UFS vendors made it clear to me that this feature is essential 
for their UFS devices to perform well. I will reach out to some of these
vendors off-list and will ask them to share performance numbers.

A note: persistent stream support is a feature that was only added
recently in the latest SCSI SBC-5 draft. This SCSI specification change
allows SCSI device vendors to interpret the GROUP NUMBER field as a data
lifetime. UFS device vendors interpret the GROUP NUMBER field as a data
lifetime since a long time - long before this was allowed by the SCSI
standards. See also the "ContextID" feature in the UFS specification.
That feature is mentioned in every version of the UFS specification I
have access to. The oldest version of the UFS specification I have
access to is version 2.2, published in 2016.
(https://www.jedec.org/system/files/docs/JESD220C-2_2.pdf). This
document is available free of charge after an account has been created 
on the JEDEC website.

> I'm also really against growing struct bio just for this. Why is patch 2
> not just using the ioprio field at least?

Hmm ... shouldn't the bits in the ioprio field in struct bio have the
same meaning as in the ioprio fields used in interfaces between user
space and the kernel? Damien Le Moal asked me not to use any of the
ioprio bits passing data lifetime information from user space to the kernel.

Is it clear that the size of struct bio has not been changed because the
new bi_lifetime member fills a hole in struct bio?

Thanks,

Bart.



^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v3 00/14] Pass data temperature information to SCSI disk devices
  2023-10-18 19:34   ` Bart Van Assche
@ 2023-10-19  0:33     ` Damien Le Moal
  2023-10-19 16:48       ` Bart Van Assche
  0 siblings, 1 reply; 29+ messages in thread
From: Damien Le Moal @ 2023-10-19  0:33 UTC (permalink / raw)
  To: Bart Van Assche, Jens Axboe
  Cc: linux-block, linux-scsi, linux-fsdevel, Martin K . Petersen,
	Christoph Hellwig, Niklas Cassel, Avri Altman, Bean Huo,
	Daejun Park

On 10/19/23 04:34, Bart Van Assche wrote:
> 
> On 10/18/23 12:09, Jens Axboe wrote:
>> My main hesitation with this is that there's a big gap between what
>> makes theoretical sense and practical sense. When we previously tried
>> this, turns out devices retained the data temperature on media, as
>> expected, but tossed it out when data was GC'ed. That made it more of a
>> benchmarking case than anything else. How do we know that things are
>> better now? In previous postings I've seen you point at some papers, but
>> I'm mostly concerned with practical use cases and devices. Are there any
>> results, at all, from that? Or is this a case of vendors asking for
>> something to check some marketing boxes or have value add?
> 
> Hi Jens,
> 
> Multiple UFS vendors made it clear to me that this feature is essential 
> for their UFS devices to perform well. I will reach out to some of these
> vendors off-list and will ask them to share performance numbers.
> 
> A note: persistent stream support is a feature that was only added
> recently in the latest SCSI SBC-5 draft. This SCSI specification change
> allows SCSI device vendors to interpret the GROUP NUMBER field as a data
> lifetime. UFS device vendors interpret the GROUP NUMBER field as a data
> lifetime since a long time - long before this was allowed by the SCSI
> standards. See also the "ContextID" feature in the UFS specification.
> That feature is mentioned in every version of the UFS specification I
> have access to. The oldest version of the UFS specification I have
> access to is version 2.2, published in 2016.
> (https://www.jedec.org/system/files/docs/JESD220C-2_2.pdf). This
> document is available free of charge after an account has been created 
> on the JEDEC website.
> 
>> I'm also really against growing struct bio just for this. Why is patch 2
>> not just using the ioprio field at least?
> 
> Hmm ... shouldn't the bits in the ioprio field in struct bio have the
> same meaning as in the ioprio fields used in interfaces between user
> space and the kernel? Damien Le Moal asked me not to use any of the
> ioprio bits passing data lifetime information from user space to the kernel.

I said so in the context that if lifetime is a per-inode property, then ioprio
is the wrong interface since the ioprio API is per process or per IO. There is a
mismatch.

One version of your patch series used fnctl() to set the lifetime per inoe,
which is fine, and then used the BIO ioprio to pass the lifetime down to the
device driver. That is in theory a nice trick, but that creates conflicts with
the userspace ioprio API if the user uses that at the same time.

So may be we should change bio ioprio from int to u16 and use the freedup u16
for lifetime. With that, things are cleanly separated without growing struct bio.

> 
> Is it clear that the size of struct bio has not been changed because the
> new bi_lifetime member fills a hole in struct bio?

When the struct is randomized, holes move or disappear. Don't count on that...

> 
> Thanks,
> 
> Bart.
> 
> 

-- 
Damien Le Moal
Western Digital Research


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v3 00/14] Pass data temperature information to SCSI disk devices
  2023-10-19  0:33     ` Damien Le Moal
@ 2023-10-19 16:48       ` Bart Van Assche
  2023-10-19 22:40         ` Damien Le Moal
  0 siblings, 1 reply; 29+ messages in thread
From: Bart Van Assche @ 2023-10-19 16:48 UTC (permalink / raw)
  To: Damien Le Moal, Jens Axboe
  Cc: linux-block, linux-scsi, linux-fsdevel, Martin K . Petersen,
	Christoph Hellwig, Niklas Cassel, Avri Altman, Bean Huo,
	Daejun Park

On 10/18/23 17:33, Damien Le Moal wrote:
> On 10/19/23 04:34, Bart Van Assche wrote:
 >> On 10/18/23 12:09, Jens Axboe wrote:
>>> I'm also really against growing struct bio just for this. Why is patch 2
>>> not just using the ioprio field at least?
>>
>> Hmm ... shouldn't the bits in the ioprio field in struct bio have the
>> same meaning as in the ioprio fields used in interfaces between user
>> space and the kernel? Damien Le Moal asked me not to use any of the
>> ioprio bits passing data lifetime information from user space to the kernel.
> 
> I said so in the context that if lifetime is a per-inode property, then ioprio
> is the wrong interface since the ioprio API is per process or per IO. There is a
> mismatch.
> 
> One version of your patch series used fnctl() to set the lifetime per inode,
> which is fine, and then used the BIO ioprio to pass the lifetime down to the
> device driver. That is in theory a nice trick, but that creates conflicts with
> the userspace ioprio API if the user uses that at the same time.
> 
> So may be we should change bio ioprio from int to u16 and use the freedup u16
> for lifetime. With that, things are cleanly separated without growing struct bio.

Hmm ... I think that bi_ioprio has been 16 bits wide since the 
introduction of that data structure member in 2016?

>> Is it clear that the size of struct bio has not been changed because the
>> new bi_lifetime member fills a hole in struct bio?
> 
> When the struct is randomized, holes move or disappear. Don't count on that...

We should aim to maximize performance for users who do not use data 
structure layout randomization.

Additionally, I doubt that anyone is using full structure layout 
randomization for SCSI devices. No SCSI driver has any 
__no_randomize_layout / __randomize_layout annotations although I'm sure 
there are plenty of data structures in SCSI drivers for which the layout 
matters.

Thanks,

Bart.



^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v3 00/14] Pass data temperature information to SCSI disk devices
  2023-10-19 16:48       ` Bart Van Assche
@ 2023-10-19 22:40         ` Damien Le Moal
  2023-10-19 23:00           ` Damien Le Moal
  0 siblings, 1 reply; 29+ messages in thread
From: Damien Le Moal @ 2023-10-19 22:40 UTC (permalink / raw)
  To: Bart Van Assche, Jens Axboe
  Cc: linux-block, linux-scsi, linux-fsdevel, Martin K . Petersen,
	Christoph Hellwig, Niklas Cassel, Avri Altman, Bean Huo,
	Daejun Park

On 10/20/23 01:48, Bart Van Assche wrote:
> On 10/18/23 17:33, Damien Le Moal wrote:
>> On 10/19/23 04:34, Bart Van Assche wrote:
>  >> On 10/18/23 12:09, Jens Axboe wrote:
>>>> I'm also really against growing struct bio just for this. Why is patch 2
>>>> not just using the ioprio field at least?
>>>
>>> Hmm ... shouldn't the bits in the ioprio field in struct bio have the
>>> same meaning as in the ioprio fields used in interfaces between user
>>> space and the kernel? Damien Le Moal asked me not to use any of the
>>> ioprio bits passing data lifetime information from user space to the kernel.
>>
>> I said so in the context that if lifetime is a per-inode property, then ioprio
>> is the wrong interface since the ioprio API is per process or per IO. There is a
>> mismatch.
>>
>> One version of your patch series used fnctl() to set the lifetime per inode,
>> which is fine, and then used the BIO ioprio to pass the lifetime down to the
>> device driver. That is in theory a nice trick, but that creates conflicts with
>> the userspace ioprio API if the user uses that at the same time.
>>
>> So may be we should change bio ioprio from int to u16 and use the freedup u16
>> for lifetime. With that, things are cleanly separated without growing struct bio.
> 
> Hmm ... I think that bi_ioprio has been 16 bits wide since the 
> introduction of that data structure member in 2016?

My bad. struct bio->bi_ioprio is an unsigned short. I got confused with the user
API and kernel functions using an int in many places. We really should change
the kernel functions to use unsigned short for ioprio everywhere.

>>> Is it clear that the size of struct bio has not been changed because the
>>> new bi_lifetime member fills a hole in struct bio?
>>
>> When the struct is randomized, holes move or disappear. Don't count on that...
> 
> We should aim to maximize performance for users who do not use data 
> structure layout randomization.
> 
> Additionally, I doubt that anyone is using full structure layout 
> randomization for SCSI devices. No SCSI driver has any 
> __no_randomize_layout / __randomize_layout annotations although I'm sure 
> there are plenty of data structures in SCSI drivers for which the layout 
> matters.

Well, if Jens is OK with adding another "unsigned short bi_lifetime" in a hole
in struct bio, that's fine with me. Otherwise, we are back to discussing how to
pack bi_ioprio in a sensible manner so that we do not create a mess between the
use cases and APIs:
1) inode based lifetime with FS setting up the bi_ioprio field
2) Direct IOs to files of an FS with lifetime set by user per IO (e.g.
aio/io_uring/ioprio_set()) and/or fcntl()
3) Direct IOs to raw block devices with lifetime set by user per IO (e.g.
aio/io_uring/ioprio_set())

Any of the above case should also allow using ioprio class/level and CDL hint.

I think the most problematic part is (2) when lifetime are set with both fcntl()
and per IO: which lifetime is the valid one ? The one set with fcntl() or the
one specified for the IO ? I think the former is the one we want here.

If we can clarify that, then I guess using 3 or 4 bits from the 10 bits ioprio
hint should be OK. That would  give you 7 or 15 lifetime values. Enough no ?

-- 
Damien Le Moal
Western Digital Research


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v3 00/14] Pass data temperature information to SCSI disk devices
  2023-10-19 22:40         ` Damien Le Moal
@ 2023-10-19 23:00           ` Damien Le Moal
  0 siblings, 0 replies; 29+ messages in thread
From: Damien Le Moal @ 2023-10-19 23:00 UTC (permalink / raw)
  To: Bart Van Assche, Jens Axboe
  Cc: linux-block, linux-scsi, linux-fsdevel, Martin K . Petersen,
	Christoph Hellwig, Niklas Cassel, Avri Altman, Bean Huo,
	Daejun Park

On 10/20/23 07:40, Damien Le Moal wrote:
> On 10/20/23 01:48, Bart Van Assche wrote:
>> On 10/18/23 17:33, Damien Le Moal wrote:
>>> On 10/19/23 04:34, Bart Van Assche wrote:
>>  >> On 10/18/23 12:09, Jens Axboe wrote:
>>>>> I'm also really against growing struct bio just for this. Why is patch 2
>>>>> not just using the ioprio field at least?
>>>>
>>>> Hmm ... shouldn't the bits in the ioprio field in struct bio have the
>>>> same meaning as in the ioprio fields used in interfaces between user
>>>> space and the kernel? Damien Le Moal asked me not to use any of the
>>>> ioprio bits passing data lifetime information from user space to the kernel.
>>>
>>> I said so in the context that if lifetime is a per-inode property, then ioprio
>>> is the wrong interface since the ioprio API is per process or per IO. There is a
>>> mismatch.
>>>
>>> One version of your patch series used fnctl() to set the lifetime per inode,
>>> which is fine, and then used the BIO ioprio to pass the lifetime down to the
>>> device driver. That is in theory a nice trick, but that creates conflicts with
>>> the userspace ioprio API if the user uses that at the same time.
>>>
>>> So may be we should change bio ioprio from int to u16 and use the freedup u16
>>> for lifetime. With that, things are cleanly separated without growing struct bio.
>>
>> Hmm ... I think that bi_ioprio has been 16 bits wide since the 
>> introduction of that data structure member in 2016?
> 
> My bad. struct bio->bi_ioprio is an unsigned short. I got confused with the user
> API and kernel functions using an int in many places. We really should change
> the kernel functions to use unsigned short for ioprio everywhere.
> 
>>>> Is it clear that the size of struct bio has not been changed because the
>>>> new bi_lifetime member fills a hole in struct bio?
>>>
>>> When the struct is randomized, holes move or disappear. Don't count on that...
>>
>> We should aim to maximize performance for users who do not use data 
>> structure layout randomization.
>>
>> Additionally, I doubt that anyone is using full structure layout 
>> randomization for SCSI devices. No SCSI driver has any 
>> __no_randomize_layout / __randomize_layout annotations although I'm sure 
>> there are plenty of data structures in SCSI drivers for which the layout 
>> matters.
> 
> Well, if Jens is OK with adding another "unsigned short bi_lifetime" in a hole
> in struct bio, that's fine with me. Otherwise, we are back to discussing how to
> pack bi_ioprio in a sensible manner so that we do not create a mess between the
> use cases and APIs:
> 1) inode based lifetime with FS setting up the bi_ioprio field
> 2) Direct IOs to files of an FS with lifetime set by user per IO (e.g.
> aio/io_uring/ioprio_set()) and/or fcntl()
> 3) Direct IOs to raw block devices with lifetime set by user per IO (e.g.
> aio/io_uring/ioprio_set())
> 
> Any of the above case should also allow using ioprio class/level and CDL hint.
> 
> I think the most problematic part is (2) when lifetime are set with both fcntl()
> and per IO: which lifetime is the valid one ? The one set with fcntl() or the
> one specified for the IO ? I think the former is the one we want here.
> 
> If we can clarify that, then I guess using 3 or 4 bits from the 10 bits ioprio
> hint should be OK. That would  give you 7 or 15 lifetime values. Enough no ?

To be clear, we have to deal with these cases:
1) File IOs
  - User uses fcntl() only for lifetime
  - User uses per direct IO ioprio with lifetime (and maybe class/level/cdl)
  - User uses all of the above
2) Raw block device direct IOs
  - Per IO ioprio with lifetime (and maybe class/level/cdl)

(2) is easy. No real change needed beside the UFS driver bits.
But the cases for (1) need clarification about how things should work.

-- 
Damien Le Moal
Western Digital Research


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v3 00/14] Pass data temperature information to SCSI disk devices
  2023-10-18 19:09 ` [PATCH v3 00/14] Pass data temperature information to SCSI disk devices Jens Axboe
  2023-10-18 19:34   ` Bart Van Assche
@ 2023-10-20 20:45   ` Bart Van Assche
  1 sibling, 0 replies; 29+ messages in thread
From: Bart Van Assche @ 2023-10-20 20:45 UTC (permalink / raw)
  To: Jens Axboe
  Cc: linux-block, linux-scsi, linux-fsdevel, Martin K . Petersen,
	Christoph Hellwig, Niklas Cassel, Avri Altman, Bean Huo,
	Daejun Park

On 10/18/23 12:09, Jens Axboe wrote:
> I'm also really against growing struct bio just for this. Why is patch 2
> not just using the ioprio field at least?

Hi Jens,

Can you please clarify whether your concern is about the size of struct 
bio only or also about the runtime impact of the comparisons that have 
been added in attempt_merge() and blk_rq_merge_ok()? It may be possible 
to eliminate the overhead of the new comparisons as follows:
* Introduce a union of struct { I/O priority; data lifetime; } and u32.
* Use that union in struct bio instead of bi_ioprio and bi_lifetime.
* Use that union in struct request instead of the ioprio and lifetime
   members.
* In attempt_merge() and blk_rq_merge_ok(), compare the u32 union member
   instead of comparing the I/O priority and data lifetime separately.

Thanks,

Bart.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v3 01/14] fs: Move enum rw_hint into a new header file
  2023-10-17 20:47 ` [PATCH v3 01/14] fs: Move enum rw_hint into a new header file Bart Van Assche
@ 2023-10-30 11:11   ` Kanchan Joshi
  2023-10-30 16:10     ` Bart Van Assche
       [not found]     ` <CGME20231017204823epcas5p2798d17757d381aaf7ad4dd235f3f0da3@epcms2p1>
  0 siblings, 2 replies; 29+ messages in thread
From: Kanchan Joshi @ 2023-10-30 11:11 UTC (permalink / raw)
  To: Bart Van Assche, Jens Axboe
  Cc: linux-block, linux-scsi, linux-fsdevel, Martin K . Petersen,
	Christoph Hellwig, Niklas Cassel, Avri Altman, Bean Huo,
	Daejun Park, Jan Kara, Christian Brauner, Jaegeuk Kim, Chao Yu,
	Alexander Viro, Jeff Layton, Chuck Lever

On 10/18/2023 2:17 AM, Bart Van Assche wrote:
> - * Write life time hint values.
> - * Stored in struct inode as u8.
> - */
> -enum rw_hint {
> -	WRITE_LIFE_NOT_SET	= 0,
> -	WRITE_LIFE_NONE		= RWH_WRITE_LIFE_NONE,
> -	WRITE_LIFE_SHORT	= RWH_WRITE_LIFE_SHORT,
> -	WRITE_LIFE_MEDIUM	= RWH_WRITE_LIFE_MEDIUM,
> -	WRITE_LIFE_LONG		= RWH_WRITE_LIFE_LONG,
> -	WRITE_LIFE_EXTREME	= RWH_WRITE_LIFE_EXTREME,
> -};
> -
>   /* Match RWF_* bits to IOCB bits */
>   #define IOCB_HIPRI		(__force int) RWF_HIPRI
>   #define IOCB_DSYNC		(__force int) RWF_DSYNC
> @@ -677,7 +665,7 @@ struct inode {
>   	spinlock_t		i_lock;	/* i_blocks, i_bytes, maybe i_size */
>   	unsigned short          i_bytes;
>   	u8			i_blkbits;
> -	u8			i_write_hint;
> +	enum rw_hint		i_write_hint;
>   	blkcnt_t		i_blocks;
>   
>   #ifdef __NEED_I_SIZE_ORDERED
> diff --git a/include/linux/rw_hint.h b/include/linux/rw_hint.h
> new file mode 100644
> index 000000000000..4a7d28945973
> --- /dev/null
> +++ b/include/linux/rw_hint.h
> @@ -0,0 +1,20 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +#ifndef _LINUX_RW_HINT_H
> +#define _LINUX_RW_HINT_H
> +
> +#include <linux/build_bug.h>
> +#include <linux/compiler_attributes.h>
> +
> +/* Block storage write lifetime hint values. */
> +enum rw_hint {
> +	WRITE_LIFE_NOT_SET	= 0, /* RWH_WRITE_LIFE_NOT_SET */
> +	WRITE_LIFE_NONE		= 1, /* RWH_WRITE_LIFE_NONE */
> +	WRITE_LIFE_SHORT	= 2, /* RWH_WRITE_LIFE_SHORT */
> +	WRITE_LIFE_MEDIUM	= 3, /* RWH_WRITE_LIFE_MEDIUM */
> +	WRITE_LIFE_LONG		= 4, /* RWH_WRITE_LIFE_LONG */
> +	WRITE_LIFE_EXTREME	= 5, /* RWH_WRITE_LIFE_EXTREME */
> +} __packed;
> +
> +static_assert(sizeof(enum rw_hint) == 1);

Does it make sense to do away with these, and have temperature-neutral 
names instead e.g., WRITE_LIFE_1, WRITE_LIFE_2?

With the current choice:
- If the count goes up (beyond 5 hints), infra can scale fine but these 
names do not. Imagine ULTRA_EXTREME after EXTREME.
- Applications or in-kernel users can specify LONG hint with data that 
actually has a SHORT lifetime. Nothing really ensures that LONG is 
really LONG.

Temperature-neutral names seem more generic/scalable and do not present 
the unnecessary need to be accurate with relative temperatures.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v3 01/14] fs: Move enum rw_hint into a new header file
  2023-10-30 11:11   ` Kanchan Joshi
@ 2023-10-30 16:10     ` Bart Van Assche
       [not found]     ` <CGME20231017204823epcas5p2798d17757d381aaf7ad4dd235f3f0da3@epcms2p1>
  1 sibling, 0 replies; 29+ messages in thread
From: Bart Van Assche @ 2023-10-30 16:10 UTC (permalink / raw)
  To: Kanchan Joshi, Jens Axboe
  Cc: linux-block, linux-scsi, linux-fsdevel, Martin K . Petersen,
	Christoph Hellwig, Niklas Cassel, Avri Altman, Bean Huo,
	Daejun Park, Jan Kara, Christian Brauner, Jaegeuk Kim, Chao Yu,
	Alexander Viro, Jeff Layton, Chuck Lever

On 10/30/23 04:11, Kanchan Joshi wrote:
> On 10/18/2023 2:17 AM, Bart Van Assche wrote:
>> +/* Block storage write lifetime hint values. */
>> +enum rw_hint {
>> +	WRITE_LIFE_NOT_SET	= 0, /* RWH_WRITE_LIFE_NOT_SET */
>> +	WRITE_LIFE_NONE		= 1, /* RWH_WRITE_LIFE_NONE */
>> +	WRITE_LIFE_SHORT	= 2, /* RWH_WRITE_LIFE_SHORT */
>> +	WRITE_LIFE_MEDIUM	= 3, /* RWH_WRITE_LIFE_MEDIUM */
>> +	WRITE_LIFE_LONG		= 4, /* RWH_WRITE_LIFE_LONG */
>> +	WRITE_LIFE_EXTREME	= 5, /* RWH_WRITE_LIFE_EXTREME */
>> +} __packed;
>> +
>> +static_assert(sizeof(enum rw_hint) == 1);
> 
> Does it make sense to do away with these, and have temperature-neutral
> names instead e.g., WRITE_LIFE_1, WRITE_LIFE_2?
> 
> With the current choice:
> - If the count goes up (beyond 5 hints), infra can scale fine but these
> names do not. Imagine ULTRA_EXTREME after EXTREME.
> - Applications or in-kernel users can specify LONG hint with data that
> actually has a SHORT lifetime. Nothing really ensures that LONG is
> really LONG.
> 
> Temperature-neutral names seem more generic/scalable and do not present
> the unnecessary need to be accurate with relative temperatures.

Thanks for having taken a look at this patch series. Jens asked for data
that shows that this patch series improves performance. Is this
something Samsung can help with?

Thanks,

Bart.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* RE:(2) [PATCH v3 01/14] fs: Move enum rw_hint into a new header file
       [not found]     ` <CGME20231017204823epcas5p2798d17757d381aaf7ad4dd235f3f0da3@epcms2p1>
@ 2023-11-01  6:39       ` Daejun Park
  2023-11-01 16:45         ` (2) " Bart Van Assche
       [not found]         ` <CGME20231017204823epcas5p2798d17757d381aaf7ad4dd235f3f0da3@epcms2p3>
  0 siblings, 2 replies; 29+ messages in thread
From: Daejun Park @ 2023-11-01  6:39 UTC (permalink / raw)
  To: Bart Van Assche, KANCHAN JOSHI, Jens Axboe
  Cc: linux-block, linux-scsi, linux-fsdevel, Martin K . Petersen,
	Christoph Hellwig, Niklas Cassel, Avri Altman, Bean Huo,
	Daejun Park, Jan Kara, Christian Brauner, Jaegeuk Kim, Chao Yu,
	Alexander Viro, Jeff Layton, Chuck Lever, Seonghun Kim, Jorn Lee,
	Sung-Jun Park, Hyunji Jeon, Dongwoo Kim, Seongcheol Hong,
	Jaeheon Lee, Wonjong Song, JinHwan Park, Yonggil Song,
	Soonyoung Kim, Shinwoo Park, Seokhwan Kim

Hi Bart,

>On 10/30/23 04:11, Kanchan Joshi wrote:
>> On 10/18/2023 2:17 AM, Bart Van Assche wrote:
>>> +/* Block storage write lifetime hint values. */
>>> +enum rw_hint {
>>> +        WRITE_LIFE_NOT_SET        = 0, /* RWH_WRITE_LIFE_NOT_SET */
>>> +        WRITE_LIFE_NONE                = 1, /* RWH_WRITE_LIFE_NONE */
>>> +        WRITE_LIFE_SHORT        = 2, /* RWH_WRITE_LIFE_SHORT */
>>> +        WRITE_LIFE_MEDIUM        = 3, /* RWH_WRITE_LIFE_MEDIUM */
>>> +        WRITE_LIFE_LONG                = 4, /* RWH_WRITE_LIFE_LONG */
>>> +        WRITE_LIFE_EXTREME        = 5, /* RWH_WRITE_LIFE_EXTREME */
>>> +} __packed;
>>> +
>>> +static_assert(sizeof(enum rw_hint) == 1);
>> 
>> Does it make sense to do away with these, and have temperature-neutral
>> names instead e.g., WRITE_LIFE_1, WRITE_LIFE_2?
>> 
>> With the current choice:
>> - If the count goes up (beyond 5 hints), infra can scale fine but these
>> names do not. Imagine ULTRA_EXTREME after EXTREME.
>> - Applications or in-kernel users can specify LONG hint with data that
>> actually has a SHORT lifetime. Nothing really ensures that LONG is
>> really LONG.
>> 
>> Temperature-neutral names seem more generic/scalable and do not present
>> the unnecessary need to be accurate with relative temperatures.
>
>Thanks for having taken a look at this patch series. Jens asked for data
>that shows that this patch series improves performance. Is this
>something Samsung can help with?

We analyzed the NAND block erase counter with and without stream separation
through a long-term workload in F2FS.
The analysis showed that the erase counter is reduced by approximately 40% 
with stream seperation.
Long-term workload is a scenario where erase and write are repeated by
stream after performing precondition fill for each temperature of F2FS.

Thanks,

Daejun.

>
>Thanks,
>
>Bart.
>
> 
> 

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: (2) [PATCH v3 01/14] fs: Move enum rw_hint into a new header file
  2023-11-01  6:39       ` Daejun Park
@ 2023-11-01 16:45         ` Bart Van Assche
       [not found]         ` <CGME20231017204823epcas5p2798d17757d381aaf7ad4dd235f3f0da3@epcms2p3>
  1 sibling, 0 replies; 29+ messages in thread
From: Bart Van Assche @ 2023-11-01 16:45 UTC (permalink / raw)
  To: daejun7.park, KANCHAN JOSHI, Jens Axboe
  Cc: linux-block, linux-scsi, linux-fsdevel, Martin K . Petersen,
	Christoph Hellwig, Niklas Cassel, Avri Altman, Bean Huo,
	Jan Kara, Christian Brauner, Jaegeuk Kim, Chao Yu,
	Alexander Viro, Jeff Layton, Chuck Lever, Seonghun Kim, Jorn Lee,
	Sung-Jun Park, Hyunji Jeon, Dongwoo Kim, Seongcheol Hong,
	Jaeheon Lee, Wonjong Song, JinHwan Park, Yonggil Song,
	Soonyoung Kim, Shinwoo Park, Seokhwan Kim

On 10/31/23 23:39, Daejun Park wrote:
>> On 10/30/23 04:11, Kanchan Joshi wrote:
>>> On 10/18/2023 2:17 AM, Bart Van Assche wrote:
>> Thanks for having taken a look at this patch series. Jens asked for data
>> that shows that this patch series improves performance. Is this
>> something Samsung can help with?
> 
> We analyzed the NAND block erase counter with and without stream separation
> through a long-term workload in F2FS.
> The analysis showed that the erase counter is reduced by approximately 40%
> with stream seperation.
> Long-term workload is a scenario where erase and write are repeated by
> stream after performing precondition fill for each temperature of F2FS.

Hi Daejun,

Thank you for having shared this data. This is very helpful. Since I'm
not familiar with the erase counter: does the above data perhaps mean
that write amplification is reduced by 40% in the workload that has been
examined?

Thanks,

Bart.


^ permalink raw reply	[flat|nested] 29+ messages in thread

* RE:(2) (2) [PATCH v3 01/14] fs: Move enum rw_hint into a new header file
       [not found]         ` <CGME20231017204823epcas5p2798d17757d381aaf7ad4dd235f3f0da3@epcms2p3>
@ 2023-11-02  7:31           ` Daejun Park
  0 siblings, 0 replies; 29+ messages in thread
From: Daejun Park @ 2023-11-02  7:31 UTC (permalink / raw)
  To: Bart Van Assche, Daejun Park, KANCHAN JOSHI, Jens Axboe
  Cc: linux-block, linux-scsi, linux-fsdevel, Martin K . Petersen,
	Christoph Hellwig, Niklas Cassel, Avri Altman, Bean Huo,
	Jan Kara, Christian Brauner, Jaegeuk Kim, Chao Yu,
	Alexander Viro, Jeff Layton, Chuck Lever, Seonghun Kim, Jorn Lee,
	Sung-Jun Park, Hyunji Jeon, Dongwoo Kim, Seongcheol Hong,
	Jaeheon Lee, Wonjong Song, JinHwan Park, Yonggil Song,
	Soonyoung Kim, Shinwoo Park, Seokhwan Kim

Hi Bart,

>On 10/31/23 23:39, Daejun Park wrote:
>>> On 10/30/23 04:11, Kanchan Joshi wrote:
>>>> On 10/18/2023 2:17 AM, Bart Van Assche wrote:
>>> Thanks for having taken a look at this patch series. Jens asked for data
>>> that shows that this patch series improves performance. Is this
>>> something Samsung can help with?
>> 
>> We analyzed the NAND block erase counter with and without stream separation
>> through a long-term workload in F2FS.
>> The analysis showed that the erase counter is reduced by approximately 40%
>> with stream seperation.
>> Long-term workload is a scenario where erase and write are repeated by
>> stream after performing precondition fill for each temperature of F2FS.
>
>Hi Daejun,
>
>Thank you for having shared this data. This is very helpful. Since I'm
>not familiar with the erase counter: does the above data perhaps mean
>that write amplification is reduced by 40% in the workload that has been
>examined?

WAF is not only caused by GC. It is also caused by other reasons.
During device GC, the valid pages in the victim block are migrated, and a
lower erase counter means that the effective GC is performed by selecting
a victim block with a small number of invalid pages.
Thus, it can be said that the WAF can be decreased about 40% by selecting
fewer victim blocks during device GC.

Thanks,

Daejun

>
>Thanks,
>
>Bart.


^ permalink raw reply	[flat|nested] 29+ messages in thread

end of thread, other threads:[~2023-11-02  7:33 UTC | newest]

Thread overview: 29+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-10-17 20:47 [PATCH v3 00/14] Pass data temperature information to SCSI disk devices Bart Van Assche
2023-10-17 20:47 ` [PATCH v3 01/14] fs: Move enum rw_hint into a new header file Bart Van Assche
2023-10-30 11:11   ` Kanchan Joshi
2023-10-30 16:10     ` Bart Van Assche
     [not found]     ` <CGME20231017204823epcas5p2798d17757d381aaf7ad4dd235f3f0da3@epcms2p1>
2023-11-01  6:39       ` Daejun Park
2023-11-01 16:45         ` (2) " Bart Van Assche
     [not found]         ` <CGME20231017204823epcas5p2798d17757d381aaf7ad4dd235f3f0da3@epcms2p3>
2023-11-02  7:31           ` Daejun Park
2023-10-17 20:47 ` [PATCH v3 02/14] block: Restore data lifetime support in struct bio and struct request Bart Van Assche
2023-10-17 20:47 ` [PATCH v3 03/14] fs: Restore write hint support Bart Van Assche
2023-10-17 20:47 ` [PATCH v3 04/14] fs/f2fs: Restore data lifetime support Bart Van Assche
2023-10-17 20:47 ` [PATCH v3 05/14] scsi: core: Query the Block Limits Extension VPD page Bart Van Assche
2023-10-17 20:47 ` [PATCH v3 06/14] scsi_proto: Add structures and constants related to I/O groups and streams Bart Van Assche
2023-10-17 20:47 ` [PATCH v3 07/14] sd: Translate data lifetime information Bart Van Assche
2023-10-17 22:43   ` kernel test robot
2023-10-17 20:47 ` [PATCH v3 08/14] scsi_debug: Reduce code duplication Bart Van Assche
2023-10-17 20:47 ` [PATCH v3 09/14] scsi_debug: Support the block limits extension VPD page Bart Van Assche
2023-10-17 20:47 ` [PATCH v3 10/14] scsi_debug: Rework page code error handling Bart Van Assche
2023-10-17 20:47 ` [PATCH v3 11/14] scsi_debug: Rework subpage " Bart Van Assche
2023-10-17 23:05   ` kernel test robot
2023-10-17 20:47 ` [PATCH v3 12/14] scsi_debug: Implement the IO Advice Hints Grouping mode page Bart Van Assche
2023-10-17 20:47 ` [PATCH v3 13/14] scsi_debug: Implement GET STREAM STATUS Bart Van Assche
2023-10-17 20:47 ` [PATCH v3 14/14] scsi_debug: Maintain write statistics per group number Bart Van Assche
2023-10-18 19:09 ` [PATCH v3 00/14] Pass data temperature information to SCSI disk devices Jens Axboe
2023-10-18 19:34   ` Bart Van Assche
2023-10-19  0:33     ` Damien Le Moal
2023-10-19 16:48       ` Bart Van Assche
2023-10-19 22:40         ` Damien Le Moal
2023-10-19 23:00           ` Damien Le Moal
2023-10-20 20:45   ` Bart Van Assche

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.