linux-block.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH 00/18] blktrace: add blktrace extension support
@ 2019-05-01  4:28 Chaitanya Kulkarni
  2019-05-01  4:28 ` [RFC PATCH 01/18] blktrace: increase the size of action mask Chaitanya Kulkarni
                   ` (17 more replies)
  0 siblings, 18 replies; 27+ messages in thread
From: Chaitanya Kulkarni @ 2019-05-01  4:28 UTC (permalink / raw)
  To: linux-block; +Cc: Chaitanya Kulkarni

Hi,

This patch series adds support to track more request based flags and 
different request fields to the blktrace infrastructure.

In this series, we increase the action and action mask field and add 
priority and priority mask field to existing infrastructure.

The userland tools part of the patch-series is followed by this
one, here is the reference:-
Chaitanya Kulkarni (10):
        blktrace.h: add blktrace extension to the header
        blktrace_api.h: update blktrace API header
        act-mask: add blktrace extension to act_mask
        blktrace.c: add support for extensions
        blkparse.c: add support for extensions
        blkparse-fmt.c: add extension support
        iowatcher/blkparse: add extension definitions
        blkiomon: add extension support
        blkrawverify: add extension support
        blktrace-tools: add extension support


Following is the detailed overview of how this patch-series is
organized:-

1. The first few patches focus on adding block trace extension:-

  blktrace: increase the size of action mask
  blktrace: add more definitions for BLK_TC_ACT
  blktrace: update trace to track more actions
  kernel/trace: add KConfig to enable blktrace_ext

2. Next set of patches adds support to track request based priority and
allows the user to configure request priority maks just like action
mask:-

  blktrace: add iopriority mask
  blktrace: add iopriority mask
  blktrace: allow user to track iopriority
  blktrace: add sysfs ioprio mask
  blktrace: add debug support for extension

3. Following patches just set the bio priority so that blktrace will not
report wrong priority while tracing bios:-

  block: set ioprio for write-zeroes, discard etc 
  block: set ioprio for zone-reset
  block: set ioprio for flush bio 
  drivers: set bio iopriority field
  fs: set bio iopriority field
  power/swap: set bio iopriority field
  mm: set bio iopriority field

  Ideally, the above patches for drivers and fs category should be sent
  separately to the respective subsystem for the RFC review purpose I
  kept it all in the one patch.
  

4. Last two patches add support for null_blk driver to specify
module parameter for discard and write-zeroes operations which
makes testing easier:-

  null_blk: add write-zeroes flag to nullb_device
  null_blk: add module param discard/write-zeroes

P.S. I've not added linux-btrace mailing list as I'm having trouble
subscribing to it. 

RFC is little light on the detail but would like to start the discussion
about how should we add extensions to the block trace
infrastructure to track more request operations and priorities.

Regards,
Chaitanya


Chaitanya Kulkarni (18):
  blktrace: increase the size of action mask
  blktrace: add more definitions for BLK_TC_ACT
  blktrace: update trace to track more actions
  kernel/trace: add KConfig to enable blktrace_ext
  blktrace: add iopriority mask
  blktrace: add iopriority mask
  blktrace: allow user to track iopriority
  blktrace: add sysfs ioprio mask
  blktrace: add debug support for extension
  block: set ioprio for write-zeroes, discard etc
  block: set ioprio for zone-reset
  block: set ioprio for flush bio
  drivers: set bio iopriority field
  fs: set bio iopriority field
  power/swap: set bio iopriority field
  mm: set bio iopriority field
  null_blk: add write-zeroes flag to nullb_device
  null_blk: add module param discard/write-zeroes

 block/blk-flush.c                   |   2 +
 block/blk-lib.c                     |   6 +
 block/blk-zoned.c                   |   2 +
 drivers/block/drbd/drbd_actlog.c    |   2 +
 drivers/block/drbd/drbd_bitmap.c    |   3 +
 drivers/block/null_blk.h            |   1 +
 drivers/block/null_blk_main.c       |  37 +++-
 drivers/block/xen-blkback/blkback.c |   3 +
 drivers/block/zram/zram_drv.c       |   2 +
 drivers/lightnvm/pblk-read.c        |   2 +
 drivers/lightnvm/pblk-write.c       |   1 +
 drivers/md/bcache/journal.c         |   2 +
 drivers/md/bcache/super.c           |   2 +
 drivers/md/dm-bufio.c               |   2 +
 drivers/md/dm-cache-target.c        |   1 +
 drivers/md/dm-io.c                  |   2 +
 drivers/md/dm-log-writes.c          |   5 +
 drivers/md/dm-thin.c                |   1 +
 drivers/md/dm-writecache.c          |   2 +
 drivers/md/dm-zoned-metadata.c      |   4 +
 drivers/md/md.c                     |   4 +
 drivers/md/raid5-cache.c            |   4 +
 drivers/md/raid5-ppl.c              |   3 +
 drivers/nvme/target/io-cmd-bdev.c   |   7 +
 drivers/staging/erofs/internal.h    |   3 +
 drivers/target/target_core_iblock.c |   3 +
 fs/btrfs/disk-io.c                  |   2 +
 fs/btrfs/extent_io.c                |   3 +
 fs/btrfs/raid56.c                   |   6 +
 fs/btrfs/scrub.c                    |   2 +
 fs/btrfs/volumes.c                  |   3 +
 fs/buffer.c                         |   2 +
 fs/crypto/bio.c                     |   3 +
 fs/direct-io.c                      |   2 +
 fs/ext4/page-io.c                   |   2 +
 fs/ext4/readpage.c                  |   1 +
 fs/f2fs/data.c                      |   3 +
 fs/f2fs/segment.c                   |   1 +
 fs/gfs2/lops.c                      |   2 +
 fs/gfs2/meta_io.c                   |   2 +
 fs/gfs2/ops_fstype.c                |   2 +
 fs/hfsplus/wrapper.c                |   2 +
 fs/iomap.c                          |   2 +
 fs/jfs/jfs_logmgr.c                 |   3 +
 fs/jfs/jfs_metapage.c               |   3 +
 fs/mpage.c                          |   1 +
 fs/nfs/blocklayout/blocklayout.c    |   2 +
 fs/nilfs2/segbuf.c                  |   2 +
 fs/ocfs2/cluster/heartbeat.c        |   2 +
 fs/xfs/xfs_aops.c                   |   3 +
 fs/xfs/xfs_buf.c                    |   2 +
 include/linux/blktrace_api.h        |  13 +-
 include/uapi/linux/blktrace_api.h   |  65 ++++--
 kernel/power/swap.c                 |   2 +
 kernel/trace/Kconfig                |  36 ++++
 kernel/trace/blktrace.c             | 323 +++++++++++++++++++++++++++-
 mm/page_io.c                        |   2 +
 57 files changed, 579 insertions(+), 26 deletions(-)

-- 
2.19.1


^ permalink raw reply	[flat|nested] 27+ messages in thread

* [RFC PATCH 01/18] blktrace: increase the size of action mask
  2019-05-01  4:28 [RFC PATCH 00/18] blktrace: add blktrace extension support Chaitanya Kulkarni
@ 2019-05-01  4:28 ` Chaitanya Kulkarni
  2019-05-01 15:48   ` Bart Van Assche
  2019-05-01  4:28 ` [RFC PATCH 02/18] blktrace: add more definitions for BLK_TC_ACT Chaitanya Kulkarni
                   ` (16 subsequent siblings)
  17 siblings, 1 reply; 27+ messages in thread
From: Chaitanya Kulkarni @ 2019-05-01  4:28 UTC (permalink / raw)
  To: linux-block; +Cc: Chaitanya Kulkarni

This patch adds the blktrace extension support where we increase the
size of action mask so that it can store more actions.

Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
---
 include/linux/blktrace_api.h | 12 +++++++++++-
 1 file changed, 11 insertions(+), 1 deletion(-)

diff --git a/include/linux/blktrace_api.h b/include/linux/blktrace_api.h
index 7bb2d8de9f30..403d4cfc6a52 100644
--- a/include/linux/blktrace_api.h
+++ b/include/linux/blktrace_api.h
@@ -17,7 +17,11 @@ struct blk_trace {
 	struct rchan *rchan;
 	unsigned long __percpu *sequence;
 	unsigned char __percpu *msg_data;
+#ifdef CONFIG_BLKTRACE_EXT
+	u64 act_mask;
+#else
 	u16 act_mask;
+#endif /* CONFIG_BLKTRACE_EXT */
 	u64 start_lba;
 	u64 end_lba;
 	u32 pid;
@@ -101,14 +105,20 @@ static inline int blk_trace_init_sysfs(struct device *dev)
 
 struct compat_blk_user_trace_setup {
 	char name[BLKTRACE_BDEV_SIZE];
+#ifdef CONFIG_BLKTRACE_EXT
+	u64 act_mask;
+#else
 	u16 act_mask;
+#endif /* CONFIG_BLKTRACE_EXT */
 	u32 buf_size;
 	u32 buf_nr;
 	compat_u64 start_lba;
 	compat_u64 end_lba;
 	u32 pid;
 };
-#define BLKTRACESETUP32 _IOWR(0x12, 115, struct compat_blk_user_trace_setup)
+
+/* XXX: temp work around for RFC */
+#define BLKTRACESETUP32 _IOWR(0x13, 115, struct compat_blk_user_trace_setup)
 
 #endif
 
-- 
2.19.1


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [RFC PATCH 02/18] blktrace: add more definitions for BLK_TC_ACT
  2019-05-01  4:28 [RFC PATCH 00/18] blktrace: add blktrace extension support Chaitanya Kulkarni
  2019-05-01  4:28 ` [RFC PATCH 01/18] blktrace: increase the size of action mask Chaitanya Kulkarni
@ 2019-05-01  4:28 ` Chaitanya Kulkarni
  2019-05-01 12:31   ` Christoph Hellwig
  2019-05-01  4:28 ` [RFC PATCH 03/18] blktrace: update trace to track more actions Chaitanya Kulkarni
                   ` (15 subsequent siblings)
  17 siblings, 1 reply; 27+ messages in thread
From: Chaitanya Kulkarni @ 2019-05-01  4:28 UTC (permalink / raw)
  To: linux-block; +Cc: Chaitanya Kulkarni

Now that we have increase the size of action mask we can now safely add
new blktrace actions and extend the code to track more block layer
request flags.

Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
---
 include/uapi/linux/blktrace_api.h | 63 +++++++++++++++++++++----------
 1 file changed, 44 insertions(+), 19 deletions(-)

diff --git a/include/uapi/linux/blktrace_api.h b/include/uapi/linux/blktrace_api.h
index 690621b610e5..c34cf752a9a1 100644
--- a/include/uapi/linux/blktrace_api.h
+++ b/include/uapi/linux/blktrace_api.h
@@ -8,30 +8,42 @@
  * Trace categories
  */
 enum blktrace_cat {
-	BLK_TC_READ	= 1 << 0,	/* reads */
-	BLK_TC_WRITE	= 1 << 1,	/* writes */
-	BLK_TC_FLUSH	= 1 << 2,	/* flush */
-	BLK_TC_SYNC	= 1 << 3,	/* sync IO */
-	BLK_TC_SYNCIO	= BLK_TC_SYNC,
-	BLK_TC_QUEUE	= 1 << 4,	/* queueing/merging */
-	BLK_TC_REQUEUE	= 1 << 5,	/* requeueing */
-	BLK_TC_ISSUE	= 1 << 6,	/* issue */
-	BLK_TC_COMPLETE	= 1 << 7,	/* completions */
-	BLK_TC_FS	= 1 << 8,	/* fs requests */
-	BLK_TC_PC	= 1 << 9,	/* pc requests */
-	BLK_TC_NOTIFY	= 1 << 10,	/* special message */
-	BLK_TC_AHEAD	= 1 << 11,	/* readahead */
-	BLK_TC_META	= 1 << 12,	/* metadata */
-	BLK_TC_DISCARD	= 1 << 13,	/* discard requests */
-	BLK_TC_DRV_DATA	= 1 << 14,	/* binary per-driver data */
-	BLK_TC_FUA	= 1 << 15,	/* fua requests */
-
-	BLK_TC_END	= 1 << 15,	/* we've run out of bits! */
+	BLK_TC_READ		= 1 << 0,	/* reads */
+	BLK_TC_WRITE		= 1 << 1,	/* writes */
+	BLK_TC_FLUSH		= 1 << 2,	/* flush */
+	BLK_TC_SYNC		= 1 << 3,	/* sync IO */
+	BLK_TC_SYNCIO		= BLK_TC_SYNC,
+	BLK_TC_QUEUE		= 1 << 4,	/* queueing/merging */
+	BLK_TC_REQUEUE		= 1 << 5,	/* requeueing */
+	BLK_TC_ISSUE		= 1 << 6,	/* issue */
+	BLK_TC_COMPLETE		= 1 << 7,	/* completions */
+	BLK_TC_FS		= 1 << 8,	/* fs requests */
+	BLK_TC_PC		= 1 << 9,	/* pc requests */
+	BLK_TC_NOTIFY		= 1 << 10,	/* special message */
+	BLK_TC_AHEAD		= 1 << 11,	/* readahead */
+	BLK_TC_META		= 1 << 12,	/* metadata */
+	BLK_TC_DISCARD		= 1 << 13,	/* discard requests */
+	BLK_TC_DRV_DATA		= 1 << 14,	/* binary per-driver data */
+	BLK_TC_FUA		= 1 << 15,	/* fua requests */
+
+#ifdef CONFIG_BLKTRACE_EXT
+	BLK_TC_WRITE_ZEROES	= 1 << 16,	/* write-zeores */
+	BLK_TC_ZONE_RESET	= 1 << 17,	/* zone-reset */
+
+	BLK_TC_END		= 1 << 31,	/* we've run out of bits! */
+#else
+	BLK_TC_END		= 1 << 16,	/* we've run out of bits! */
+#endif /* CONFIG_BLKTRACE_EXT */
 };
 
+#ifdef CONFIG_BLKTRACE_EXT
+#define BLK_TC_SHIFT		(32)
+#define BLK_TC_ACT(act)		(((u64)act) << BLK_TC_SHIFT)
+#else
 #define BLK_TC_SHIFT		(16)
 #define BLK_TC_ACT(act)		((act) << BLK_TC_SHIFT)
 
+#endif /* CONFIG_BLKTRACE_EXT */
 /*
  * Basic trace actions
  */
@@ -93,7 +105,11 @@ enum blktrace_notify {
 #define BLK_TN_MESSAGE		(__BLK_TN_MESSAGE | BLK_TC_ACT(BLK_TC_NOTIFY))
 
 #define BLK_IO_TRACE_MAGIC	0x65617400
+#ifdef CONFIG_BLKTRACE_EXT
+#define BLK_IO_TRACE_VERSION	0x08
+#else
 #define BLK_IO_TRACE_VERSION	0x07
+#endif /* CONFIG_BLKTRACE_EXT */
 
 /*
  * The trace itself
@@ -104,7 +120,12 @@ struct blk_io_trace {
 	__u64 time;		/* in nanoseconds */
 	__u64 sector;		/* disk offset */
 	__u32 bytes;		/* transfer length */
+
+#ifdef CONFIG_BLKTRACE_EXT
+	__u64 action;		/* what happened */
+#else
 	__u32 action;		/* what happened */
+#endif /* CONFIG_BLKTRACE_EXT */
 	__u32 pid;		/* who did it */
 	__u32 device;		/* device number */
 	__u32 cpu;		/* on what cpu did it happen */
@@ -135,7 +156,11 @@ enum {
  */
 struct blk_user_trace_setup {
 	char name[BLKTRACE_BDEV_SIZE];	/* output */
+#ifdef CONFIG_BLKTRACE_EXT
+	__u64 act_mask;			/* input */
+#else
 	__u16 act_mask;			/* input */
+#endif /* CONFIG_BLKTRACE_EXT */
 	__u32 buf_size;			/* input */
 	__u32 buf_nr;			/* input */
 	__u64 start_lba;
-- 
2.19.1


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [RFC PATCH 03/18] blktrace: update trace to track more actions
  2019-05-01  4:28 [RFC PATCH 00/18] blktrace: add blktrace extension support Chaitanya Kulkarni
  2019-05-01  4:28 ` [RFC PATCH 01/18] blktrace: increase the size of action mask Chaitanya Kulkarni
  2019-05-01  4:28 ` [RFC PATCH 02/18] blktrace: add more definitions for BLK_TC_ACT Chaitanya Kulkarni
@ 2019-05-01  4:28 ` Chaitanya Kulkarni
  2019-05-01  4:28 ` [RFC PATCH 04/18] kernel/trace: add KConfig to enable blktrace_ext Chaitanya Kulkarni
                   ` (14 subsequent siblings)
  17 siblings, 0 replies; 27+ messages in thread
From: Chaitanya Kulkarni @ 2019-05-01  4:28 UTC (permalink / raw)
  To: linux-block; +Cc: Chaitanya Kulkarni

The existing blocktrace API has different data type when it comes to
tracking the action. Now that we have increased the size of action mask
and the action itself update all the APIs prototypes to handle action of
size u64. Also with extension we now we can track the priorities, so
update the API to hold the priority values for logging.

Also, in the previous patch we have added two new block trace actions
for REQ_OP_WRITE_ZEROES and REQ_OP_ZONE_RESET. Add those action into
the __blk_add_trace() so we can track new actions.

Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
---
 kernel/trace/blktrace.c | 165 +++++++++++++++++++++++++++++++++++++++-
 1 file changed, 164 insertions(+), 1 deletion(-)

diff --git a/kernel/trace/blktrace.c b/kernel/trace/blktrace.c
index e1c6d79fb4cc..6d2b4adae76e 100644
--- a/kernel/trace/blktrace.c
+++ b/kernel/trace/blktrace.c
@@ -63,9 +63,16 @@ static void blk_unregister_tracepoints(void);
 /*
  * Send out a notify message.
  */
+#ifdef CONFIG_BLKTRACE_EXT
+static void trace_note(struct blk_trace *bt, pid_t pid, u64 action,
+		       const void *data, size_t len,
+		       union kernfs_node_id *cgid)
+
+#else
 static void trace_note(struct blk_trace *bt, pid_t pid, int action,
 		       const void *data, size_t len,
 		       union kernfs_node_id *cgid)
+#endif /* CONFIG_BLKTRACE_EXT */
 {
 	struct blk_io_trace *t;
 	struct ring_buffer_event *event = NULL;
@@ -180,8 +187,14 @@ void __trace_note_message(struct blk_trace *bt, struct blkcg *blkcg,
 }
 EXPORT_SYMBOL_GPL(__trace_note_message);
 
+
+#ifdef CONFIG_BLKTRACE_EXT
+static int act_log_check(struct blk_trace *bt, u64 what, sector_t sector,
+			 pid_t pid)
+#else
 static int act_log_check(struct blk_trace *bt, u32 what, sector_t sector,
 			 pid_t pid)
+#endif /* CONFIG_BLKTRACE_EXT */
 {
 	if (((bt->act_mask << BLK_TC_SHIFT) & what) == 0)
 		return 1;
@@ -196,8 +209,15 @@ static int act_log_check(struct blk_trace *bt, u32 what, sector_t sector,
 /*
  * Data direction bit lookup
  */
+
+#ifdef CONFIG_BLKTRACE_EXT
+static const u64 ddir_act[2] = { BLK_TC_ACT(BLK_TC_READ),
+                                BLK_TC_ACT(BLK_TC_WRITE) };
+
+#else
 static const u32 ddir_act[2] = { BLK_TC_ACT(BLK_TC_READ),
-				 BLK_TC_ACT(BLK_TC_WRITE) };
+                                 BLK_TC_ACT(BLK_TC_WRITE) };
+#endif /* CONFIG_BLKTRACE_EXT */
 
 #define BLK_TC_RAHEAD		BLK_TC_AHEAD
 #define BLK_TC_PREFLUSH		BLK_TC_FLUSH
@@ -210,9 +230,16 @@ static const u32 ddir_act[2] = { BLK_TC_ACT(BLK_TC_READ),
  * The worker for the various blk_add_trace*() types. Fills out a
  * blk_io_trace structure and places it in a per-cpu subbuffer.
  */
+
+#ifdef CONFIG_BLKTRACE_EXT
+static void __blk_add_trace(struct blk_trace *bt, sector_t sector, int bytes,
+		     int op, int op_flags, u64 what, int error, int pdu_len,
+		     void *pdu_data, union kernfs_node_id *cgid, u32 ioprio)
+#else
 static void __blk_add_trace(struct blk_trace *bt, sector_t sector, int bytes,
 		     int op, int op_flags, u32 what, int error, int pdu_len,
 		     void *pdu_data, union kernfs_node_id *cgid)
+#endif /* CONFIG_BLKTRACE_EXT */
 {
 	struct task_struct *tsk = current;
 	struct ring_buffer_event *event = NULL;
@@ -238,6 +265,14 @@ static void __blk_add_trace(struct blk_trace *bt, sector_t sector, int bytes,
 		what |= BLK_TC_ACT(BLK_TC_DISCARD);
 	if (op == REQ_OP_FLUSH)
 		what |= BLK_TC_ACT(BLK_TC_FLUSH);
+
+#ifdef CONFIG_BLKTRACE_EXT
+	if (unlikely(op == REQ_OP_WRITE_ZEROES))
+		what |= BLK_TC_ACT(BLK_TC_WRITE_ZEROES);
+	if (unlikely(op == REQ_OP_ZONE_RESET))
+		what |= BLK_TC_ACT(BLK_TC_ZONE_RESET);
+#endif /* CONFIG_BLKTRACE_EXT */
+
 	if (cgid)
 		what |= __BLK_TA_CGROUP;
 
@@ -535,8 +570,14 @@ static int do_blk_trace_setup(struct request_queue *q, char *name, dev_t dev,
 		goto err;
 
 	bt->act_mask = buts->act_mask;
+
+#ifdef CONFIG_BLKTRACE_EXT
+	if (!bt->act_mask)
+		bt->act_mask = (u64) -1ULL;
+#else
 	if (!bt->act_mask)
 		bt->act_mask = (u16) -1;
+#endif /* CONFIG_BLKTRACE_EXT */
 
 	blk_trace_setup_lba(bt, bdev);
 
@@ -802,9 +843,16 @@ blk_trace_request_get_cgid(struct request_queue *q, struct request *rq)
  *     Records an action against a request. Will log the bio offset + size.
  *
  **/
+#ifdef CONFIG_BLKTRACE_EXT
+static void blk_add_trace_rq(struct request *rq, int error,
+			     unsigned int nr_bytes, u64 what,
+			     union kernfs_node_id *cgid)
+
+#else
 static void blk_add_trace_rq(struct request *rq, int error,
 			     unsigned int nr_bytes, u32 what,
 			     union kernfs_node_id *cgid)
+#endif /* CONFIG_BLKTRACE_EXT */
 {
 	struct blk_trace *bt = rq->q->blk_trace;
 
@@ -816,8 +864,14 @@ static void blk_add_trace_rq(struct request *rq, int error,
 	else
 		what |= BLK_TC_ACT(BLK_TC_FS);
 
+#ifdef CONFIG_BLKTRACE_EXT
+	__blk_add_trace(bt, blk_rq_trace_sector(rq), nr_bytes, req_op(rq),
+		rq->cmd_flags, what, error, 0, NULL, cgid, req_get_ioprio(rq));
+
+#else
 	__blk_add_trace(bt, blk_rq_trace_sector(rq), nr_bytes, req_op(rq),
 			rq->cmd_flags, what, error, 0, NULL, cgid);
+#endif /* CONFIG_BLKTRACE_EXT */
 }
 
 static void blk_add_trace_rq_insert(void *ignore,
@@ -860,17 +914,31 @@ static void blk_add_trace_rq_complete(void *ignore, struct request *rq,
  *     Records an action against a bio. Will log the bio offset + size.
  *
  **/
+
+#ifdef CONFIG_BLKTRACE_EXT
+static void blk_add_trace_bio(struct request_queue *q, struct bio *bio,
+			      u64 what, int error)
+
+#else
 static void blk_add_trace_bio(struct request_queue *q, struct bio *bio,
 			      u32 what, int error)
+#endif /* CONFIG_BLKTRACE_EXT */
 {
 	struct blk_trace *bt = q->blk_trace;
 
 	if (likely(!bt))
 		return;
 
+#ifdef CONFIG_BLKTRACE_EXT
+	__blk_add_trace(bt, bio->bi_iter.bi_sector, bio->bi_iter.bi_size,
+			bio_op(bio), bio->bi_opf, what, error, 0, NULL,
+			blk_trace_bio_get_cgid(q, bio), bio_prio(bio));
+
+#else
 	__blk_add_trace(bt, bio->bi_iter.bi_sector, bio->bi_iter.bi_size,
 			bio_op(bio), bio->bi_opf, what, error, 0, NULL,
 			blk_trace_bio_get_cgid(q, bio));
+#endif /* CONFIG_BLKTRACE_EXT */
 }
 
 static void blk_add_trace_bio_bounce(void *ignore,
@@ -917,9 +985,16 @@ static void blk_add_trace_getrq(void *ignore,
 	else {
 		struct blk_trace *bt = q->blk_trace;
 
+#ifdef CONFIG_BLKTRACE_EXT
+		if (bt)
+			__blk_add_trace(bt, 0, 0, rw, 0, BLK_TA_GETRQ, 0, 0,
+					NULL, NULL, 0);
+
+#else
 		if (bt)
 			__blk_add_trace(bt, 0, 0, rw, 0, BLK_TA_GETRQ, 0, 0,
 					NULL, NULL);
+#endif /* CONFIG_BLKTRACE_EXT */
 	}
 }
 
@@ -933,9 +1008,17 @@ static void blk_add_trace_sleeprq(void *ignore,
 	else {
 		struct blk_trace *bt = q->blk_trace;
 
+
+#ifdef CONFIG_BLKTRACE_EXT
+		if (bt)
+			__blk_add_trace(bt, 0, 0, rw, 0, BLK_TA_SLEEPRQ,
+					0, 0, NULL, NULL, 0);
+
+#else
 		if (bt)
 			__blk_add_trace(bt, 0, 0, rw, 0, BLK_TA_SLEEPRQ,
 					0, 0, NULL, NULL);
+#endif /* CONFIG_BLKTRACE_EXT */
 	}
 }
 
@@ -943,8 +1026,14 @@ static void blk_add_trace_plug(void *ignore, struct request_queue *q)
 {
 	struct blk_trace *bt = q->blk_trace;
 
+#ifdef CONFIG_BLKTRACE_EXT
+	if (bt)
+		__blk_add_trace(bt, 0, 0, 0, 0, BLK_TA_PLUG, 0, 0, NULL, NULL,
+				0);
+#else
 	if (bt)
 		__blk_add_trace(bt, 0, 0, 0, 0, BLK_TA_PLUG, 0, 0, NULL, NULL);
+#endif /* CONFIG_BLKTRACE_EXT */
 }
 
 static void blk_add_trace_unplug(void *ignore, struct request_queue *q,
@@ -954,14 +1043,24 @@ static void blk_add_trace_unplug(void *ignore, struct request_queue *q,
 
 	if (bt) {
 		__be64 rpdu = cpu_to_be64(depth);
+
+#ifdef CONFIG_BLKTRACE_EXT
+		u64 what;
+#else
 		u32 what;
+#endif /* CONFIG_BLKTRACE_EXT */
 
 		if (explicit)
 			what = BLK_TA_UNPLUG_IO;
 		else
 			what = BLK_TA_UNPLUG_TIMER;
 
+#ifdef CONFIG_BLKTRACE_EXT
+		__blk_add_trace(bt, 0, 0, 0, 0, what, 0, sizeof(rpdu), &rpdu,
+				NULL, 0);
+#else
 		__blk_add_trace(bt, 0, 0, 0, 0, what, 0, sizeof(rpdu), &rpdu, NULL);
+#endif /* CONFIG_BLKTRACE_EXT */
 	}
 }
 
@@ -974,10 +1073,18 @@ static void blk_add_trace_split(void *ignore,
 	if (bt) {
 		__be64 rpdu = cpu_to_be64(pdu);
 
+#ifdef CONFIG_BLKTRACE_EXT
+		__blk_add_trace(bt, bio->bi_iter.bi_sector,
+				bio->bi_iter.bi_size, bio_op(bio), bio->bi_opf,
+				BLK_TA_SPLIT, bio->bi_status, sizeof(rpdu),
+				&rpdu, blk_trace_bio_get_cgid(q, bio),
+				bio_prio(bio));
+#else
 		__blk_add_trace(bt, bio->bi_iter.bi_sector,
 				bio->bi_iter.bi_size, bio_op(bio), bio->bi_opf,
 				BLK_TA_SPLIT, bio->bi_status, sizeof(rpdu),
 				&rpdu, blk_trace_bio_get_cgid(q, bio));
+#endif /* CONFIG_BLKTRACE_EXT */
 	}
 }
 
@@ -1008,9 +1115,16 @@ static void blk_add_trace_bio_remap(void *ignore,
 	r.device_to   = cpu_to_be32(bio_dev(bio));
 	r.sector_from = cpu_to_be64(from);
 
+#ifdef CONFIG_BLKTRACE_EXT
+	__blk_add_trace(bt, bio->bi_iter.bi_sector, bio->bi_iter.bi_size,
+			bio_op(bio), bio->bi_opf, BLK_TA_REMAP, bio->bi_status,
+			sizeof(r), &r, blk_trace_bio_get_cgid(q, bio),
+			bio_prio(bio));
+#else
 	__blk_add_trace(bt, bio->bi_iter.bi_sector, bio->bi_iter.bi_size,
 			bio_op(bio), bio->bi_opf, BLK_TA_REMAP, bio->bi_status,
 			sizeof(r), &r, blk_trace_bio_get_cgid(q, bio));
+#endif /* CONFIG_BLKTRACE_EXT */
 }
 
 /**
@@ -1041,9 +1155,17 @@ static void blk_add_trace_rq_remap(void *ignore,
 	r.device_to   = cpu_to_be32(disk_devt(rq->rq_disk));
 	r.sector_from = cpu_to_be64(from);
 
+
+#ifdef CONFIG_BLKTRACE_EXT
+	__blk_add_trace(bt, blk_rq_pos(rq), blk_rq_bytes(rq),
+			rq_data_dir(rq), 0, BLK_TA_REMAP, 0,
+			sizeof(r), &r, blk_trace_request_get_cgid(q, rq),
+			req_get_ioprio(rq));
+#else
 	__blk_add_trace(bt, blk_rq_pos(rq), blk_rq_bytes(rq),
 			rq_data_dir(rq), 0, BLK_TA_REMAP, 0,
 			sizeof(r), &r, blk_trace_request_get_cgid(q, rq));
+#endif /* CONFIG_BLKTRACE_EXT */
 }
 
 /**
@@ -1066,9 +1188,16 @@ void blk_add_driver_data(struct request_queue *q,
 	if (likely(!bt))
 		return;
 
+#ifdef CONFIG_BLKTRACE_EXT
+	__blk_add_trace(bt, blk_rq_trace_sector(rq), blk_rq_bytes(rq), 0, 0,
+				BLK_TA_DRV_DATA, 0, len, data,
+				blk_trace_request_get_cgid(q, rq),
+				req_get_ioprio(rq));
+#else
 	__blk_add_trace(bt, blk_rq_trace_sector(rq), blk_rq_bytes(rq), 0, 0,
 				BLK_TA_DRV_DATA, 0, len, data,
 				blk_trace_request_get_cgid(q, rq));
+#endif /* CONFIG_BLKTRACE_EXT */
 }
 EXPORT_SYMBOL_GPL(blk_add_driver_data);
 
@@ -1139,7 +1268,11 @@ static void blk_unregister_tracepoints(void)
 static void fill_rwbs(char *rwbs, const struct blk_io_trace *t)
 {
 	int i = 0;
+#ifdef CONFIG_BLKTRACE_EXT
+	u64 tc = t->action >> BLK_TC_SHIFT;
+#else
 	int tc = t->action >> BLK_TC_SHIFT;
+#endif /* CONFIG_BLKTRACE_EXT */
 
 	if ((t->action & ~__BLK_TN_CGROUP) == BLK_TN_MESSAGE) {
 		rwbs[i++] = 'N';
@@ -1151,6 +1284,17 @@ static void fill_rwbs(char *rwbs, const struct blk_io_trace *t)
 
 	if (tc & BLK_TC_DISCARD)
 		rwbs[i++] = 'D';
+
+#ifdef CONFIG_BLKTRACE_EXT
+	else if ((tc & BLK_TC_WRITE_ZEROES)) {
+	/* instead of adding nested if in BLK_TC_WRITE keep the code clean */
+		rwbs[i++] = 'W';
+		rwbs[i++] = 'Z';
+	} else if ((tc & BLK_TC_ZONE_RESET)) {
+		rwbs[i++] = 'Z';
+		rwbs[i++] = 'R';
+	}
+#endif /* CONFIG_BLKTRACE_EXT */
 	else if (tc & BLK_TC_WRITE)
 		rwbs[i++] = 'W';
 	else if (t->bytes)
@@ -1193,7 +1337,11 @@ static inline int pdu_real_len(const struct trace_entry *ent, bool has_cg)
 			(has_cg ? sizeof(union kernfs_node_id) : 0);
 }
 
+#ifdef CONFIG_BLKTRACE_EXT
+static inline u64 t_action(const struct trace_entry *ent)
+#else
 static inline u32 t_action(const struct trace_entry *ent)
+#endif /* CONFIG_BLKTRACE_EXT */
 {
 	return te_blk_io_trace(ent)->action;
 }
@@ -1474,7 +1622,12 @@ static enum print_line_t print_one_line(struct trace_iterator *iter,
 	bool has_cg;
 
 	t	   = te_blk_io_trace(iter->ent);
+
+#ifdef CONFIG_BLKTRACE_EXT
+	what = (t->action & ((1ULL << BLK_TC_SHIFT) - 1)) & ~__BLK_TA_CGROUP;
+#else
 	what	   = (t->action & ((1 << BLK_TC_SHIFT) - 1)) & ~__BLK_TA_CGROUP;
+#endif /* CONFIG_BLKTRACE_EXT */
 	long_act   = !!(tr->trace_flags & TRACE_ITER_VERBOSE);
 	log_action = classic ? &blk_log_action_classic : &blk_log_action;
 	has_cg	   = t->action & __BLK_TA_CGROUP;
@@ -1617,7 +1770,13 @@ static int blk_trace_setup_queue(struct request_queue *q,
 		goto free_bt;
 
 	bt->dev = bdev->bd_dev;
+
+#ifdef CONFIG_BLKTRACE_EXT
+	bt->act_mask = (u64)-1ULL;
+
+#else
 	bt->act_mask = (u16)-1;
+#endif /* CONFIG_BLKTRACE_EXT */
 
 	blk_trace_setup_lba(bt, bdev);
 
@@ -1688,6 +1847,10 @@ static const struct {
 	{ BLK_TC_DISCARD,	"discard"	},
 	{ BLK_TC_DRV_DATA,	"drv_data"	},
 	{ BLK_TC_FUA,		"fua"		},
+#ifdef CONFIG_BLKTRACE_EXT
+	{ BLK_TC_WRITE_ZEROES,	"write-zeroes"	},
+	{ BLK_TC_ZONE_RESET,	"zone-reset"	},
+#endif /* CONFIG_BLKTRACE_EXT */
 };
 
 static int blk_trace_str2mask(const char *str)
-- 
2.19.1


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [RFC PATCH 04/18] kernel/trace: add KConfig to enable blktrace_ext
  2019-05-01  4:28 [RFC PATCH 00/18] blktrace: add blktrace extension support Chaitanya Kulkarni
                   ` (2 preceding siblings ...)
  2019-05-01  4:28 ` [RFC PATCH 03/18] blktrace: update trace to track more actions Chaitanya Kulkarni
@ 2019-05-01  4:28 ` Chaitanya Kulkarni
  2019-05-01  4:28 ` [RFC PATCH 05/18] blktrace: add iopriority mask Chaitanya Kulkarni
                   ` (13 subsequent siblings)
  17 siblings, 0 replies; 27+ messages in thread
From: Chaitanya Kulkarni @ 2019-05-01  4:28 UTC (permalink / raw)
  To: linux-block; +Cc: Chaitanya Kulkarni

Add kernel kconfig option for blktrace extension.

Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
---
 kernel/trace/Kconfig | 25 +++++++++++++++++++++++++
 1 file changed, 25 insertions(+)

diff --git a/kernel/trace/Kconfig b/kernel/trace/Kconfig
index 8bd1d6d001d7..5f8c938e495f 100644
--- a/kernel/trace/Kconfig
+++ b/kernel/trace/Kconfig
@@ -456,6 +456,31 @@ config BLK_DEV_IO_TRACE
 
 	  If unsure, say N.
 
+config BLKTRACE_EXT
+	bool "Support for tracing block IO actions extensions like priority"
+	depends on BLK_DEV_IO_TRACE
+	depends on BLOCK
+	select TRACEPOINTS
+	select GENERIC_TRACER
+	select STACKTRACE
+	help
+	  Say Y here if you want to be able to trace the extended block layer
+	  actions on a given queue. Tracing allows you to see any traffic
+	  happening on a block device queue with this extension one can see
+	  the request like write-zeroes and zone reset along with the request
+	  priority. For more information (and the userspac support tools
+          needed), fetch the blktrace tools from:
+
+	  git://git.kernel.dk/blktrace.git
+
+	  Tracing also is possible using the ftrace interface, e.g.:
+
+	    echo 1 > /sys/block/sda/sda1/trace/enable
+	    echo blk > /sys/kernel/debug/tracing/current_tracer
+	    cat /sys/kernel/debug/tracing/trace_pipe
+
+	  If unsure, say N.
+
 config KPROBE_EVENTS
 	depends on KPROBES
 	depends on HAVE_REGS_AND_STACK_ACCESS_API
-- 
2.19.1


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [RFC PATCH 05/18] blktrace: add iopriority mask
  2019-05-01  4:28 [RFC PATCH 00/18] blktrace: add blktrace extension support Chaitanya Kulkarni
                   ` (3 preceding siblings ...)
  2019-05-01  4:28 ` [RFC PATCH 04/18] kernel/trace: add KConfig to enable blktrace_ext Chaitanya Kulkarni
@ 2019-05-01  4:28 ` Chaitanya Kulkarni
  2019-05-01  4:28 ` [RFC PATCH 06/18] " Chaitanya Kulkarni
                   ` (12 subsequent siblings)
  17 siblings, 0 replies; 27+ messages in thread
From: Chaitanya Kulkarni @ 2019-05-01  4:28 UTC (permalink / raw)
  To: linux-block; +Cc: Chaitanya Kulkarni

Update the blktrace API header with to handle the priority mask.

Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
---
 include/linux/blktrace_api.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/include/linux/blktrace_api.h b/include/linux/blktrace_api.h
index 403d4cfc6a52..1b9a8b30cf9c 100644
--- a/include/linux/blktrace_api.h
+++ b/include/linux/blktrace_api.h
@@ -19,6 +19,7 @@ struct blk_trace {
 	unsigned char __percpu *msg_data;
 #ifdef CONFIG_BLKTRACE_EXT
 	u64 act_mask;
+	u32 prio_mask;
 #else
 	u16 act_mask;
 #endif /* CONFIG_BLKTRACE_EXT */
-- 
2.19.1


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [RFC PATCH 06/18] blktrace: add iopriority mask
  2019-05-01  4:28 [RFC PATCH 00/18] blktrace: add blktrace extension support Chaitanya Kulkarni
                   ` (4 preceding siblings ...)
  2019-05-01  4:28 ` [RFC PATCH 05/18] blktrace: add iopriority mask Chaitanya Kulkarni
@ 2019-05-01  4:28 ` Chaitanya Kulkarni
  2019-05-01  4:28 ` [RFC PATCH 07/18] blktrace: allow user to track iopriority Chaitanya Kulkarni
                   ` (11 subsequent siblings)
  17 siblings, 0 replies; 27+ messages in thread
From: Chaitanya Kulkarni @ 2019-05-01  4:28 UTC (permalink / raw)
  To: linux-block; +Cc: Chaitanya Kulkarni

Update the actual trace structure to actually store the priority and
IOCTL structure to hold the priority mask just like the action mask
from the userspace.

Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
---
 include/uapi/linux/blktrace_api.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/include/uapi/linux/blktrace_api.h b/include/uapi/linux/blktrace_api.h
index c34cf752a9a1..143bf81c088d 100644
--- a/include/uapi/linux/blktrace_api.h
+++ b/include/uapi/linux/blktrace_api.h
@@ -123,6 +123,7 @@ struct blk_io_trace {
 
 #ifdef CONFIG_BLKTRACE_EXT
 	__u64 action;		/* what happened */
+	__u32 ioprio;		/* ioprio */
 #else
 	__u32 action;		/* what happened */
 #endif /* CONFIG_BLKTRACE_EXT */
@@ -158,6 +159,7 @@ struct blk_user_trace_setup {
 	char name[BLKTRACE_BDEV_SIZE];	/* output */
 #ifdef CONFIG_BLKTRACE_EXT
 	__u64 act_mask;			/* input */
+	__u32 prio_mask;		/* input */
 #else
 	__u16 act_mask;			/* input */
 #endif /* CONFIG_BLKTRACE_EXT */
-- 
2.19.1


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [RFC PATCH 07/18] blktrace: allow user to track iopriority
  2019-05-01  4:28 [RFC PATCH 00/18] blktrace: add blktrace extension support Chaitanya Kulkarni
                   ` (5 preceding siblings ...)
  2019-05-01  4:28 ` [RFC PATCH 06/18] " Chaitanya Kulkarni
@ 2019-05-01  4:28 ` Chaitanya Kulkarni
  2019-05-01  4:28 ` [RFC PATCH 08/18] blktrace: add sysfs ioprio mask Chaitanya Kulkarni
                   ` (10 subsequent siblings)
  17 siblings, 0 replies; 27+ messages in thread
From: Chaitanya Kulkarni @ 2019-05-01  4:28 UTC (permalink / raw)
  To: linux-block; +Cc: Chaitanya Kulkarni

Now that we have added the support for to track the iopriority
update the blktrace extension code to actually track the priority
and use priority mask to filter out the log.

Priority mask just works same as action mask where we discard all the
traces where they don't match the priority mask.

Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
---
 kernel/trace/blktrace.c | 60 ++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 59 insertions(+), 1 deletion(-)

diff --git a/kernel/trace/blktrace.c b/kernel/trace/blktrace.c
index 6d2b4adae76e..1b113ba284fe 100644
--- a/kernel/trace/blktrace.c
+++ b/kernel/trace/blktrace.c
@@ -16,6 +16,7 @@
 #include <linux/uaccess.h>
 #include <linux/list.h>
 #include <linux/blk-cgroup.h>
+#include <linux/ioprio.h>
 
 #include "../../block/blk.h"
 
@@ -189,6 +190,54 @@ EXPORT_SYMBOL_GPL(__trace_note_message);
 
 
 #ifdef CONFIG_BLKTRACE_EXT
+static bool prio_log_check(struct blk_trace *bt, u32 ioprio)
+{
+	bool ret;
+
+	switch (IOPRIO_PRIO_CLASS(ioprio)) {
+	case IOPRIO_CLASS_NONE:
+	case IOPRIO_CLASS_RT:
+	case IOPRIO_CLASS_BE:
+	case IOPRIO_CLASS_IDLE:
+		break;
+	default:
+		/*XXX: print rate limit warn here */
+		ret = false;
+		goto out;
+	}
+
+	switch (IOPRIO_PRIO_CLASS(ioprio)) {
+	case IOPRIO_CLASS_NONE:
+		if (bt->prio_mask & 0x01)
+			ret = true;
+		else
+			ret = false;
+		break;
+	case IOPRIO_CLASS_RT:
+		if (bt->prio_mask & 0x02)
+			ret = true;
+		else
+			ret = false;
+		break;
+	case IOPRIO_CLASS_BE:
+		if (bt->prio_mask & 0x04)
+			ret = true;
+		else
+			ret = false;
+		break;
+	case IOPRIO_CLASS_IDLE:
+		if (bt->prio_mask & 0x08)
+			ret = true;
+		else
+			ret = false;
+		break;
+	default:
+		ret = false;
+	}
+out:
+	return ret;
+}
+
 static int act_log_check(struct blk_trace *bt, u64 what, sector_t sector,
 			 pid_t pid)
 #else
@@ -279,6 +328,10 @@ static void __blk_add_trace(struct blk_trace *bt, sector_t sector, int bytes,
 	pid = tsk->pid;
 	if (act_log_check(bt, what, sector, pid))
 		return;
+#ifdef CONFIG_BLKTRACE_EXT
+	if (bt->prio_mask && !prio_log_check(bt, ioprio))
+		return;
+#endif /* CONFIG_BLKTRACE_EXT */
 	cpu = raw_smp_processor_id();
 
 	if (blk_tracer) {
@@ -324,6 +377,9 @@ static void __blk_add_trace(struct blk_trace *bt, sector_t sector, int bytes,
 		t->sector = sector;
 		t->bytes = bytes;
 		t->action = what;
+#ifdef CONFIG_BLKTRACE_EXT
+		t->ioprio = ioprio;
+#endif /* CONFIG_BLKTRACE_EXT */
 		t->device = bt->dev;
 		t->error = error;
 		t->pdu_len = pdu_len + cgid_len;
@@ -574,6 +630,7 @@ static int do_blk_trace_setup(struct request_queue *q, char *name, dev_t dev,
 #ifdef CONFIG_BLKTRACE_EXT
 	if (!bt->act_mask)
 		bt->act_mask = (u64) -1ULL;
+	bt->prio_mask = buts->prio_mask;
 #else
 	if (!bt->act_mask)
 		bt->act_mask = (u16) -1;
@@ -1773,7 +1830,8 @@ static int blk_trace_setup_queue(struct request_queue *q,
 
 #ifdef CONFIG_BLKTRACE_EXT
 	bt->act_mask = (u64)-1ULL;
-
+	/* do not track priorities by default */
+	bt->prio_mask = 0;
 #else
 	bt->act_mask = (u16)-1;
 #endif /* CONFIG_BLKTRACE_EXT */
-- 
2.19.1


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [RFC PATCH 08/18] blktrace: add sysfs ioprio mask
  2019-05-01  4:28 [RFC PATCH 00/18] blktrace: add blktrace extension support Chaitanya Kulkarni
                   ` (6 preceding siblings ...)
  2019-05-01  4:28 ` [RFC PATCH 07/18] blktrace: allow user to track iopriority Chaitanya Kulkarni
@ 2019-05-01  4:28 ` Chaitanya Kulkarni
  2019-05-01  4:28 ` [RFC PATCH 09/18] blktrace: add debug support for extension Chaitanya Kulkarni
                   ` (9 subsequent siblings)
  17 siblings, 0 replies; 27+ messages in thread
From: Chaitanya Kulkarni @ 2019-05-01  4:28 UTC (permalink / raw)
  To: linux-block; +Cc: Chaitanya Kulkarni

With the priority mask and tracking support added, here we add
priority mask in the sysfs. These are just place holders for now but
they are required for complete the RFC.

Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
---
 kernel/trace/blktrace.c | 90 ++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 89 insertions(+), 1 deletion(-)

diff --git a/kernel/trace/blktrace.c b/kernel/trace/blktrace.c
index 1b113ba284fe..84163fa6a61f 100644
--- a/kernel/trace/blktrace.c
+++ b/kernel/trace/blktrace.c
@@ -1867,6 +1867,7 @@ static ssize_t sysfs_blk_trace_attr_store(struct device *dev,
 
 static BLK_TRACE_DEVICE_ATTR(enable);
 static BLK_TRACE_DEVICE_ATTR(act_mask);
+static BLK_TRACE_DEVICE_ATTR(prio_mask);
 static BLK_TRACE_DEVICE_ATTR(pid);
 static BLK_TRACE_DEVICE_ATTR(start_lba);
 static BLK_TRACE_DEVICE_ATTR(end_lba);
@@ -1874,6 +1875,7 @@ static BLK_TRACE_DEVICE_ATTR(end_lba);
 static struct attribute *blk_trace_attrs[] = {
 	&dev_attr_enable.attr,
 	&dev_attr_act_mask.attr,
+	&dev_attr_prio_mask.attr,
 	&dev_attr_pid.attr,
 	&dev_attr_start_lba.attr,
 	&dev_attr_end_lba.attr,
@@ -1911,6 +1913,16 @@ static const struct {
 #endif /* CONFIG_BLKTRACE_EXT */
 };
 
+static const struct {
+	int prio_mask;
+	const char *str;
+} prio_mask_maps[] = {
+	{ IOPRIO_CLASS_NONE,	"none" },
+	{ IOPRIO_CLASS_RT,	"read" },
+	{ IOPRIO_CLASS_BE,	"best" },
+	{ IOPRIO_CLASS_IDLE,	"idle" },
+};
+
 static int blk_trace_str2mask(const char *str)
 {
 	int i;
@@ -1962,6 +1974,62 @@ static ssize_t blk_trace_mask2str(char *buf, int mask)
 	return p - buf;
 }
 
+#ifdef CONFIG_BLKTRACE_EXT
+static int blk_trace_str2priomask(const char *str)
+{
+	int i;
+	int mask = 0;
+	char *buf, *s, *token;
+
+	/* XXX: revisit this placeholder for now */
+	buf = kstrdup(str, GFP_KERNEL);
+	if (buf == NULL)
+		return -ENOMEM;
+	s = strstrip(buf);
+
+	while (1) {
+		token = strsep(&s, ",");
+		if (token == NULL)
+			break;
+
+		if (*token == '\0')
+			continue;
+
+		for (i = 0; i < ARRAY_SIZE(prio_mask_maps); i++) {
+			if (strcasecmp(token, prio_mask_maps[i].str) == 0) {
+				mask |= (1 << prio_mask_maps[i].prio_mask);
+				break;
+			}
+		}
+		if (i == ARRAY_SIZE(prio_mask_maps)) {
+			mask = -EINVAL;
+			break;
+		}
+	}
+	kfree(buf);
+
+	return mask;
+}
+
+static ssize_t blk_trace_prio_mask2str(char *buf, int mask)
+{
+	int i;
+	char *p = buf;
+
+	for (i = 0; i < ARRAY_SIZE(prio_mask_maps); i++) {
+		/* XXX: revisit this placeholder for now */
+		if (mask & (0xF & (1 << prio_mask_maps[i].prio_mask))) {
+			p += sprintf(p, "%s%s",
+				    (p == buf) ? "" : ",",
+				    prio_mask_maps[i].str);
+		}
+	}
+	*p++ = '\n';
+
+	return p - buf;
+}
+#endif /* CONFIG_BLKTRACE_EXT */
+
 static struct request_queue *blk_trace_get_queue(struct block_device *bdev)
 {
 	if (bdev->bd_disk == NULL)
@@ -1998,6 +2066,10 @@ static ssize_t sysfs_blk_trace_attr_show(struct device *dev,
 		ret = sprintf(buf, "disabled\n");
 	else if (attr == &dev_attr_act_mask)
 		ret = blk_trace_mask2str(buf, q->blk_trace->act_mask);
+#ifdef CONFIG_BLKTRACE_EXT
+	else if (attr == &dev_attr_prio_mask)
+		ret = blk_trace_prio_mask2str(buf, q->blk_trace->prio_mask);
+#endif /* CONFIG_BLKTRACE_EXT */
 	else if (attr == &dev_attr_pid)
 		ret = sprintf(buf, "%u\n", q->blk_trace->pid);
 	else if (attr == &dev_attr_start_lba)
@@ -2034,7 +2106,19 @@ static ssize_t sysfs_blk_trace_attr_store(struct device *dev,
 				goto out;
 			value = ret;
 		}
-	} else if (kstrtoull(buf, 0, &value))
+	}
+#ifdef CONFIG_BLKTRACE_EXT
+	else if (attr == &dev_attr_prio_mask) {
+		if (kstrtoull(buf, 0, &value)) {
+			/* Assume it is a list of trace category names */
+			ret = blk_trace_str2priomask(buf);
+			if (ret < 0)
+				goto out;
+			value = ret;
+		}
+	}
+#endif /* CONFIG_BLKTRACE_EXT */
+	else if (kstrtoull(buf, 0, &value))
 		goto out;
 
 	ret = -ENXIO;
@@ -2069,6 +2153,10 @@ static ssize_t sysfs_blk_trace_attr_store(struct device *dev,
 	if (ret == 0) {
 		if (attr == &dev_attr_act_mask)
 			q->blk_trace->act_mask = value;
+#ifdef CONFIG_BLKTRACE_EXT
+		else if (attr == &dev_attr_prio_mask)
+			q->blk_trace->prio_mask = value;
+#endif /* CONFIG_BLKTRACE_EXT */
 		else if (attr == &dev_attr_pid)
 			q->blk_trace->pid = value;
 		else if (attr == &dev_attr_start_lba)
-- 
2.19.1


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [RFC PATCH 09/18] blktrace: add debug support for extension
  2019-05-01  4:28 [RFC PATCH 00/18] blktrace: add blktrace extension support Chaitanya Kulkarni
                   ` (7 preceding siblings ...)
  2019-05-01  4:28 ` [RFC PATCH 08/18] blktrace: add sysfs ioprio mask Chaitanya Kulkarni
@ 2019-05-01  4:28 ` Chaitanya Kulkarni
  2019-05-01  4:28 ` [RFC PATCH 10/18] block: set ioprio for write-zeroes, discard etc Chaitanya Kulkarni
                   ` (8 subsequent siblings)
  17 siblings, 0 replies; 27+ messages in thread
From: Chaitanya Kulkarni @ 2019-05-01  4:28 UTC (permalink / raw)
  To: linux-block; +Cc: Chaitanya Kulkarni

This patch adds a new keconfig options to enable debug messages for
blktrace extension.

Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
---
 kernel/trace/Kconfig    | 11 +++++++++++
 kernel/trace/blktrace.c | 36 +++++++++++++++++++++++-------------
 2 files changed, 34 insertions(+), 13 deletions(-)

diff --git a/kernel/trace/Kconfig b/kernel/trace/Kconfig
index 5f8c938e495f..d01bd7972638 100644
--- a/kernel/trace/Kconfig
+++ b/kernel/trace/Kconfig
@@ -481,6 +481,17 @@ config BLKTRACE_EXT
 
 	  If unsure, say N.
 
+config DEBUG_BLKTRACE_EXT
+	bool "Debug blktrace extension"
+	depends on BLK_DEV_IO_TRACE
+	depends on BLOCK
+	depends on BLKTRACE_EXT
+	select TRACEPOINTS
+	select GENERIC_TRACER
+	select STACKTRACE
+	help
+	  This enables debug messages for the blktrace extension.
+
 config KPROBE_EVENTS
 	depends on KPROBES
 	depends on HAVE_REGS_AND_STACK_ACCESS_API
diff --git a/kernel/trace/blktrace.c b/kernel/trace/blktrace.c
index 84163fa6a61f..d03473614b3c 100644
--- a/kernel/trace/blktrace.c
+++ b/kernel/trace/blktrace.c
@@ -196,45 +196,51 @@ static bool prio_log_check(struct blk_trace *bt, u32 ioprio)
 
 	switch (IOPRIO_PRIO_CLASS(ioprio)) {
 	case IOPRIO_CLASS_NONE:
-	case IOPRIO_CLASS_RT:
-	case IOPRIO_CLASS_BE:
-	case IOPRIO_CLASS_IDLE:
-		break;
-	default:
-		/*XXX: print rate limit warn here */
-		ret = false;
-		goto out;
-	}
-
-	switch (IOPRIO_PRIO_CLASS(ioprio)) {
-	case IOPRIO_CLASS_NONE:
+#ifdef CONFIG_DEBUG_BLKTRACE_EXT
+		trace_printk("%s %d NONE %s\n", __func__, __LINE__,
+				bt->prio_mask & 0x01 ? "TRUE" : "FALSE");
+#endif /* CONFIG_DEBUG_BLKTRACE_EXT */
 		if (bt->prio_mask & 0x01)
 			ret = true;
 		else
 			ret = false;
 		break;
 	case IOPRIO_CLASS_RT:
+#ifdef CONFIG_DEBUG_BLKTRACE_EXT
+		trace_printk("%s %d REAL %s\n", __func__, __LINE__,
+				bt->prio_mask & 0x02 ? "TRUE" : "FALSE");
+#endif /* CONFIG_DEBUG_BLKTRACE_EXT */
 		if (bt->prio_mask & 0x02)
 			ret = true;
 		else
 			ret = false;
 		break;
 	case IOPRIO_CLASS_BE:
+#ifdef CONFIG_DEBUG_BLKTRACE_EXT
+		trace_printk("%s %d BEST %s\n", __func__, __LINE__,
+				bt->prio_mask & 0x03 ? "TRUE" : "FALSE");
+#endif /* CONFIG_DEBUG_BLKTRACE_EXT */
 		if (bt->prio_mask & 0x04)
 			ret = true;
 		else
 			ret = false;
 		break;
 	case IOPRIO_CLASS_IDLE:
+#ifdef CONFIG_DEBUG_BLKTRACE_EXT
+		trace_printk("%s %d IDLE %s\n", __func__, __LINE__,
+				bt->prio_mask & 0x04 ? "TRUE" : "FALSE");
+#endif /* CONFIG_DEBUG_BLKTRACE_EXT */
 		if (bt->prio_mask & 0x08)
 			ret = true;
 		else
 			ret = false;
 		break;
 	default:
+#ifdef CONFIG_DEBUG_BLKTRACE_EXT
+		trace_printk("%s %d ERROR\n", __func__, __LINE__);
+#endif /* CONFIG_DEBUG_BLKTRACE_EXT */
 		ret = false;
 	}
-out:
 	return ret;
 }
 
@@ -630,6 +636,10 @@ static int do_blk_trace_setup(struct request_queue *q, char *name, dev_t dev,
 #ifdef CONFIG_BLKTRACE_EXT
 	if (!bt->act_mask)
 		bt->act_mask = (u64) -1ULL;
+
+#ifdef CONFIG_DEBUG_BLKTRACE_EXT
+	trace_printk("blktrace: prio mask 0x%x\n", buts->prio_mask);
+#endif /* CONFIG_DEBUG_BLKTRACE_EXT */
 	bt->prio_mask = buts->prio_mask;
 #else
 	if (!bt->act_mask)
-- 
2.19.1


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [RFC PATCH 10/18] block: set ioprio for write-zeroes, discard etc
  2019-05-01  4:28 [RFC PATCH 00/18] blktrace: add blktrace extension support Chaitanya Kulkarni
                   ` (8 preceding siblings ...)
  2019-05-01  4:28 ` [RFC PATCH 09/18] blktrace: add debug support for extension Chaitanya Kulkarni
@ 2019-05-01  4:28 ` Chaitanya Kulkarni
  2019-05-01  4:28 ` [RFC PATCH 11/18] block: set ioprio for zone-reset Chaitanya Kulkarni
                   ` (7 subsequent siblings)
  17 siblings, 0 replies; 27+ messages in thread
From: Chaitanya Kulkarni @ 2019-05-01  4:28 UTC (permalink / raw)
  To: linux-block; +Cc: Chaitanya Kulkarni

This change is required so that using tools like blkdisacrd we can
track the priority with blktrace newly added blktrace extension along
with the new operations like write-zeroes.

Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
---
 block/blk-lib.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/block/blk-lib.c b/block/blk-lib.c
index 5f2c429d4378..efc9a1bf5262 100644
--- a/block/blk-lib.c
+++ b/block/blk-lib.c
@@ -7,6 +7,7 @@
 #include <linux/bio.h>
 #include <linux/blkdev.h>
 #include <linux/scatterlist.h>
+#include <linux/ioprio.h>
 
 #include "blk.h"
 
@@ -64,6 +65,7 @@ int __blkdev_issue_discard(struct block_device *bdev, sector_t sector,
 		bio->bi_iter.bi_sector = sector;
 		bio_set_dev(bio, bdev);
 		bio_set_op_attrs(bio, op, 0);
+		bio_set_prio(bio, get_current_ioprio());
 
 		bio->bi_iter.bi_size = req_sects << 9;
 		sector += req_sects;
@@ -162,6 +164,7 @@ static int __blkdev_issue_write_same(struct block_device *bdev, sector_t sector,
 		bio->bi_io_vec->bv_offset = 0;
 		bio->bi_io_vec->bv_len = bdev_logical_block_size(bdev);
 		bio_set_op_attrs(bio, REQ_OP_WRITE_SAME, 0);
+		bio_set_prio(bio, get_current_ioprio());
 
 		if (nr_sects > max_write_same_sectors) {
 			bio->bi_iter.bi_size = max_write_same_sectors << 9;
@@ -234,6 +237,8 @@ static int __blkdev_issue_write_zeroes(struct block_device *bdev,
 		bio->bi_iter.bi_sector = sector;
 		bio_set_dev(bio, bdev);
 		bio->bi_opf = REQ_OP_WRITE_ZEROES;
+		bio_set_prio(bio, get_current_ioprio());
+
 		if (flags & BLKDEV_ZERO_NOUNMAP)
 			bio->bi_opf |= REQ_NOUNMAP;
 
@@ -286,6 +291,7 @@ static int __blkdev_issue_zero_pages(struct block_device *bdev,
 		bio->bi_iter.bi_sector = sector;
 		bio_set_dev(bio, bdev);
 		bio_set_op_attrs(bio, REQ_OP_WRITE, 0);
+		bio_set_prio(bio, get_current_ioprio());
 
 		while (nr_sects != 0) {
 			sz = min((sector_t) PAGE_SIZE, nr_sects << 9);
-- 
2.19.1


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [RFC PATCH 11/18] block: set ioprio for zone-reset
  2019-05-01  4:28 [RFC PATCH 00/18] blktrace: add blktrace extension support Chaitanya Kulkarni
                   ` (9 preceding siblings ...)
  2019-05-01  4:28 ` [RFC PATCH 10/18] block: set ioprio for write-zeroes, discard etc Chaitanya Kulkarni
@ 2019-05-01  4:28 ` Chaitanya Kulkarni
  2019-05-01  4:28 ` [RFC PATCH 12/18] block: set ioprio for flush bio Chaitanya Kulkarni
                   ` (6 subsequent siblings)
  17 siblings, 0 replies; 27+ messages in thread
From: Chaitanya Kulkarni @ 2019-05-01  4:28 UTC (permalink / raw)
  To: linux-block; +Cc: Chaitanya Kulkarni

This change is required so that we can track the priority for the
REQ_OP_ZONE_RESET using "blkzone reset <device_name>" command.

Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
---
 block/blk-zoned.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/block/blk-zoned.c b/block/blk-zoned.c
index 2d98803faec2..5d8517654b16 100644
--- a/block/blk-zoned.c
+++ b/block/blk-zoned.c
@@ -13,6 +13,7 @@
 #include <linux/rbtree.h>
 #include <linux/blkdev.h>
 #include <linux/blk-mq.h>
+#include <linux/ioprio.h>
 
 #include "blk.h"
 
@@ -248,6 +249,7 @@ int blkdev_reset_zones(struct block_device *bdev,
 		bio->bi_iter.bi_sector = sector;
 		bio_set_dev(bio, bdev);
 		bio_set_op_attrs(bio, REQ_OP_ZONE_RESET, 0);
+		bio_set_prio(bio, get_current_ioprio());
 
 		sector += zone_sectors;
 
-- 
2.19.1


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [RFC PATCH 12/18] block: set ioprio for flush bio
  2019-05-01  4:28 [RFC PATCH 00/18] blktrace: add blktrace extension support Chaitanya Kulkarni
                   ` (10 preceding siblings ...)
  2019-05-01  4:28 ` [RFC PATCH 11/18] block: set ioprio for zone-reset Chaitanya Kulkarni
@ 2019-05-01  4:28 ` Chaitanya Kulkarni
  2019-05-01  4:28 ` [RFC PATCH 13/18] drivers: set bio iopriority field Chaitanya Kulkarni
                   ` (5 subsequent siblings)
  17 siblings, 0 replies; 27+ messages in thread
From: Chaitanya Kulkarni @ 2019-05-01  4:28 UTC (permalink / raw)
  To: linux-block; +Cc: Chaitanya Kulkarni

This change is required to track the priority for the flush operations
with blktrace extension.

Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
---
 block/blk-flush.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/block/blk-flush.c b/block/blk-flush.c
index d95f94892015..af84ed4cafc9 100644
--- a/block/blk-flush.c
+++ b/block/blk-flush.c
@@ -70,6 +70,7 @@
 #include <linux/blkdev.h>
 #include <linux/gfp.h>
 #include <linux/blk-mq.h>
+#include <linux/ioprio.h>
 
 #include "blk.h"
 #include "blk-mq.h"
@@ -446,6 +447,7 @@ int blkdev_issue_flush(struct block_device *bdev, gfp_t gfp_mask,
 	bio = bio_alloc(gfp_mask, 0);
 	bio_set_dev(bio, bdev);
 	bio->bi_opf = REQ_OP_WRITE | REQ_PREFLUSH;
+	bio_set_prio(bio, get_current_ioprio());
 
 	ret = submit_bio_wait(bio);
 
-- 
2.19.1


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [RFC PATCH 13/18] drivers: set bio iopriority field
  2019-05-01  4:28 [RFC PATCH 00/18] blktrace: add blktrace extension support Chaitanya Kulkarni
                   ` (11 preceding siblings ...)
  2019-05-01  4:28 ` [RFC PATCH 12/18] block: set ioprio for flush bio Chaitanya Kulkarni
@ 2019-05-01  4:28 ` Chaitanya Kulkarni
  2019-05-01  6:23   ` Javier González
  2019-05-01  4:28 ` [RFC PATCH 14/18] fs: " Chaitanya Kulkarni
                   ` (4 subsequent siblings)
  17 siblings, 1 reply; 27+ messages in thread
From: Chaitanya Kulkarni @ 2019-05-01  4:28 UTC (permalink / raw)
  To: linux-block; +Cc: Chaitanya Kulkarni

Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
---
 drivers/block/drbd/drbd_actlog.c    | 2 ++
 drivers/block/drbd/drbd_bitmap.c    | 3 +++
 drivers/block/xen-blkback/blkback.c | 3 +++
 drivers/block/zram/zram_drv.c       | 2 ++
 drivers/lightnvm/pblk-read.c        | 2 ++
 drivers/lightnvm/pblk-write.c       | 1 +
 drivers/md/bcache/journal.c         | 2 ++
 drivers/md/bcache/super.c           | 2 ++
 drivers/md/dm-bufio.c               | 2 ++
 drivers/md/dm-cache-target.c        | 1 +
 drivers/md/dm-io.c                  | 2 ++
 drivers/md/dm-log-writes.c          | 5 +++++
 drivers/md/dm-thin.c                | 1 +
 drivers/md/dm-writecache.c          | 2 ++
 drivers/md/dm-zoned-metadata.c      | 4 ++++
 drivers/md/md.c                     | 4 ++++
 drivers/md/raid5-cache.c            | 4 ++++
 drivers/md/raid5-ppl.c              | 3 +++
 drivers/nvme/target/io-cmd-bdev.c   | 7 +++++++
 drivers/staging/erofs/internal.h    | 3 +++
 drivers/target/target_core_iblock.c | 3 +++
 21 files changed, 58 insertions(+)

diff --git a/drivers/block/drbd/drbd_actlog.c b/drivers/block/drbd/drbd_actlog.c
index 5f0eaee8c8a7..67235633c172 100644
--- a/drivers/block/drbd/drbd_actlog.c
+++ b/drivers/block/drbd/drbd_actlog.c
@@ -27,6 +27,7 @@
 #include <linux/crc32c.h>
 #include <linux/drbd.h>
 #include <linux/drbd_limits.h>
+#include <linux/ioprio.h>
 #include "drbd_int.h"
 
 
@@ -159,6 +160,7 @@ static int _drbd_md_sync_page_io(struct drbd_device *device,
 	bio->bi_private = device;
 	bio->bi_end_io = drbd_md_endio;
 	bio_set_op_attrs(bio, op, op_flags);
+	bio_set_prio(bio, get_current_ioprio());
 
 	if (op != REQ_OP_WRITE && device->state.disk == D_DISKLESS && device->ldev == NULL)
 		/* special case, drbd_md_read() during drbd_adm_attach(): no get_ldev */
diff --git a/drivers/block/drbd/drbd_bitmap.c b/drivers/block/drbd/drbd_bitmap.c
index 11a85b740327..e7cb027488c7 100644
--- a/drivers/block/drbd/drbd_bitmap.c
+++ b/drivers/block/drbd/drbd_bitmap.c
@@ -30,6 +30,7 @@
 #include <linux/drbd.h>
 #include <linux/slab.h>
 #include <linux/highmem.h>
+#include <linux/ioprio.h>
 
 #include "drbd_int.h"
 
@@ -1028,6 +1029,8 @@ static void bm_page_io_async(struct drbd_bm_aio_ctx *ctx, int page_nr) __must_ho
 	bio->bi_private = ctx;
 	bio->bi_end_io = drbd_bm_endio;
 	bio_set_op_attrs(bio, op, 0);
+	bio_set_prio(bio, get_current_ioprio());
+
 
 	if (drbd_insert_fault(device, (op == REQ_OP_WRITE) ? DRBD_FAULT_MD_WR : DRBD_FAULT_MD_RD)) {
 		bio_io_error(bio);
diff --git a/drivers/block/xen-blkback/blkback.c b/drivers/block/xen-blkback/blkback.c
index fd1e19f1a49f..41294944267d 100644
--- a/drivers/block/xen-blkback/blkback.c
+++ b/drivers/block/xen-blkback/blkback.c
@@ -42,6 +42,7 @@
 #include <linux/delay.h>
 #include <linux/freezer.h>
 #include <linux/bitmap.h>
+#include <linux/ioprio.h>
 
 #include <xen/events.h>
 #include <xen/page.h>
@@ -1375,6 +1376,7 @@ static int dispatch_rw_block_io(struct xen_blkif_ring *ring,
 			bio->bi_end_io  = end_block_io_op;
 			bio->bi_iter.bi_sector  = preq.sector_number;
 			bio_set_op_attrs(bio, operation, operation_flags);
+			bio_set_prio(bio, get_current_ioprio());
 		}
 
 		preq.sector_number += seg[i].nsec;
@@ -1393,6 +1395,7 @@ static int dispatch_rw_block_io(struct xen_blkif_ring *ring,
 		bio->bi_private = pending_req;
 		bio->bi_end_io  = end_block_io_op;
 		bio_set_op_attrs(bio, operation, operation_flags);
+		bio_set_prio(bio, get_current_ioprio());
 	}
 
 	atomic_set(&pending_req->pendcnt, nbio);
diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
index 399cad7daae7..1a4e3b0e98ad 100644
--- a/drivers/block/zram/zram_drv.c
+++ b/drivers/block/zram/zram_drv.c
@@ -33,6 +33,7 @@
 #include <linux/sysfs.h>
 #include <linux/debugfs.h>
 #include <linux/cpuhotplug.h>
+#include <linux/ioprio.h>
 
 #include "zram_drv.h"
 
@@ -596,6 +597,7 @@ static int read_from_bdev_async(struct zram *zram, struct bio_vec *bvec,
 
 	bio->bi_iter.bi_sector = entry * (PAGE_SIZE >> 9);
 	bio_set_dev(bio, zram->bdev);
+	bio_set_prio(bio, get_current_ioprio());
 	if (!bio_add_page(bio, bvec->bv_page, bvec->bv_len, bvec->bv_offset)) {
 		bio_put(bio);
 		return -EIO;
diff --git a/drivers/lightnvm/pblk-read.c b/drivers/lightnvm/pblk-read.c
index 0b7d5fb4548d..2b866744545e 100644
--- a/drivers/lightnvm/pblk-read.c
+++ b/drivers/lightnvm/pblk-read.c
@@ -16,6 +16,7 @@
  * pblk-read.c - pblk's read path
  */
 
+#include <linux/ioprio.h>
 #include "pblk.h"
 
 /*
@@ -336,6 +337,7 @@ static int pblk_setup_partial_read(struct pblk *pblk, struct nvm_rq *rqd,
 
 	new_bio->bi_iter.bi_sector = 0; /* internal bio */
 	bio_set_op_attrs(new_bio, REQ_OP_READ, 0);
+	bio_set_prio(bio, get_current_ioprio());
 
 	rqd->bio = new_bio;
 	rqd->nr_ppas = nr_holes;
diff --git a/drivers/lightnvm/pblk-write.c b/drivers/lightnvm/pblk-write.c
index 6593deab52da..3fdbbff40fde 100644
--- a/drivers/lightnvm/pblk-write.c
+++ b/drivers/lightnvm/pblk-write.c
@@ -628,6 +628,7 @@ static int pblk_submit_write(struct pblk *pblk, int *secs_left)
 
 	bio->bi_iter.bi_sector = 0; /* internal bio */
 	bio_set_op_attrs(bio, REQ_OP_WRITE, 0);
+	bio_set_prio(bio, get_current_ioprio());
 
 	rqd = pblk_alloc_rqd(pblk, PBLK_WRITE);
 	rqd->bio = bio;
diff --git a/drivers/md/bcache/journal.c b/drivers/md/bcache/journal.c
index b2fd412715b1..8fda51134919 100644
--- a/drivers/md/bcache/journal.c
+++ b/drivers/md/bcache/journal.c
@@ -10,6 +10,7 @@
 #include "debug.h"
 #include "extents.h"
 
+#include <linux/ioprio.h>
 #include <trace/events/bcache.h>
 
 /*
@@ -445,6 +446,7 @@ static void journal_discard_work(struct work_struct *work)
 	struct journal_device *ja =
 		container_of(work, struct journal_device, discard_work);
 
+	bio_set_prio(&ja->discard_bio, get_current_ioprio());
 	submit_bio(&ja->discard_bio);
 }
 
diff --git a/drivers/md/bcache/super.c b/drivers/md/bcache/super.c
index a697a3a923cd..336208a8d05e 100644
--- a/drivers/md/bcache/super.c
+++ b/drivers/md/bcache/super.c
@@ -24,6 +24,7 @@
 #include <linux/random.h>
 #include <linux/reboot.h>
 #include <linux/sysfs.h>
+#include <linux/ioprio.h>
 
 unsigned int bch_cutoff_writeback;
 unsigned int bch_cutoff_writeback_sync;
@@ -210,6 +211,7 @@ static void __write_super(struct cache_sb *sb, struct bio *bio)
 	bio->bi_iter.bi_sector	= SB_SECTOR;
 	bio->bi_iter.bi_size	= SB_SIZE;
 	bio_set_op_attrs(bio, REQ_OP_WRITE, REQ_SYNC|REQ_META);
+	bio_set_prio(bio, get_current_ioprio());
 	bch_bio_map(bio, NULL);
 
 	out->offset		= cpu_to_le64(sb->offset);
diff --git a/drivers/md/dm-bufio.c b/drivers/md/dm-bufio.c
index 1ecef76225a1..2e9b4fb3b2c9 100644
--- a/drivers/md/dm-bufio.c
+++ b/drivers/md/dm-bufio.c
@@ -18,6 +18,7 @@
 #include <linux/module.h>
 #include <linux/rbtree.h>
 #include <linux/stacktrace.h>
+#include <linux/ioprio.h>
 
 #define DM_MSG_PREFIX "bufio"
 
@@ -591,6 +592,7 @@ static void use_bio(struct dm_buffer *b, int rw, sector_t sector,
 	bio_set_op_attrs(bio, rw, 0);
 	bio->bi_end_io = bio_complete;
 	bio->bi_private = b;
+	bio_set_prio(bio, get_current_ioprio());
 
 	ptr = (char *)b->data + offset;
 	len = n_sectors << SECTOR_SHIFT;
diff --git a/drivers/md/dm-cache-target.c b/drivers/md/dm-cache-target.c
index d249cf8ac277..d09bd8e2db36 100644
--- a/drivers/md/dm-cache-target.c
+++ b/drivers/md/dm-cache-target.c
@@ -938,6 +938,7 @@ static void remap_to_origin_and_cache(struct cache *cache, struct bio *bio,
 	 * all code that might use per_bio_data (since clone doesn't have it)
 	 */
 	__remap_to_origin_clear_discard(cache, origin_bio, oblock, false);
+	bio_set_prio(bio, get_current_ioprio());
 	submit_bio(origin_bio);
 
 	remap_to_cache(cache, bio, cblock);
diff --git a/drivers/md/dm-io.c b/drivers/md/dm-io.c
index 81ffc59d05c9..5964be4a4a2a 100644
--- a/drivers/md/dm-io.c
+++ b/drivers/md/dm-io.c
@@ -16,6 +16,7 @@
 #include <linux/sched.h>
 #include <linux/slab.h>
 #include <linux/dm-io.h>
+#include <linux/ioprio.h>
 
 #define DM_MSG_PREFIX "io"
 
@@ -350,6 +351,7 @@ static void do_region(int op, int op_flags, unsigned region,
 		bio_set_dev(bio, where->bdev);
 		bio->bi_end_io = endio;
 		bio_set_op_attrs(bio, op, op_flags);
+		bio_set_prio(bio, get_current_ioprio());
 		store_io_and_region_in_bio(bio, io, region);
 
 		if (op == REQ_OP_DISCARD || op == REQ_OP_WRITE_ZEROES) {
diff --git a/drivers/md/dm-log-writes.c b/drivers/md/dm-log-writes.c
index 9ea2b0291f20..d06e70f4dbf6 100644
--- a/drivers/md/dm-log-writes.c
+++ b/drivers/md/dm-log-writes.c
@@ -15,6 +15,7 @@
 #include <linux/kthread.h>
 #include <linux/freezer.h>
 #include <linux/uio.h>
+#include <linux/ioprio.h>
 
 #define DM_MSG_PREFIX "log-writes"
 
@@ -218,6 +219,7 @@ static int write_metadata(struct log_writes_c *lc, void *entry,
 	bio->bi_end_io = log_end_io;
 	bio->bi_private = lc;
 	bio_set_op_attrs(bio, REQ_OP_WRITE, 0);
+	bio_set_prio(bio, get_current_ioprio());
 
 	page = alloc_page(GFP_KERNEL);
 	if (!page) {
@@ -277,6 +279,7 @@ static int write_inline_data(struct log_writes_c *lc, void *entry,
 		bio->bi_end_io = log_end_io;
 		bio->bi_private = lc;
 		bio_set_op_attrs(bio, REQ_OP_WRITE, 0);
+		bio_set_prio(bio, get_current_ioprio());
 
 		for (i = 0; i < bio_pages; i++) {
 			pg_datalen = min_t(int, datalen, PAGE_SIZE);
@@ -364,6 +367,7 @@ static int log_one_block(struct log_writes_c *lc,
 	bio->bi_end_io = log_end_io;
 	bio->bi_private = lc;
 	bio_set_op_attrs(bio, REQ_OP_WRITE, 0);
+	bio_set_prio(bio, get_current_ioprio());
 
 	for (i = 0; i < block->vec_cnt; i++) {
 		/*
@@ -386,6 +390,7 @@ static int log_one_block(struct log_writes_c *lc,
 			bio->bi_end_io = log_end_io;
 			bio->bi_private = lc;
 			bio_set_op_attrs(bio, REQ_OP_WRITE, 0);
+			bio_set_prio(bio, get_current_ioprio());
 
 			ret = bio_add_page(bio, block->vecs[i].bv_page,
 					   block->vecs[i].bv_len, 0);
diff --git a/drivers/md/dm-thin.c b/drivers/md/dm-thin.c
index fcd887703f95..fab5a7b20ffd 100644
--- a/drivers/md/dm-thin.c
+++ b/drivers/md/dm-thin.c
@@ -1180,6 +1180,7 @@ static void process_prepared_discard_passdown_pt1(struct dm_thin_new_mapping *m)
 	}
 
 	discard_parent = bio_alloc(GFP_NOIO, 1);
+	bio_set_prio(discard_parent, get_current_ioprio());
 	if (!discard_parent) {
 		DMWARN("%s: unable to allocate top level discard bio for passdown. Skipping passdown.",
 		       dm_device_name(tc->pool->pool_md));
diff --git a/drivers/md/dm-writecache.c b/drivers/md/dm-writecache.c
index f7822875589e..bc7fad54cbec 100644
--- a/drivers/md/dm-writecache.c
+++ b/drivers/md/dm-writecache.c
@@ -15,6 +15,7 @@
 #include <linux/dax.h>
 #include <linux/pfn_t.h>
 #include <linux/libnvdimm.h>
+#include <linux/ioprio.h>
 
 #define DM_MSG_PREFIX "writecache"
 
@@ -1480,6 +1481,7 @@ static void __writecache_writeback_pmem(struct dm_writecache *wc, struct writeba
 		wb->wc = wc;
 		wb->bio.bi_end_io = writecache_writeback_endio;
 		bio_set_dev(&wb->bio, wc->dev->bdev);
+		bio_set_prio(bio, get_current_ioprio());
 		wb->bio.bi_iter.bi_sector = read_original_sector(wc, e);
 		wb->page_offset = PAGE_SIZE;
 		if (max_pages <= WB_LIST_INLINE ||
diff --git a/drivers/md/dm-zoned-metadata.c b/drivers/md/dm-zoned-metadata.c
index fa68336560c3..a57556e58808 100644
--- a/drivers/md/dm-zoned-metadata.c
+++ b/drivers/md/dm-zoned-metadata.c
@@ -7,6 +7,7 @@
 #include "dm-zoned.h"
 
 #include <linux/module.h>
+#include <linux/ioprio.h>
 #include <linux/crc32.h>
 
 #define	DM_MSG_PREFIX		"zoned metadata"
@@ -439,6 +440,7 @@ static struct dmz_mblock *dmz_get_mblock_slow(struct dmz_metadata *zmd,
 	bio->bi_end_io = dmz_mblock_bio_end_io;
 	bio_set_op_attrs(bio, REQ_OP_READ, REQ_META | REQ_PRIO);
 	bio_add_page(bio, mblk->page, DMZ_BLOCK_SIZE, 0);
+	bio_set_prio(bio, get_current_ioprio());
 	submit_bio(bio);
 
 	return mblk;
@@ -589,6 +591,7 @@ static void dmz_write_mblock(struct dmz_metadata *zmd, struct dmz_mblock *mblk,
 	bio->bi_end_io = dmz_mblock_bio_end_io;
 	bio_set_op_attrs(bio, REQ_OP_WRITE, REQ_META | REQ_PRIO);
 	bio_add_page(bio, mblk->page, DMZ_BLOCK_SIZE, 0);
+	bio_set_prio(bio, get_current_ioprio());
 	submit_bio(bio);
 }
 
@@ -609,6 +612,7 @@ static int dmz_rdwr_block(struct dmz_metadata *zmd, int op, sector_t block,
 	bio_set_dev(bio, zmd->dev->bdev);
 	bio_set_op_attrs(bio, op, REQ_SYNC | REQ_META | REQ_PRIO);
 	bio_add_page(bio, page, DMZ_BLOCK_SIZE, 0);
+	bio_set_prio(bio, get_current_ioprio());
 	ret = submit_bio_wait(bio);
 	bio_put(bio);
 
diff --git a/drivers/md/md.c b/drivers/md/md.c
index 45ffa23fa85d..def94bdc2a48 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -446,6 +446,7 @@ static void submit_flushes(struct work_struct *ws)
 			bi->bi_private = rdev;
 			bio_set_dev(bi, rdev->bdev);
 			bi->bi_opf = REQ_OP_WRITE | REQ_PREFLUSH;
+			bio_set_prio(bi, get_current_ioprio());
 			atomic_inc(&mddev->flush_pending);
 			submit_bio(bi);
 			rcu_read_lock();
@@ -822,6 +823,8 @@ void md_super_write(struct mddev *mddev, struct md_rdev *rdev,
 	    !test_bit(LastDev, &rdev->flags))
 		ff = MD_FAILFAST;
 	bio->bi_opf = REQ_OP_WRITE | REQ_SYNC | REQ_PREFLUSH | REQ_FUA | ff;
+	bio_set_prio(bio, get_current_ioprio());
+
 
 	atomic_inc(&mddev->pending_writes);
 	submit_bio(bio);
@@ -856,6 +859,7 @@ int sync_page_io(struct md_rdev *rdev, sector_t sector, int size,
 	else
 		bio->bi_iter.bi_sector = sector + rdev->data_offset;
 	bio_add_page(bio, page, size, 0);
+	bio_set_prio(bio, get_current_ioprio());
 
 	submit_bio_wait(bio);
 
diff --git a/drivers/md/raid5-cache.c b/drivers/md/raid5-cache.c
index cbbe6b6535be..7efbb910b133 100644
--- a/drivers/md/raid5-cache.c
+++ b/drivers/md/raid5-cache.c
@@ -656,6 +656,8 @@ static void r5l_do_submit_io(struct r5l_log *log, struct r5l_io_unit *io)
 			io->split_bio->bi_opf |= REQ_PREFLUSH;
 		if (io->has_fua)
 			io->split_bio->bi_opf |= REQ_FUA;
+
+		bio_set_prio(io->split_bio, get_current_ioprio());
 		submit_bio(io->split_bio);
 	}
 
@@ -663,6 +665,7 @@ static void r5l_do_submit_io(struct r5l_log *log, struct r5l_io_unit *io)
 		io->current_bio->bi_opf |= REQ_PREFLUSH;
 	if (io->has_fua)
 		io->current_bio->bi_opf |= REQ_FUA;
+	bio_set_prio(io->current_bio, get_current_ioprio());
 	submit_bio(io->current_bio);
 }
 
@@ -1315,6 +1318,7 @@ void r5l_flush_stripe_to_raid(struct r5l_log *log)
 		return;
 	bio_reset(&log->flush_bio);
 	bio_set_dev(&log->flush_bio, log->rdev->bdev);
+	bio_set_prio(&log->flush_bio, get_current_ioprio());
 	log->flush_bio.bi_end_io = r5l_log_flush_endio;
 	log->flush_bio.bi_opf = REQ_OP_WRITE | REQ_PREFLUSH;
 	submit_bio(&log->flush_bio);
diff --git a/drivers/md/raid5-ppl.c b/drivers/md/raid5-ppl.c
index 17e9e7d51097..badfdad742db 100644
--- a/drivers/md/raid5-ppl.c
+++ b/drivers/md/raid5-ppl.c
@@ -18,6 +18,7 @@
 #include <linux/crc32c.h>
 #include <linux/async_tx.h>
 #include <linux/raid/md_p.h>
+#include <linux/ioprio.h>
 #include "md.h"
 #include "raid5.h"
 #include "raid5-log.h"
@@ -511,6 +512,7 @@ static void ppl_submit_iounit(struct ppl_io_unit *io)
 			bio_copy_dev(bio, prev);
 			bio->bi_iter.bi_sector = bio_end_sector(prev);
 			bio_add_page(bio, sh->ppl_page, PAGE_SIZE, 0);
+			bio_set_prio(bio, get_current_ioprio());
 
 			bio_chain(bio, prev);
 			ppl_submit_iounit_bio(io, prev);
@@ -647,6 +649,7 @@ static void ppl_do_flush(struct ppl_io_unit *io)
 
 			bio = bio_alloc_bioset(GFP_NOIO, 0, &ppl_conf->flush_bs);
 			bio_set_dev(bio, bdev);
+			bio_set_prio(bio, get_current_ioprio());
 			bio->bi_private = io;
 			bio->bi_opf = REQ_OP_WRITE | REQ_PREFLUSH;
 			bio->bi_end_io = ppl_flush_endio;
diff --git a/drivers/nvme/target/io-cmd-bdev.c b/drivers/nvme/target/io-cmd-bdev.c
index 3efc52f9c309..c4feb8e12d26 100644
--- a/drivers/nvme/target/io-cmd-bdev.c
+++ b/drivers/nvme/target/io-cmd-bdev.c
@@ -6,6 +6,7 @@
 #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
 #include <linux/blkdev.h>
 #include <linux/module.h>
+#include <linux/ioprio.h>
 #include "nvmet.h"
 
 int nvmet_bdev_ns_enable(struct nvmet_ns *ns)
@@ -142,6 +143,7 @@ static void nvmet_bdev_execute_rw(struct nvmet_req *req)
 	bio->bi_private = req;
 	bio->bi_end_io = nvmet_bio_done;
 	bio_set_op_attrs(bio, op, op_flags);
+	bio_set_prio(bio, get_current_ioprio());
 
 	for_each_sg(req->sg, sg, req->sg_cnt, i) {
 		while (bio_add_page(bio, sg_page(sg), sg->length, sg->offset)
@@ -149,9 +151,11 @@ static void nvmet_bdev_execute_rw(struct nvmet_req *req)
 			struct bio *prev = bio;
 
 			bio = bio_alloc(GFP_KERNEL, min(sg_cnt, BIO_MAX_PAGES));
+			bio_set_prio(bio, get_current_ioprio());
 			bio_set_dev(bio, req->ns->bdev);
 			bio->bi_iter.bi_sector = sector;
 			bio_set_op_attrs(bio, op, op_flags);
+			bio_set_prio(bio, get_current_ioprio());
 
 			bio_chain(bio, prev);
 			submit_bio(prev);
@@ -170,6 +174,7 @@ static void nvmet_bdev_execute_flush(struct nvmet_req *req)
 
 	bio_init(bio, req->inline_bvec, ARRAY_SIZE(req->inline_bvec));
 	bio_set_dev(bio, req->ns->bdev);
+	bio_set_prio(bio, get_current_ioprio());
 	bio->bi_private = req;
 	bio->bi_end_io = nvmet_bio_done;
 	bio->bi_opf = REQ_OP_WRITE | REQ_PREFLUSH;
@@ -226,6 +231,7 @@ static void nvmet_bdev_execute_discard(struct nvmet_req *req)
 			bio->bi_status = BLK_STS_IOERR;
 			bio_endio(bio);
 		} else {
+			bio_set_prio(bio, get_current_ioprio());
 			submit_bio(bio);
 		}
 	} else {
@@ -266,6 +272,7 @@ static void nvmet_bdev_execute_write_zeroes(struct nvmet_req *req)
 	if (bio) {
 		bio->bi_private = req;
 		bio->bi_end_io = nvmet_bio_done;
+		bio_set_prio(bio, get_current_ioprio());
 		submit_bio(bio);
 	} else {
 		nvmet_req_complete(req, errno_to_nvme_status(req, ret));
diff --git a/drivers/staging/erofs/internal.h b/drivers/staging/erofs/internal.h
index e3bfde00c7d2..6df239a5856b 100644
--- a/drivers/staging/erofs/internal.h
+++ b/drivers/staging/erofs/internal.h
@@ -22,6 +22,7 @@
 #include <linux/cleancache.h>
 #include <linux/slab.h>
 #include <linux/vmalloc.h>
+#include <linux/ioprio.h>
 #include "erofs_fs.h"
 
 /* redefine pr_fmt "erofs: " */
@@ -482,12 +483,14 @@ erofs_grab_bio(struct super_block *sb,
 	bio->bi_end_io = endio;
 	bio_set_dev(bio, sb->s_bdev);
 	bio->bi_iter.bi_sector = (sector_t)blkaddr << LOG_SECTORS_PER_BLOCK;
+	bio_set_prio(bio, get_current_ioprio());
 	return bio;
 }
 
 static inline void __submit_bio(struct bio *bio, unsigned op, unsigned op_flags)
 {
 	bio_set_op_attrs(bio, op, op_flags);
+	bio_set_prio(bio, get_current_ioprio());
 	submit_bio(bio);
 }
 
diff --git a/drivers/target/target_core_iblock.c b/drivers/target/target_core_iblock.c
index b5ed9c377060..0db3fb9f339a 100644
--- a/drivers/target/target_core_iblock.c
+++ b/drivers/target/target_core_iblock.c
@@ -37,6 +37,7 @@
 #include <linux/module.h>
 #include <scsi/scsi_proto.h>
 #include <asm/unaligned.h>
+#include <linux/ioprio.h>
 
 #include <target/target_core_base.h>
 #include <target/target_core_backend.h>
@@ -341,6 +342,7 @@ iblock_get_bio(struct se_cmd *cmd, sector_t lba, u32 sg_num, int op,
 	bio->bi_end_io = &iblock_bio_done;
 	bio->bi_iter.bi_sector = lba;
 	bio_set_op_attrs(bio, op, op_flags);
+	bio_set_prio(bio, get_current_ioprio());
 
 	return bio;
 }
@@ -395,6 +397,7 @@ iblock_execute_sync_cache(struct se_cmd *cmd)
 	bio->bi_end_io = iblock_end_io_flush;
 	bio_set_dev(bio, ib_dev->ibd_bd);
 	bio->bi_opf = REQ_OP_WRITE | REQ_PREFLUSH;
+	bio_set_prio(bio, get_current_ioprio());
 	if (!immed)
 		bio->bi_private = cmd;
 	submit_bio(bio);
-- 
2.19.1


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [RFC PATCH 14/18] fs: set bio iopriority field
  2019-05-01  4:28 [RFC PATCH 00/18] blktrace: add blktrace extension support Chaitanya Kulkarni
                   ` (12 preceding siblings ...)
  2019-05-01  4:28 ` [RFC PATCH 13/18] drivers: set bio iopriority field Chaitanya Kulkarni
@ 2019-05-01  4:28 ` Chaitanya Kulkarni
  2019-05-01  4:28 ` [RFC PATCH 15/18] power/swap: " Chaitanya Kulkarni
                   ` (3 subsequent siblings)
  17 siblings, 0 replies; 27+ messages in thread
From: Chaitanya Kulkarni @ 2019-05-01  4:28 UTC (permalink / raw)
  To: linux-block; +Cc: Chaitanya Kulkarni

Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
---
 fs/btrfs/disk-io.c               | 2 ++
 fs/btrfs/extent_io.c             | 3 +++
 fs/btrfs/raid56.c                | 6 ++++++
 fs/btrfs/scrub.c                 | 2 ++
 fs/btrfs/volumes.c               | 3 +++
 fs/buffer.c                      | 2 ++
 fs/crypto/bio.c                  | 3 +++
 fs/direct-io.c                   | 2 ++
 fs/ext4/page-io.c                | 2 ++
 fs/ext4/readpage.c               | 1 +
 fs/f2fs/data.c                   | 3 +++
 fs/f2fs/segment.c                | 1 +
 fs/gfs2/lops.c                   | 2 ++
 fs/gfs2/meta_io.c                | 2 ++
 fs/gfs2/ops_fstype.c             | 2 ++
 fs/hfsplus/wrapper.c             | 2 ++
 fs/iomap.c                       | 2 ++
 fs/jfs/jfs_logmgr.c              | 3 +++
 fs/jfs/jfs_metapage.c            | 3 +++
 fs/mpage.c                       | 1 +
 fs/nfs/blocklayout/blocklayout.c | 2 ++
 fs/nilfs2/segbuf.c               | 2 ++
 fs/ocfs2/cluster/heartbeat.c     | 2 ++
 fs/xfs/xfs_aops.c                | 3 +++
 fs/xfs/xfs_buf.c                 | 2 ++
 25 files changed, 58 insertions(+)

diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index 6fe9197f6ee4..28e147b78b25 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -19,6 +19,7 @@
 #include <linux/crc32c.h>
 #include <linux/sched/mm.h>
 #include <asm/unaligned.h>
+#include <linux/ioprio.h>
 #include "ctree.h"
 #include "disk-io.h"
 #include "transaction.h"
@@ -3603,6 +3604,7 @@ static void write_dev_flush(struct btrfs_device *device)
 	bio_reset(bio);
 	bio->bi_end_io = btrfs_end_empty_barrier;
 	bio_set_dev(bio, device->bdev);
+	bio_set_prio(bio, get_current_ioprio());
 	bio->bi_opf = REQ_OP_WRITE | REQ_SYNC | REQ_PREFLUSH;
 	init_completion(&device->flush_wait);
 	bio->bi_private = &device->flush_wait;
diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
index ca8b8e785cf3..b0689a8aade0 100644
--- a/fs/btrfs/extent_io.c
+++ b/fs/btrfs/extent_io.c
@@ -13,6 +13,7 @@
 #include <linux/pagevec.h>
 #include <linux/prefetch.h>
 #include <linux/cleancache.h>
+#include <linux/ioprio.h>
 #include "extent_io.h"
 #include "extent_map.h"
 #include "ctree.h"
@@ -160,6 +161,7 @@ static int __must_check submit_one_bio(struct bio *bio, int mirror_num,
 	start = page_offset(bv.bv_page) + bv.bv_offset;
 
 	bio->bi_private = NULL;
+	bio_set_prio(bio, get_current_ioprio());
 
 	if (tree->ops)
 		ret = tree->ops->submit_bio_hook(tree->private_data, bio,
@@ -2043,6 +2045,7 @@ int repair_io_failure(struct btrfs_fs_info *fs_info, u64 ino, u64 start,
 	bio_set_dev(bio, dev->bdev);
 	bio->bi_opf = REQ_OP_WRITE | REQ_SYNC;
 	bio_add_page(bio, page, length, pg_offset);
+	bio_set_prio(bio, get_current_ioprio());
 
 	if (btrfsic_submit_bio_wait(bio)) {
 		/* try to remap that extent elsewhere? */
diff --git a/fs/btrfs/raid56.c b/fs/btrfs/raid56.c
index 67a6f7d47402..77211e45b11c 100644
--- a/fs/btrfs/raid56.c
+++ b/fs/btrfs/raid56.c
@@ -13,6 +13,7 @@
 #include <linux/list_sort.h>
 #include <linux/raid/xor.h>
 #include <linux/mm.h>
+#include <linux/ioprio.h>
 #include "ctree.h"
 #include "disk-io.h"
 #include "volumes.h"
@@ -1324,6 +1325,7 @@ static noinline void finish_rmw(struct btrfs_raid_bio *rbio)
 		bio->bi_private = rbio;
 		bio->bi_end_io = raid_write_end_io;
 		bio->bi_opf = REQ_OP_WRITE;
+		bio_set_prio(bio, get_current_ioprio());
 
 		submit_bio(bio);
 	}
@@ -1567,6 +1569,7 @@ static int raid56_rmw_stripe(struct btrfs_raid_bio *rbio)
 		bio->bi_private = rbio;
 		bio->bi_end_io = raid_rmw_end_io;
 		bio->bi_opf = REQ_OP_READ;
+		bio_set_prio(bio, get_current_ioprio());
 
 		btrfs_bio_wq_end_io(rbio->fs_info, bio, BTRFS_WQ_ENDIO_RAID56);
 
@@ -2114,6 +2117,7 @@ static int __raid56_parity_recover(struct btrfs_raid_bio *rbio)
 		bio->bi_private = rbio;
 		bio->bi_end_io = raid_recover_end_io;
 		bio->bi_opf = REQ_OP_READ;
+		bio_set_prio(bio, get_current_ioprio());
 
 		btrfs_bio_wq_end_io(rbio->fs_info, bio, BTRFS_WQ_ENDIO_RAID56);
 
@@ -2487,6 +2491,7 @@ static noinline void finish_parity_scrub(struct btrfs_raid_bio *rbio,
 		bio->bi_private = rbio;
 		bio->bi_end_io = raid_write_end_io;
 		bio->bi_opf = REQ_OP_WRITE;
+		bio_set_prio(bio, get_current_ioprio());
 
 		submit_bio(bio);
 	}
@@ -2669,6 +2674,7 @@ static void raid56_parity_scrub_stripe(struct btrfs_raid_bio *rbio)
 		bio->bi_private = rbio;
 		bio->bi_end_io = raid56_parity_scrub_end_io;
 		bio->bi_opf = REQ_OP_READ;
+		bio_set_prio(bio, get_current_ioprio());
 
 		btrfs_bio_wq_end_io(rbio->fs_info, bio, BTRFS_WQ_ENDIO_RAID56);
 
diff --git a/fs/btrfs/scrub.c b/fs/btrfs/scrub.c
index a99588536c79..b0be49a6a87a 100644
--- a/fs/btrfs/scrub.c
+++ b/fs/btrfs/scrub.c
@@ -1485,6 +1485,7 @@ static void scrub_recheck_block(struct btrfs_fs_info *fs_info,
 		WARN_ON(!page->page);
 		bio = btrfs_io_bio_alloc(1);
 		bio_set_dev(bio, page->dev->bdev);
+		bio_set_prio(bio, get_current_ioprio());
 
 		bio_add_page(bio, page->page, PAGE_SIZE, 0);
 		bio->bi_iter.bi_sector = page->physical >> 9;
@@ -2058,6 +2059,7 @@ static int scrub_add_page_to_rd_bio(struct scrub_ctx *sctx,
 		bio->bi_private = sbio;
 		bio->bi_end_io = scrub_bio_end_io;
 		bio_set_dev(bio, sbio->dev->bdev);
+		bio_set_prio(bio, get_current_ioprio());
 		bio->bi_iter.bi_sector = sbio->physical >> 9;
 		bio->bi_opf = REQ_OP_READ;
 		sbio->status = 0;
diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index db934ceae9c1..ea95c719aa11 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -14,6 +14,7 @@
 #include <linux/semaphore.h>
 #include <linux/uuid.h>
 #include <linux/list_sort.h>
+#include <linux/ioprio.h>
 #include "ctree.h"
 #include "extent_map.h"
 #include "disk-io.h"
@@ -641,6 +642,7 @@ static noinline void run_scheduled_bios(struct btrfs_device *device)
 			sync_pending = 0;
 		}
 
+		bio_set_prio(cur, get_current_ioprio());
 		btrfsic_submit_bio(cur);
 		num_run++;
 		batch_run++;
@@ -6499,6 +6501,7 @@ static void submit_stripe_bio(struct btrfs_bio *bbio, struct bio *bio,
 	struct btrfs_fs_info *fs_info = bbio->fs_info;
 
 	bio->bi_private = bbio;
+	bio_set_prio(bio, get_current_ioprio());
 	btrfs_io_bio(bio)->stripe_index = dev_nr;
 	bio->bi_end_io = btrfs_end_bio;
 	bio->bi_iter.bi_sector = physical >> 9;
diff --git a/fs/buffer.c b/fs/buffer.c
index ce357602f471..a172f032b739 100644
--- a/fs/buffer.c
+++ b/fs/buffer.c
@@ -44,6 +44,7 @@
 #include <linux/mpage.h>
 #include <linux/bit_spinlock.h>
 #include <linux/pagevec.h>
+#include <linux/ioprio.h>
 #include <linux/sched/mm.h>
 #include <trace/events/block.h>
 
@@ -3089,6 +3090,7 @@ static int submit_bh_wbc(int op, int op_flags, struct buffer_head *bh,
 	if (buffer_prio(bh))
 		op_flags |= REQ_PRIO;
 	bio_set_op_attrs(bio, op, op_flags);
+	bio_set_prio(bio, get_current_ioprio());
 
 	if (wbc) {
 		wbc_init_bio(wbc, bio);
diff --git a/fs/crypto/bio.c b/fs/crypto/bio.c
index 5759bcd018cd..75b2284e03cb 100644
--- a/fs/crypto/bio.c
+++ b/fs/crypto/bio.c
@@ -24,6 +24,7 @@
 #include <linux/module.h>
 #include <linux/bio.h>
 #include <linux/namei.h>
+#include <linux/ioprio.h>
 #include "fscrypt_private.h"
 
 static void __fscrypt_decrypt_bio(struct bio *bio, bool done)
@@ -139,6 +140,8 @@ int fscrypt_zeroout_range(const struct inode *inode, pgoff_t lblk,
 			err = -EIO;
 			goto errout;
 		}
+
+		bio_set_prio(bio, get_current_ioprio());
 		err = submit_bio_wait(bio);
 		if (err == 0 && bio->bi_status)
 			err = -EIO;
diff --git a/fs/direct-io.c b/fs/direct-io.c
index 9bb015bc4a83..744e5ca35def 100644
--- a/fs/direct-io.c
+++ b/fs/direct-io.c
@@ -37,6 +37,7 @@
 #include <linux/uio.h>
 #include <linux/atomic.h>
 #include <linux/prefetch.h>
+#include <linux/ioprio.h>
 
 /*
  * How many user pages to map in one call to get_user_pages().  This determines
@@ -440,6 +441,7 @@ dio_bio_alloc(struct dio *dio, struct dio_submit *sdio,
 	bio_set_dev(bio, bdev);
 	bio->bi_iter.bi_sector = first_sector;
 	bio_set_op_attrs(bio, dio->op, dio->op_flags);
+	bio_set_prio(bio, get_current_ioprio());
 	if (dio->is_async)
 		bio->bi_end_io = dio_bio_end_aio;
 	else
diff --git a/fs/ext4/page-io.c b/fs/ext4/page-io.c
index 3e9298e6a705..7c9857c276aa 100644
--- a/fs/ext4/page-io.c
+++ b/fs/ext4/page-io.c
@@ -25,6 +25,7 @@
 #include <linux/slab.h>
 #include <linux/mm.h>
 #include <linux/backing-dev.h>
+#include <linux/ioprio.h>
 
 #include "ext4_jbd2.h"
 #include "xattr.h"
@@ -382,6 +383,7 @@ static int io_submit_init_bio(struct ext4_io_submit *io,
 	io->io_bio = bio;
 	io->io_next_block = bh->b_blocknr;
 	wbc_init_bio(io->io_wbc, bio);
+	bio_set_prio(bio, get_current_ioprio());
 	return 0;
 }
 
diff --git a/fs/ext4/readpage.c b/fs/ext4/readpage.c
index 3adadf461825..243ca2a33171 100644
--- a/fs/ext4/readpage.c
+++ b/fs/ext4/readpage.c
@@ -261,6 +261,7 @@ int ext4_mpage_readpages(struct address_space *mapping,
 			bio->bi_private = ctx;
 			bio_set_op_attrs(bio, REQ_OP_READ,
 						is_readahead ? REQ_RAHEAD : 0);
+			bio_set_prio(bio, get_current_ioprio());
 		}
 
 		length = first_hole << blkbits;
diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
index 9727944139f2..c0050a6e0723 100644
--- a/fs/f2fs/data.c
+++ b/fs/f2fs/data.c
@@ -18,6 +18,7 @@
 #include <linux/uio.h>
 #include <linux/cleancache.h>
 #include <linux/sched/signal.h>
+#include <linux/ioprio.h>
 
 #include "f2fs.h"
 #include "node.h"
@@ -276,6 +277,7 @@ static struct bio *__bio_alloc(struct f2fs_sb_info *sbi, block_t blk_addr,
 	if (wbc)
 		wbc_init_bio(wbc, bio);
 
+	bio_set_prio(bio, get_current_ioprio());
 	return bio;
 }
 
@@ -569,6 +571,7 @@ static struct bio *f2fs_grab_read_bio(struct inode *inode, block_t blkaddr,
 	f2fs_target_device(sbi, blkaddr, bio);
 	bio->bi_end_io = f2fs_read_end_io;
 	bio_set_op_attrs(bio, REQ_OP_READ, op_flag);
+	bio_set_prio(bio, get_current_ioprio());
 
 	if (f2fs_encrypted_file(inode))
 		post_read_steps |= 1 << STEP_DECRYPT;
diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
index aa7fe79b62b2..d1b67ff8ce66 100644
--- a/fs/f2fs/segment.c
+++ b/fs/f2fs/segment.c
@@ -567,6 +567,7 @@ static int __submit_flush_wait(struct f2fs_sb_info *sbi,
 
 	bio->bi_opf = REQ_OP_WRITE | REQ_SYNC | REQ_PREFLUSH;
 	bio_set_dev(bio, bdev);
+	bio_set_prio(bio, get_current_ioprio());
 	ret = submit_bio_wait(bio);
 	bio_put(bio);
 
diff --git a/fs/gfs2/lops.c b/fs/gfs2/lops.c
index 8722c60b11fe..fd9752209be6 100644
--- a/fs/gfs2/lops.c
+++ b/fs/gfs2/lops.c
@@ -17,6 +17,7 @@
 #include <linux/bio.h>
 #include <linux/fs.h>
 #include <linux/list_sort.h>
+#include <linux/ioprio.h>
 
 #include "dir.h"
 #include "gfs2.h"
@@ -272,6 +273,7 @@ static struct bio *gfs2_log_alloc_bio(struct gfs2_sbd *sdp, u64 blkno,
 	bio_set_dev(bio, sb->s_bdev);
 	bio->bi_end_io = end_io;
 	bio->bi_private = sdp;
+	bio_set_prio(bio, get_current_ioprio());
 
 	return bio;
 }
diff --git a/fs/gfs2/meta_io.c b/fs/gfs2/meta_io.c
index 3201342404a7..9502b808c78c 100644
--- a/fs/gfs2/meta_io.c
+++ b/fs/gfs2/meta_io.c
@@ -19,6 +19,7 @@
 #include <linux/delay.h>
 #include <linux/bio.h>
 #include <linux/gfs2_ondisk.h>
+#include <linux/ioprio.h>
 
 #include "gfs2.h"
 #include "incore.h"
@@ -234,6 +235,7 @@ static void gfs2_submit_bhs(int op, int op_flags, struct buffer_head *bhs[],
 		}
 		bio->bi_end_io = gfs2_meta_read_endio;
 		bio_set_op_attrs(bio, op, op_flags);
+		bio_set_prio(bio, get_current_ioprio());
 		submit_bio(bio);
 	}
 }
diff --git a/fs/gfs2/ops_fstype.c b/fs/gfs2/ops_fstype.c
index b041cb8ae383..a7aa38456f31 100644
--- a/fs/gfs2/ops_fstype.c
+++ b/fs/gfs2/ops_fstype.c
@@ -24,6 +24,7 @@
 #include <linux/lockdep.h>
 #include <linux/module.h>
 #include <linux/backing-dev.h>
+#include <linux/ioprio.h>
 
 #include "gfs2.h"
 #include "incore.h"
@@ -248,6 +249,7 @@ static int gfs2_read_super(struct gfs2_sbd *sdp, sector_t sector, int silent)
 	bio->bi_end_io = end_bio_io_page;
 	bio->bi_private = page;
 	bio_set_op_attrs(bio, REQ_OP_READ, REQ_META);
+	bio_set_prio(bio, get_current_ioprio());
 	submit_bio(bio);
 	wait_on_page_locked(page);
 	bio_put(bio);
diff --git a/fs/hfsplus/wrapper.c b/fs/hfsplus/wrapper.c
index 08c1580bdf7a..b04eae16cf53 100644
--- a/fs/hfsplus/wrapper.c
+++ b/fs/hfsplus/wrapper.c
@@ -14,6 +14,7 @@
 #include <linux/cdrom.h>
 #include <linux/genhd.h>
 #include <asm/unaligned.h>
+#include <linux/ioprio.h>
 
 #include "hfsplus_fs.h"
 #include "hfsplus_raw.h"
@@ -68,6 +69,7 @@ int hfsplus_submit_bio(struct super_block *sb, sector_t sector,
 	bio->bi_iter.bi_sector = sector;
 	bio_set_dev(bio, sb->s_bdev);
 	bio_set_op_attrs(bio, op, op_flags);
+	bio_set_prio(bio, get_current_ioprio());
 
 	if (op != WRITE && data)
 		*data = (u8 *)buf + offset;
diff --git a/fs/iomap.c b/fs/iomap.c
index abdd18e404f8..47ca9a4fe427 100644
--- a/fs/iomap.c
+++ b/fs/iomap.c
@@ -30,6 +30,7 @@
 #include <linux/task_io_accounting_ops.h>
 #include <linux/dax.h>
 #include <linux/sched/signal.h>
+#include <linux/ioprio.h>
 
 #include "internal.h"
 
@@ -1618,6 +1619,7 @@ iomap_dio_zero(struct iomap_dio *dio, struct iomap *iomap, loff_t pos,
 	get_page(page);
 	__bio_add_page(bio, page, len, 0);
 	bio_set_op_attrs(bio, REQ_OP_WRITE, flags);
+	bio_set_prio(bio, get_current_ioprio());
 	iomap_dio_submit_bio(dio, iomap, bio);
 }
 
diff --git a/fs/jfs/jfs_logmgr.c b/fs/jfs/jfs_logmgr.c
index 6b68df395892..2311868160a2 100644
--- a/fs/jfs/jfs_logmgr.c
+++ b/fs/jfs/jfs_logmgr.c
@@ -72,6 +72,7 @@
 #include <linux/mutex.h>
 #include <linux/seq_file.h>
 #include <linux/slab.h>
+#include <linux/ioprio.h>
 #include "jfs_incore.h"
 #include "jfs_filsys.h"
 #include "jfs_metapage.h"
@@ -1996,6 +1997,7 @@ static int lbmRead(struct jfs_log * log, int pn, struct lbuf ** bpp)
 
 	bio->bi_iter.bi_sector = bp->l_blkno << (log->l2bsize - 9);
 	bio_set_dev(bio, log->bdev);
+	bio_set_prio(bio, get_current_ioprio());
 
 	bio_add_page(bio, bp->l_page, LOGPSIZE, bp->l_offset);
 	BUG_ON(bio->bi_iter.bi_size != LOGPSIZE);
@@ -2140,6 +2142,7 @@ static void lbmStartIO(struct lbuf * bp)
 	bio = bio_alloc(GFP_NOFS, 1);
 	bio->bi_iter.bi_sector = bp->l_blkno << (log->l2bsize - 9);
 	bio_set_dev(bio, log->bdev);
+	bio_set_prio(bio, get_current_ioprio());
 
 	bio_add_page(bio, bp->l_page, LOGPSIZE, bp->l_offset);
 	BUG_ON(bio->bi_iter.bi_size != LOGPSIZE);
diff --git a/fs/jfs/jfs_metapage.c b/fs/jfs/jfs_metapage.c
index fa2c6824c7f2..021841273a25 100644
--- a/fs/jfs/jfs_metapage.c
+++ b/fs/jfs/jfs_metapage.c
@@ -26,6 +26,7 @@
 #include <linux/buffer_head.h>
 #include <linux/mempool.h>
 #include <linux/seq_file.h>
+#include <linux/ioprio.h>
 #include "jfs_incore.h"
 #include "jfs_superblock.h"
 #include "jfs_filsys.h"
@@ -435,6 +436,7 @@ static int metapage_writepage(struct page *page, struct writeback_control *wbc)
 		bio->bi_end_io = metapage_write_end_io;
 		bio->bi_private = page;
 		bio_set_op_attrs(bio, REQ_OP_WRITE, 0);
+		bio_set_prio(bio, get_current_ioprio());
 
 		/* Don't call bio_add_page yet, we may add to this vec */
 		bio_offset = offset;
@@ -516,6 +518,7 @@ static int metapage_readpage(struct file *fp, struct page *page)
 			bio->bi_end_io = metapage_read_end_io;
 			bio->bi_private = page;
 			bio_set_op_attrs(bio, REQ_OP_READ, 0);
+			bio_set_prio(bio, get_current_ioprio());
 			len = xlen << inode->i_blkbits;
 			offset = block_offset << inode->i_blkbits;
 			if (bio_add_page(bio, page, len, offset) < len)
diff --git a/fs/mpage.c b/fs/mpage.c
index 3f19da75178b..766be568395a 100644
--- a/fs/mpage.c
+++ b/fs/mpage.c
@@ -87,6 +87,7 @@ mpage_alloc(struct block_device *bdev,
 	if (bio) {
 		bio_set_dev(bio, bdev);
 		bio->bi_iter.bi_sector = first_sector;
+		bio_set_prio(bio, get_current_ioprio());
 	}
 	return bio;
 }
diff --git a/fs/nfs/blocklayout/blocklayout.c b/fs/nfs/blocklayout/blocklayout.c
index 690221747b47..ccec9bfad963 100644
--- a/fs/nfs/blocklayout/blocklayout.c
+++ b/fs/nfs/blocklayout/blocklayout.c
@@ -37,6 +37,7 @@
 #include <linux/bio.h>		/* struct bio */
 #include <linux/prefetch.h>
 #include <linux/pagevec.h>
+#include <linux/ioprio.h>
 
 #include "../pnfs.h"
 #include "../nfs4session.h"
@@ -133,6 +134,7 @@ bl_alloc_init_bio(int npg, struct block_device *bdev, sector_t disk_sector,
 		bio_set_dev(bio, bdev);
 		bio->bi_end_io = end_io;
 		bio->bi_private = par;
+		bio_set_prio(bio, get_current_ioprio());
 	}
 	return bio;
 }
diff --git a/fs/nilfs2/segbuf.c b/fs/nilfs2/segbuf.c
index 20c479b5e41b..8632bbb1c620 100644
--- a/fs/nilfs2/segbuf.c
+++ b/fs/nilfs2/segbuf.c
@@ -13,6 +13,7 @@
 #include <linux/crc32.h>
 #include <linux/backing-dev.h>
 #include <linux/slab.h>
+#include <linux/ioprio.h>
 #include "page.h"
 #include "segbuf.h"
 
@@ -394,6 +395,7 @@ static struct bio *nilfs_alloc_seg_bio(struct the_nilfs *nilfs, sector_t start,
 		bio_set_dev(bio, nilfs->ns_bdev);
 		bio->bi_iter.bi_sector =
 			start << (nilfs->ns_blocksize_bits - 9);
+		bio_set_prio(bio, get_current_ioprio());
 	}
 	return bio;
 }
diff --git a/fs/ocfs2/cluster/heartbeat.c b/fs/ocfs2/cluster/heartbeat.c
index f3c20b279eb2..38b6a8799613 100644
--- a/fs/ocfs2/cluster/heartbeat.c
+++ b/fs/ocfs2/cluster/heartbeat.c
@@ -37,6 +37,7 @@
 #include <linux/slab.h>
 #include <linux/bitmap.h>
 #include <linux/ktime.h>
+#include <linux/ioprio.h>
 #include "heartbeat.h"
 #include "tcp.h"
 #include "nodemanager.h"
@@ -557,6 +558,7 @@ static struct bio *o2hb_setup_one_bio(struct o2hb_region *reg,
 	bio->bi_private = wc;
 	bio->bi_end_io = o2hb_bio_end_io;
 	bio_set_op_attrs(bio, op, op_flags);
+	bio_set_prio(bio, get_current_ioprio());
 
 	vec_start = (cs << bits) % PAGE_SIZE;
 	while(cs < max_slots) {
diff --git a/fs/xfs/xfs_aops.c b/fs/xfs/xfs_aops.c
index 3619e9e8d359..30a27c386ff0 100644
--- a/fs/xfs/xfs_aops.c
+++ b/fs/xfs/xfs_aops.c
@@ -22,6 +22,7 @@
 #include "xfs_bmap_btree.h"
 #include "xfs_reflink.h"
 #include <linux/writeback.h>
+#include <linux/ioprio.h>
 
 /*
  * structure owned by writepages passed to individual writepage calls
@@ -597,6 +598,7 @@ xfs_alloc_ioend(
 	INIT_WORK(&ioend->io_work, xfs_end_io);
 	ioend->io_append_trans = NULL;
 	ioend->io_bio = bio;
+	bio_set_prio(bio, get_current_ioprio());
 	return ioend;
 }
 
@@ -620,6 +622,7 @@ xfs_chain_bio(
 	bio_set_dev(new, bdev);
 	new->bi_iter.bi_sector = sector;
 	bio_chain(ioend->io_bio, new);
+	bio_set_prio(new, get_current_ioprio());
 	bio_get(ioend->io_bio);		/* for xfs_destroy_ioend */
 	ioend->io_bio->bi_opf = REQ_OP_WRITE | wbc_to_write_flags(wbc);
 	ioend->io_bio->bi_write_hint = ioend->io_inode->i_write_hint;
diff --git a/fs/xfs/xfs_buf.c b/fs/xfs/xfs_buf.c
index 548344e25128..2a91b4414e84 100644
--- a/fs/xfs/xfs_buf.c
+++ b/fs/xfs/xfs_buf.c
@@ -21,6 +21,7 @@
 #include <linux/migrate.h>
 #include <linux/backing-dev.h>
 #include <linux/freezer.h>
+#include <linux/ioprio.h>
 
 #include "xfs_format.h"
 #include "xfs_log_format.h"
@@ -1381,6 +1382,7 @@ xfs_buf_ioapply_map(
 	bio->bi_end_io = xfs_buf_bio_end_io;
 	bio->bi_private = bp;
 	bio_set_op_attrs(bio, op, op_flags);
+	bio_set_prio(bio, get_current_ioprio());
 
 	for (; size && nr_pages; nr_pages--, page_index++) {
 		int	rbytes, nbytes = PAGE_SIZE - offset;
-- 
2.19.1


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [RFC PATCH 15/18] power/swap: set bio iopriority field
  2019-05-01  4:28 [RFC PATCH 00/18] blktrace: add blktrace extension support Chaitanya Kulkarni
                   ` (13 preceding siblings ...)
  2019-05-01  4:28 ` [RFC PATCH 14/18] fs: " Chaitanya Kulkarni
@ 2019-05-01  4:28 ` Chaitanya Kulkarni
  2019-05-01  4:28 ` [RFC PATCH 16/18] mm: " Chaitanya Kulkarni
                   ` (2 subsequent siblings)
  17 siblings, 0 replies; 27+ messages in thread
From: Chaitanya Kulkarni @ 2019-05-01  4:28 UTC (permalink / raw)
  To: linux-block; +Cc: Chaitanya Kulkarni

Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
---
 kernel/power/swap.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/kernel/power/swap.c b/kernel/power/swap.c
index d7f6c1a288d3..74b6d11fabe2 100644
--- a/kernel/power/swap.c
+++ b/kernel/power/swap.c
@@ -33,6 +33,7 @@
 #include <linux/kthread.h>
 #include <linux/crc32.h>
 #include <linux/ktime.h>
+#include <linux/ioprio.h>
 
 #include "power.h"
 
@@ -273,6 +274,7 @@ static int hib_submit_io(int op, int op_flags, pgoff_t page_off, void *addr,
 	bio->bi_iter.bi_sector = page_off * (PAGE_SIZE >> 9);
 	bio_set_dev(bio, hib_resume_bdev);
 	bio_set_op_attrs(bio, op, op_flags);
+	bio_set_prio(bio, get_current_ioprio());
 
 	if (bio_add_page(bio, page, PAGE_SIZE, 0) < PAGE_SIZE) {
 		pr_err("Adding page to bio failed at %llu\n",
-- 
2.19.1


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [RFC PATCH 16/18] mm: set bio iopriority field
  2019-05-01  4:28 [RFC PATCH 00/18] blktrace: add blktrace extension support Chaitanya Kulkarni
                   ` (14 preceding siblings ...)
  2019-05-01  4:28 ` [RFC PATCH 15/18] power/swap: " Chaitanya Kulkarni
@ 2019-05-01  4:28 ` Chaitanya Kulkarni
  2019-05-01  4:28 ` [RFC PATCH 17/18] null_blk: add write-zeroes flag to nullb_device Chaitanya Kulkarni
  2019-05-01  4:28 ` [RFC PATCH 18/18] null_blk: add module param discard/write-zeroes Chaitanya Kulkarni
  17 siblings, 0 replies; 27+ messages in thread
From: Chaitanya Kulkarni @ 2019-05-01  4:28 UTC (permalink / raw)
  To: linux-block; +Cc: Chaitanya Kulkarni

Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
---
 mm/page_io.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/mm/page_io.c b/mm/page_io.c
index 2e8019d0e048..950cc002f60a 100644
--- a/mm/page_io.c
+++ b/mm/page_io.c
@@ -24,6 +24,7 @@
 #include <linux/blkdev.h>
 #include <linux/uio.h>
 #include <linux/sched/task.h>
+#include <linux/ioprio.h>
 #include <asm/pgtable.h>
 
 static struct bio *get_swap_bio(gfp_t gfp_flags,
@@ -40,6 +41,7 @@ static struct bio *get_swap_bio(gfp_t gfp_flags,
 		bio_set_dev(bio, bdev);
 		bio->bi_iter.bi_sector <<= PAGE_SHIFT - 9;
 		bio->bi_end_io = end_io;
+		bio_set_prio(bio, get_current_ioprio());
 
 		for (i = 0; i < nr; i++)
 			bio_add_page(bio, page + i, PAGE_SIZE, 0);
-- 
2.19.1


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [RFC PATCH 17/18] null_blk: add write-zeroes flag to nullb_device
  2019-05-01  4:28 [RFC PATCH 00/18] blktrace: add blktrace extension support Chaitanya Kulkarni
                   ` (15 preceding siblings ...)
  2019-05-01  4:28 ` [RFC PATCH 16/18] mm: " Chaitanya Kulkarni
@ 2019-05-01  4:28 ` Chaitanya Kulkarni
  2019-05-01  4:28 ` [RFC PATCH 18/18] null_blk: add module param discard/write-zeroes Chaitanya Kulkarni
  17 siblings, 0 replies; 27+ messages in thread
From: Chaitanya Kulkarni @ 2019-05-01  4:28 UTC (permalink / raw)
  To: linux-block; +Cc: Chaitanya Kulkarni

This patch adds a new write-zeroes flag just like discard under
struct null_blk to enable REQ_OP_WRITE_ZEROES operation on the null_blk.

This is needed for testing the blktrace extension with different
priorities on write-zeroes operation.

Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
---
 drivers/block/null_blk.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/block/null_blk.h b/drivers/block/null_blk.h
index 34b22d6523ba..ecd1e45f6eb9 100644
--- a/drivers/block/null_blk.h
+++ b/drivers/block/null_blk.h
@@ -63,6 +63,7 @@ struct nullb_device {
 	bool power; /* power on/off the device */
 	bool memory_backed; /* if data is stored in memory */
 	bool discard; /* if support discard */
+	bool write_zeroes; /* if support write_zeroes */
 	bool zoned; /* if device is zoned */
 };
 
-- 
2.19.1


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [RFC PATCH 18/18] null_blk: add module param discard/write-zeroes
  2019-05-01  4:28 [RFC PATCH 00/18] blktrace: add blktrace extension support Chaitanya Kulkarni
                   ` (16 preceding siblings ...)
  2019-05-01  4:28 ` [RFC PATCH 17/18] null_blk: add write-zeroes flag to nullb_device Chaitanya Kulkarni
@ 2019-05-01  4:28 ` Chaitanya Kulkarni
  17 siblings, 0 replies; 27+ messages in thread
From: Chaitanya Kulkarni @ 2019-05-01  4:28 UTC (permalink / raw)
  To: linux-block; +Cc: Chaitanya Kulkarni

This patch adds a two new module params discard and write-zeroes
in order to test the REQ_OP_DISACRD and REQ_OP_WRITE_ZEROES
operations.

This is needed to test latest blktrace code changes which enables
us to track more request based operations such as write-zeroes.

Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
---
 drivers/block/null_blk_main.c | 37 +++++++++++++++++++++++++++++++----
 1 file changed, 33 insertions(+), 4 deletions(-)

diff --git a/drivers/block/null_blk_main.c b/drivers/block/null_blk_main.c
index d7ac09c092f2..93fe2c843d03 100644
--- a/drivers/block/null_blk_main.c
+++ b/drivers/block/null_blk_main.c
@@ -192,6 +192,14 @@ static unsigned int g_zone_nr_conv;
 module_param_named(zone_nr_conv, g_zone_nr_conv, uint, 0444);
 MODULE_PARM_DESC(zone_nr_conv, "Number of conventional zones when block device is zoned. Default: 0");
 
+static bool g_discard;
+module_param_named(discard, g_discard, bool, 0444);
+MODULE_PARM_DESC(discard, "Allow REQ_OP_DISCARD processing. Default: false");
+
+static bool g_write_zeroes;
+module_param_named(write_zeroes, g_write_zeroes, bool, 0444);
+MODULE_PARM_DESC(write_zeroes, "Allow REQ_OP_WRITE_ZEROES processing. Default: false");
+
 static struct nullb_device *null_alloc_dev(void);
 static void null_free_dev(struct nullb_device *dev);
 static void null_del_dev(struct nullb *nullb);
@@ -527,6 +535,12 @@ static struct nullb_device *null_alloc_dev(void)
 	dev->zoned = g_zoned;
 	dev->zone_size = g_zone_size;
 	dev->zone_nr_conv = g_zone_nr_conv;
+	dev->discard = g_discard;
+	dev->write_zeroes = g_write_zeroes;
+	pr_info("null_blk: discard  %s\n",
+			dev->discard == true ? "TRUE" : "FALSE");
+	pr_info("null_blk: write_zeroes %s\n",
+			dev->write_zeroes == true ? "TRUE" : "FALSE");
 	return dev;
 }
 
@@ -1059,7 +1073,11 @@ static int null_handle_rq(struct nullb_cmd *cmd)
 
 	sector = blk_rq_pos(rq);
 
-	if (req_op(rq) == REQ_OP_DISCARD) {
+	/* just discard for write zeroes for now */
+	switch (req_op(rq)) {
+	case REQ_OP_DISCARD:
+		/* fall through */
+	case REQ_OP_WRITE_ZEROES:
 		null_handle_discard(nullb, sector, blk_rq_bytes(rq));
 		return 0;
 	}
@@ -1093,7 +1111,11 @@ static int null_handle_bio(struct nullb_cmd *cmd)
 
 	sector = bio->bi_iter.bi_sector;
 
-	if (bio_op(bio) == REQ_OP_DISCARD) {
+	/* just discard for write zeroes for now */
+	switch (bio_op(bio)) {
+	case REQ_OP_DISCARD:
+		/* fall through */
+	case REQ_OP_WRITE_ZEROES:
 		null_handle_discard(nullb, sector,
 			bio_sectors(bio) << SECTOR_SHIFT);
 		return 0;
@@ -1192,7 +1214,6 @@ static blk_status_t null_handle_cmd(struct nullb_cmd *cmd)
 		}
 	}
 	cmd->error = errno_to_blk_status(err);
-
 	if (!cmd->error && dev->zoned) {
 		sector_t sector;
 		unsigned int nr_sectors;
@@ -1402,7 +1423,7 @@ static void null_del_dev(struct nullb *nullb)
 
 static void null_config_discard(struct nullb *nullb)
 {
-	if (nullb->dev->discard == false)
+	if (!nullb->dev->discard)
 		return;
 	nullb->q->limits.discard_granularity = nullb->dev->blocksize;
 	nullb->q->limits.discard_alignment = nullb->dev->blocksize;
@@ -1410,6 +1431,13 @@ static void null_config_discard(struct nullb *nullb)
 	blk_queue_flag_set(QUEUE_FLAG_DISCARD, nullb->q);
 }
 
+static void null_config_write_zeroes(struct nullb *nullb)
+{
+	if (!nullb->dev->write_zeroes)
+		return;
+	blk_queue_max_write_zeroes_sectors(nullb->q, UINT_MAX >> 9);
+}
+
 static int null_open(struct block_device *bdev, fmode_t mode)
 {
 	return 0;
@@ -1702,6 +1730,7 @@ static int null_add_dev(struct nullb_device *dev)
 	blk_queue_physical_block_size(nullb->q, dev->blocksize);
 
 	null_config_discard(nullb);
+	null_config_write_zeroes(nullb);
 
 	sprintf(nullb->disk_name, "nullb%d", nullb->index);
 
-- 
2.19.1


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* Re: [RFC PATCH 13/18] drivers: set bio iopriority field
  2019-05-01  4:28 ` [RFC PATCH 13/18] drivers: set bio iopriority field Chaitanya Kulkarni
@ 2019-05-01  6:23   ` Javier González
  0 siblings, 0 replies; 27+ messages in thread
From: Javier González @ 2019-05-01  6:23 UTC (permalink / raw)
  To: Chaitanya Kulkarni; +Cc: linux-block

[-- Attachment #1: Type: text/plain, Size: 5747 bytes --]

> On 1 May 2019, at 06.28, Chaitanya Kulkarni <Chaitanya.Kulkarni@wdc.com> wrote:
> 
> Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
> ---
> drivers/block/drbd/drbd_actlog.c    | 2 ++
> drivers/block/drbd/drbd_bitmap.c    | 3 +++
> drivers/block/xen-blkback/blkback.c | 3 +++
> drivers/block/zram/zram_drv.c       | 2 ++
> drivers/lightnvm/pblk-read.c        | 2 ++
> drivers/lightnvm/pblk-write.c       | 1 +
> drivers/md/bcache/journal.c         | 2 ++
> drivers/md/bcache/super.c           | 2 ++
> drivers/md/dm-bufio.c               | 2 ++
> drivers/md/dm-cache-target.c        | 1 +
> drivers/md/dm-io.c                  | 2 ++
> drivers/md/dm-log-writes.c          | 5 +++++
> drivers/md/dm-thin.c                | 1 +
> drivers/md/dm-writecache.c          | 2 ++
> drivers/md/dm-zoned-metadata.c      | 4 ++++
> drivers/md/md.c                     | 4 ++++
> drivers/md/raid5-cache.c            | 4 ++++
> drivers/md/raid5-ppl.c              | 3 +++
> drivers/nvme/target/io-cmd-bdev.c   | 7 +++++++
> drivers/staging/erofs/internal.h    | 3 +++
> drivers/target/target_core_iblock.c | 3 +++
> 21 files changed, 58 insertions(+)
> 
> diff --git a/drivers/block/drbd/drbd_actlog.c b/drivers/block/drbd/drbd_actlog.c
> index 5f0eaee8c8a7..67235633c172 100644
> --- a/drivers/block/drbd/drbd_actlog.c
> +++ b/drivers/block/drbd/drbd_actlog.c
> @@ -27,6 +27,7 @@
> #include <linux/crc32c.h>
> #include <linux/drbd.h>
> #include <linux/drbd_limits.h>
> +#include <linux/ioprio.h>
> #include "drbd_int.h"
> 
> 
> @@ -159,6 +160,7 @@ static int _drbd_md_sync_page_io(struct drbd_device *device,
> 	bio->bi_private = device;
> 	bio->bi_end_io = drbd_md_endio;
> 	bio_set_op_attrs(bio, op, op_flags);
> +	bio_set_prio(bio, get_current_ioprio());
> 
> 	if (op != REQ_OP_WRITE && device->state.disk == D_DISKLESS && device->ldev == NULL)
> 		/* special case, drbd_md_read() during drbd_adm_attach(): no get_ldev */
> diff --git a/drivers/block/drbd/drbd_bitmap.c b/drivers/block/drbd/drbd_bitmap.c
> index 11a85b740327..e7cb027488c7 100644
> --- a/drivers/block/drbd/drbd_bitmap.c
> +++ b/drivers/block/drbd/drbd_bitmap.c
> @@ -30,6 +30,7 @@
> #include <linux/drbd.h>
> #include <linux/slab.h>
> #include <linux/highmem.h>
> +#include <linux/ioprio.h>
> 
> #include "drbd_int.h"
> 
> @@ -1028,6 +1029,8 @@ static void bm_page_io_async(struct drbd_bm_aio_ctx *ctx, int page_nr) __must_ho
> 	bio->bi_private = ctx;
> 	bio->bi_end_io = drbd_bm_endio;
> 	bio_set_op_attrs(bio, op, 0);
> +	bio_set_prio(bio, get_current_ioprio());
> +
> 
> 	if (drbd_insert_fault(device, (op == REQ_OP_WRITE) ? DRBD_FAULT_MD_WR : DRBD_FAULT_MD_RD)) {
> 		bio_io_error(bio);
> diff --git a/drivers/block/xen-blkback/blkback.c b/drivers/block/xen-blkback/blkback.c
> index fd1e19f1a49f..41294944267d 100644
> --- a/drivers/block/xen-blkback/blkback.c
> +++ b/drivers/block/xen-blkback/blkback.c
> @@ -42,6 +42,7 @@
> #include <linux/delay.h>
> #include <linux/freezer.h>
> #include <linux/bitmap.h>
> +#include <linux/ioprio.h>
> 
> #include <xen/events.h>
> #include <xen/page.h>
> @@ -1375,6 +1376,7 @@ static int dispatch_rw_block_io(struct xen_blkif_ring *ring,
> 			bio->bi_end_io  = end_block_io_op;
> 			bio->bi_iter.bi_sector  = preq.sector_number;
> 			bio_set_op_attrs(bio, operation, operation_flags);
> +			bio_set_prio(bio, get_current_ioprio());
> 		}
> 
> 		preq.sector_number += seg[i].nsec;
> @@ -1393,6 +1395,7 @@ static int dispatch_rw_block_io(struct xen_blkif_ring *ring,
> 		bio->bi_private = pending_req;
> 		bio->bi_end_io  = end_block_io_op;
> 		bio_set_op_attrs(bio, operation, operation_flags);
> +		bio_set_prio(bio, get_current_ioprio());
> 	}
> 
> 	atomic_set(&pending_req->pendcnt, nbio);
> diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
> index 399cad7daae7..1a4e3b0e98ad 100644
> --- a/drivers/block/zram/zram_drv.c
> +++ b/drivers/block/zram/zram_drv.c
> @@ -33,6 +33,7 @@
> #include <linux/sysfs.h>
> #include <linux/debugfs.h>
> #include <linux/cpuhotplug.h>
> +#include <linux/ioprio.h>
> 
> #include "zram_drv.h"
> 
> @@ -596,6 +597,7 @@ static int read_from_bdev_async(struct zram *zram, struct bio_vec *bvec,
> 
> 	bio->bi_iter.bi_sector = entry * (PAGE_SIZE >> 9);
> 	bio_set_dev(bio, zram->bdev);
> +	bio_set_prio(bio, get_current_ioprio());
> 	if (!bio_add_page(bio, bvec->bv_page, bvec->bv_len, bvec->bv_offset)) {
> 		bio_put(bio);
> 		return -EIO;
> diff --git a/drivers/lightnvm/pblk-read.c b/drivers/lightnvm/pblk-read.c
> index 0b7d5fb4548d..2b866744545e 100644
> --- a/drivers/lightnvm/pblk-read.c
> +++ b/drivers/lightnvm/pblk-read.c
> @@ -16,6 +16,7 @@
>  * pblk-read.c - pblk's read path
>  */
> 
> +#include <linux/ioprio.h>
> #include "pblk.h"
> 
> /*
> @@ -336,6 +337,7 @@ static int pblk_setup_partial_read(struct pblk *pblk, struct nvm_rq *rqd,
> 
> 	new_bio->bi_iter.bi_sector = 0; /* internal bio */
> 	bio_set_op_attrs(new_bio, REQ_OP_READ, 0);
> +	bio_set_prio(bio, get_current_ioprio());
> 
> 	rqd->bio = new_bio;
> 	rqd->nr_ppas = nr_holes;
> diff --git a/drivers/lightnvm/pblk-write.c b/drivers/lightnvm/pblk-write.c
> index 6593deab52da..3fdbbff40fde 100644
> --- a/drivers/lightnvm/pblk-write.c
> +++ b/drivers/lightnvm/pblk-write.c
> @@ -628,6 +628,7 @@ static int pblk_submit_write(struct pblk *pblk, int *secs_left)
> 
> 	bio->bi_iter.bi_sector = 0; /* internal bio */
> 	bio_set_op_attrs(bio, REQ_OP_WRITE, 0);
> +	bio_set_prio(bio, get_current_ioprio());
> 
> 	rqd = pblk_alloc_rqd(pblk, PBLK_WRITE);
> 	rqd->bio = bio;
> 

pblk bits look god to me.

Reviewed-by: Javier González <javier@javigon.com>


[-- Attachment #2: Message signed with OpenPGP --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [RFC PATCH 02/18] blktrace: add more definitions for BLK_TC_ACT
  2019-05-01  4:28 ` [RFC PATCH 02/18] blktrace: add more definitions for BLK_TC_ACT Chaitanya Kulkarni
@ 2019-05-01 12:31   ` Christoph Hellwig
  2019-05-01 12:56     ` Jeff Moyer
  2019-05-02  3:49     ` Chaitanya Kulkarni
  0 siblings, 2 replies; 27+ messages in thread
From: Christoph Hellwig @ 2019-05-01 12:31 UTC (permalink / raw)
  To: Chaitanya Kulkarni; +Cc: linux-block

On Tue, Apr 30, 2019 at 09:28:15PM -0700, Chaitanya Kulkarni wrote:
> @@ -104,7 +120,12 @@ struct blk_io_trace {
>  	__u64 time;		/* in nanoseconds */
>  	__u64 sector;		/* disk offset */
>  	__u32 bytes;		/* transfer length */
> +
> +#ifdef CONFIG_BLKTRACE_EXT
> +	__u64 action;		/* what happened */
> +#else
>  	__u32 action;		/* what happened */
> +#endif /* CONFIG_BLKTRACE_EXT */

You can't use CONFIG_ symbols in UAPI headers, as userspace
applications won't set it.  You also can't ever change the layout of an
existing structure in UAPI headers in not backward compatible way.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [RFC PATCH 02/18] blktrace: add more definitions for BLK_TC_ACT
  2019-05-01 12:31   ` Christoph Hellwig
@ 2019-05-01 12:56     ` Jeff Moyer
  2019-05-02  3:48       ` Chaitanya Kulkarni
  2019-05-02  3:49     ` Chaitanya Kulkarni
  1 sibling, 1 reply; 27+ messages in thread
From: Jeff Moyer @ 2019-05-01 12:56 UTC (permalink / raw)
  To: Chaitanya Kulkarni, Christoph Hellwig; +Cc: linux-block

Christoph Hellwig <hch@infradead.org> writes:

> On Tue, Apr 30, 2019 at 09:28:15PM -0700, Chaitanya Kulkarni wrote:
>> @@ -104,7 +120,12 @@ struct blk_io_trace {
>>  	__u64 time;		/* in nanoseconds */
>>  	__u64 sector;		/* disk offset */
>>  	__u32 bytes;		/* transfer length */
>> +
>> +#ifdef CONFIG_BLKTRACE_EXT
>> +	__u64 action;		/* what happened */
>> +#else
>>  	__u32 action;		/* what happened */
>> +#endif /* CONFIG_BLKTRACE_EXT */
>
> You can't use CONFIG_ symbols in UAPI headers, as userspace
> applications won't set it.  You also can't ever change the layout of an
> existing structure in UAPI headers in not backward compatible way.

Right.  The blk_io_trace->magic has the lower 8 bits reserved for a
version number which is checked by userspace.  There's no way to
negotiate a supported version between userspace and the kernel,
unfortunately.  The version number is checked for each trace event.

What you *could* do is to add another trace event with a higher version
number that includes only the extra data.  So each event would be split
into two: the original event with original content and the new event
that only contains the new fields.  That way the old userspace would
continue to work, as it would discard the trace events it doesn't
recognize.  Newer userspace could handle both types of events, and merge
them back together.

There would be a ton of warnings spewed on stderr, unfortunately, but it
would at least work.  I don't see a lot of value in the kernel config
option, no matter which way we go with this.

-Jeff

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [RFC PATCH 01/18] blktrace: increase the size of action mask
  2019-05-01  4:28 ` [RFC PATCH 01/18] blktrace: increase the size of action mask Chaitanya Kulkarni
@ 2019-05-01 15:48   ` Bart Van Assche
  2019-05-02  3:43     ` Chaitanya Kulkarni
  0 siblings, 1 reply; 27+ messages in thread
From: Bart Van Assche @ 2019-05-01 15:48 UTC (permalink / raw)
  To: Chaitanya Kulkarni, linux-block

On Tue, 2019-04-30 at 21:28 -0700, Chaitanya Kulkarni wrote:
> -#define BLKTRACESETUP32 _IOWR(0x12, 115, struct compat_blk_user_trace_setup)
> +
> +/* XXX: temp work around for RFC */
> +#define BLKTRACESETUP32 _IOWR(0x13, 115, struct compat_blk_user_trace_setup)

This change breaks user space so this change is not acceptable. I think you
want to introduce a new ioctl instead of modifying an existing ioctl.
Additionally, have you considered to split the blktrace_api.h header file
into two header files: one with kernel-internal definitions and a second one
with definitions that are shared with user space (include/uapi/...)?

Thanks,

Bart.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [RFC PATCH 01/18] blktrace: increase the size of action mask
  2019-05-01 15:48   ` Bart Van Assche
@ 2019-05-02  3:43     ` Chaitanya Kulkarni
  2019-05-02 15:12       ` Bart Van Assche
  0 siblings, 1 reply; 27+ messages in thread
From: Chaitanya Kulkarni @ 2019-05-02  3:43 UTC (permalink / raw)
  To: Bart Van Assche, linux-block

On 5/1/19 8:48 AM, Bart Van Assche wrote:
> On Tue, 2019-04-30 at 21:28 -0700, Chaitanya Kulkarni wrote:
>> -#define BLKTRACESETUP32 _IOWR(0x12, 115, struct compat_blk_user_trace_setup)
>> +
>> +/* XXX: temp work around for RFC */
>> +#define BLKTRACESETUP32 _IOWR(0x13, 115, struct compat_blk_user_trace_setup)
> 
> This change breaks user space so this change is not acceptable. I think you
> want to introduce a new ioctl instead of modifying an existing ioctl.
> Additionally, have you considered to split the blktrace_api.h header file
> into two header files: one with kernel-internal definitions and a second one
> with definitions that are shared with user space (include/uapi/...)?
> 
> Thanks,
> 
> Bart.
> 

I want to avoid modifying an existing IOCTL, I'll add a new ioctl and 
update the tools to use the extension IOCTL and split the header file 
also. Also I found that user space tools have replicated BLK_XX_XXX 
definitions, will be okay to keep all those in one place and include 
those from the appropriate header file ?

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [RFC PATCH 02/18] blktrace: add more definitions for BLK_TC_ACT
  2019-05-01 12:56     ` Jeff Moyer
@ 2019-05-02  3:48       ` Chaitanya Kulkarni
  0 siblings, 0 replies; 27+ messages in thread
From: Chaitanya Kulkarni @ 2019-05-02  3:48 UTC (permalink / raw)
  To: Jeff Moyer, Christoph Hellwig; +Cc: linux-block

Thanks for the reply Jeff.

On 5/1/19 5:56 AM, Jeff Moyer wrote:
> Christoph Hellwig <hch@infradead.org> writes:
> 
>> On Tue, Apr 30, 2019 at 09:28:15PM -0700, Chaitanya Kulkarni wrote:
>>> @@ -104,7 +120,12 @@ struct blk_io_trace {
>>>   	__u64 time;		/* in nanoseconds */
>>>   	__u64 sector;		/* disk offset */
>>>   	__u32 bytes;		/* transfer length */
>>> +
>>> +#ifdef CONFIG_BLKTRACE_EXT
>>> +	__u64 action;		/* what happened */
>>> +#else
>>>   	__u32 action;		/* what happened */
>>> +#endif /* CONFIG_BLKTRACE_EXT */
>>
>> You can't use CONFIG_ symbols in UAPI headers, as userspace
>> applications won't set it.  You also can't ever change the layout of an
>> existing structure in UAPI headers in not backward compatible way.
> 
> Right.  The blk_io_trace->magic has the lower 8 bits reserved for a
> version number which is checked by userspace.  There's no way to
> negotiate a supported version between userspace and the kernel,
> unfortunately.  The version number is checked for each trace event.
> 
> What you *could* do is to add another trace event with a higher version
> number that includes only the extra data.  So each event would be split
> into two: the original event with original content and the new event
> that only contains the new fields.  That way the old userspace would
> continue to work, as it would discard the trace events it doesn't
> recognize.  Newer userspace could handle both types of events, and merge
> them back together.
> 
> There would be a ton of warnings spewed on stderr, unfortunately, but it
> would at least work.  I don't see a lot of value in the kernel config
> option, no matter which way we go with this.
>
As you have mentioned this approach will have a lot of stderr, I was 
trying to avoid this scenario. If everyone is okay with this will make 
this change and resend the series.

> -Jeff
> 



^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [RFC PATCH 02/18] blktrace: add more definitions for BLK_TC_ACT
  2019-05-01 12:31   ` Christoph Hellwig
  2019-05-01 12:56     ` Jeff Moyer
@ 2019-05-02  3:49     ` Chaitanya Kulkarni
  1 sibling, 0 replies; 27+ messages in thread
From: Chaitanya Kulkarni @ 2019-05-02  3:49 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: linux-block

Thanks for looking into this.

On 5/1/19 5:31 AM, Christoph Hellwig wrote:
> On Tue, Apr 30, 2019 at 09:28:15PM -0700, Chaitanya Kulkarni wrote:
>> @@ -104,7 +120,12 @@ struct blk_io_trace {
>>   	__u64 time;		/* in nanoseconds */
>>   	__u64 sector;		/* disk offset */
>>   	__u32 bytes;		/* transfer length */
>> +
>> +#ifdef CONFIG_BLKTRACE_EXT
>> +	__u64 action;		/* what happened */
>> +#else
>>   	__u32 action;		/* what happened */
>> +#endif /* CONFIG_BLKTRACE_EXT */
> 
> You can't use CONFIG_ symbols in UAPI headers, as userspace
> applications won't set it.  You also can't ever change the layout of an
> existing structure in UAPI headers in not backward compatible way.
> 
Jeff has suggested another approach, if everyone is okay with that 
approach will send out the series with that change.

Please let me know if you have more comments.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [RFC PATCH 01/18] blktrace: increase the size of action mask
  2019-05-02  3:43     ` Chaitanya Kulkarni
@ 2019-05-02 15:12       ` Bart Van Assche
  0 siblings, 0 replies; 27+ messages in thread
From: Bart Van Assche @ 2019-05-02 15:12 UTC (permalink / raw)
  To: Chaitanya Kulkarni, linux-block

On Thu, 2019-05-02 at 03:43 +0000, Chaitanya Kulkarni wrote:
> On 5/1/19 8:48 AM, Bart Van Assche wrote:
> > On Tue, 2019-04-30 at 21:28 -0700, Chaitanya Kulkarni wrote:
> > > -#define BLKTRACESETUP32 _IOWR(0x12, 115, struct compat_blk_user_trace_setup)
> > > +
> > > +/* XXX: temp work around for RFC */
> > > +#define BLKTRACESETUP32 _IOWR(0x13, 115, struct compat_blk_user_trace_setup)
> > 
> > This change breaks user space so this change is not acceptable. I think you
> > want to introduce a new ioctl instead of modifying an existing ioctl.
> > Additionally, have you considered to split the blktrace_api.h header file
> > into two header files: one with kernel-internal definitions and a second one
> > with definitions that are shared with user space (include/uapi/...)?
> 
> I want to avoid modifying an existing IOCTL, I'll add a new ioctl and 
> update the tools to use the extension IOCTL and split the header file 
> also. Also I found that user space tools have replicated BLK_XX_XXX 
> definitions, will be okay to keep all those in one place and include 
> those from the appropriate header file ?

Hi Chaitanya,

I think all definitions that are relevant for the user space blktrace tool
should be moved into a header file under include/uapi/linux. I'm not sure
what the best strategy is to include that header file in the blktrace tool.
Another project that interfaces with the kernel (rdma-core; see also
https://github.com/linux-rdma/rdma-core/) periodically copies kernel header
files into its own source code repository.

Bart.

^ permalink raw reply	[flat|nested] 27+ messages in thread

end of thread, other threads:[~2019-05-02 15:12 UTC | newest]

Thread overview: 27+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-05-01  4:28 [RFC PATCH 00/18] blktrace: add blktrace extension support Chaitanya Kulkarni
2019-05-01  4:28 ` [RFC PATCH 01/18] blktrace: increase the size of action mask Chaitanya Kulkarni
2019-05-01 15:48   ` Bart Van Assche
2019-05-02  3:43     ` Chaitanya Kulkarni
2019-05-02 15:12       ` Bart Van Assche
2019-05-01  4:28 ` [RFC PATCH 02/18] blktrace: add more definitions for BLK_TC_ACT Chaitanya Kulkarni
2019-05-01 12:31   ` Christoph Hellwig
2019-05-01 12:56     ` Jeff Moyer
2019-05-02  3:48       ` Chaitanya Kulkarni
2019-05-02  3:49     ` Chaitanya Kulkarni
2019-05-01  4:28 ` [RFC PATCH 03/18] blktrace: update trace to track more actions Chaitanya Kulkarni
2019-05-01  4:28 ` [RFC PATCH 04/18] kernel/trace: add KConfig to enable blktrace_ext Chaitanya Kulkarni
2019-05-01  4:28 ` [RFC PATCH 05/18] blktrace: add iopriority mask Chaitanya Kulkarni
2019-05-01  4:28 ` [RFC PATCH 06/18] " Chaitanya Kulkarni
2019-05-01  4:28 ` [RFC PATCH 07/18] blktrace: allow user to track iopriority Chaitanya Kulkarni
2019-05-01  4:28 ` [RFC PATCH 08/18] blktrace: add sysfs ioprio mask Chaitanya Kulkarni
2019-05-01  4:28 ` [RFC PATCH 09/18] blktrace: add debug support for extension Chaitanya Kulkarni
2019-05-01  4:28 ` [RFC PATCH 10/18] block: set ioprio for write-zeroes, discard etc Chaitanya Kulkarni
2019-05-01  4:28 ` [RFC PATCH 11/18] block: set ioprio for zone-reset Chaitanya Kulkarni
2019-05-01  4:28 ` [RFC PATCH 12/18] block: set ioprio for flush bio Chaitanya Kulkarni
2019-05-01  4:28 ` [RFC PATCH 13/18] drivers: set bio iopriority field Chaitanya Kulkarni
2019-05-01  6:23   ` Javier González
2019-05-01  4:28 ` [RFC PATCH 14/18] fs: " Chaitanya Kulkarni
2019-05-01  4:28 ` [RFC PATCH 15/18] power/swap: " Chaitanya Kulkarni
2019-05-01  4:28 ` [RFC PATCH 16/18] mm: " Chaitanya Kulkarni
2019-05-01  4:28 ` [RFC PATCH 17/18] null_blk: add write-zeroes flag to nullb_device Chaitanya Kulkarni
2019-05-01  4:28 ` [RFC PATCH 18/18] null_blk: add module param discard/write-zeroes Chaitanya Kulkarni

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).