All of lore.kernel.org
 help / color / mirror / Atom feed
From: Niklas Cassel <Niklas.Cassel@wdc.com>
To: "axboe@kernel.dk" <axboe@kernel.dk>
Cc: "fio@vger.kernel.org" <fio@vger.kernel.org>,
	Damien Le Moal <Damien.LeMoal@wdc.com>,
	Niklas Cassel <Niklas.Cassel@wdc.com>
Subject: [PATCH v2 08/11] libaio,io_uring: introduce cmdprio_bssplit
Date: Fri, 3 Sep 2021 15:20:25 +0000	[thread overview]
Message-ID: <20210903152012.18035-9-Niklas.Cassel@wdc.com> (raw)
In-Reply-To: <20210903152012.18035-1-Niklas.Cassel@wdc.com>

From: Damien Le Moal <damien.lemoal@wdc.com>

The cmdprio_percentage, cmdprio_class and cmdprio options allow
specifying different values for read and write operations. This enables
various IO priority issuing patterns even uner a mixed read-write
workload but does not allow differentiation within read and write
I/O operation types with different sizes when the bssplit option is
used.

Introduce the cmdprio_bssplit option to complement the use of the
bssplit option.  This new option has the same format as the bssplit
option, but the percentage values indicate the percentage of I/O
operations with a particular block size that must be issued with the
priority class and value specified by cmdprio_class and cmdprio.

Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
---
 HOWTO                        |  29 ++++++---
 engines/cmdprio.h            | 113 ++++++++++++++++++++++++++++++++++-
 engines/io_uring.c           |  29 ++++++++-
 engines/libaio.c             |  29 ++++++++-
 fio.1                        |  34 +++++++----
 tools/fiograph/fiograph.conf |   4 +-
 6 files changed, 210 insertions(+), 28 deletions(-)

diff --git a/HOWTO b/HOWTO
index 8b7d4957..1853f56a 100644
--- a/HOWTO
+++ b/HOWTO
@@ -2175,23 +2175,38 @@ with the caveat that when used on the command line, they must come after the
 .. option:: cmdprio_class=int[,int] : [io_uring] [libaio]
 
 	Set the I/O priority class to use for I/Os that must be issued with
-	a priority when :option:`cmdprio_percentage` is set. If not specified
-	when :option:`cmdprio_percentage` is set, this defaults to the highest
-	priority class. A single value applies to reads and writes.
-	Comma-separated values may be specified for reads and writes. See
-	:manpage:`ionice(1)`. See also the :option:`prioclass` option.
+	a priority when :option:`cmdprio_percentage` or
+	:option:`cmdprio_bssplit` is set. If not specified when
+	:option:`cmdprio_percentage` or :option:`cmdprio_bssplit` is set,
+	this defaults to the highest priority class. A single value applies
+	to reads and writes. Comma-separated values may be specified for
+	reads and writes. See :manpage:`ionice(1)`. See also the
+	:option:`prioclass` option.
 
 .. option:: cmdprio=int[,int] : [io_uring] [libaio]
 
 	Set the I/O priority value to use for I/Os that must be issued with
-	a priority when :option:`cmdprio_percentage` is set. If not specified
-	when :option:`cmdprio_percentage` is set, this defaults to 0.
+	a priority when :option:`cmdprio_percentage` or
+	:option:`cmdprio_bssplit` is set. If not specified when
+	:option:`cmdprio_percentage` or :option:`cmdprio_bssplit` is set,
+	this defaults to 0.
 	Linux limits us to a positive value between 0 and 7, with 0 being the
 	highest. A single value applies to reads and writes. Comma-separated
 	values may be specified for reads and writes. See :manpage:`ionice(1)`.
 	Refer to an appropriate manpage for other operating systems since
 	meaning of priority may differ. See also the :option:`prio` option.
 
+.. option:: cmdprio_bssplit=str[,str] : [io_uring] [libaio]
+	To get a finer control over I/O priority, this option allows
+	specifying the percentage of IOs that must have a priority set
+	depending on the block size of the IO. This option is useful only
+	when used together with the :option:`bssplit` option, that is,
+	multiple different block sizes are used for reads and writes.
+	The format for this option is the same as the format of the
+	:option:`bssplit` option, with the exception that values for
+	trim IOs are ignored. This option is mutually exclusive with the
+	:option:`cmdprio_percentage` option.
+
 .. option:: fixedbufs : [io_uring]
 
     If fio is asked to do direct IO, then Linux will map pages for each
diff --git a/engines/cmdprio.h b/engines/cmdprio.h
index e3b42182..8acdb0b3 100644
--- a/engines/cmdprio.h
+++ b/engines/cmdprio.h
@@ -12,18 +12,106 @@ struct cmdprio {
 	unsigned int percentage[DDIR_RWDIR_CNT];
 	unsigned int class[DDIR_RWDIR_CNT];
 	unsigned int level[DDIR_RWDIR_CNT];
+	unsigned int bssplit_nr[DDIR_RWDIR_CNT];
+	struct bssplit *bssplit[DDIR_RWDIR_CNT];
 };
 
+static int fio_cmdprio_bssplit_ddir(struct thread_options *to, void *cb_arg,
+				    enum fio_ddir ddir, char *str, bool data)
+{
+	struct cmdprio *cmdprio = cb_arg;
+	struct split split;
+	unsigned int i;
+
+	if (ddir == DDIR_TRIM)
+		return 0;
+
+	memset(&split, 0, sizeof(split));
+
+	if (split_parse_ddir(to, &split, str, data, BSSPLIT_MAX))
+		return 1;
+	if (!split.nr)
+		return 0;
+
+	cmdprio->bssplit_nr[ddir] = split.nr;
+	cmdprio->bssplit[ddir] = malloc(split.nr * sizeof(struct bssplit));
+	if (!cmdprio->bssplit[ddir])
+		return 1;
+
+	for (i = 0; i < split.nr; i++) {
+		cmdprio->bssplit[ddir][i].bs = split.val1[i];
+		if (split.val2[i] == -1U) {
+			cmdprio->bssplit[ddir][i].perc = 0;
+		} else {
+			if (split.val2[i] > 100)
+				cmdprio->bssplit[ddir][i].perc = 100;
+			else
+				cmdprio->bssplit[ddir][i].perc = split.val2[i];
+		}
+	}
+
+	return 0;
+}
+
+static int fio_cmdprio_bssplit_parse(struct thread_data *td, const char *input,
+				     struct cmdprio *cmdprio)
+{
+	char *str, *p;
+	int i, ret = 0;
+
+	p = str = strdup(input);
+
+	strip_blank_front(&str);
+	strip_blank_end(str);
+
+	ret = str_split_parse(td, str, fio_cmdprio_bssplit_ddir, cmdprio, false);
+
+	if (parse_dryrun()) {
+		for (i = 0; i < DDIR_RWDIR_CNT; i++) {
+			free(cmdprio->bssplit[i]);
+			cmdprio->bssplit[i] = NULL;
+			cmdprio->bssplit_nr[i] = 0;
+		}
+	}
+
+	free(p);
+	return ret;
+}
+
+static inline int fio_cmdprio_percentage(struct cmdprio *cmdprio,
+					 struct io_u *io_u)
+{
+	enum fio_ddir ddir = io_u->ddir;
+	unsigned int p = cmdprio->percentage[ddir];
+	int i;
+
+	/*
+	 * If cmdprio_percentage option was specified, then use that
+	 * percentage. Otherwise, use cmdprio_bssplit percentages depending
+	 * on the IO size.
+	 */
+	if (p)
+		return p;
+
+	for (i = 0; i < cmdprio->bssplit_nr[ddir]; i++) {
+		if (cmdprio->bssplit[ddir][i].bs == io_u->buflen)
+			return cmdprio->bssplit[ddir][i].perc;
+	}
+
+	return 0;
+}
+
 static int fio_cmdprio_init(struct thread_data *td, struct cmdprio *cmdprio,
 			    bool *has_cmdprio)
 {
 	struct thread_options *to = &td->o;
 	bool has_cmdprio_percentage = false;
+	bool has_cmdprio_bssplit = false;
 	int i;
 
 	/*
-	 * If cmdprio_percentage is set and cmdprio_class is not set,
-	 * default to RT priority class.
+	 * If cmdprio_percentage/cmdprio_bssplit is set and cmdprio_class
+	 * is not set, default to RT priority class.
 	 */
 	for (i = 0; i < DDIR_RWDIR_CNT; i++) {
 		if (cmdprio->percentage[i]) {
@@ -31,6 +119,11 @@ static int fio_cmdprio_init(struct thread_data *td, struct cmdprio *cmdprio,
 				cmdprio->class[i] = IOPRIO_CLASS_RT;
 			has_cmdprio_percentage = true;
 		}
+		if (cmdprio->bssplit_nr[i]) {
+			if (!cmdprio->class[i])
+				cmdprio->class[i] = IOPRIO_CLASS_RT;
+			has_cmdprio_bssplit = true;
+		}
 	}
 
 	/*
@@ -44,8 +137,22 @@ static int fio_cmdprio_init(struct thread_data *td, struct cmdprio *cmdprio,
 			to->name);
 		return 1;
 	}
+	if (has_cmdprio_bssplit &&
+	    (fio_option_is_set(to, ioprio) ||
+	     fio_option_is_set(to, ioprio_class))) {
+		log_err("%s: cmdprio_bssplit option and mutually exclusive "
+			"prio or prioclass option is set, exiting\n",
+			to->name);
+		return 1;
+	}
+	if (has_cmdprio_percentage && has_cmdprio_bssplit) {
+		log_err("%s: cmdprio_percentage and cmdprio_bssplit options "
+			"are mutually exclusive\n",
+			to->name);
+		return 1;
+	}
 
-	*has_cmdprio = has_cmdprio_percentage;
+	*has_cmdprio = has_cmdprio_percentage || has_cmdprio_bssplit;
 
 	return 0;
 }
diff --git a/engines/io_uring.c b/engines/io_uring.c
index 1591ee4e..57124d22 100644
--- a/engines/io_uring.c
+++ b/engines/io_uring.c
@@ -75,7 +75,7 @@ struct ioring_data {
 };
 
 struct ioring_options {
-	void *pad;
+	struct thread_data *td;
 	unsigned int hipri;
 	struct cmdprio cmdprio;
 	unsigned int fixedbufs;
@@ -108,6 +108,15 @@ static int fio_ioring_sqpoll_cb(void *data, unsigned long long *val)
 	return 0;
 }
 
+static int str_cmdprio_bssplit_cb(void *data, const char *input)
+{
+	struct ioring_options *o = data;
+	struct thread_data *td = o->td;
+	struct cmdprio *cmdprio = &o->cmdprio;
+
+	return fio_cmdprio_bssplit_parse(td, input, cmdprio);
+}
+
 static struct fio_option options[] = {
 	{
 		.name	= "hipri",
@@ -163,6 +172,16 @@ static struct fio_option options[] = {
 		.category = FIO_OPT_C_ENGINE,
 		.group	= FIO_OPT_G_IOURING,
 	},
+	{
+		.name   = "cmdprio_bssplit",
+		.lname  = "Priority percentage block size split",
+		.type   = FIO_OPT_STR_ULL,
+		.cb     = str_cmdprio_bssplit_cb,
+		.off1   = offsetof(struct ioring_options, cmdprio.bssplit),
+		.help   = "Set priority percentages for different block sizes",
+		.category = FIO_OPT_C_ENGINE,
+		.group	= FIO_OPT_G_IOURING,
+	},
 #else
 	{
 		.name	= "cmdprio_percentage",
@@ -182,6 +201,12 @@ static struct fio_option options[] = {
 		.type	= FIO_OPT_UNSUPPORTED,
 		.help	= "Your platform does not support I/O priority classes",
 	},
+	{
+		.name   = "cmdprio_bssplit",
+		.lname  = "Priority percentage block size split",
+		.type	= FIO_OPT_UNSUPPORTED,
+		.help	= "Your platform does not support I/O priority classes",
+	},
 #endif
 	{
 		.name	= "fixedbufs",
@@ -432,7 +457,7 @@ static void fio_ioring_prio_prep(struct thread_data *td, struct io_u *io_u)
 	struct io_uring_sqe *sqe = &ld->sqes[io_u->index];
 	struct cmdprio *cmdprio = &o->cmdprio;
 	enum fio_ddir ddir = io_u->ddir;
-	unsigned int p = cmdprio->percentage[ddir];
+	unsigned int p = fio_cmdprio_percentage(cmdprio, io_u);
 
 	if (p && rand_between(&td->prio_state, 0, 99) < p) {
 		sqe->ioprio =
diff --git a/engines/libaio.c b/engines/libaio.c
index 8b965fe2..9fba3b12 100644
--- a/engines/libaio.c
+++ b/engines/libaio.c
@@ -56,12 +56,21 @@ struct libaio_data {
 };
 
 struct libaio_options {
-	void *pad;
+	struct thread_data *td;
 	unsigned int userspace_reap;
 	struct cmdprio cmdprio;
 	unsigned int nowait;
 };
 
+static int str_cmdprio_bssplit_cb(void *data, const char *input)
+{
+	struct libaio_options *o = data;
+	struct thread_data *td = o->td;
+	struct cmdprio *cmdprio = &o->cmdprio;
+
+	return fio_cmdprio_bssplit_parse(td, input, cmdprio);
+}
+
 static struct fio_option options[] = {
 	{
 		.name	= "userspace_reap",
@@ -117,6 +126,16 @@ static struct fio_option options[] = {
 		.category = FIO_OPT_C_ENGINE,
 		.group	= FIO_OPT_G_LIBAIO,
 	},
+	{
+		.name   = "cmdprio_bssplit",
+		.lname  = "Priority percentage block size split",
+		.type   = FIO_OPT_STR_ULL,
+		.cb     = str_cmdprio_bssplit_cb,
+		.off1   = offsetof(struct libaio_options, cmdprio.bssplit),
+		.help   = "Set priority percentages for different block sizes",
+		.category = FIO_OPT_C_ENGINE,
+		.group	= FIO_OPT_G_LIBAIO,
+	},
 #else
 	{
 		.name	= "cmdprio_percentage",
@@ -136,6 +155,12 @@ static struct fio_option options[] = {
 		.type	= FIO_OPT_UNSUPPORTED,
 		.help	= "Your platform does not support I/O priority classes",
 	},
+	{
+		.name   = "cmdprio_bssplit",
+		.lname  = "Priority percentage block size split",
+		.type	= FIO_OPT_UNSUPPORTED,
+		.help	= "Your platform does not support I/O priority classes",
+	},
 #endif
 	{
 		.name	= "nowait",
@@ -185,7 +210,7 @@ static void fio_libaio_prio_prep(struct thread_data *td, struct io_u *io_u)
 	struct libaio_options *o = td->eo;
 	struct cmdprio *cmdprio = &o->cmdprio;
 	enum fio_ddir ddir = io_u->ddir;
-	unsigned int p = cmdprio->percentage[ddir];
+	unsigned int p = fio_cmdprio_percentage(cmdprio, io_u);
 
 	if (p && rand_between(&td->prio_state, 0, 99) < p) {
 		io_u->iocb.aio_reqprio =
diff --git a/fio.1 b/fio.1
index 09b97de3..415a91bb 100644
--- a/fio.1
+++ b/fio.1
@@ -1972,21 +1972,31 @@ used. fio must also be run as the root user.
 .TP
 .BI (io_uring,libaio)cmdprio_class \fR=\fPint[,int]
 Set the I/O priority class to use for I/Os that must be issued with a
-priority when \fBcmdprio_percentage\fR is set. If not specified when
-\fBcmdprio_percentage\fR is set, this defaults to the highest priority
-class. A single value applies to reads and writes. Comma-separated
-values may be specified for reads and writes. See man \fBionice\fR\|(1).
-See also the \fBprioclass\fR option.
+priority when \fBcmdprio_percentage\fR or \fBcmdprio_bssplit\fR is set.
+If not specified when \fBcmdprio_percentage\fR or \fBcmdprio_bssplit\fR
+is set, this defaults to the highest priority class. A single value applies
+to reads and writes. Comma-separated values may be specified for reads and
+writes. See man \fBionice\fR\|(1). See also the \fBprioclass\fR option.
 .TP
 .BI (io_uring,libaio)cmdprio \fR=\fPint[,int]
 Set the I/O priority value to use for I/Os that must be issued with a
-priority when \fBcmdprio_percentage\fR is set. If not specified when
-\fBcmdprio_percentage\fR is set, this defaults to 0. Linux limits us to
-a positive value between 0 and 7, with 0 being the highest. A single
-value applies to reads and writes. Comma-separated values may be specified
-for reads and writes. See man \fBionice\fR\|(1). Refer to an appropriate
-manpage for other operating systems since the meaning of priority may differ.
-See also the \fBprio\fR option.
+priority when \fBcmdprio_percentage\fR or \fBcmdprio_bssplit\fR is set.
+If not specified when \fBcmdprio_percentage\fR or \fBcmdprio_bssplit\fR
+is set, this defaults to 0. Linux limits us to a positive value between
+0 and 7, with 0 being the highest. A single value applies to reads and writes.
+Comma-separated values may be specified for reads and writes. See man
+\fBionice\fR\|(1). Refer to an appropriate manpage for other operating systems
+since the meaning of priority may differ. See also the \fBprio\fR option.
+.TP
+.BI (io_uring,libaio)cmdprio_bssplit \fR=\fPstr[,str]
+To get a finer control over I/O priority, this option allows specifying
+the percentage of IOs that must have a priority set depending on the block
+size of the IO. This option is useful only when used together with the option
+\fBbssplit\fR, that is, multiple different block sizes are used for reads and
+writes. The format for this option is the same as the format of the
+\fBbssplit\fR option, with the exception that values for trim IOs are
+ignored. This option is mutually exclusive with the \fBcmdprio_percentage\fR
+option.
 .TP
 .BI (io_uring)fixedbufs
 If fio is asked to do direct IO, then Linux will map pages for each IO call, and
diff --git a/tools/fiograph/fiograph.conf b/tools/fiograph/fiograph.conf
index 5ba59c52..cfd2fd8e 100644
--- a/tools/fiograph/fiograph.conf
+++ b/tools/fiograph/fiograph.conf
@@ -51,10 +51,10 @@ specific_options=https  http_host  http_user  http_pass  http_s3_key  http_s3_ke
 specific_options=ime_psync  ime_psyncv
 
 [ioengine_io_uring]
-specific_options=hipri  cmdprio_percentage  cmdprio_class  cmdprio  fixedbufs  registerfiles  sqthread_poll  sqthread_poll_cpu  nonvectored  uncached  nowait  force_async
+specific_options=hipri  cmdprio_percentage  cmdprio_class  cmdprio  cmdprio_bssplit  fixedbufs  registerfiles  sqthread_poll  sqthread_poll_cpu  nonvectored  uncached  nowait  force_async
 
 [ioengine_libaio]
-specific_options=userspace_reap  cmdprio_percentage  cmdprio_class  cmdprio  nowait
+specific_options=userspace_reap  cmdprio_percentage  cmdprio_class  cmdprio  cmdprio_bssplit  nowait
 
 [ioengine_libcufile]
 specific_options=gpu_dev_ids  cuda_io
-- 
2.31.1


  parent reply	other threads:[~2021-09-03 15:20 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-09-03 15:20 [PATCH v2 00/11] Improve io_uring and libaio IO priority support Niklas Cassel
2021-09-03 15:20 ` [PATCH v2 01/11] manpage: fix formatting Niklas Cassel
2021-09-03 15:20 ` [PATCH v2 02/11] manpage: fix definition of prio and prioclass options Niklas Cassel
2021-09-03 15:20 ` [PATCH v2 03/11] tools: fiograph: do not overwrite input script file Niklas Cassel
2021-09-03 15:20 ` [PATCH v2 04/11] os: introduce ioprio_value() helper Niklas Cassel
2021-09-03 15:20 ` [PATCH v2 05/11] options: make parsing functions available to ioengines Niklas Cassel
2021-09-03 15:20 ` [PATCH v2 07/11] libaio,io_uring: introduce cmdprio_class and cmdprio options Niklas Cassel
2021-09-03 15:20 ` [PATCH v2 06/11] libaio,io_uring: improve cmdprio_percentage option Niklas Cassel
2021-09-03 15:20 ` Niklas Cassel [this message]
2021-09-03 15:20 ` [PATCH v2 09/11] libaio,io_uring: relax cmdprio_percentage constraints Niklas Cassel
2021-09-03 15:20 ` [PATCH v2 10/11] fio: Introduce the log_prio option Niklas Cassel
2021-09-03 15:20 ` [PATCH v2 11/11] examples: add examples for cmdprio_* IO priority options Niklas Cassel
2021-09-03 16:12 ` [PATCH v2 00/11] Improve io_uring and libaio IO priority support Jens Axboe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210903152012.18035-9-Niklas.Cassel@wdc.com \
    --to=niklas.cassel@wdc.com \
    --cc=Damien.LeMoal@wdc.com \
    --cc=axboe@kernel.dk \
    --cc=fio@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.