All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ankit Kumar <ankit.kumar@samsung.com>
To: Jens Axboe <axboe@kernel.dk>
Cc: fio@vger.kernel.org, krish.reddy@samsung.com,
	joshi.k@samsung.com, anuj20.g@samsung.com
Subject: Re: [PATCH 3/7] engines/io_uring: add new I/O engine for uring passthrough support
Date: Mon, 23 May 2022 23:50:03 +0530	[thread overview]
Message-ID: <20220523182003.GA16572@test-zns> (raw)
In-Reply-To: <be576bfa-49e8-aa05-5163-6133e9137695@kernel.dk>

[-- Attachment #1: Type: text/plain, Size: 8094 bytes --]

On Mon, May 23, 2022 at 10:44:50AM -0600, Jens Axboe wrote:
> On 5/23/22 7:10 AM, Ankit Kumar wrote:
> > From: Anuj Gupta <anuj20.g@samsung.com>
> > 
> > Add a new I/O engine (io_uring_cmd) for sending uring passthrough
> > commands. The new I/O engine will be built only if its support is
> > present in the kernel. It will also use most of the existing
> > helpers from the I/O engine io_uring. The I/O preparation,
> > completion, file open, file close and post init paths are going to
> > differ and hence io_uring_cmd will have its own helper for them.
> > 
> > Add a new io_uring_cmd engine specific flag to support nvme
> > passthrough commands. Filename name for this specific option
> > must specify nvme-ns generic character device (dev/ngXnY).
> > This provides io_uring_cmd I/O engine a bandwidth to support
> > various passthrough commands in future.
> > 
> > The engine_pos and engine_data fields in struct fio_file are
> > separated now. This will help I/O engine io_uring_cmd to store
> > specific data as well as keep track of register files.
> > 
> > The supported io_uring_cmd options are:
> >  * registerfiles
> >  * sqthread_poll
> >  * sqthread_poll_cpu
> >  * cmd_type
> 
> This looks way better than the earlier versions.
> 
> > co-authored-By: Anuj Gupta <anuj20.g@samsung.com>
> > co-authored-By: Ankit Kumar <ankit.kumar@samsung.com>
> 
> Since the patch is attributed to Anuj as the author, that co-authored-by
> should be a Signed-off-by from Anuf.
>
Thanks, Will update it
> > diff --git a/engines/io_uring.c b/engines/io_uring.c
> > index 1e15647e..75248624 100644
> > --- a/engines/io_uring.c
> > +++ b/engines/io_uring.c
> > @@ -25,6 +25,17 @@
> >  #include "../os/linux/io_uring.h"
> >  #include "cmdprio.h"
> >  
> > +#ifdef CONFIG_LIBNVME
> > +#include <sys/stat.h>
> > +#include <nvme/ioctl.h>
> > +#endif
> > +
> > +#ifdef CONFIG_URING_CMD
> > +enum uring_cmd_type {
> > +	FIO_URING_CMD_NVME = 1,
> > +};
> > +#endif
> 
> Can you briefly describe why we need the libnvme dependency? What does
> libnvme get us here? Would it be too cumbersome to do it without
> libnvme?
> 
> In general, fio tries to not rely on external libraries unless there's a
> strong reason to do so.
> 
Our idea of having libnvme dependency was ease of maintaing the code
base as libnvme provides all the necessary structures, definitions and
helper functions.
But yes libnvme won't be necessary if we maintain all the nvme
structures and send passthru commands.
> > @@ -270,6 +289,23 @@ static struct fio_option options[] = {
> >  		.category = FIO_OPT_C_ENGINE,
> >  		.group	= FIO_OPT_G_IOURING,
> >  	},
> > +#ifdef CONFIG_URING_CMD
> > +	{
> > +		.name	= "cmd_type",
> > +		.lname	= "Uring cmd type",
> > +		.type	= FIO_OPT_STR,
> > +		.off1	= offsetof(struct ioring_options, cmd_type),
> > +		.help	= "Specify uring-cmd type",
> > +		.posval = {
> > +			  { .ival = "nvme",
> > +			    .oval = FIO_URING_CMD_NVME,
> > +			    .help = "Issue nvme-uring-cmd",
> > +			  },
> > +		},
> > +		.category = FIO_OPT_C_ENGINE,
> > +		.group	= FIO_OPT_G_IOURING,
> > +	},
> > +#endif
> >  	{
> >  		.name	= NULL,
> >  	},
> 
> Options should always be visible regardless of availability, they should
> just error if not available at runtime.
> 
> > @@ -373,6 +409,61 @@ static int fio_ioring_prep(struct thread_data *td, struct io_u *io_u)
> >  	return 0;
> >  }
> >  
> > +#ifdef CONFIG_URING_CMD
> > +static int fio_ioring_cmd_prep(struct thread_data *td, struct io_u *io_u)
> > +{
> > +	struct ioring_data *ld = td->io_ops_data;
> > +	struct ioring_options *o = td->eo;
> > +	struct fio_file *f = io_u->file;
> > +	struct io_uring_sqe *sqe;
> > +
> > +	/* nvme_uring_cmd case */
> > +	if (o->cmd_type == FIO_URING_CMD_NVME) {
> > +#ifdef CONFIG_LIBNVME
> > +		struct nvme_data *data = FILE_ENG_DATA(io_u->file);
> > +		struct nvme_uring_cmd *cmd;
> > +		unsigned long long slba;
> > +		unsigned long long nlb;
> > +
> > +		sqe = &ld->sqes[(io_u->index) << 1];
> > +
> > +		if (o->registerfiles) {
> > +			sqe->fd = f->engine_pos;
> > +			sqe->flags = IOSQE_FIXED_FILE;
> > +		} else {
> > +			sqe->fd = f->fd;
> > +		}
> > +		sqe->opcode = IORING_OP_URING_CMD;
> > +		sqe->user_data = (unsigned long) io_u;
> > +		sqe->cmd_op = NVME_URING_CMD_IO;
> > +
> > +		slba = io_u->offset / data->lba_size;
> > +		nlb = (io_u->xfer_buflen / data->lba_size) - 1;
> > +
> > +		cmd = (struct nvme_uring_cmd *)sqe->cmd;
> > +		memset(cmd, 0, sizeof(struct nvme_uring_cmd));
> > +
> > +		/* cdw10 and cdw11 represent starting lba */
> > +		cmd->cdw10 = slba & 0xffffffff;
> > +		cmd->cdw11 = slba >> 32;
> > +		/* cdw12 represent number of lba's for read/write */
> > +		cmd->cdw12 = nlb;
> > +		cmd->addr = (__u64)io_u->xfer_buf;
> > +		cmd->data_len = io_u->xfer_buflen;
> > +		cmd->nsid = data->nsid;
> > +
> > +		if (io_u->ddir == DDIR_READ)
> > +			cmd->opcode = nvme_cmd_read;
> > +		if (io_u->ddir == DDIR_WRITE)
> > +			cmd->opcode = nvme_cmd_write;
> > +
> > +		return 0;
> > +#endif
> > +	}
> > +	return -EINVAL;
> > +}
> > +#endif
> 
> The nested ifdefs don't do much here for readability. Would be cleaner
> as:
> 
> #ifdef CONFIG_URING_CMD
> static int fio_ioring_cmd_prep(struct thread_data *td, struct io_u *io_u)
> {
> #ifndef CONFIG_LIBNVME
> 	return -EINVAL;
> #else
> 	struct ioring_data *ld = td->io_ops_data;
> 	struct ioring_options *o = td->eo;
> 	struct fio_file *f = io_u->file;
> 	struct io_uring_sqe *sqe;
> 
> 	/* nvme_uring_cmd case */
> 	if (o->cmd_type != FIO_URING_CMD_NVME)
> 		return -EINVAL;
> 
> 	...
> #endif /* CONFIG_LIBNVME */
> }
> #endif /* CONFIG_URING_CMD */
> 
Ack
> > +#ifdef CONFIG_URING_CMD
> > +static int fio_ioring_cmd_queue_init(struct thread_data *td)
> > +{
> > +	struct ioring_data *ld = td->io_ops_data;
> > +	struct ioring_options *o = td->eo;
> > +	int depth = td->o.iodepth;
> > +	struct io_uring_params p;
> > +	int ret;
> > +
> > +	memset(&p, 0, sizeof(p));
> > +
> > +	if (o->hipri && o->cmd_type == FIO_URING_CMD_NVME) {
> > +		log_err("fio: nvme_uring_cmd doesn't support hipri\n");
> > +		return ENOTSUP;
> > +	}
> > +	if (o->hipri)
> > +		p.flags |= IORING_SETUP_IOPOLL;
> > +	if (o->sqpoll_thread) {
> > +		p.flags |= IORING_SETUP_SQPOLL;
> > +		if (o->sqpoll_set) {
> > +			p.flags |= IORING_SETUP_SQ_AFF;
> > +			p.sq_thread_cpu = o->sqpoll_cpu;
> > +		}
> > +	}
> > +	if (o->cmd_type == FIO_URING_CMD_NVME) {
> > +		p.flags |= IORING_SETUP_SQE128;
> > +		p.flags |= IORING_SETUP_CQE32;
> > +	}
> > +
> > +	/*
> > +	 * Clamp CQ ring size at our SQ ring size, we don't need more entries
> > +	 * than that.
> > +	 */
> > +	p.flags |= IORING_SETUP_CQSIZE;
> > +	p.cq_entries = depth;
> > +
> > +retry:
> > +	ret = syscall(__NR_io_uring_setup, depth, &p);
> > +	if (ret < 0) {
> > +		if (errno == EINVAL && p.flags & IORING_SETUP_CQSIZE) {
> > +			p.flags &= ~IORING_SETUP_CQSIZE;
> > +			goto retry;
> > +		}
> > +		return ret;
> > +	}
> > +
> > +	ld->ring_fd = ret;
> > +
> > +	fio_ioring_probe(td);
> > +
> > +	if (o->fixedbufs) {
> > +		log_err("fio: io_uring_cmd doesn't support fixedbufs\n");
> > +		return ENOTSUP;
> > +	}
> > +	return fio_ioring_mmap(ld, &p);
> 
> This is a limitation of the current implementation, which means that
> once the kernel does support that, your fio still won't work until you
> update it.
> 
> In other words, this should just fail at runtime if someone attempts to
> use it.
> 
> In general, cleanup all the ifdef mess, it's all over the place. It
> makes it really hard to maintain, way too easy to make a change that
> works on your setup and breaks if you don't have eg libnvme. Rather than
> do it in the middle of functions, have stubs that just return an error
> if the right options aren't available. That makes the actual functional
> functions easier to read too, rather than having ifdefs sprinkled
> everywhere.
> 
> -- 
> Jens Axboe
> 
> 
Thanks for reviewing it quickly, we will cleanup all the ifdef mess.
We will remove the libnvme dependency and have a separate file
(engines/nvme_uring.c similar to engines/cmdprio.c) for all the
necessary structures and helper functions.

[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



  reply	other threads:[~2022-05-23 19:43 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <CGME20220523131607epcas5p3dbb26f3b6f62fc8bcbe219304d60a7d1@epcas5p3.samsung.com>
2022-05-23 13:10 ` [PATCH 0/7] add support for uring passthrough commands Ankit Kumar
     [not found]   ` <CGME20220523131611epcas5p44aef6eae876d30b1d92ab4dc4533eaa5@epcas5p4.samsung.com>
2022-05-23 13:10     ` [PATCH 1/7] configure: check if uring_cmd support is present Ankit Kumar
     [not found]   ` <CGME20220523131616epcas5p1a1779b1fdcf24631eea353d97757f063@epcas5p1.samsung.com>
2022-05-23 13:10     ` [PATCH 2/7] io_uring.h: add IORING_SETUP_SQE128 and IORING_SETUP_CQE32 Ankit Kumar
     [not found]   ` <CGME20220523131619epcas5p47af80b04e8015a949e157875f0cb0f07@epcas5p4.samsung.com>
2022-05-23 13:10     ` [PATCH 3/7] engines/io_uring: add new I/O engine for uring passthrough support Ankit Kumar
2022-05-23 16:44       ` Jens Axboe
2022-05-23 18:20         ` Ankit Kumar [this message]
     [not found]   ` <CGME20220523131621epcas5p2e6c71932c10003f423dbd38762089957@epcas5p2.samsung.com>
2022-05-23 13:10     ` [PATCH 4/7] docs: document options for io_uring_cmd I/O engine Ankit Kumar
     [not found]   ` <CGME20220523131623epcas5p418b049ac287ffc218cf46a88d94b2527@epcas5p4.samsung.com>
2022-05-23 13:10     ` [PATCH 5/7] zbd: Check for direct flag only if its block device Ankit Kumar
2022-05-25 10:07       ` Shinichiro Kawasaki
2022-05-25 10:46         ` Ankit Kumar
     [not found]   ` <CGME20220523131626epcas5p271c6ec917cc65ce1fcfd784c3757597c@epcas5p2.samsung.com>
2022-05-23 13:10     ` [PATCH 6/7] engines/io_uring: Enable zone device support for io_uring_cmd I/O engine Ankit Kumar
     [not found]   ` <CGME20220523131629epcas5p4ab9e6158624a4c6a055bb0e410f46079@epcas5p4.samsung.com>
2022-05-23 13:10     ` [PATCH 7/7] examples: add 2 example job file for io_uring_cmd engine Ankit Kumar

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220523182003.GA16572@test-zns \
    --to=ankit.kumar@samsung.com \
    --cc=anuj20.g@samsung.com \
    --cc=axboe@kernel.dk \
    --cc=fio@vger.kernel.org \
    --cc=joshi.k@samsung.com \
    --cc=krish.reddy@samsung.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.