From: Jinpu Wang <jinpu.wang@cloud.ionos.com>
To: Bart Van Assche <bvanassche@acm.org>
Cc: Jack Wang <jinpuwang@gmail.com>,
linux-block@vger.kernel.org, linux-rdma@vger.kernel.org,
Jens Axboe <axboe@kernel.dk>,
Christoph Hellwig <hch@infradead.org>,
Sagi Grimberg <sagi@grimberg.me>,
Jason Gunthorpe <jgg@mellanox.com>,
Doug Ledford <dledford@redhat.com>,
Danil Kipnis <danil.kipnis@cloud.ionos.com>,
rpenyaev@suse.de
Subject: Re: [PATCH v4 21/25] ibnbd: server: functionality for IO submission to file or block dev
Date: Thu, 26 Sep 2019 16:04:03 +0200 [thread overview]
Message-ID: <CAMGffE=RpTjDX-LL2E5t_eOF8iwG=UpgWFcnB3tz-N-PQ0etoQ@mail.gmail.com> (raw)
In-Reply-To: <d1208649-5c25-578f-967e-f7a3c9edd9ce@acm.org>
Sorry for the slow reply.
On Wed, Sep 18, 2019 at 11:46 PM Bart Van Assche <bvanassche@acm.org> wrote:
>
> On 6/20/19 8:03 AM, Jack Wang wrote:
> > +#undef pr_fmt
> > +#define pr_fmt(fmt) KBUILD_MODNAME " L" __stringify(__LINE__) ": " fmt
>
> Same comment as for a previous patch: please do not include line number
> information in pr_fmt().
Ok, will be removed.
>
> > +static int ibnbd_dev_vfs_open(struct ibnbd_dev *dev, const char *path,
> > + fmode_t flags)
> > +{
> > + int oflags = O_DSYNC; /* enable write-through */
> > +
> > + if (flags & FMODE_WRITE)
> > + oflags |= O_RDWR;
> > + else if (flags & FMODE_READ)
> > + oflags |= O_RDONLY;
> > + else
> > + return -EINVAL;
> > +
> > + dev->file = filp_open(path, oflags, 0);
> > + return PTR_ERR_OR_ZERO(dev->file);
> > +}
>
> Isn't the use of O_DSYNC something that should be configurable?
I know scst allow O_DSYNC to be configured, but in our production, we
only use with O_DSYNC,
we sure can add options to allow it to configure it, but we don't
have a need yet.
>
> > +struct ibnbd_dev *ibnbd_dev_open(const char *path, fmode_t flags,
> > + enum ibnbd_io_mode mode, struct bio_set *bs,
> > + ibnbd_dev_io_fn io_cb)
> > +{
> > + struct ibnbd_dev *dev;
> > + int ret;
> > +
> > + dev = kzalloc(sizeof(*dev), GFP_KERNEL);
> > + if (!dev)
> > + return ERR_PTR(-ENOMEM);
> > +
> > + if (mode == IBNBD_BLOCKIO) {
> > + dev->blk_open_flags = flags;
> > + ret = ibnbd_dev_blk_open(dev, path, dev->blk_open_flags);
> > + if (ret)
> > + goto err;
> > + } else if (mode == IBNBD_FILEIO) {
> > + dev->blk_open_flags = FMODE_READ;
> > + ret = ibnbd_dev_blk_open(dev, path, dev->blk_open_flags);
> > + if (ret)
> > + goto err;
> > +
> > + ret = ibnbd_dev_vfs_open(dev, path, flags);
> > + if (ret)
> > + goto blk_put;
>
> This looks really weird. Why to call ibnbd_dev_blk_open() first for file
> I/O mode? Why to set dev->blk_open_flags to FMODE_READ in file I/O mode?
The reason behind is we want to be able to symlink to the block device.
And for File io mode, we only allow exporting block device.
>
> > +static int ibnbd_dev_blk_submit_io(struct ibnbd_dev *dev, sector_t sector,
> > + void *data, size_t len, u32 bi_size,
> > + enum ibnbd_io_flags flags, short prio,
> > + void *priv)
> > +{
> > + struct request_queue *q = bdev_get_queue(dev->bdev);
> > + struct ibnbd_dev_blk_io *io;
> > + struct bio *bio;
> > +
> > + /* check if the buffer is suitable for bdev */
> > + if (unlikely(WARN_ON(!blk_rq_aligned(q, (unsigned long)data, len))))
> > + return -EINVAL;
> > +
> > + /* Generate bio with pages pointing to the rdma buffer */
> > + bio = ibnbd_bio_map_kern(q, data, dev->ibd_bio_set, len, GFP_KERNEL);
> > + if (unlikely(IS_ERR(bio)))
> > + return PTR_ERR(bio);
> > +
> > + io = kmalloc(sizeof(*io), GFP_KERNEL);
> > + if (unlikely(!io)) {
> > + bio_put(bio);
> > + return -ENOMEM;
> > + }
> > +
> > + io->dev = dev;
> > + io->priv = priv;
> > +
> > + bio->bi_end_io = ibnbd_dev_bi_end_io;
> > + bio->bi_private = io;
> > + bio->bi_opf = ibnbd_to_bio_flags(flags);
> > + bio->bi_iter.bi_sector = sector;
> > + bio->bi_iter.bi_size = bi_size;
> > + bio_set_prio(bio, prio);
> > + bio_set_dev(bio, dev->bdev);
> > +
> > + submit_bio(bio);
> > +
> > + return 0;
> > +}
>
> Can struct bio and struct ibnbd_dev_blk_io be combined into a single
> data structure by passing the size of the latter data structure as the
> front_pad argument to bioset_init()?
Thanks for the suggestion, will look into it,
looks we can embed struct bio to struct ibnbd_dev_blk_io.
>
> > +static void ibnbd_dev_file_submit_io_worker(struct work_struct *w)
> > +{
> > + struct ibnbd_dev_file_io_work *dev_work;
> > + struct file *f;
> > + int ret, len;
> > + loff_t off;
> > +
> > + dev_work = container_of(w, struct ibnbd_dev_file_io_work, work);
> > + off = dev_work->sector * ibnbd_dev_get_logical_bsize(dev_work->dev);
> > + f = dev_work->dev->file;
> > + len = dev_work->bi_size;
> > +
> > + if (ibnbd_op(dev_work->flags) == IBNBD_OP_FLUSH) {
> > + ret = ibnbd_dev_file_handle_flush(dev_work, off);
> > + if (unlikely(ret))
> > + goto out;
> > + }
> > +
> > + if (ibnbd_op(dev_work->flags) == IBNBD_OP_WRITE_SAME) {
> > + ret = ibnbd_dev_file_handle_write_same(dev_work);
> > + if (unlikely(ret))
> > + goto out;
> > + }
> > +
> > + /* TODO Implement support for DIRECT */
> > + if (dev_work->bi_size) {
> > + loff_t off_tmp = off;
> > +
> > + if (ibnbd_op(dev_work->flags) == IBNBD_OP_WRITE)
> > + ret = kernel_write(f, dev_work->data, dev_work->bi_size,
> > + &off_tmp);
> > + else
> > + ret = kernel_read(f, dev_work->data, dev_work->bi_size,
> > + &off_tmp);
> > +
> > + if (unlikely(ret < 0)) {
> > + goto out;
> > + } else if (unlikely(ret != dev_work->bi_size)) {
> > + /* TODO implement support for partial completions */
> > + ret = -EIO;
> > + goto out;
> > + } else {
> > + ret = 0;
> > + }
> > + }
> > +
> > + if (dev_work->flags & IBNBD_F_FUA)
> > + ret = ibnbd_dev_file_handle_fua(dev_work, off);
> > +out:
> > + dev_work->dev->io_cb(dev_work->priv, ret);
> > + kfree(dev_work);
> > +}
> > +
> > +static int ibnbd_dev_file_submit_io(struct ibnbd_dev *dev, sector_t sector,
> > + void *data, size_t len, size_t bi_size,
> > + enum ibnbd_io_flags flags, void *priv)
> > +{
> > + struct ibnbd_dev_file_io_work *w;
> > +
> > + if (!ibnbd_flags_supported(flags)) {
> > + pr_info_ratelimited("Unsupported I/O flags: 0x%x on device "
> > + "%s\n", flags, dev->name);
> > + return -ENOTSUPP;
> > + }
> > +
> > + w = kmalloc(sizeof(*w), GFP_KERNEL);
> > + if (!w)
> > + return -ENOMEM;
> > +
> > + w->dev = dev;
> > + w->priv = priv;
> > + w->sector = sector;
> > + w->data = data;
> > + w->len = len;
> > + w->bi_size = bi_size;
> > + w->flags = flags;
> > + INIT_WORK(&w->work, ibnbd_dev_file_submit_io_worker);
> > +
> > + if (unlikely(!queue_work(fileio_wq, &w->work))) {
> > + kfree(w);
> > + return -EEXIST;
> > + }
> > +
> > + return 0;
> > +}
>
> Please use the in-kernel asynchronous I/O API instead of kernel_read()
> and kernel_write() and remove the fileio_wq workqueue. Examples of how
> to use call_read_iter() and call_write_iter() are available in the loop
> driver and also in drivers/target/target_core_file.c.
What the benefits of using call_read_iter/call_write_iter, does it
offer better performance?
>
> > +/** ibnbd_dev_init() - Initialize ibnbd_dev
> > + *
> > + * This functions initialized the ibnbd-dev component.
> > + * It has to be called 1x time before ibnbd_dev_open() is used
> > + */
> > +int ibnbd_dev_init(void);
>
> It is great so see kernel-doc headers above functions but I'm not sure
> these should be in .h files. I think most kernel developers prefer to
> see kernel-doc headers for functions in .c files because that makes it
> more likely that the implementation and the documentation stay in sync.
>
Ok, will move the kernel doc to source code.
I feel for exported functions, it's more common to do it in header files.
For this case, I think it's fine to move the kernel-doc to the c file.
Thanks,
Jinpu
next prev parent reply other threads:[~2019-09-26 14:04 UTC|newest]
Thread overview: 123+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20190620150337.7847-1-jinpuwang@gmail.com>
2019-06-20 15:03 ` [PATCH v4 01/25] sysfs: export sysfs_remove_file_self() Jack Wang
2019-09-23 17:21 ` Bart Van Assche
2019-09-25 9:30 ` Danil Kipnis
2019-07-09 9:55 ` [PATCH v4 00/25] InfiniBand Transport (IBTRS) and Network Block Device (IBNBD) Danil Kipnis
2019-07-09 11:00 ` Leon Romanovsky
2019-07-09 11:17 ` Greg KH
2019-07-09 11:57 ` Jinpu Wang
2019-07-09 13:32 ` Leon Romanovsky
2019-07-09 15:39 ` Bart Van Assche
2019-07-09 11:37 ` Jinpu Wang
2019-07-09 12:06 ` Jason Gunthorpe
2019-07-09 13:15 ` Jinpu Wang
2019-07-09 13:19 ` Jason Gunthorpe
2019-07-09 14:17 ` Jinpu Wang
2019-07-09 21:27 ` Sagi Grimberg
2019-07-19 13:12 ` Danil Kipnis
2019-07-10 14:55 ` Danil Kipnis
2019-07-09 12:04 ` Jason Gunthorpe
2019-07-09 19:45 ` Sagi Grimberg
2019-07-10 13:55 ` Jason Gunthorpe
2019-07-10 16:25 ` Sagi Grimberg
2019-07-10 17:25 ` Jason Gunthorpe
2019-07-10 19:11 ` Sagi Grimberg
2019-07-11 7:27 ` Danil Kipnis
2019-07-11 8:54 ` Danil Kipnis
2019-07-12 0:22 ` Sagi Grimberg
2019-07-12 7:57 ` Jinpu Wang
2019-07-12 19:40 ` Sagi Grimberg
2019-07-15 11:21 ` Jinpu Wang
2019-07-12 10:58 ` Danil Kipnis
[not found] ` <20190620150337.7847-26-jinpuwang@gmail.com>
2019-07-09 15:10 ` [PATCH v4 25/25] MAINTAINERS: Add maintainer for IBNBD/IBTRS modules Leon Romanovsky
2019-07-09 15:18 ` Jinpu Wang
2019-07-09 15:51 ` Leon Romanovsky
2019-09-13 23:56 ` Bart Van Assche
2019-09-19 10:30 ` Jinpu Wang
[not found] ` <20190620150337.7847-16-jinpuwang@gmail.com>
2019-09-13 22:10 ` [PATCH v4 15/25] ibnbd: private headers with IBNBD protocol structs and helpers Bart Van Assche
2019-09-15 14:30 ` Jinpu Wang
2019-09-16 5:27 ` Leon Romanovsky
2019-09-16 13:45 ` Bart Van Assche
2019-09-17 15:41 ` Leon Romanovsky
2019-09-17 15:52 ` Jinpu Wang
2019-09-16 7:08 ` Danil Kipnis
2019-09-16 14:57 ` Jinpu Wang
2019-09-16 17:25 ` Bart Van Assche
2019-09-17 12:27 ` Jinpu Wang
2019-09-16 15:39 ` Jinpu Wang
2019-09-18 15:26 ` Bart Van Assche
2019-09-18 16:11 ` Jinpu Wang
[not found] ` <20190620150337.7847-17-jinpuwang@gmail.com>
2019-09-13 22:25 ` [PATCH v4 16/25] ibnbd: client: private header with client structs and functions Bart Van Assche
2019-09-17 16:36 ` Jinpu Wang
2019-09-25 23:43 ` Danil Kipnis
2019-09-26 10:00 ` Jinpu Wang
[not found] ` <20190620150337.7847-18-jinpuwang@gmail.com>
2019-09-13 23:46 ` [PATCH v4 17/25] ibnbd: client: main functionality Bart Van Assche
2019-09-16 14:17 ` Danil Kipnis
2019-09-16 16:46 ` Bart Van Assche
2019-09-17 11:39 ` Danil Kipnis
2019-09-18 7:14 ` Danil Kipnis
2019-09-18 15:47 ` Bart Van Assche
2019-09-20 8:29 ` Danil Kipnis
2019-09-25 22:26 ` Danil Kipnis
2019-09-26 9:55 ` Roman Penyaev
2019-09-26 15:01 ` Bart Van Assche
2019-09-27 8:52 ` Roman Penyaev
2019-09-27 9:32 ` Danil Kipnis
2019-09-27 12:18 ` Danil Kipnis
2019-09-27 16:37 ` Bart Van Assche
2019-09-27 16:50 ` Roman Penyaev
2019-09-27 17:16 ` Bart Van Assche
2019-09-17 13:09 ` Jinpu Wang
2019-09-17 16:46 ` Bart Van Assche
2019-09-18 12:02 ` Jinpu Wang
2019-09-18 16:05 ` Jinpu Wang
2019-09-14 0:00 ` Bart Van Assche
[not found] ` <20190620150337.7847-25-jinpuwang@gmail.com>
2019-09-13 23:58 ` [PATCH v4 24/25] ibnbd: a bit of documentation Bart Van Assche
2019-09-18 12:22 ` Jinpu Wang
[not found] ` <20190620150337.7847-19-jinpuwang@gmail.com>
2019-09-18 16:28 ` [PATCH v4 18/25] ibnbd: client: sysfs interface functions Bart Van Assche
2019-09-19 15:55 ` Jinpu Wang
[not found] ` <20190620150337.7847-21-jinpuwang@gmail.com>
2019-09-18 17:41 ` [PATCH v4 20/25] ibnbd: server: main functionality Bart Van Assche
2019-09-20 7:36 ` Danil Kipnis
2019-09-20 15:42 ` Bart Van Assche
2019-09-23 15:19 ` Danil Kipnis
[not found] ` <20190620150337.7847-22-jinpuwang@gmail.com>
2019-09-18 21:46 ` [PATCH v4 21/25] ibnbd: server: functionality for IO submission to file or block dev Bart Van Assche
2019-09-26 14:04 ` Jinpu Wang [this message]
2019-09-26 15:11 ` Bart Van Assche
2019-09-26 15:25 ` Danil Kipnis
2019-09-26 15:29 ` Bart Van Assche
2019-09-26 15:38 ` Danil Kipnis
2019-09-26 15:42 ` Jinpu Wang
[not found] ` <20190620150337.7847-3-jinpuwang@gmail.com>
2019-09-23 17:44 ` [PATCH v4 02/25] ibtrs: public interface header to establish RDMA connections Bart Van Assche
2019-09-25 10:20 ` Danil Kipnis
2019-09-25 15:38 ` Bart Van Assche
[not found] ` <20190620150337.7847-7-jinpuwang@gmail.com>
2019-09-23 21:51 ` [PATCH v4 06/25] ibtrs: client: main functionality Bart Van Assche
2019-09-25 17:36 ` Danil Kipnis
2019-09-25 18:55 ` Bart Van Assche
2019-09-25 20:50 ` Danil Kipnis
2019-09-25 21:08 ` Bart Van Assche
2019-09-25 21:16 ` Bart Van Assche
2019-09-25 22:53 ` Danil Kipnis
2019-09-25 23:21 ` Bart Van Assche
2019-09-26 9:16 ` Danil Kipnis
[not found] ` <20190620150337.7847-4-jinpuwang@gmail.com>
2019-09-23 22:50 ` [PATCH v4 03/25] ibtrs: private headers with IBTRS protocol structs and helpers Bart Van Assche
2019-09-25 21:45 ` Danil Kipnis
2019-09-25 21:57 ` Bart Van Assche
2019-09-27 8:56 ` Jinpu Wang
[not found] ` <20190620150337.7847-5-jinpuwang@gmail.com>
2019-09-23 23:03 ` [PATCH v4 04/25] ibtrs: core: lib functions shared between client and server modules Bart Van Assche
2019-09-27 10:13 ` Jinpu Wang
[not found] ` <20190620150337.7847-6-jinpuwang@gmail.com>
2019-09-23 23:05 ` [PATCH v4 05/25] ibtrs: client: private header with client structs and functions Bart Van Assche
2019-09-27 10:18 ` Jinpu Wang
[not found] ` <20190620150337.7847-8-jinpuwang@gmail.com>
2019-09-23 23:15 ` [PATCH v4 07/25] ibtrs: client: statistics functions Bart Van Assche
2019-09-27 12:00 ` Jinpu Wang
[not found] ` <20190620150337.7847-10-jinpuwang@gmail.com>
2019-09-23 23:21 ` [PATCH v4 09/25] ibtrs: server: private header with server structs and functions Bart Van Assche
2019-09-27 12:04 ` Jinpu Wang
[not found] ` <20190620150337.7847-11-jinpuwang@gmail.com>
2019-09-23 23:49 ` [PATCH v4 10/25] ibtrs: server: main functionality Bart Van Assche
2019-09-27 15:03 ` Jinpu Wang
2019-09-27 15:11 ` Bart Van Assche
2019-09-27 15:19 ` Jinpu Wang
[not found] ` <20190620150337.7847-12-jinpuwang@gmail.com>
2019-09-23 23:56 ` [PATCH v4 11/25] ibtrs: server: statistics functions Bart Van Assche
2019-10-02 15:15 ` Jinpu Wang
2019-10-02 15:42 ` Leon Romanovsky
2019-10-02 15:45 ` Jinpu Wang
2019-10-02 16:00 ` Leon Romanovsky
[not found] ` <20190620150337.7847-13-jinpuwang@gmail.com>
2019-09-24 0:00 ` [PATCH v4 12/25] ibtrs: server: sysfs interface functions Bart Van Assche
2019-10-02 15:11 ` Jinpu Wang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CAMGffE=RpTjDX-LL2E5t_eOF8iwG=UpgWFcnB3tz-N-PQ0etoQ@mail.gmail.com' \
--to=jinpu.wang@cloud.ionos.com \
--cc=axboe@kernel.dk \
--cc=bvanassche@acm.org \
--cc=danil.kipnis@cloud.ionos.com \
--cc=dledford@redhat.com \
--cc=hch@infradead.org \
--cc=jgg@mellanox.com \
--cc=jinpuwang@gmail.com \
--cc=linux-block@vger.kernel.org \
--cc=linux-rdma@vger.kernel.org \
--cc=rpenyaev@suse.de \
--cc=sagi@grimberg.me \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).