All of lore.kernel.org
 help / color / mirror / Atom feed
From: Xiaoguang Wang <xiaoguang.wang@linux.alibaba.com>
To: Ming Lei <ming.lei@redhat.com>
Cc: ZiyangZhang <ZiyangZhang@linux.alibaba.com>,
	Gabriel Krisman Bertazi <krisman@collabora.com>,
	joseph.qi@linux.alibaba.com, linux-block@vger.kernel.org,
	Jens Axboe <axboe@kernel.dk>
Subject: Re: Follow up on UBD discussion
Date: Tue, 10 May 2022 12:38:38 +0800	[thread overview]
Message-ID: <a38639db-03fb-c320-8c9e-ce17cd7772c7@linux.alibaba.com> (raw)
In-Reply-To: <YnkT3BXnm0RW3L7f@T590>

hi,

> On Mon, May 09, 2022 at 07:53:05PM +0800, Xiaoguang Wang wrote:
>> hi,
>>
>>>>>> Second, I'd like to share some ideas on UBD. I'm not sure if they are
>>>>>> reasonable so please figure out my mistakes.
>>>>>>
>>>>>> 1) UBD issues one sqe to commit last completed request and fetch a new
>>>>>> one. Then, blk-mq's queue_rq() issues a new UBD IO request and completes
>>>>>> one cqe for the fetch command. We have evaluated that io_submit_sqes()
>>>>>> costs some CPU and steps of building a new sqe may lower throughput.
>>>>>> Here I'd like to give a new solution: never submit sqe but trump up a
>>>>>> cqe(with information of new UBD IO request) when calling queue_rq(). I
>>>>>> am inspired by one io_uring flag: IORING_POLL_ADD_MULTI, with which a
>>>>>> user issues only one sqe for polling an fd and repeatedly gets multiple
>>>>>> cqes when new events occur. Dose this solution break the architecture of
>>>>>> UBD?
>>>>> But each cqe has to be associated with one sqe, if I understand
>>>>> correctly.
>>>> Yeah, for current io_uring implementation, it is. But if io_uring offers below
>>>> helper:
>>>> void io_gen_cqe_direct(struct file *file, u64 user_data, s32 res, u32 cflags)
>>>> {
>>>>         struct io_ring_ctx *ctx;
>>>>         ctx = file->private_data;
>>>>
>>>>         spin_lock(&ctx->completion_lock);
>>>>         __io_fill_cqe(ctx, user_data, res, cflags);
>>>>         io_commit_cqring(ctx);
>>>>         spin_unlock(&ctx->completion_lock);
>>>>         io_cqring_ev_posted(ctx);
>>>> }
>>>>
>>>> Then in ubd driver:
>>>> 1) device setup stage
>>>> We attach io_uring files and user_data to every ubd hard queue.
>>>>
>>>> 2) when blk-mq->queue_rq() is called.
>>>> io_gen_cqe_direct() will be called in ubd's queue_rq, and we put ubd io request's
>>>> qid and tag info into cqe's res field, then we don't need to issue sqe to fetch io cmds.
>>> The above way is actually anti io_uring design, and I don't think it may
>>> improve much since submitting UBD_IO_COMMIT_AND_FETCH_REQ is pretty lightweight.
>> Actually I don't come up with this idea mostly for performance reason :) Just try to
>> simplify codes a bit:
>> 1) In current implementation, ubdsrv will need to submit queue depth number of
>> sqes firstly, and ubd_ctrl_start_dev() will also need to wait all sqes to be submitted.
> Yes, because handling IO need the associated io_uring commend reached to ubd driver
> first.
>
>> 2) Try to make ubd_queue_rq simpler, it maybe just call one io_gen_cqe_direct().
> But it has to work at least. Also not see real simplification in your
> suggestion.
>
>>> Also without re-submitting UBD_IO_COMMIT_AND_FETCH_REQ command, how can you
>>> commit io handling result from ubd server and ask ubd driver to complete
>>> io request?
>> No, I don't mean to remove COMMIT command, we still need io_uring async
>> command feature to support ubd COMMIT or GETDATA command.
> GETDATA command has been removed, because task_work_add() is used to
> complete io_uring command(UBD_IO_COMMIT_AND_FETCH_REQ or UBD_IO_FETCH_REQ),
> so pinning pages and copying data is always done in ubdsrv daemon
> context.
OK, I'll read your latest codes.

>
> You may not get the whole idea:
>
> 1) UBD_IO_FETCH_REQ is only submitted to ubd driver before starting
> device because at the beginning there isn't any IO handled, so no need
> to send COMMIT.
>
> 2) after device is started, only UBD_IO_COMMIT_AND_FETCH_REQ is
> submitted for both committing io handling result to driver and fetching new
> io request, and UBD_IO_COMMIT_AND_FETCH_REQ can be thought as combined
> command of COMMIT and UBD_IO_FETCH_REQ.
>
> 3) COMMIT command is just submitted after queue is aborted, since we
> needn't to fetch request any more, and just need to commit in-flight
> request's result to ubd driver.
Thanks for detailed clarification.
I think I have understood codes better now.

>
> If you meant using COMMIT with io_gen_cqe_direct(), what benefit can we
> get? Still one command is required for handling IO, that is exactly what
> UBD_IO_COMMIT_AND_FETCH_REQ is doing.
Agree now.
>
>> I have another concern that currently there are may flags in ubd kernel or
>> ubdsrv, such as:
>> #define UBDSRV_NEED_FETCH_RQ (1UL << 0)
> UBDSRV_NEED_FETCH_RQ means the to be queued io_uring command has to fetch
> new io request from ubd driver.
>
>> #define UBDSRV_NEED_COMMIT_RQ_COMP (1UL << 1)
> UBDSRV_NEED_COMMIT_RQCOMP means the to be queued io_uring command has to
> commit io handling result to ubd driver.
>
>> #define UBDSRV_IO_FREE (1UL << 2)
> Only io with this flag can be queued to ubd driver. Once this flag is
> cleared, it means the io command has been submitted to ubd driver.
>
>> #define UBDSRV_IO_HANDLING (1UL << 3)
> UBDSRV_IO_HANDLING means the io command is being handled by target code.
>
>> #define UBDSRV_NEED_GET_DATA (1UL << 4)
> The above one has been removed.
>
>> Some of their names looks weird, for example UBDSRV_IO_FREE. I think
>> more flags may result in more state machine error.
> Figuring out perfect name is always not easy, but I don't think they
> are weird since the above short comments explained them clearly.
OK, thanks for these clarifications again.

Regards,
Xiaoguang Wang

>
>
> Thanks,
> Ming


  reply	other threads:[~2022-05-10  4:38 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <874k27rfwm.fsf@collabora.com>
2022-05-03  8:02 ` Follow up on UBD discussion Ming Lei
2022-05-07  4:20   ` ZiyangZhang
2022-05-07 17:13     ` Ming Lei
2022-05-09  4:05       ` Xiaoguang Wang
2022-05-09  7:36         ` Ming Lei
2022-05-09 11:53           ` Xiaoguang Wang
2022-05-09 13:15             ` Ming Lei
2022-05-10  4:38               ` Xiaoguang Wang [this message]
2022-05-09  8:11       ` Ziyang Zhang
2022-05-09 11:24         ` Ming Lei
2022-05-10  4:46           ` Ziyang Zhang
2022-05-11  1:52             ` Ming Lei
2022-05-10  4:55       ` Ziyang Zhang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=a38639db-03fb-c320-8c9e-ce17cd7772c7@linux.alibaba.com \
    --to=xiaoguang.wang@linux.alibaba.com \
    --cc=ZiyangZhang@linux.alibaba.com \
    --cc=axboe@kernel.dk \
    --cc=joseph.qi@linux.alibaba.com \
    --cc=krisman@collabora.com \
    --cc=linux-block@vger.kernel.org \
    --cc=ming.lei@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.