Linux-Block Archive on lore.kernel.org
 help / color / Atom feed
From: Jackie Liu <liuyun01@kylinos.cn>
To: Jens Axboe <axboe@kernel.dk>
Cc: linux-block@vger.kernel.org
Subject: Re: [PATCH 1/2] io_uring: fix issue when IOSQE_IO_DRAIN pass with IOSQE_IO_LINK
Date: Fri, 16 Aug 2019 14:47:49 +0800
Message-ID: <C052542E-0AC9-4BC0-9C59-FA1F7F023A73@kylinos.cn> (raw)
In-Reply-To: <ebe617aa-1a63-bd70-4096-e8f67b9f8adb@kernel.dk>


> ÔÚ 2019Äê8ÔÂ16ÈÕ£¬09:21£¬Jens Axboe <axboe@kernel.dk> дµÀ£º
> 
> On 8/15/19 6:48 PM, Jackie Liu wrote:
>> 
>> ÔÚ 2019Äê8ÔÂ16ÈÕ£¬01:07£¬Jens Axboe <axboe@kernel.dk> дµÀ£º
>> 
>>> 
>>> On 8/14/19 3:35 AM, Jackie Liu wrote:
>>>> Suppose there are three IOs here, and their order is as follows:
>>>> 
>>>> Submit:
>>>> 	[1] IO_LINK
>>>> 	    |
>>>> 	    |---  [2] IO_LINK | IO_DRAIN
>>>> 		      |
>>>> 		      |- [3] NORMAL_IO
>>>> 
>>>> In theory, they all need to be inserted into the Link-list, but flag
>>>> IO_DRAIN we have, io[2] and io[3] will be inserted into the defer_list,
>>>> and finally, io[3] and io[2] will be processed at the same time.
>>>> 
>>>> Now, it is directly forbidden to pass these two flags at the same time.
>>>> 
>>>> Fixes: 9e645e1105c ("io_uring: add support for sqe links")
>>>> Signed-off-by: Jackie Liu <liuyun01@kylinos.cn>
>>>> ---
>>>>  fs/io_uring.c | 7 ++++++-
>>>>  1 file changed, 6 insertions(+), 1 deletion(-)
>>>> 
>>>> diff --git a/fs/io_uring.c b/fs/io_uring.c
>>>> index d542f1c..05ee628 100644
>>>> --- a/fs/io_uring.c
>>>> +++ b/fs/io_uring.c
>>>> @@ -2074,10 +2074,13 @@ static void io_submit_sqe(struct io_ring_ctx *ctx, struct sqe_submit *s,
>>>>  {
>>>>  	struct io_uring_sqe *sqe_copy;
>>>>  	struct io_kiocb *req;
>>>> +	unsigned int flags;
>>>>  	int ret;
>>>> 
>>>> +	flags = READ_ONCE(s->sqe->flags);
>>>>  	/* enforce forwards compatibility on users */
>>>> -	if (unlikely(s->sqe->flags & ~SQE_VALID_FLAGS)) {
>>>> +	if (unlikely((flags & ~SQE_VALID_FLAGS) ||
>>>> +		     (flags & (IOSQE_IO_DRAIN | IOSQE_IO_LINK)))) {
>>> 
>>> This doesn't look right, as any setting of either DRAIN or LINK would now
>>> fail?
>>> 
>>> Did you mean something ala:
>>> 
>>> 	if ((flags & (IOSQE_IO_DRAIN | IOSQE_IO_LINK)) ==
>>> 	    (IOSQE_IO_DRAIN | IOSQE_IO_LINK)) {
>>> 		... fail ...
>>> 	}
>>> 
>>> which makes me worried that you didn't test this at all...
>>> 
>>> -- 
>>> Jens Axboe
>> 
>> Oh, yes, it's my fault, I just simulated it in my head, thank you for
>> pointing out.  I think I'd add an [RFC PATCH] next time.
> 
> Even for an RFC, it better be more tested than just being thought
> about... If something hasn't been run at all, it should always include
> wording to that effect ("Totally untested, but something like this
> perhaps"). I have higher expectations for even an RFC patch, I do expect
> that to be both thought about AND tested.
> 
>> For this issue, I have two solutions, first is this, just avoid
>> passing DRAIN and LINK at the same time; second is allow, let the SQE
>> following LINK inherit the DRAIN flag, but It's more complicated, I
>> prefer the first one.
>> 
>> I will rewrite this patch later, with a real test. Thanks again.
> 
> If an SQE has both set, it should first wait for any inflight sqe to
> complete, then execute the chain. Once things have drained, it should
> behave like an SQE that just had LINK set. I'd be interested in seeing a
> patch that fixes this instead of just making it illegal, it seems to be
> a valid use case.
> 

How about this, We consider link list as a whole, and any IO among them
that has drain will mark the first IO as drain, which is easy to implement.
Of course, there is no clear order between IO in link list and IO in defer_list,
so there maybe have a problems. 

If we want to keep a clear order between link list and defer_list, maybe need to
add more flags and variables. my initial implementation is very complicated. 
do you have any good idea?

First idea patch like follow. untested, just a idea.

---
diff --git a/fs/io_uring.c b/fs/io_uring.c
index 6b572c4..bc0b535 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -1995,10 +1995,15 @@ static int io_req_set_file(struct io_ring_ctx *ctx, const struct sqe_submit *s,
        flags = READ_ONCE(s->sqe->flags);
        fd = READ_ONCE(s->sqe->fd);

-       if (flags & IOSQE_IO_DRAIN) {
+       if (flags & IOSQE_IO_DRAIN)
                req->flags |= REQ_F_IO_DRAIN;
-               req->sequence = ctx->cached_sq_head - 1;
-       }
+
+       /*
+        * All io need record the previous position, if LINK vs DARIN,
+        * it can be used to mark the position of the first IO in the
+        * link list.
+        */
+       req->sequence = ctx->cached_sq_head - 1;

        if (!io_op_needs_file(s->sqe))
                return 0;
@@ -2123,6 +2128,12 @@ static void io_submit_sqe(struct io_ring_ctx *ctx, struct sqe_submit *s,
                }

                s->sqe = sqe_copy;
+               /*
+                * Mark the first IO in link list as DRAIN, let all the following
+                * IOs enter the defer list.
+                */
+               if (s->sqe->flags & IOSQE_IO_DRAIN)
+                       prev->flags |= REQ_F_IO_DRAIN;
                memcpy(&req->submit, s, sizeof(*s));
                list_add_tail(&req->list, &prev->link_list);
        } else if (s->sqe->flags & IOSQE_IO_LINK) {


--
Jackie Liu




      reply index

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-08-14  9:35 Jackie Liu
2019-08-14  9:35 ` [PATCH 2/2] io_uring: fix an issue when IOSQE_IO_LINK is inserted into defer list Jackie Liu
2019-08-15 17:21   ` Jens Axboe
2019-08-15 17:07 ` [PATCH 1/2] io_uring: fix issue when IOSQE_IO_DRAIN pass with IOSQE_IO_LINK Jens Axboe
2019-08-16  0:48   ` Jackie Liu
2019-08-16  1:21     ` Jens Axboe
2019-08-16  6:47       ` Jackie Liu [this message]

Reply instructions:

You may reply publically to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=C052542E-0AC9-4BC0-9C59-FA1F7F023A73@kylinos.cn \
    --to=liuyun01@kylinos.cn \
    --cc=axboe@kernel.dk \
    --cc=linux-block@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Linux-Block Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-block/0 linux-block/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-block linux-block/ https://lore.kernel.org/linux-block \
		linux-block@vger.kernel.org linux-block@archiver.kernel.org
	public-inbox-index linux-block


Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-block


AGPL code for this site: git clone https://public-inbox.org/ public-inbox