From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.7 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH,MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS, URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 079ACC432C0 for ; Thu, 21 Nov 2019 01:35:31 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id D0DD520855 for ; Thu, 21 Nov 2019 01:35:30 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726437AbfKUBfa convert rfc822-to-8bit (ORCPT ); Wed, 20 Nov 2019 20:35:30 -0500 Received: from smtpbgeu1.qq.com ([52.59.177.22]:37499 "EHLO smtpbgeu1.qq.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726343AbfKUBfa (ORCPT ); Wed, 20 Nov 2019 20:35:30 -0500 X-QQ-mid: bizesmtp23t1574300124tk550bp8 Received: from [172.16.31.194] (unknown [218.76.23.26]) by esmtp10.qq.com (ESMTP) with id ; Thu, 21 Nov 2019 09:35:23 +0800 (CST) X-QQ-SSF: 00400000002000T0ZWF0B00A0000000 X-QQ-FEAT: ++iCEsLHXh/vAN1hoegjtNFR1s0PJGuyhWH2UD5OAuSV/TdGxgQCQ47lMbd6W Uro29e0ua9JScwn8Cu+oGBtujqPMlWzJOIWrq9ktTDUq/agRKbSByI75bJPvaHSeht7jh0r pWlflzzZ3LG2fnXvVBK8yviFpDXS6scXzCGw9gzhnuNmhm4KY3+lr5Xqy1/FUlFpFvuB4kx +yk9mfV3Sa975PizVBtO0c84p2Do3v1+rlKSuIKIBfO0tq9uNgfhDusKnjNMJHawTtOGiiB f0Ii2+O2CdH6y59aRkzBP4blZlDWr9ESAM6uSlYSRAAy4jYz2hAer7oNE0ZB/e01rEmT8kF 8qC8dmzKINgC3cmrbS6SqTO1q1hHQ== X-QQ-GoodBg: 2 Content-Type: text/plain; charset=utf-8 Mime-Version: 1.0 (Mac OS X Mail 13.0 \(3601.0.10\)) Subject: Re: [PATCH] io_uring: fix race with shadow drain deferrals From: Jackie Liu In-Reply-To: <1C6D35BF-C89B-4AB9-83CD-0A6B676E4752@kylinos.cn> Date: Thu, 21 Nov 2019 09:35:22 +0800 Cc: io-uring@vger.kernel.org Content-Transfer-Encoding: 8BIT Message-Id: <890E4F5B-DDA2-40EF-B7AD-3C63EFA20D93@kylinos.cn> References: <27c153ee-cfc8-3251-a874-df85a033429a@kernel.dk> <1C6D35BF-C89B-4AB9-83CD-0A6B676E4752@kylinos.cn> To: Jens Axboe X-Mailer: Apple Mail (2.3601.0.10) X-QQ-SENDSIZE: 520 Feedback-ID: bizesmtp:kylinos.cn:qybgforeign:qybgforeign5 X-QQ-Bgrelay: 1 Sender: io-uring-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: io-uring@vger.kernel.org > 2019年11月21日 09:32,Jackie Liu 写道: > > 2019年11月21日 07:58,Jens Axboe 写道: > >> >> On 11/20/19 4:07 PM, Jens Axboe wrote: >>> When we go and queue requests with drain, we check if we need to defer >>> based on sequence. This is done safely under the lock, but then we drop >>> the lock before actually inserting the shadow. If the original request >>> is found on the deferred list by another completion in the mean time, >>> it could have been started AND completed by the time we insert the >>> shadow, which will stall the queue. >>> >>> After re-grabbing the completion lock, check if the original request is >>> still in the deferred list. If it isn't, then we know that someone else >>> already found and issued it. If that happened, then our job is done, we >>> can simply free the shadow. >>> >>> Cc: Jackie Liu >>> Fixes: 4fe2c963154c ("io_uring: add support for link with drain") >>> Signed-off-by: Jens Axboe >> >> BTW, the other solution here is to not release the completion_lock if >> we're going to return -EIOCBQUEUED, and let the caller do what it needs >> before releasing it. That'd look something like this, with some sparse >> annotations to keep things happy. >> >> I think the original I posted here is easier to follow, and the >> deferral list is going to be tiny in general so it won't really add >> any extra overhead. >> >> Let me know what you think and prefer. >> >> diff --git a/fs/io_uring.c b/fs/io_uring.c >> index 6175e2e195c0..0d1f33bcedc0 100644 >> --- a/fs/io_uring.c >> +++ b/fs/io_uring.c >> @@ -2552,6 +2552,11 @@ static int io_async_cancel(struct io_kiocb *req, const struct io_uring_sqe *sqe, >> return 0; >> } >> >> +/* >> + * Returns with ctx->completion_lock held if -EIOCBQUEUED is returned, so >> + * the caller can make decisions based on the deferral without worrying about >> + * the request being found and issued in the mean time. >> + */ >> static int io_req_defer(struct io_kiocb *req) >> { >> const struct io_uring_sqe *sqe = req->submit.sqe; >> @@ -2579,7 +2584,7 @@ static int io_req_defer(struct io_kiocb *req) >> >> trace_io_uring_defer(ctx, req, false); >> list_add_tail(&req->list, &ctx->defer_list); >> - spin_unlock_irq(&ctx->completion_lock); >> + __release(&ctx->completion_lock); >> return -EIOCBQUEUED; >> } >> >> @@ -2954,6 +2959,7 @@ static void __io_queue_sqe(struct io_kiocb *req) >> >> static void io_queue_sqe(struct io_kiocb *req) >> { >> + struct io_ring_ctx *ctx = req->ctx; >> int ret; >> >> ret = io_req_defer(req); >> @@ -2963,6 +2969,9 @@ static void io_queue_sqe(struct io_kiocb *req) >> if (req->flags & REQ_F_LINK) >> req->flags |= REQ_F_FAIL_LINK; >> io_double_put_req(req); >> + } else { >> + __acquire(&ctx->completion_lock); >> + spin_unlock_irq(&ctx->completion_lock); >> } >> } else >> __io_queue_sqe(req); >> @@ -3001,16 +3010,17 @@ static void io_queue_link_head(struct io_kiocb *req, struct io_kiocb *shadow) >> __io_free_req(shadow); >> return; >> } >> + __acquire(&ctx->completion_lock); >> } else { >> /* >> * If ret == 0 means that all IOs in front of link io are >> * running done. let's queue link head. >> */ >> need_submit = true; >> + spin_lock_irq(&ctx->completion_lock); >> } >> >> /* Insert shadow req to defer_list, blocking next IOs */ >> - spin_lock_irq(&ctx->completion_lock); >> trace_io_uring_defer(ctx, shadow, true); >> list_add_tail(&shadow->list, &ctx->defer_list); >> spin_unlock_irq(&ctx->completion_lock); > > This is indeed a potential lock issue, thanks, I am prefer this solution, clearer than first one. > But It may be a bit difficult for other people who read the code, use 'io_req_defer_may_lock'? > > who about this? > > diff --git a/fs/io_uring.c b/fs/io_uring.c > index 5ad652f..6fdaeb1 100644 > --- a/fs/io_uring.c > +++ b/fs/io_uring.c > @@ -2469,7 +2469,7 @@ static int io_async_cancel(struct io_kiocb *req, const struct io_uring_sqe *sqe, > return 0; > } > > -static int io_req_defer(struct io_kiocb *req) > +static int __io_req_defer(struct io_kiocb *req) > { > const struct io_uring_sqe *sqe = req->submit.sqe; > struct io_uring_sqe *sqe_copy; > @@ -2495,8 +2495,21 @@ static int io_req_defer(struct io_kiocb *req) > > trace_io_uring_defer(ctx, req, false); > list_add_tail(&req->list, &ctx->defer_list); > + > + return -EIOCBQUEUED; > +} > + > +static int io_req_defer(struct io_kiocb *req) > +{ > + int ret = __io_req_defer(req); There have an problem, need fix. static int io_req_defer(struct io_kiocb *req) { int ret = __io_req_defer(req); if (ret == -EIOCBQUEUED) spin_unlock_irq(&ctx->completion_lock); return ret; } > spin_unlock_irq(&ctx->completion_lock); > - return-EIOCBQUEUED; > + return ret; > +} > + > +static int io_req_defer_may_lock(struct io_kiocb *req) > +{ > + return __io_req_defer(req); > + > } > > static int __io_submit_sqe(struct io_kiocb *req, struct io_kiocb **nxt, > @@ -2927,7 +2940,7 @@ static int io_queue_link_head(struct io_kiocb *req, struct io_kiocb *shadow) > * list. > */ > req->flags |= REQ_F_IO_DRAIN; > - ret = io_req_defer(req); > + ret = io_req_defer_may_lock(req); > if (ret) { > if (ret != -EIOCBQUEUED) { > io_cqring_add_event(req, ret); > @@ -2941,10 +2954,10 @@ static int io_queue_link_head(struct io_kiocb *req, struct io_kiocb *shadow) > * running done. let's queue link head. > */ > need_submit = true; > + spin_lock_irq(&ctx->completion_lock); > } > > /* Insert shadow req to defer_list, blocking next IOs */ > - spin_lock_irq(&ctx->completion_lock); > trace_io_uring_defer(ctx, shadow, true); > list_add_tail(&shadow->list, &ctx->defer_list); > spin_unlock_irq(&ctx->completion_lock); > > -- > BR, Jackie Liu -- BR, Jackie Liu