From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.5 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,NICE_REPLY_A, SPF_HELO_NONE,SPF_PASS,USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 36CACC43214 for ; Fri, 23 Jul 2021 17:32:58 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 21D5860EB5 for ; Fri, 23 Jul 2021 17:32:58 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232372AbhGWQwX (ORCPT ); Fri, 23 Jul 2021 12:52:23 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49718 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231126AbhGWQwW (ORCPT ); Fri, 23 Jul 2021 12:52:22 -0400 Received: from mail-il1-x133.google.com (mail-il1-x133.google.com [IPv6:2607:f8b0:4864:20::133]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 71F18C061575 for ; Fri, 23 Jul 2021 10:32:55 -0700 (PDT) Received: by mail-il1-x133.google.com with SMTP id l11so2273167iln.4 for ; Fri, 23 Jul 2021 10:32:55 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20150623.gappssmtp.com; s=20150623; h=subject:to:cc:references:from:message-id:date:user-agent :mime-version:in-reply-to:content-language:content-transfer-encoding; bh=5M958zdX9geS/u2H/YZsFoj5TvVRmO6Wg409po08/mU=; b=YcTfH+rYbdgeIXg0QBd8fGyTputCo3lqix79WGZw8baWHvvyPGr+q5iXOBuUCEoYKL 8zy6XSGQXPqOZmw/hlWrRe+YTVK1w5ptEYosMnicYsjnmhUlh6fR/v1b9Sp1urxBOUIe WwbuEX+y/JZmLQ4NO7awUaeFw9cqmSJAdSbFCrYTXNnKR8gdVNXNB0XorVC/xsvKHQrQ SDfb8oLP8ePqKtzHymn1AMR2Jl+goWGI1qycsJbgcBWJ9H6BgcVjwFYAHYt4AHzMETGt PMmpG/Lphys+/0PT9k85N/P51rcthaEM7buclrGvvB26huJN4E7JgFXba5W9Rnpi90ph 9m4g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:cc:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=5M958zdX9geS/u2H/YZsFoj5TvVRmO6Wg409po08/mU=; b=rQdbHBvNCJ6DfP5Q9luhSzyvxz4vy9HNcZTBKtrMcbIrSfe21N20mk644nqtVbTTge ulqiQ6+drw6AANZvPieLMS2vrs0wXOgXny8MpfxsrTWrDc9QgVCOhMsMeLC4ReSzX0Ea Gbf3ozhpMaO7BmxYPsfM3olcR9wPxhpuvL5GFv3/oQwyRIDjlUgNj7LwqEhBGR1CHiYq wDPvcAkdeEt97zcbyCsIgDzKA882QQFgK9ntHUhInhj2x6YLAOCfenZGp4TZUzwO2DnZ +2Bj1MtnQQaeJXtU3Dh7iRbojbERUIbATUyYyhDnpg0kDCSvipaEYSBfWtH5mp4i1vYb wwcQ== X-Gm-Message-State: AOAM530jHfBu29hTmXHC4mY+0VRkJGV0ou8FVyZe066hCQ6rZVIz0HdO luSXYl3uWUupA4VKUbqAZGm1+w== X-Google-Smtp-Source: ABdhPJwhTskMqkvJG0ZMzPKJi2gwwNXNNhp5/dcfBRQ60JGBeNkWBq0lh2qKaruH64V+RzlRtLCxxQ== X-Received: by 2002:a92:d3c7:: with SMTP id c7mr4127169ilh.59.1627061574692; Fri, 23 Jul 2021 10:32:54 -0700 (PDT) Received: from [192.168.1.10] ([65.144.74.34]) by smtp.gmail.com with ESMTPSA id u12sm5670209iog.54.2021.07.23.10.32.53 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Fri, 23 Jul 2021 10:32:54 -0700 (PDT) Subject: Re: [PATCH 3/3] io_uring: refactor io_sq_offload_create() To: Al Viro Cc: Pavel Begunkov , io-uring@vger.kernel.org, linux-fsdevel@vger.kernel.org, Linus Torvalds References: <939776f90de8d2cdd0414e1baa29c8ec0926b561.1618916549.git.asml.silence@gmail.com> <57758edf-d064-d37e-e544-e0c72299823d@kernel.dk> <8fb39022-ba21-2c1f-3df5-29be002014d8@kernel.dk> From: Jens Axboe Message-ID: Date: Fri, 23 Jul 2021 11:32:53 -0600 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.10.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org On 7/23/21 11:11 AM, Al Viro wrote: > On Fri, Jul 23, 2021 at 10:17:27AM -0600, Jens Axboe wrote: >> On 7/22/21 6:10 PM, Al Viro wrote: >>> On Thu, Jul 22, 2021 at 05:42:55PM -0600, Jens Axboe wrote: >>> >>>>> So how can we possibly get there with tsk->files == NULL and what does it >>>>> have to do with files, anyway? >>>> >>>> It's not the clearest, but the files check is just to distinguish between >>>> exec vs normal cancel. For exec, we pass in files == NULL. It's not >>>> related to task->files being NULL or not, we explicitly pass NULL for >>>> exec. >>> >>> Er... So turn that argument into bool cancel_all, and pass false on exit and >>> true on exec? >> >> Yes >> >>> While we are at it, what happens if you pass io_uring descriptor >>> to another process, close yours and then have the recepient close the one it >>> has gotten? AFAICS, io_ring_ctx_wait_and_kill(ctx) will be called in context >>> of a process that has never done anything io_uring-related. Can it end up >>> trying to resubmit some requests?> >>> I rather hope it can't happen, but I don't see what would prevent it... >> >> No, the pending request would either have gone to a created thread of >> the original task on submission, or it would be sitting in a >> ready-to-retry state. The retry would attempt to queue to original task, >> and either succeed (if still alive) or get failed with -ECANCELED. Any >> given request is tied to the original task. > > Hmm... Sure, you'll be pushing it to the same io_wqe it went through originally, > but you are still in context of io_uring_release() caller, aren't you? > > So you call io_wqe_wake_worker(), and it decides that all threads are busy, > but ->nr_workers is still below ->max_workers. And proceeds to > create_io_worker(wqe->wq, wqe, acct->index); > which will create a new io-worker thread, but do that in the thread group of > current, i.e. the caller of io_uring_release(). Looks like we'd get > an io-worker thread with the wrong parent... > > What am I missing here? I think there's two main cases here: 1) Request has already been issued by original task in some shape or form. This is the case I was mentioning in my previous reply. 2) Request has not been seen yet, this would be a new submit. For case #2, let's say you pass the ring to another task, there are entries in the SQ ring that haven't been submitted yet. Will these go under the new task? Yes they would - you've passed the ring to someone else at that point. For case #1, by definition the request has already been issued and is assigned to the task that did that. This either happens from the syscall itself, or from task_work which is also run from that original task. For your particular case, it's either an original async queue (hasn't been done on this task before), in which case it will create a thread from the right task. The other option is that we decide to async requeue from async context for some odd reason, and we're already in the right context at that point to create a new thread (which should not even happen, as the same thread is now available). I don't see a case where this wouldn't work as expected. However, I do think we could add a WARN_ON_ONCE (or similar) and reject any attempt to io_wq_enqueue() from the wrong context as a proactive measure to catch any bugs in testing rather than later. Outside of that, we're not submitting off release, only killing anything pending. The only odd case there is iopoll, but that doesn't resubmit here. -- Jens Axboe