From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.3 required=3.0 tests=DKIM_SIGNED,DKIM_VALID, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE, SPF_PASS,USER_AGENT_SANE_1 autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8E747CA9EA9 for ; Fri, 18 Oct 2019 15:55:03 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 5D748222C6 for ; Fri, 18 Oct 2019 15:55:03 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=kernel-dk.20150623.gappssmtp.com header.i=@kernel-dk.20150623.gappssmtp.com header.b="2SvE9jA2" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2436479AbfJRPzC (ORCPT ); Fri, 18 Oct 2019 11:55:02 -0400 Received: from mail-pf1-f196.google.com ([209.85.210.196]:33800 "EHLO mail-pf1-f196.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726506AbfJRPzC (ORCPT ); Fri, 18 Oct 2019 11:55:02 -0400 Received: by mail-pf1-f196.google.com with SMTP id b128so4170313pfa.1 for ; Fri, 18 Oct 2019 08:55:02 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20150623.gappssmtp.com; s=20150623; h=subject:from:to:cc:references:message-id:date:user-agent :mime-version:in-reply-to:content-language:content-transfer-encoding; bh=LBUYF/N7x8hbc8GXc7h+thi64B1eIh6Rk/6W7sM3oDM=; b=2SvE9jA2N9du2TOdHNwiaPTDUtX3LC9qU6IA6N/SxuhNG9wVeJptkOCG8EPnG5nT83 LfiwGHMm1CQb03m7K3V3JrGsT9cJ5dsNUssGn/5iuSe81S5a5SwjZwcVA7cJ3wvWcK5S JhB1UrdHdVpikBCsmbAQCZyZmG8MadJeyEv8v8zuI2gPAQ469+ReOvZxvChDVZo8Bt2o 2/bKGiFVwJR40QjUtWSLoNqDyCcjNpuRPv9mhuBtm3TIGqZn65/ji3TVmb4wo3ub0cS+ cM1k6NgfOmBVOyVzmpP2XqFrSBJ94Na173dAkanx53gxYsXIkP5KjeCvK3+OCS56Jwep Ek7g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:from:to:cc:references:message-id:date :user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=LBUYF/N7x8hbc8GXc7h+thi64B1eIh6Rk/6W7sM3oDM=; b=ag+Uy1/DYs9/9CxVwQGKekZLU2q38b6HQZV0n9HE0+r4TCt1rWXlH/04vjprJkgKml MknpNvE0ILiaPnpuAEAseLkrZCQt0QYAm0fvSXueHpaMVfeV1MCuGvenQmwWOCTWsHaW l9CMGQvpGQA3rEPFQT07FL5ZmGQ1tgt+tAiA4SsSsQ4zRfQz1qtDXS+qhoD5NW6TaucR Wyv6va9b8UG9gg1jumJRgc8qb2uNTbLP07WSjTkRnhsy1tTuqSNraoyT9rGahFuWTG7n bM8sTd/gsxo2mJ1ZKI6pbu9NRn5Xw3bDO67Kp8ao4uirAqbJFoyirXD7mw8tSSBuXwXv D/Ow== X-Gm-Message-State: APjAAAVMgzel0N56NEzE1WbxtXNOAYlvZ9zvOGd9Dmje6WNOt5HwTYXt kkT71VGNc3td9sWcnocEwUs3rg== X-Google-Smtp-Source: APXvYqx0fQSBv0grZFNNFFic5PMS6uA+D2YOVm/rTZfXICvRdP6ONY/5m1YsYPPrw0eZ6xu7bBg5kA== X-Received: by 2002:a63:f854:: with SMTP id v20mr10543680pgj.92.1571414101262; Fri, 18 Oct 2019 08:55:01 -0700 (PDT) Received: from [192.168.1.188] ([66.219.217.79]) by smtp.gmail.com with ESMTPSA id y10sm6520405pfe.148.2019.10.18.08.54.59 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 18 Oct 2019 08:55:00 -0700 (PDT) Subject: Re: [PATCH 1/3] io_uring: add support for async work inheriting files table From: Jens Axboe To: Jann Horn Cc: linux-block@vger.kernel.org, "David S. Miller" , Network Development References: <20191017212858.13230-1-axboe@kernel.dk> <20191017212858.13230-2-axboe@kernel.dk> <0fb9d9a0-6251-c4bd-71b0-6e34c6a1aab8@kernel.dk> Message-ID: <572f40fb-201c-99ce-b3f5-05ff9369b895@kernel.dk> Date: Fri, 18 Oct 2019 09:54:57 -0600 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.9.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org On 10/18/19 9:00 AM, Jens Axboe wrote: > On 10/18/19 8:52 AM, Jann Horn wrote: >> On Fri, Oct 18, 2019 at 4:43 PM Jens Axboe wrote: >>> >>> On 10/18/19 8:40 AM, Jann Horn wrote: >>>> On Fri, Oct 18, 2019 at 4:37 PM Jens Axboe wrote: >>>>> >>>>> On 10/18/19 8:34 AM, Jann Horn wrote: >>>>>> On Fri, Oct 18, 2019 at 4:01 PM Jens Axboe wrote: >>>>>>> On 10/17/19 8:41 PM, Jann Horn wrote: >>>>>>>> On Fri, Oct 18, 2019 at 4:01 AM Jens Axboe wrote: >>>>>>>>> This is in preparation for adding opcodes that need to modify files >>>>>>>>> in a process file table, either adding new ones or closing old ones. >>>>>> [...] >>>>>>> Updated patch1: >>>>>>> >>>>>>> http://git.kernel.dk/cgit/linux-block/commit/?h=for-5.5/io_uring-test&id=df6caac708dae8ee9a74c9016e479b02ad78d436 >>>>>> >>>>>> I don't understand what you're doing with old_files in there. In the >>>>>> "s->files && !old_files" branch, "current->files = s->files" happens >>>>>> without holding task_lock(), but current->files and s->files are also >>>>>> the same already at that point anyway. And what's the intent behind >>>>>> assigning stuff to old_files inside the loop? Isn't that going to >>>>>> cause the workqueue to keep a modified current->files beyond the >>>>>> runtime of the work? >>>>> >>>>> I simply forgot to remove the old block, it should only have this one: >>>>> >>>>> if (s->files && s->files != cur_files) { >>>>> task_lock(current); >>>>> current->files = s->files; >>>>> task_unlock(current); >>>>> if (cur_files) >>>>> put_files_struct(cur_files); >>>>> cur_files = s->files; >>>>> } >>>> >>>> Don't you still need a put_files_struct() in the case where "s->files >>>> == cur_files"? >>> >>> I want to hold on to the files for as long as I can, to avoid unnecessary >>> shuffling of it. But I take it your worry here is that we'll be calling >>> something that manipulates ->files? Nothing should do that, unless >>> s->files is set. We didn't hide the workqueue ->files[] before this >>> change either. >> >> No, my worry is that the refcount of the files_struct is left too >> high. From what I can tell, the "do" loop in io_sq_wq_submit_work() >> iterates over multiple instances of struct sqe_submit. If there are >> two sqe_submit instances with the same ->files (each holding a >> reference from the get_files_struct() in __io_queue_sqe()), then: >> >> When processing the first sqe_submit instance, current->files and >> cur_files are set to $user_files. >> When processing the second sqe_submit instance, nothing happens >> (s->files == cur_files). >> After the loop, at the end of the function, put_files_struct() is >> called once on $user_files. >> >> So get_files_struct() has been called twice, but put_files_struct() >> has only been called once. That leaves the refcount too high, and by >> repeating this, an attacker can make the refcount wrap around and then >> cause a use-after-free. > > Ah now I see what you are getting at, yes that's clearly a bug! I wonder > how we best safely can batch the drops. We can track the number of times > we've used the same files, and do atomic_sub_and_test() in a > put_files_struct_many() type addition. But that would leave us open to > the issue you describe, where someone could maliciously overflow the > files ref count. > > Probably not worth over-optimizing, as long as we can avoid the > current->files task lock/unlock and shuffle. > > I'll update the patch. Alright, this incremental on top should do it. And full updated patch here: http://git.kernel.dk/cgit/linux-block/commit/?h=for-5.5/io_uring-test&id=40449c5a3d3b16796fa13e9469c69d62986e961c Let me know what you think. diff --git a/fs/io_uring.c b/fs/io_uring.c index 68eda36bcc9c..2fed0badad38 100644 --- a/fs/io_uring.c +++ b/fs/io_uring.c @@ -2289,13 +2289,13 @@ static void io_sq_wq_submit_work(struct work_struct *work) set_fs(USER_DS); } } - if (s->files && s->files != cur_files) { + if (cur_files) + put_files_struct(cur_files); + cur_files = s->files; + if (cur_files && cur_files != current->files) { task_lock(current); - current->files = s->files; + current->files = cur_files; task_unlock(current); - if (cur_files) - put_files_struct(cur_files); - cur_files = s->files; } if (!ret) { @@ -2393,8 +2393,9 @@ static void io_sq_wq_submit_work(struct work_struct *work) task_lock(current); current->files = old_files; task_unlock(current); - put_files_struct(cur_files); } + if (cur_files) + put_files_struct(cur_files); } /* -- Jens Axboe