From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.9 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 73F9BC433DF for ; Thu, 15 Oct 2020 21:22:21 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 1C4552054F for ; Thu, 15 Oct 2020 21:22:21 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1602796941; bh=HL+DlAMrEr4KbErZTbVnD7MothqVzXY0FmsJmr7n/U4=; h=References:In-Reply-To:From:Date:Subject:To:Cc:List-ID:From; b=xtYZUwoY4MwOv8Et0O+Cyg5Pb+Pvn691NYQ0XArv+DrVVilQRyO/k8VZZpEZYDPjl JPLIm3nmBmVqPI2BIOXle8JEzNKyx0+xKoZvXvTKl+1beWZpC13B9LfM8B+ORzJKWL S0OgJ0MgWXhVu5nEqAbb2gwYfUGz50exk4flRoNc= Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730018AbgJOVWU (ORCPT ); Thu, 15 Oct 2020 17:22:20 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36584 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726696AbgJOVWU (ORCPT ); Thu, 15 Oct 2020 17:22:20 -0400 Received: from mail-lf1-x143.google.com (mail-lf1-x143.google.com [IPv6:2a00:1450:4864:20::143]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 55741C061755 for ; Thu, 15 Oct 2020 14:22:18 -0700 (PDT) Received: by mail-lf1-x143.google.com with SMTP id h6so367749lfj.3 for ; Thu, 15 Oct 2020 14:22:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux-foundation.org; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=ve8APvdbXwr+mkU24Wg+7NdCEN7nDo0UjPZho1mbCAs=; b=E/m/evaeyIctAIsX9swfjk8WvVZlpksp1XP2wHwAMLRvoLOERFZzvDjSkGzfrRMJGP CdC+DU+x6xnBEixMCNaC3BVB59Pv/9jxQ6Jk8Emwz10xGgnI3ZGVqTjbRo3yJB94tNif 5L7Ib4iEG+6BAcbtChrvOtZ8ZQbXYOCEvnndM= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=ve8APvdbXwr+mkU24Wg+7NdCEN7nDo0UjPZho1mbCAs=; b=p6Z+iBW7Ico560/I3cnFIQYcJgsAiSVRFwdvhfk3z8wKhD90BrwEATc5HfjVUFgZeL 2D0gw+rXsnXcjnTv3ohiiYbb3FqssVp7H7nmyTmBBhLtflI7QtYH84dSXcbOLEG6ph5G g80vXWHjGZFtjpSbIfXwKlFz0IR4I/LP6GN828YUBNNR4iTQCbdqVN4I0J7SSLZOJ5X2 hV2/ELNXoPs8jNa0XgmU8QMlLMj0jfVM91S9rRNk55DH5/mdHnJJJfyG4+kI2FiCVevc DlvnXblStm6FCCMMC9pQDv9aCm3rJM0vogN5ou2ANeft+fvrtrdaggmH5lcHK0PxOLMF vE9w== X-Gm-Message-State: AOAM532DJHSACfY4WoKv8tFHHYlve7kQtNgKiygACElO8pgGICwWDjMe 2okamM7e2nD5XsoTYFXeghB4IfZAA7vumA== X-Google-Smtp-Source: ABdhPJyi2IE0IY2jRGe9lIKdmuRBBk06J69G4TF/sRhYGKDy4Q/qUmxcPmFHucpE2FHPuiKLCQa/Jw== X-Received: by 2002:ac2:419a:: with SMTP id z26mr144396lfh.537.1602796935971; Thu, 15 Oct 2020 14:22:15 -0700 (PDT) Received: from mail-lf1-f48.google.com (mail-lf1-f48.google.com. [209.85.167.48]) by smtp.gmail.com with ESMTPSA id y125sm78374lfa.208.2020.10.15.14.22.14 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Thu, 15 Oct 2020 14:22:14 -0700 (PDT) Received: by mail-lf1-f48.google.com with SMTP id j30so358038lfp.4 for ; Thu, 15 Oct 2020 14:22:14 -0700 (PDT) X-Received: by 2002:ac2:5f48:: with SMTP id 8mr179180lfz.344.1602796934226; Thu, 15 Oct 2020 14:22:14 -0700 (PDT) MIME-Version: 1.0 References: <4794a3fa3742a5e84fb0f934944204b55730829b.camel@lca.pw> <20201015151606.GA226448@redhat.com> <20201015195526.GC226448@redhat.com> In-Reply-To: <20201015195526.GC226448@redhat.com> From: Linus Torvalds Date: Thu, 15 Oct 2020 14:21:58 -0700 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: Possible deadlock in fuse write path (Was: Re: [PATCH 0/4] Some more lock_page work..) To: Vivek Goyal Cc: Miklos Szeredi , Qian Cai , Hugh Dickins , Matthew Wilcox , "Kirill A . Shutemov" , Linux-MM , Andrew Morton , linux-fsdevel , Amir Goldstein Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org On Thu, Oct 15, 2020 at 12:55 PM Vivek Goyal wrote: > > I am wondering how should I fix this issue. Is it enough that I drop > the page lock (but keep the reference) inside the loop. And once copying > from user space is done, acquire page locks for all pages (Attached > a patch below). What is the page lock supposed to protect? Because whatever it protects, dropping the lock drops, and you'd need to re-check whatever the page lock was there for. > Or dropping page lock means that there are no guarantees that this > page did not get written back and removed from address space and > a new page has been placed at same offset. Does that mean I should > instead be looking up page cache again after copying from user > space is done. I don't know why fuse does multiple pages to begin with. Why can't it do whatever it does just one page at a time? But yes, you probably should look the page up again whenever you've unlocked it, because it might have been truncated or whatever. Not that this is purely about unlocking the page, not about "after copying from user space". The iov_iter_copy_from_user_atomic() part is safe - if that takes a page fault, it will just do a partial copy, it won't deadlock. So you can potentially do multiple pages, and keep them all locked, but only as long as the copies are all done with that "from_user_atomic()" case. Which normally works fine, since normal users will write stuff that they just generated, so it will all be there. It's only when that returns zero, and you do the fallback to pre-fault in any data with iov_iter_fault_in_readable() that you need to unlock _all_ pages (and once you do that, I don't see what possible advantage the multi-page array can have). Of course, the way that code is written, it always does the iov_iter_fault_in_readable() for each page - it's not written like some kind of "special case fallback thing". I suspect the code was copied from the generic write code, but without understanding why the generic write code was ok. Linus From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.9 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id C8801C433E7 for ; Thu, 15 Oct 2020 21:22:20 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id ECC7E20759 for ; Thu, 15 Oct 2020 21:22:19 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=linux-foundation.org header.i=@linux-foundation.org header.b="E/m/evae" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org ECC7E20759 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 35B8F6B005D; Thu, 15 Oct 2020 17:22:19 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 2BDB56B0068; Thu, 15 Oct 2020 17:22:19 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 15ED36B006E; Thu, 15 Oct 2020 17:22:19 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0228.hostedemail.com [216.40.44.228]) by kanga.kvack.org (Postfix) with ESMTP id DA8DF6B005D for ; Thu, 15 Oct 2020 17:22:18 -0400 (EDT) Received: from smtpin23.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 8023A181AEF09 for ; Thu, 15 Oct 2020 21:22:18 +0000 (UTC) X-FDA: 77375433156.23.geese01_3f1524127217 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin23.hostedemail.com (Postfix) with ESMTP id 5DF7B37604 for ; Thu, 15 Oct 2020 21:22:18 +0000 (UTC) X-HE-Tag: geese01_3f1524127217 X-Filterd-Recvd-Size: 5710 Received: from mail-lj1-f195.google.com (mail-lj1-f195.google.com [209.85.208.195]) by imf25.hostedemail.com (Postfix) with ESMTP for ; Thu, 15 Oct 2020 21:22:17 +0000 (UTC) Received: by mail-lj1-f195.google.com with SMTP id c21so331729ljj.0 for ; Thu, 15 Oct 2020 14:22:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux-foundation.org; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=ve8APvdbXwr+mkU24Wg+7NdCEN7nDo0UjPZho1mbCAs=; b=E/m/evaeyIctAIsX9swfjk8WvVZlpksp1XP2wHwAMLRvoLOERFZzvDjSkGzfrRMJGP CdC+DU+x6xnBEixMCNaC3BVB59Pv/9jxQ6Jk8Emwz10xGgnI3ZGVqTjbRo3yJB94tNif 5L7Ib4iEG+6BAcbtChrvOtZ8ZQbXYOCEvnndM= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=ve8APvdbXwr+mkU24Wg+7NdCEN7nDo0UjPZho1mbCAs=; b=UJ2t4/dsMQ4G5Hqm5KzIpwO30EKQRula9dya0MoKT5eSOY0nonUY3PxjnOhBQiMcUN /hFZl87SCh4QUPCcyCuypI4YMezRyYTtKcrvUx7Bgnhy+LfKtOr1SiGt+dQi+BkQNARv CTnYKxxsZW01471QjeUEABX55sQRvZzA/tdxGOv3WU8qxE5fIFcBMW8zCK7eBfjHKyZ4 NwqoQ7XbKzZSfjFJN5cI07GHvMn9Ajt+CLWY2z6Vk1rYV56+F0+BwLLZ3d57FxGlxSm0 A6MXdmzGkEzpA2Zkjxlg1noOeJVS9LZYV1vb9AZYEhMS6l+owsWAoCQ3pPmSMxvd9/uz xtWA== X-Gm-Message-State: AOAM531570NxwF/jiVd5tavHViffVlMGq95MCKdB41f3mLLoRlJdlPNp eNE4Xs9AiaQGkGCkOXNj2q6qoRnGISyCvw== X-Google-Smtp-Source: ABdhPJy4t611D5h7b4KanEndpInBPBR1Cl0oiQwGF9OPO201XX24g0MLyARrb6oH6akwo6Q0HsoiWA== X-Received: by 2002:a2e:864c:: with SMTP id i12mr228320ljj.396.1602796935704; Thu, 15 Oct 2020 14:22:15 -0700 (PDT) Received: from mail-lf1-f48.google.com (mail-lf1-f48.google.com. [209.85.167.48]) by smtp.gmail.com with ESMTPSA id i6sm79108lfo.270.2020.10.15.14.22.14 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Thu, 15 Oct 2020 14:22:14 -0700 (PDT) Received: by mail-lf1-f48.google.com with SMTP id r127so308910lff.12 for ; Thu, 15 Oct 2020 14:22:14 -0700 (PDT) X-Received: by 2002:ac2:5f48:: with SMTP id 8mr179180lfz.344.1602796934226; Thu, 15 Oct 2020 14:22:14 -0700 (PDT) MIME-Version: 1.0 References: <4794a3fa3742a5e84fb0f934944204b55730829b.camel@lca.pw> <20201015151606.GA226448@redhat.com> <20201015195526.GC226448@redhat.com> In-Reply-To: <20201015195526.GC226448@redhat.com> From: Linus Torvalds Date: Thu, 15 Oct 2020 14:21:58 -0700 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: Possible deadlock in fuse write path (Was: Re: [PATCH 0/4] Some more lock_page work..) To: Vivek Goyal Cc: Miklos Szeredi , Qian Cai , Hugh Dickins , Matthew Wilcox , "Kirill A . Shutemov" , Linux-MM , Andrew Morton , linux-fsdevel , Amir Goldstein Content-Type: text/plain; charset="UTF-8" X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Thu, Oct 15, 2020 at 12:55 PM Vivek Goyal wrote: > > I am wondering how should I fix this issue. Is it enough that I drop > the page lock (but keep the reference) inside the loop. And once copying > from user space is done, acquire page locks for all pages (Attached > a patch below). What is the page lock supposed to protect? Because whatever it protects, dropping the lock drops, and you'd need to re-check whatever the page lock was there for. > Or dropping page lock means that there are no guarantees that this > page did not get written back and removed from address space and > a new page has been placed at same offset. Does that mean I should > instead be looking up page cache again after copying from user > space is done. I don't know why fuse does multiple pages to begin with. Why can't it do whatever it does just one page at a time? But yes, you probably should look the page up again whenever you've unlocked it, because it might have been truncated or whatever. Not that this is purely about unlocking the page, not about "after copying from user space". The iov_iter_copy_from_user_atomic() part is safe - if that takes a page fault, it will just do a partial copy, it won't deadlock. So you can potentially do multiple pages, and keep them all locked, but only as long as the copies are all done with that "from_user_atomic()" case. Which normally works fine, since normal users will write stuff that they just generated, so it will all be there. It's only when that returns zero, and you do the fallback to pre-fault in any data with iov_iter_fault_in_readable() that you need to unlock _all_ pages (and once you do that, I don't see what possible advantage the multi-page array can have). Of course, the way that code is written, it always does the iov_iter_fault_in_readable() for each page - it's not written like some kind of "special case fallback thing". I suspect the code was copied from the generic write code, but without understanding why the generic write code was ok. Linus