From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6A974C433EF for ; Wed, 27 Oct 2021 21:15:19 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 50D146109F for ; Wed, 27 Oct 2021 21:15:19 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S244249AbhJ0VRk (ORCPT ); Wed, 27 Oct 2021 17:17:40 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53046 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S244244AbhJ0VRg (ORCPT ); Wed, 27 Oct 2021 17:17:36 -0400 Received: from mail-lj1-x22f.google.com (mail-lj1-x22f.google.com [IPv6:2a00:1450:4864:20::22f]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C3223C061767 for ; Wed, 27 Oct 2021 14:15:09 -0700 (PDT) Received: by mail-lj1-x22f.google.com with SMTP id i26so5076613ljg.7 for ; Wed, 27 Oct 2021 14:15:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux-foundation.org; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=hiXmABk2yLJcTBwV/OV64LUSAMHnd37fHZz79FC6JvU=; b=ZWNMOoSkuRaQ/lv8qizXhk55H0ec5K7RYuCkQOMY0yln4tyxGOGFp3Gs+/CZY7p6bT 3CZlT/Neco4TP12/d9wLKcFuseqO24iEpPL6jwtGtY8MUiJAI9mKP12Z7SKjitGg++uj lH3CUAw1rGdqhKUixiRRtsMf0iuWhbOzVZTJw= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=hiXmABk2yLJcTBwV/OV64LUSAMHnd37fHZz79FC6JvU=; b=NXnitOMZn5p8SATc1HORIhqjS21o8R1DioBRwMixaPvvlPBr1hGnVjWzi9nC/WKoE9 ngyJEtyy0Kn1Xm2iYmm+SC1roV/cfqd0MagfTOaxT58YHS2/17McdKBTmNShpzQI0IhL EyBT47vkChFfPKxofkXsNch9PEBcFdotseJ68JAoraeSDpcxQRbnKBl7W78c2hP3vgEc OkwNUvvmIPRGSxpJL2pluq8yr0n9gSs/wauus3gXZxVVkpzYIbP7ZKGxBYI+PtXViJSf 2km1/2C86M3nKD4JQTDY7yLwYOfGCU0rziGx5JWcnbDsXGmv3MckHS3ErBvVr2veWNzu Beww== X-Gm-Message-State: AOAM531Zix2ksLZ6YErmt4Dqzzcb+hxPBe1ZzARz0HBakxS8J1JQGnNp Xgxa0V6lJYxHnLOwZe+r+TkYoEwxw2ucw+Gi X-Google-Smtp-Source: ABdhPJw/E+snwh/SOWpZkQoWSRG3v/oQoXa1Y3pQs/WgqF2g0AXe1SU1FyyNTDGgQnz49xuRjHBjCA== X-Received: by 2002:a2e:a90b:: with SMTP id j11mr331916ljq.282.1635369307470; Wed, 27 Oct 2021 14:15:07 -0700 (PDT) Received: from mail-lf1-f54.google.com (mail-lf1-f54.google.com. [209.85.167.54]) by smtp.gmail.com with ESMTPSA id q10sm104947lfu.68.2021.10.27.14.15.04 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 27 Oct 2021 14:15:05 -0700 (PDT) Received: by mail-lf1-f54.google.com with SMTP id f3so790969lfu.12 for ; Wed, 27 Oct 2021 14:15:04 -0700 (PDT) X-Received: by 2002:a19:f619:: with SMTP id x25mr90493lfe.141.1635369304547; Wed, 27 Oct 2021 14:15:04 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Linus Torvalds Date: Wed, 27 Oct 2021 14:14:48 -0700 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [PATCH v8 00/17] gfs2: Fix mmap + page fault deadlocks To: Catalin Marinas Cc: Andreas Gruenbacher , Paul Mackerras , Alexander Viro , Christoph Hellwig , "Darrick J. Wong" , Jan Kara , Matthew Wilcox , cluster-devel , linux-fsdevel , Linux Kernel Mailing List , ocfs2-devel@oss.oracle.com, kvm-ppc@vger.kernel.org, linux-btrfs Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org On Wed, Oct 27, 2021 at 12:13 PM Catalin Marinas wrote: > > As an alternative, you mentioned earlier that a per-thread fault status > was not feasible on x86 due to races. Was this only for the hw poison > case? I think the uaccess is slightly different. It's not x86-specific, it's very generic. If we set some flag in the per-thread status, we'll need to be careful about not overwriting it if we then have a subsequent NMI that _also_ takes a (completely unrelated) page fault - before we then read the per-thread flag. Think 'perf' and fetching backtraces etc. Note that the NMI page fault can easily also be a pointer coloring fault on arm64, for exactly the same reason that whatever original copy_from_user() code was. So this is not a "oh, pointer coloring faults are different". They have the same re-entrancy issue. And both the "pagefault_disable" and "fault happens in interrupt context" cases are also the exact same 'faulthandler_disabled()' thing. So even at fault time they look very similar. So we'd have to have some way to separate out only the one we care about. Linus From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 44CF1C433EF for ; Wed, 27 Oct 2021 21:18:29 +0000 (UTC) Received: from mx0b-00069f02.pphosted.com (mx0b-00069f02.pphosted.com [205.220.177.32]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id D9FC76113B for ; Wed, 27 Oct 2021 21:18:28 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org D9FC76113B Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=oss.oracle.com Received: from pps.filterd (m0246630.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.16.1.2/8.16.1.2) with SMTP id 19RL51ot017567; Wed, 27 Oct 2021 21:18:28 GMT Received: from userp3020.oracle.com (userp3020.oracle.com [156.151.31.79]) by mx0b-00069f02.pphosted.com with ESMTP id 3bx4fg4rna-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 27 Oct 2021 21:18:27 +0000 Received: from pps.filterd (userp3020.oracle.com [127.0.0.1]) by userp3020.oracle.com (8.16.1.2/8.16.1.2) with SMTP id 19RLBvv7071972; Wed, 27 Oct 2021 21:18:26 GMT Received: from oss.oracle.com (oss-old-reserved.oracle.com [137.254.22.2]) by userp3020.oracle.com with ESMTP id 3bx4grr7nn-1 (version=TLSv1 cipher=AES256-SHA bits=256 verify=NO); Wed, 27 Oct 2021 21:18:26 +0000 Received: from localhost ([127.0.0.1] helo=lb-oss.oracle.com) by oss.oracle.com with esmtp (Exim 4.63) (envelope-from ) id 1mfqGS-0002i7-FN; Wed, 27 Oct 2021 14:15:16 -0700 Received: from userp3020.oracle.com ([156.151.31.79]) by oss.oracle.com with esmtp (Exim 4.63) (envelope-from ) id 1mfqGO-0002ho-SS for ocfs2-devel@oss.oracle.com; Wed, 27 Oct 2021 14:15:13 -0700 Received: from pps.filterd (userp3020.oracle.com [127.0.0.1]) by userp3020.oracle.com (8.16.1.2/8.16.1.2) with SMTP id 19RLBwCb072037 for ; Wed, 27 Oct 2021 21:15:12 GMT Received: from mx0a-00069f01.pphosted.com (mx0a-00069f01.pphosted.com [205.220.165.26]) by userp3020.oracle.com with ESMTP id 3bx4grr3tx-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK) for ; Wed, 27 Oct 2021 21:15:11 +0000 Received: from pps.filterd (m0246575.ppops.net [127.0.0.1]) by mx0b-00069f01.pphosted.com (8.16.1.2/8.16.1.2) with SMTP id 19RHWUBE011503 for ; Wed, 27 Oct 2021 21:15:10 GMT Received: from mail-lj1-f177.google.com (mail-lj1-f177.google.com [209.85.208.177]) by mx0b-00069f01.pphosted.com with ESMTP id 3by9bsvqxj-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=OK) for ; Wed, 27 Oct 2021 21:15:08 +0000 Received: by mail-lj1-f177.google.com with SMTP id 205so6937806ljf.9 for ; Wed, 27 Oct 2021 14:15:08 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=hiXmABk2yLJcTBwV/OV64LUSAMHnd37fHZz79FC6JvU=; b=dGdDCzmpxPLH8PHapEkw/sXo2THZQ/xLYSVDx9lIBpiWv7BOS9RXtOyIAjj5PjusBV pMiGwFayWBwnAg+6hTpdZzldYJP/ajlctxlmppgJIxfqo156yNu6oHGVGv4h7YTREHkE YAnegdXe6ArE4syuRr2qXN4UQwmkg5b2GAdYyJPpD+YCxM3avHf2A9vhg2wIkB34SdwP 1y20uNYkpQESCHiY+MSiHaS1LZoYzceiTOTrDXl6zFF0F9Iges8oo5X8+ouWalE5t3b7 U/2jFS12TfFelV7ihjgzSO27zsCxwKfyQVn68aBk31jW/sMYCFP1g2uJK9y/xJjWDLMB Pb1Q== X-Gm-Message-State: AOAM532aVrVOYNYPIg2AJSelV9/ysRUdHRNURRUzvTuOgM1RbNgtlZV3 0+6HrLNtbWXJedH9NxA3RpPL68G8iJ3OTODO X-Google-Smtp-Source: ABdhPJyKRQCqbiUQtpfwBwShu5TDW5wlID8XQ19ydhbsvrFFVe+sEcdTmLbzCnjB6Mova3AyEfgI1w== X-Received: by 2002:a2e:9205:: with SMTP id k5mr273752ljg.451.1635369305358; Wed, 27 Oct 2021 14:15:05 -0700 (PDT) Received: from mail-lf1-f48.google.com (mail-lf1-f48.google.com. [209.85.167.48]) by smtp.gmail.com with ESMTPSA id q5sm90559ljb.125.2021.10.27.14.15.04 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 27 Oct 2021 14:15:05 -0700 (PDT) Received: by mail-lf1-f48.google.com with SMTP id bq11so8941372lfb.10 for ; Wed, 27 Oct 2021 14:15:04 -0700 (PDT) X-Received: by 2002:a19:f619:: with SMTP id x25mr90493lfe.141.1635369304547; Wed, 27 Oct 2021 14:15:04 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Linus Torvalds Date: Wed, 27 Oct 2021 14:14:48 -0700 X-Gmail-Original-Message-ID: Message-ID: To: Catalin Marinas X-Source-IP: 209.85.208.177 X-ServerName: mail-lj1-f177.google.com X-Proofpoint-SPF-Result: pass X-Proofpoint-SPF-Record: v=spf1 ip4:198.145.29.98/31 ip4:72.55.140.81 include:_spf.google.com include:amazonses.com include:_spf.salesforce.com ~all X-Proofpoint-Virus-Version: vendor=nai engine=6300 definitions=10150 signatures=668683 X-Proofpoint-Spam-Details: rule=tap_notspam policy=tap score=0 priorityscore=70 impostorscore=0 phishscore=0 adultscore=0 clxscore=341 malwarescore=0 bulkscore=0 mlxscore=0 spamscore=0 lowpriorityscore=0 mlxlogscore=621 suspectscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2110150000 definitions=main-2110270119 domainage_hfrom=5426 X-Spam: Clean Cc: kvm-ppc@vger.kernel.org, Christoph Hellwig , cluster-devel , Jan Kara , Andreas Gruenbacher , Linux Kernel Mailing List , Paul Mackerras , Alexander Viro , linux-fsdevel , linux-btrfs , ocfs2-devel@oss.oracle.com Subject: Re: [Ocfs2-devel] [PATCH v8 00/17] gfs2: Fix mmap + page fault deadlocks X-BeenThere: ocfs2-devel@oss.oracle.com X-Mailman-Version: 2.1.9 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: ocfs2-devel-bounces@oss.oracle.com Errors-To: ocfs2-devel-bounces@oss.oracle.com X-Proofpoint-Virus-Version: vendor=nai engine=6300 definitions=10150 signatures=668683 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 phishscore=0 malwarescore=0 adultscore=0 suspectscore=0 bulkscore=0 mlxscore=0 spamscore=0 mlxlogscore=999 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2110150000 definitions=main-2110270119 X-Proofpoint-ORIG-GUID: 2sKoBBcY6cD1UDp35ouCZOvxNck1jmsw X-Proofpoint-GUID: 2sKoBBcY6cD1UDp35ouCZOvxNck1jmsw On Wed, Oct 27, 2021 at 12:13 PM Catalin Marinas wrote: > > As an alternative, you mentioned earlier that a per-thread fault status > was not feasible on x86 due to races. Was this only for the hw poison > case? I think the uaccess is slightly different. It's not x86-specific, it's very generic. If we set some flag in the per-thread status, we'll need to be careful about not overwriting it if we then have a subsequent NMI that _also_ takes a (completely unrelated) page fault - before we then read the per-thread flag. Think 'perf' and fetching backtraces etc. Note that the NMI page fault can easily also be a pointer coloring fault on arm64, for exactly the same reason that whatever original copy_from_user() code was. So this is not a "oh, pointer coloring faults are different". They have the same re-entrancy issue. And both the "pagefault_disable" and "fault happens in interrupt context" cases are also the exact same 'faulthandler_disabled()' thing. So even at fault time they look very similar. So we'd have to have some way to separate out only the one we care about. Linus _______________________________________________ Ocfs2-devel mailing list Ocfs2-devel@oss.oracle.com https://oss.oracle.com/mailman/listinfo/ocfs2-devel From mboxrd@z Thu Jan 1 00:00:00 1970 From: Linus Torvalds Date: Wed, 27 Oct 2021 14:14:48 -0700 Subject: [Cluster-devel] [PATCH v8 00/17] gfs2: Fix mmap + page fault deadlocks In-Reply-To: References: Message-ID: List-Id: To: cluster-devel.redhat.com MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit On Wed, Oct 27, 2021 at 12:13 PM Catalin Marinas wrote: > > As an alternative, you mentioned earlier that a per-thread fault status > was not feasible on x86 due to races. Was this only for the hw poison > case? I think the uaccess is slightly different. It's not x86-specific, it's very generic. If we set some flag in the per-thread status, we'll need to be careful about not overwriting it if we then have a subsequent NMI that _also_ takes a (completely unrelated) page fault - before we then read the per-thread flag. Think 'perf' and fetching backtraces etc. Note that the NMI page fault can easily also be a pointer coloring fault on arm64, for exactly the same reason that whatever original copy_from_user() code was. So this is not a "oh, pointer coloring faults are different". They have the same re-entrancy issue. And both the "pagefault_disable" and "fault happens in interrupt context" cases are also the exact same 'faulthandler_disabled()' thing. So even at fault time they look very similar. So we'd have to have some way to separate out only the one we care about. Linus From mboxrd@z Thu Jan 1 00:00:00 1970 From: Linus Torvalds Date: Wed, 27 Oct 2021 21:14:48 +0000 Subject: Re: [PATCH v8 00/17] gfs2: Fix mmap + page fault deadlocks Message-Id: List-Id: References: In-Reply-To: MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit To: Catalin Marinas Cc: Andreas Gruenbacher , Paul Mackerras , Alexander Viro , Christoph Hellwig , "Darrick J. Wong" , Jan Kara , Matthew Wilcox , cluster-devel , linux-fsdevel , Linux Kernel Mailing List , ocfs2-devel@oss.oracle.com, kvm-ppc@vger.kernel.org, linux-btrfs On Wed, Oct 27, 2021 at 12:13 PM Catalin Marinas wrote: > > As an alternative, you mentioned earlier that a per-thread fault status > was not feasible on x86 due to races. Was this only for the hw poison > case? I think the uaccess is slightly different. It's not x86-specific, it's very generic. If we set some flag in the per-thread status, we'll need to be careful about not overwriting it if we then have a subsequent NMI that _also_ takes a (completely unrelated) page fault - before we then read the per-thread flag. Think 'perf' and fetching backtraces etc. Note that the NMI page fault can easily also be a pointer coloring fault on arm64, for exactly the same reason that whatever original copy_from_user() code was. So this is not a "oh, pointer coloring faults are different". They have the same re-entrancy issue. And both the "pagefault_disable" and "fault happens in interrupt context" cases are also the exact same 'faulthandler_disabled()' thing. So even at fault time they look very similar. So we'd have to have some way to separate out only the one we care about. Linus