From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id BB785C4338F for ; Tue, 27 Jul 2021 17:52:06 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 9D72960F4F for ; Tue, 27 Jul 2021 17:52:06 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230379AbhG0RwE (ORCPT ); Tue, 27 Jul 2021 13:52:04 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39126 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230139AbhG0Rv6 (ORCPT ); Tue, 27 Jul 2021 13:51:58 -0400 Received: from mail-lf1-x12a.google.com (mail-lf1-x12a.google.com [IPv6:2a00:1450:4864:20::12a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id ED5AEC061760 for ; Tue, 27 Jul 2021 10:51:57 -0700 (PDT) Received: by mail-lf1-x12a.google.com with SMTP id h2so23196274lfu.4 for ; Tue, 27 Jul 2021 10:51:57 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux-foundation.org; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=OBk5RRCD2iLIhOK5co1TZE5o+SAQV6EEDJtV9mSx3y4=; b=cHnY7pFfsAdkQyfJFNg5EaWUnAvGvmdR6YdJ+HmkdJ6abnT5bydaZvX2wtF+ok6nr4 7AGzXW9dGdx60YszoBA4wNaFmE747x3Kk2TBtM5CDprIKnApwqYEvPDQPEyogBzV7JX7 14joPYpiwvuQTijyCdwG44D45lS6M1/RP7VVs= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=OBk5RRCD2iLIhOK5co1TZE5o+SAQV6EEDJtV9mSx3y4=; b=aKcgutnLKlnw89o6DxUiBCZBgY/hI46bDi+orTJNpq6I/e7coMgEIBB1zxC7YBCqeW FEY+Y9lpgEAZ4DtlcoKLZ6kKBZaBF1BwVZGBybwtFHNlhwgkHrm6XFyF5bYRzh+fgtBL uJWQD8PWtRmelBOfc+GDkHjXFLmJvvfl7Kz4L0rqjpMWCR+a3hwS51/N/L4TQEIN41Xb FZ+kDpZI/l2AtwwODis4Py66S32GWCBjN8nIgTMV0Aozg14LJT7zETbN9u8s+PHuKa+2 klfNbZg0NhpShmS4mwsf0VFCWESvWBYjoSMDuOU9a6ZHZu+POo6VxM6IjLRKkr7dSf93 JHtg== X-Gm-Message-State: AOAM533H9/nk9uB+D+ZveI9CNXrj+TOZ+mmj4EDS7oGHa1rT/AMKu65L Er4F7ak6KZrrOR7RWob/jdTUPhvlbOwlOLEnhsA= X-Google-Smtp-Source: ABdhPJyG/vsQGbjkdLU8+iKoUb5fCOmnhdoPCTeTisZD4Ru9zQHUZdczU45zGoOpxUTU4fJndo2VlA== X-Received: by 2002:ac2:54b8:: with SMTP id w24mr17316630lfk.593.1627408316221; Tue, 27 Jul 2021 10:51:56 -0700 (PDT) Received: from mail-lj1-f174.google.com (mail-lj1-f174.google.com. [209.85.208.174]) by smtp.gmail.com with ESMTPSA id p16sm352597lfr.122.2021.07.27.10.51.54 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Tue, 27 Jul 2021 10:51:55 -0700 (PDT) Received: by mail-lj1-f174.google.com with SMTP id u20so16685216ljo.0 for ; Tue, 27 Jul 2021 10:51:54 -0700 (PDT) X-Received: by 2002:a2e:81c4:: with SMTP id s4mr15961914ljg.251.1627408314168; Tue, 27 Jul 2021 10:51:54 -0700 (PDT) MIME-Version: 1.0 References: <20210724193449.361667-1-agruenba@redhat.com> <20210724193449.361667-2-agruenba@redhat.com> <03e0541400e946cf87bc285198b82491@AcuMS.aculab.com> In-Reply-To: From: Linus Torvalds Date: Tue, 27 Jul 2021 10:51:38 -0700 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [PATCH v4 1/8] iov_iter: Introduce iov_iter_fault_in_writeable helper To: Andreas Gruenbacher Cc: David Laight , Alexander Viro , Christoph Hellwig , "Darrick J. Wong" , Jan Kara , Matthew Wilcox , cluster-devel , linux-fsdevel , Linux Kernel Mailing List , "ocfs2-devel@oss.oracle.com" Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Jul 27, 2021 at 4:14 AM Andreas Gruenbacher wrote: > > On Tue, Jul 27, 2021 at 11:30 AM David Laight wrote: > > > > Is it actually worth doing any more than ensuring the first byte > > of the buffer is paged in before entering the block that has > > to disable page faults? > > We definitely do want to process as many pages as we can, especially > if allocations are involved during a write. Yeah, from an efficiency standpoint, once you start walking page tables, it's probably best to just handle as much as you can. But once you get an error, I don't think it should be "everything is bad". This is a bit annoying, because while *most* users really just want that "everything is good", *some* users might just want to handle the partial success case. It's why "copy_to/from_user()" returns the number of bytes *not* written, rather than -EFAULT like get/put_user(). 99% of all users just want to know "did I write all bytes" (and then checking for a zero return is a simple and cheap verification of "everything was ok"). But then very occasionally, you hit a case where you actually want to know how much of a copy worked. It's rare, but it happens, and the read/write system calls tend to be the main user of it. And yes, the fact that "copy_to/from_user()" doesn't return an error (like get/put_user() does) has confused people many times over the years. It's annoying, but it's required by those (few) users that really do want to handle that partial case. I think this iov_iter_fault_in_readable/writeable() case should do the same. And no, it's not new to Andreas' patch. iov_iter_fault_in_readable() is doing the "everything has to be good" thing already. Which maybe implies that nobody cares about partial reads/writes. Or it's very very rare - I've seen code that handles page faults in user space, but it's admittedly been some very special CPU simulator/emulator checkpointing stuff. Linus From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.7 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, URIBL_BLOCKED autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 238BCC4320A for ; Tue, 27 Jul 2021 17:52:11 +0000 (UTC) Received: from mx0a-00069f02.pphosted.com (mx0a-00069f02.pphosted.com [205.220.165.32]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id C30F260F4F for ; Tue, 27 Jul 2021 17:52:10 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org C30F260F4F Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=oss.oracle.com Received: from pps.filterd (m0246627.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.16.0.43/8.16.0.43) with SMTP id 16RHq55u032551; Tue, 27 Jul 2021 17:52:10 GMT Received: from userp3020.oracle.com (userp3020.oracle.com [156.151.31.79]) by mx0b-00069f02.pphosted.com with ESMTP id 3a2358ag5c-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Tue, 27 Jul 2021 17:52:10 +0000 Received: from pps.filterd (userp3020.oracle.com [127.0.0.1]) by userp3020.oracle.com (8.16.0.42/8.16.0.42) with SMTP id 16RHonaM117123; Tue, 27 Jul 2021 17:52:06 GMT Received: from oss.oracle.com (oss-old-reserved.oracle.com [137.254.22.2]) by userp3020.oracle.com with ESMTP id 3a234vxfrm-1 (version=TLSv1 cipher=AES256-SHA bits=256 verify=NO); Tue, 27 Jul 2021 17:52:06 +0000 Received: from localhost ([127.0.0.1] helo=lb-oss.oracle.com) by oss.oracle.com with esmtp (Exim 4.63) (envelope-from ) id 1m8RFN-00054O-M3; Tue, 27 Jul 2021 10:52:05 -0700 Received: from aserp3020.oracle.com ([141.146.126.70]) by oss.oracle.com with esmtp (Exim 4.63) (envelope-from ) id 1m8RFH-00053v-IY for ocfs2-devel@oss.oracle.com; Tue, 27 Jul 2021 10:51:59 -0700 Received: from pps.filterd (aserp3020.oracle.com [127.0.0.1]) by aserp3020.oracle.com (8.16.0.42/8.16.0.42) with SMTP id 16RHpYso086633 for ; Tue, 27 Jul 2021 17:51:59 GMT Received: from mx0b-00069f01.pphosted.com (mx0b-00069f01.pphosted.com [205.220.177.26]) by aserp3020.oracle.com with ESMTP id 3a2348ka30-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK) for ; Tue, 27 Jul 2021 17:51:58 +0000 Received: from pps.filterd (m0246578.ppops.net [127.0.0.1]) by mx0b-00069f01.pphosted.com (8.16.0.43/8.16.0.43) with SMTP id 16RHSKw1004321 for ; Tue, 27 Jul 2021 17:51:58 GMT Received: from mail-lj1-f172.google.com (mail-lj1-f172.google.com [209.85.208.172]) by mx0b-00069f01.pphosted.com with ESMTP id 3a235k3rj8-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=OK) for ; Tue, 27 Jul 2021 17:51:57 +0000 Received: by mail-lj1-f172.google.com with SMTP id q2so17026766ljq.5 for ; Tue, 27 Jul 2021 10:51:57 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=OBk5RRCD2iLIhOK5co1TZE5o+SAQV6EEDJtV9mSx3y4=; b=gvAnFpgOQfWYGCVlXhyZS5rVHU27eU3dK5B2aJLqqkd0sGEzEhZADvDuVbG2Y7afIQ Xs8sAPMg1/qzVQAx/JmLdiO06EPfINS5YVRl39oXURKrzzUnV6MEbN91pPZS+DdNkBHj yA3mIF/kOj7I3QyMqSlj8b5a0Fb7Cacs4d4Z1cuBSv8Q1cYgiO2oNJMv4TrAEsco14P7 L1us2RPfPBXZQYiJIZuKWWOYStbNTuT98d8vIaCsVfQPZZQiOjsUIh7EYv/VFZyrS9n8 U14HCIXEtMWfL8wCfx/+OE76E2ZfgwOcTqnZMqTOJNeuKDPZtkNpuwOepzaKRSc9YkMz AM9g== X-Gm-Message-State: AOAM5324fZKjyhkK015Q3Iv6TRoW9WxBPLfwQt1bCxLBk/LozRScg9GS 2E7w81bOtS6yrth58WdtcXHd7t0uUIwlFGQb99I= X-Google-Smtp-Source: ABdhPJyhPVtkKqYPpOhTiK4aOYMjyV9H4s3aD3BaG9YlJOvHfzlT0/ADZSysOYCnwJllJZ/I6s5L6Q== X-Received: by 2002:a2e:868c:: with SMTP id l12mr16651358lji.134.1627408315422; Tue, 27 Jul 2021 10:51:55 -0700 (PDT) Received: from mail-lj1-f181.google.com (mail-lj1-f181.google.com. [209.85.208.181]) by smtp.gmail.com with ESMTPSA id l11sm332561ljc.71.2021.07.27.10.51.54 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Tue, 27 Jul 2021 10:51:54 -0700 (PDT) Received: by mail-lj1-f181.google.com with SMTP id x7so16977536ljn.10 for ; Tue, 27 Jul 2021 10:51:54 -0700 (PDT) X-Received: by 2002:a2e:81c4:: with SMTP id s4mr15961914ljg.251.1627408314168; Tue, 27 Jul 2021 10:51:54 -0700 (PDT) MIME-Version: 1.0 References: <20210724193449.361667-1-agruenba@redhat.com> <20210724193449.361667-2-agruenba@redhat.com> <03e0541400e946cf87bc285198b82491@AcuMS.aculab.com> In-Reply-To: From: Linus Torvalds Date: Tue, 27 Jul 2021 10:51:38 -0700 X-Gmail-Original-Message-ID: Message-ID: To: Andreas Gruenbacher X-Source-IP: 209.85.208.172 X-ServerName: mail-lj1-f172.google.com X-Proofpoint-SPF-Result: pass X-Proofpoint-SPF-Record: v=spf1 ip4:198.145.29.98/31 ip4:72.55.140.81 include:_spf.google.com include:amazonses.com include:_spf.salesforce.com ~all X-Proofpoint-Virus-Version: vendor=nai engine=6200 definitions=10058 signatures=668682 X-Proofpoint-Spam-Details: rule=tap_notspam policy=tap score=0 lowpriorityscore=0 priorityscore=0 suspectscore=0 mlxlogscore=999 impostorscore=0 mlxscore=0 malwarescore=0 adultscore=0 spamscore=0 clxscore=361 bulkscore=0 phishscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2107140000 definitions=main-2107270105 X-Spam: Clean X-Proofpoint-Virus-Version: vendor=nai engine=6200 definitions=10058 signatures=668682 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 mlxscore=0 adultscore=0 suspectscore=0 malwarescore=0 spamscore=0 mlxlogscore=999 phishscore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2107140000 definitions=main-2107270106 Cc: cluster-devel , Jan Kara , Linux Kernel Mailing List , Christoph Hellwig , David Laight , Alexander Viro , linux-fsdevel , "ocfs2-devel@oss.oracle.com" Subject: Re: [Ocfs2-devel] [PATCH v4 1/8] iov_iter: Introduce iov_iter_fault_in_writeable helper X-BeenThere: ocfs2-devel@oss.oracle.com X-Mailman-Version: 2.1.9 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: ocfs2-devel-bounces@oss.oracle.com Errors-To: ocfs2-devel-bounces@oss.oracle.com X-Proofpoint-Virus-Version: vendor=nai engine=6200 definitions=10058 signatures=668682 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 adultscore=0 spamscore=0 mlxlogscore=999 bulkscore=0 mlxscore=0 phishscore=0 suspectscore=0 malwarescore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2107140000 definitions=main-2107270106 X-Proofpoint-GUID: z1bmoVKxRqVA5IwFoOecqqZHDSm5Fc2j X-Proofpoint-ORIG-GUID: z1bmoVKxRqVA5IwFoOecqqZHDSm5Fc2j On Tue, Jul 27, 2021 at 4:14 AM Andreas Gruenbacher wrote: > > On Tue, Jul 27, 2021 at 11:30 AM David Laight wrote: > > > > Is it actually worth doing any more than ensuring the first byte > > of the buffer is paged in before entering the block that has > > to disable page faults? > > We definitely do want to process as many pages as we can, especially > if allocations are involved during a write. Yeah, from an efficiency standpoint, once you start walking page tables, it's probably best to just handle as much as you can. But once you get an error, I don't think it should be "everything is bad". This is a bit annoying, because while *most* users really just want that "everything is good", *some* users might just want to handle the partial success case. It's why "copy_to/from_user()" returns the number of bytes *not* written, rather than -EFAULT like get/put_user(). 99% of all users just want to know "did I write all bytes" (and then checking for a zero return is a simple and cheap verification of "everything was ok"). But then very occasionally, you hit a case where you actually want to know how much of a copy worked. It's rare, but it happens, and the read/write system calls tend to be the main user of it. And yes, the fact that "copy_to/from_user()" doesn't return an error (like get/put_user() does) has confused people many times over the years. It's annoying, but it's required by those (few) users that really do want to handle that partial case. I think this iov_iter_fault_in_readable/writeable() case should do the same. And no, it's not new to Andreas' patch. iov_iter_fault_in_readable() is doing the "everything has to be good" thing already. Which maybe implies that nobody cares about partial reads/writes. Or it's very very rare - I've seen code that handles page faults in user space, but it's admittedly been some very special CPU simulator/emulator checkpointing stuff. Linus _______________________________________________ Ocfs2-devel mailing list Ocfs2-devel@oss.oracle.com https://oss.oracle.com/mailman/listinfo/ocfs2-devel From mboxrd@z Thu Jan 1 00:00:00 1970 From: Linus Torvalds Date: Tue, 27 Jul 2021 10:51:38 -0700 Subject: [Cluster-devel] [PATCH v4 1/8] iov_iter: Introduce iov_iter_fault_in_writeable helper In-Reply-To: References: <20210724193449.361667-1-agruenba@redhat.com> <20210724193449.361667-2-agruenba@redhat.com> <03e0541400e946cf87bc285198b82491@AcuMS.aculab.com> Message-ID: List-Id: To: cluster-devel.redhat.com MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit On Tue, Jul 27, 2021 at 4:14 AM Andreas Gruenbacher wrote: > > On Tue, Jul 27, 2021 at 11:30 AM David Laight wrote: > > > > Is it actually worth doing any more than ensuring the first byte > > of the buffer is paged in before entering the block that has > > to disable page faults? > > We definitely do want to process as many pages as we can, especially > if allocations are involved during a write. Yeah, from an efficiency standpoint, once you start walking page tables, it's probably best to just handle as much as you can. But once you get an error, I don't think it should be "everything is bad". This is a bit annoying, because while *most* users really just want that "everything is good", *some* users might just want to handle the partial success case. It's why "copy_to/from_user()" returns the number of bytes *not* written, rather than -EFAULT like get/put_user(). 99% of all users just want to know "did I write all bytes" (and then checking for a zero return is a simple and cheap verification of "everything was ok"). But then very occasionally, you hit a case where you actually want to know how much of a copy worked. It's rare, but it happens, and the read/write system calls tend to be the main user of it. And yes, the fact that "copy_to/from_user()" doesn't return an error (like get/put_user() does) has confused people many times over the years. It's annoying, but it's required by those (few) users that really do want to handle that partial case. I think this iov_iter_fault_in_readable/writeable() case should do the same. And no, it's not new to Andreas' patch. iov_iter_fault_in_readable() is doing the "everything has to be good" thing already. Which maybe implies that nobody cares about partial reads/writes. Or it's very very rare - I've seen code that handles page faults in user space, but it's admittedly been some very special CPU simulator/emulator checkpointing stuff. Linus