From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.5 required=3.0 tests=BAYES_00,DKIM_ADSP_CUSTOM_MED, DKIM_INVALID,DKIM_SIGNED,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE, SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id EA316C63798 for ; Sun, 29 Nov 2020 00:49:57 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 6F25520885 for ; Sun, 29 Nov 2020 00:49:57 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="bCpV6Igt" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 6F25520885 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 73EB36B005C; Sat, 28 Nov 2020 19:49:56 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 6C87D6B005D; Sat, 28 Nov 2020 19:49:56 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 590696B0068; Sat, 28 Nov 2020 19:49:56 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0079.hostedemail.com [216.40.44.79]) by kanga.kvack.org (Postfix) with ESMTP id 3D9766B005C for ; Sat, 28 Nov 2020 19:49:56 -0500 (EST) Received: from smtpin03.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id F219F180AD82F for ; Sun, 29 Nov 2020 00:49:55 +0000 (UTC) X-FDA: 77535623550.03.lace94_1d05f9827395 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin03.hostedemail.com (Postfix) with ESMTP id D43A228A4E9 for ; Sun, 29 Nov 2020 00:49:55 +0000 (UTC) X-HE-Tag: lace94_1d05f9827395 X-Filterd-Recvd-Size: 6857 Received: from mail-pl1-f196.google.com (mail-pl1-f196.google.com [209.85.214.196]) by imf14.hostedemail.com (Postfix) with ESMTP for ; Sun, 29 Nov 2020 00:49:55 +0000 (UTC) Received: by mail-pl1-f196.google.com with SMTP id k5so4504988plt.6 for ; Sat, 28 Nov 2020 16:49:55 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:mime-version :content-transfer-encoding; bh=83/pAMCHdug8q8KigZ36pmo9dSTEa8sHBk2e7NUwP3o=; b=bCpV6IgtHQXk2QH2Zlsg3vh2B7/GfpfNHPlsnAma2TRGOcvHcbjlz+xl7io++EOCWk jdLLzMbmhRktYiF/6eME+TN8jJuGMsxFMDFibSsOpxO9SMC2Q1cGds5hrDAGfW4Z14wU fNU7LsprK26F1GnVYtoKHv8mpt0IxTA0A2CTkx9Ob4JKAQ5EQ7zwW0ipW6JDzXyNcHae CdweGA+A3jU+sEHEmYlbZo/t3Y7A6eCl1mh5scSrAfb0VwzMb9bys+e3mdDNKpAA2MV/ PTgyilLJa2jm3vdaPgOua3PVsz/wY7F6wgB+XeV3AEuh9T7zQeuRAFdb136eWDZRC41y niMw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:mime-version :content-transfer-encoding; bh=83/pAMCHdug8q8KigZ36pmo9dSTEa8sHBk2e7NUwP3o=; b=KPtnvKCOiezkn1qOS83fzC6TUbjF6SNdG8fKnISMw82rhsdrf3mla7Dv3vN4KZDSWT kDWSqGkm7IfWfbaHArVp4uS2pfPXNSDIozxDt1+n6T5BJVtHKiaHQLMpE86xxbSBzztl c0swjCWYQRNNUC6+gMHu6u/Fce06fG8Efmlb3CCcKaNrXj9NJMsRsLC7OgBtG/qtyHRY rpQxW74O5sS3d6nedoi3A+jAzJ739PRzis9qrCIDiPxshf0iW3OoKFkfNW1DLrClwrbU b20A2KB9kGbqs2nkes6rKwv3EgVQ0CLR530BTHrhw+J6DLCD6v/6dMJRyX97P4a35QR6 v2Qw== X-Gm-Message-State: AOAM533xMfcW/6kTFo6mWVRZQax9fPMKMUjVzW1ULF9JVwY0Ne9YSzZP SbZ4xhJ5O+LtEs0KNIMqS7Q= X-Google-Smtp-Source: ABdhPJwssGQhlhKBeDInAf63q82cFfSyhEr5h2dKa38xdU3BGUUE7ohK+TqMxjkvOwaD6/IEaP4B/g== X-Received: by 2002:a17:902:bd8c:b029:d8:db1d:2a35 with SMTP id q12-20020a170902bd8cb02900d8db1d2a35mr12843288pls.66.1606610994387; Sat, 28 Nov 2020 16:49:54 -0800 (PST) Received: from sc2-haas01-esx0118.eng.vmware.com ([66.170.99.1]) by smtp.gmail.com with ESMTPSA id gg19sm16444871pjb.21.2020.11.28.16.49.52 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 28 Nov 2020 16:49:53 -0800 (PST) From: Nadav Amit X-Google-Original-From: Nadav Amit To: linux-fsdevel@vger.kernel.org Cc: Nadav Amit , Mike Kravetz , Jens Axboe , Andrea Arcangeli , Peter Xu , Alexander Viro , io-uring@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: [RFC PATCH 00/13] fs/userfaultfd: support iouring and polling Date: Sat, 28 Nov 2020 16:45:35 -0800 Message-Id: <20201129004548.1619714-1-namit@vmware.com> X-Mailer: git-send-email 2.25.1 MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Nadav Amit While the overhead of userfaultfd is usually reasonable, this overhead can still be prohibitive for low-latency backing storage, such as RDMA, persistent memory or in-memory compression. In such cases the overhead of scheduling and entering/exiting the kernel becomes dominant. The natural solution for this problem is to use iouring with userfaultfd. But besides one bug, this does not provide sufficient performance improvement and the use of ioctls for zero/copy limits the use of iouring for synchronous "reads" (reporting of faults/events). This patch-set provides four solutions for this overhead: 1. Userfaultfd "polling" mode, in which the faulting thread polls after reporting the fault instead of being de-scheduled. This fits cases in which the handler is expected to poll for page-faults on a different thread. 2. Asynchronous-reads, in which the faulting thread reports page-faults (and other events) directly to the userspace handler thread. For this matter asynchronous read completions are being introduced. 3. Write interface, which provides similar services to the zero/copy ioctls. This allows the use of iouring for zero/copy without changing the iouring code or making it to be userfaultfd-aware. The low bits of the "position" are being used to encode the requested operation (zero/cop/wp/etc). 4. Async-writes, in which the zero/copy is performed by the faulting thread instead of the iouring thread. This reduces caching effects as the data is likely to be used by the faulting thread and find_vma() cannot use its cache on the iouring worker. I will provide some benchmark results later, but some initial results show that these patches reduce the overhead of handling a user page-fault by over 50%. The patches require a bit more cleanup but seem to pass the tests. Note that the first three patches are bug fixes. I did not Cc them to stable yet. Cc: Mike Kravetz Cc: Jens Axboe Cc: Andrea Arcangeli Cc: Peter Xu Cc: Alexander Viro Cc: io-uring@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org Cc: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org Nadav Amit (13): fs/userfaultfd: fix wrong error code on WP & !VM_MAYWRITE fs/userfaultfd: fix wrong file usage with iouring selftests/vm/userfaultfd: wake after copy failure fs/userfaultfd: simplify locks in userfaultfd_ctx_read fs/userfaultfd: introduce UFFD_FEATURE_POLL iov_iter: support atomic copy_page_from_iter_iovec() fs/userfaultfd: support read_iter to use io_uring fs/userfaultfd: complete reads asynchronously fs/userfaultfd: use iov_iter for copy/zero fs/userfaultfd: add write_iter() interface fs/userfaultfd: complete write asynchronously fs/userfaultfd: kmem-cache for wait-queue objects selftests/vm/userfaultfd: iouring and polling tests fs/userfaultfd.c | 740 ++++++++++++++++---- include/linux/hugetlb.h | 4 +- include/linux/mm.h | 6 +- include/linux/shmem_fs.h | 2 +- include/linux/uio.h | 3 + include/linux/userfaultfd_k.h | 10 +- include/uapi/linux/userfaultfd.h | 21 +- lib/iov_iter.c | 23 +- mm/hugetlb.c | 12 +- mm/memory.c | 36 +- mm/shmem.c | 17 +- mm/userfaultfd.c | 96 ++- tools/testing/selftests/vm/Makefile | 2 +- tools/testing/selftests/vm/userfaultfd.c | 835 +++++++++++++++++++++-- 14 files changed, 1506 insertions(+), 301 deletions(-) --=20 2.25.1