From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.8 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,MSGID_FROM_MTA_HEADER,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 41DC9C4741F for ; Fri, 30 Oct 2020 14:46:31 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 8379820724 for ; Fri, 30 Oct 2020 14:46:30 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=nvidia.com header.i=@nvidia.com header.b="F112DJuh" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 8379820724 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=nvidia.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 186A86B0073; Fri, 30 Oct 2020 10:46:29 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id E5BCD6B0075; Fri, 30 Oct 2020 10:46:28 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C3F0A6B0070; Fri, 30 Oct 2020 10:46:28 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0107.hostedemail.com [216.40.44.107]) by kanga.kvack.org (Postfix) with ESMTP id 87BE06B0070 for ; Fri, 30 Oct 2020 10:46:28 -0400 (EDT) Received: from smtpin28.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 2850C8249980 for ; Fri, 30 Oct 2020 14:46:28 +0000 (UTC) X-FDA: 77428867656.28.nose60_06038c227296 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin28.hostedemail.com (Postfix) with ESMTP id 1AEF56D62 for ; Fri, 30 Oct 2020 14:46:28 +0000 (UTC) X-HE-Tag: nose60_06038c227296 X-Filterd-Recvd-Size: 7038 Received: from hqnvemgate24.nvidia.com (hqnvemgate24.nvidia.com [216.228.121.143]) by imf21.hostedemail.com (Postfix) with ESMTP for ; Fri, 30 Oct 2020 14:46:26 +0000 (UTC) Received: from hqmail.nvidia.com (Not Verified[216.228.121.13]) by hqnvemgate24.nvidia.com (using TLS: TLSv1.2, AES256-SHA) id ; Fri, 30 Oct 2020 07:46:30 -0700 Received: from HQMAIL111.nvidia.com (172.20.187.18) by HQMAIL111.nvidia.com (172.20.187.18) with Microsoft SMTP Server (TLS) id 15.0.1473.3; Fri, 30 Oct 2020 14:46:25 +0000 Received: from NAM02-SN1-obe.outbound.protection.outlook.com (104.47.36.50) by HQMAIL111.nvidia.com (172.20.187.18) with Microsoft SMTP Server (TLS) id 15.0.1473.3 via Frontend Transport; Fri, 30 Oct 2020 14:46:25 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=a0E/SukTaDJS3Ux+8YXDPMS1/Km30Js15CccR7GYR6jSf/N+84qWSjGrMFcFriMHd4afeyTnw7a9r9Wgww0KpzEzyxIc15fgam8Kb0WwlkMCMOQgtxw3k1w8H9BanOW0qWFZQOZVFGs0GKmNNo4uH/N6u2Z1aHe2jSaYMxKSRq6AOLvdgE31qLE1q82PcaHCb92YAD65fhGuP/yoHJIiJYssTauzoIvpacCmHEp49WC6HIqzmLoCUSQdm9IBDtAezy9Loi3CvPFs9oE0b00gFdNLTe/IOrO40YGkz8XZMfKWO5Lk63IrhqGL1X0yVCYqf9MR7c+CgiiCJRrzV8yn5g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=yA37bQQyGB83W6gYtEjZ2P4vSLjCEkOXt+gkAek6kek=; b=cH0tfHxwRIxku3s0tXMQX8f/vwv0zSdRJfb1NZfXCp6v8Szmp42UdBGgtyTaNT4BWFByWOhINGscXyumKFwJKxVE3vqd0rQVHSCZbhgbqPLaPPOzSHtEDXBzmDtVTiRb5zV55nHKDLzQEGJT7KccsHbC8P10zBT4Gsqy98ekDMOXKN8IKn7WXUMvJ/WOrTEXFaKU7n8aEutaRcZvHJ4RxJGvkq0tQjWMUi6EKz3fLjAo+p08FAt1a+xDYwNr5jD26FUFGk3lzZX8dCV4+TCQCUYofNCiyqcBib2YFKz1uG3ai8/Qi5/5YQU2yp6Bpp/HvRYrKdo5vHZ9BV5PcCM5ZA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=nvidia.com; dmarc=pass action=none header.from=nvidia.com; dkim=pass header.d=nvidia.com; arc=none Received: from DM6PR12MB3834.namprd12.prod.outlook.com (2603:10b6:5:14a::12) by DM6PR12MB3403.namprd12.prod.outlook.com (2603:10b6:5:11d::27) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3499.24; Fri, 30 Oct 2020 14:46:24 +0000 Received: from DM6PR12MB3834.namprd12.prod.outlook.com ([fe80::cdbe:f274:ad65:9a78]) by DM6PR12MB3834.namprd12.prod.outlook.com ([fe80::cdbe:f274:ad65:9a78%7]) with mapi id 15.20.3499.027; Fri, 30 Oct 2020 14:46:24 +0000 From: Jason Gunthorpe To: , Peter Xu , Linus Torvalds CC: Andrea Arcangeli , Andrew Morton , Aneesh Kumar K.V , Christoph Hellwig , Hugh Dickins , Jan Kara , Jann Horn , John Hubbard , Kirill Shutemov , Kirill Tkhai , Leon Romanovsky , Linux-MM , Michal Hocko , Oleg Nesterov Subject: [PATCH v2 0/2] Add a seqcount between gup_fast and copy_page_range() Date: Fri, 30 Oct 2020 11:46:19 -0300 Message-ID: <0-v2-dfe9ecdb6c74+2066-gup_fork_jgg@nvidia.com> Content-Transfer-Encoding: quoted-printable Content-Type: text/plain X-ClientProxiedBy: BL1PR13CA0247.namprd13.prod.outlook.com (2603:10b6:208:2ba::12) To DM6PR12MB3834.namprd12.prod.outlook.com (2603:10b6:5:14a::12) MIME-Version: 1.0 X-MS-Exchange-MessageSentRepresentingType: 1 Received: from mlx.ziepe.ca (156.34.48.30) by BL1PR13CA0247.namprd13.prod.outlook.com (2603:10b6:208:2ba::12) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3499.8 via Frontend Transport; Fri, 30 Oct 2020 14:46:23 +0000 Received: from jgg by mlx with local (Exim 4.94) (envelope-from ) id 1kYVfa-00DEpu-0P; Fri, 30 Oct 2020 11:46:22 -0300 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=nvidia.com; s=n1; t=1604069190; bh=0JZUOnDCGRa2TxJ8qE44mLoj1AbMcthKfar8zvC1DJM=; h=ARC-Seal:ARC-Message-Signature:ARC-Authentication-Results:From:To: CC:Subject:Date:Message-ID:Content-Transfer-Encoding:Content-Type: X-ClientProxiedBy:MIME-Version: X-MS-Exchange-MessageSentRepresentingType; b=F112DJuhs7a+UTX1TGyrF+/LxWHXkGdzsnESeeKEISfU50nXmxjNDHGtxybojEhcu K8kwsGxtHpSwwrmATzHY0HSIqE8fNbQpUnVknKP/oE/H03Rvahz7TbFJ2ICvzfYjEd Ncu9mr3jIWpMkw0Qkt82JJEz43EckfXj1dc1TeN0Dpj6nvyV9tTNxd0BtLstzZ8/O6 gP8vyzLgpmpkc4fxvbKYbjC782jx+kx1Qm5kjxN7bMULNARTBm81mNvZwOdy+tPgVT 5L96OE6qUiKbnM041RFY1URkqhNJ3wUERjO1sv90N1jdzcl8YYDTjAR66hZ5levFoZ yuag4yeZNTCeA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: As discussed and suggested by Linus use a seqcount to close the small race between gup_fast and copy_page_range(). Unfortunately the good suggestion to just use write_seqcount_begin() blows up lockdep immediately due to the (new?) requirement that the write side of seqcount be in a preempt disabled region. For this application it does not seem like a good idea, nor is it necessary as we don't spin on retry. This is solved by being the first place to use raw_write_seqcount_t_begin() This can go after the merge window. I was table to test it using two threads, one forking and the other using ibv_reg_mr() to trigger GUP fast. Modifying copy_page_range() to sleep made the window large enough to reliably hit to test the logic. v2: - Use start not addr in lockless_pages_from_mm - Replace unsigned long casts with using the proper variable type - Update comments - Use raw_write_seqcount_t_begin() instead of open coding - Update commit messages v1: https://lore.kernel.org/r/0-v1-281e425c752f+2df-gup_fork_jgg@nvidia.com To: linux-kernel@vger.kernel.org To: Peter Xu To: Linus Torvalds Cc: Peter Xu Cc: John Hubbard Cc: Linux-MM Cc: Linux Kernel Mailing List Cc: Andrew Morton Cc: Jan Kara Cc: Michal Hocko Cc: Kirill Tkhai Cc: Kirill Shutemov Cc: Hugh Dickins Cc: Christoph Hellwig Cc: Andrea Arcangeli Cc: Oleg Nesterov Cc: Jann Horn Jason Gunthorpe (2): mm: reorganize internal_get_user_pages_fast() mm: prevent gup_fast from racing with COW during fork arch/x86/kernel/tboot.c | 1 + drivers/firmware/efi/efi.c | 1 + include/linux/mm_types.h | 7 +++ kernel/fork.c | 1 + mm/gup.c | 118 +++++++++++++++++++++++-------------- mm/init-mm.c | 1 + mm/memory.c | 10 +++- 7 files changed, 93 insertions(+), 46 deletions(-) --=20 2.28.0