From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8B605C433EF for ; Thu, 30 Sep 2021 01:57:39 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 11838617E1 for ; Thu, 30 Sep 2021 01:57:39 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 11838617E1 Authentication-Results: mail.kernel.org; dmarc=fail (p=quarantine dis=none) header.from=nvidia.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id A361F940078; Wed, 29 Sep 2021 21:57:38 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 9E4EA94003A; Wed, 29 Sep 2021 21:57:38 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 85E9A940078; Wed, 29 Sep 2021 21:57:38 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0076.hostedemail.com [216.40.44.76]) by kanga.kvack.org (Postfix) with ESMTP id 75CDF94003A for ; Wed, 29 Sep 2021 21:57:38 -0400 (EDT) Received: from smtpin26.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 21F3C1815275A for ; Thu, 30 Sep 2021 01:57:38 +0000 (UTC) X-FDA: 78642578196.26.DDFE3DF Received: from NAM12-DM6-obe.outbound.protection.outlook.com (mail-dm6nam12on2080.outbound.protection.outlook.com [40.107.243.80]) by imf29.hostedemail.com (Postfix) with ESMTP id 8CF63900066B for ; Thu, 30 Sep 2021 01:57:37 +0000 (UTC) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=Tznr3xiqJ7sOqBcdqDEaD7PwL0HIpwckyMEHn/7IG7tUBKw+2ql4w5mKSHWkH2nW3+fnhLlgcI+RHmpTepSVEg5EE3imdu9NxnORkGLC1rHPDdcY/uyahubP8PpS8yotsZCYw1c1EWa8q7tFOKbyzIBDPFRa8eeJKdlmoFn++hXkJ/iLzI5gEm85QUKW89HuinJbRNkQHqVSECu85UW+lV/tuEYX6OfYnYf9n8FSd3CWdcvRUemXaA9FUvPmqWJoKEElCTT0v+5wEznHZBLuxWeQKnQKHEQibQDguy6YTPcDC6eRs6IvjswqY/EKHU8QNY06v1xzb+iejmzi2q/L2g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version; bh=7UoyJmxOzVoa+OzyEjfLHxPycYT8gvff4qZacpOvsPU=; b=VIx6qK98K0bYfX3MY5MrxsANy6LuP5yBn16gNnqsdT7cZUJNmg9eJtGi9fPtKwmOBL9kXeChEnhrXunTigl94GMWbQasBJhvqesdESVawqm2L+3hZA8sC8OqHl2scRchztCnVYMci13+gFvOvSVSLA5Nz4BReJKJs9kbKvg3obyjqeV4wcdj0wrmQoz27Z/8Zk+YiFbUSWdL5EF9OHHbhywftPBmGAt7V4EjgZMs/5VpE+Jw1sHvChNY22KKVwnczbp2q5uGEHwjqspY4Lm2znWKY4R7KdhxauLgE9nzhG/f2cX4lsObsxbBk1kPPV2YvyhWunljKgipJQFwkUdtgw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=nvidia.com; dmarc=pass action=none header.from=nvidia.com; dkim=pass header.d=nvidia.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=7UoyJmxOzVoa+OzyEjfLHxPycYT8gvff4qZacpOvsPU=; b=ltBXLyRhSzST2Tn4R//49I8yHE84VJekvmNbFoS/A+Dxal0GiS1ZK3j+RGnU0h76B4kByYmc4aVX9dzxaMvfpDtYMBWU57FBASIW9vUCLpNzwNzuKkbPvNzQn8D3CTMsxfk+R8v91mmgJXRGOHVVtZeQjrWosCS1YJtySrRI2/b1dgDySKiOA72fTqTvB3bRkuNac/tsTjH9uLumjzv6LU5iq8EQ/YDhkIOFyVHfPn5MWENJBWFT47/OzzcCvsASJk80uft1zahXwfjDSLj3cKN1iF0ldaskmIjkBF9rlrqZrHzkYUmAKiw2YAfYPsbaZ/ENdv6N7/daBdFF6k0pyg== Received: from BY5PR12MB4130.namprd12.prod.outlook.com (2603:10b6:a03:20b::16) by BYAPR12MB2725.namprd12.prod.outlook.com (2603:10b6:a03:6b::15) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4544.19; Thu, 30 Sep 2021 01:57:35 +0000 Received: from BY5PR12MB4130.namprd12.prod.outlook.com ([fe80::a0ac:922f:1e42:f310]) by BY5PR12MB4130.namprd12.prod.outlook.com ([fe80::a0ac:922f:1e42:f310%3]) with mapi id 15.20.4566.015; Thu, 30 Sep 2021 01:57:34 +0000 Message-ID: <9636a101-4445-0ede-e3ad-4cecd531f433@nvidia.com> Date: Wed, 29 Sep 2021 18:57:33 -0700 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.1.1 Subject: Re: Possible race with page_maybe_dma_pinned? Content-Language: en-US To: Linus Torvalds , Peter Xu Cc: Linux MM Mailing List , Jason Gunthorpe , Jan Kara , Andrew Morton , Andrea Arcangeli References: From: John Hubbard In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-ClientProxiedBy: BY5PR03CA0028.namprd03.prod.outlook.com (2603:10b6:a03:1e0::38) To BY5PR12MB4130.namprd12.prod.outlook.com (2603:10b6:a03:20b::16) MIME-Version: 1.0 Received: from [10.2.93.211] (216.228.112.22) by BY5PR03CA0028.namprd03.prod.outlook.com (2603:10b6:a03:1e0::38) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4566.14 via Frontend Transport; Thu, 30 Sep 2021 01:57:34 +0000 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: 45ddf16f-f4bd-4553-fc52-08d983b5abee X-MS-TrafficTypeDiagnostic: BYAPR12MB2725: X-MS-Exchange-Transport-Forked: True X-Microsoft-Antispam-PRVS: X-MS-Oob-TLC-OOBClassifiers: OLM:6790; X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: qsMUO2lxSXRLw4r70If77ard+Q5zGr/oelfqm9OBaPGv8lhr3/U/rgbkTGCBMtATMXut7xXZb4aY4dElPNFM8PqjGD0eRRAeRdaER9mRcn7gvrWePQn68F5m1F384WnPBeM6Te1CdCtHcyLTBX/0mh5sdV/Z6iz2nQgLDTUYHhg0gb6AkJsYrbB/WRA+l8w0Wox58i0tCav/ZZvIXVPtPNmcDu6yyvD4PGHfzWVqIILpyh+HzIw2gqAKQcRFOJMGIIgog2i+o7vI5zR9xvg9RsXPvW3JXPMmqeQ5nuEnvzY/B8G/wKDeT7t1H4V9Csjr/0XDLt26v6CvypbbcvFNtRVv1dkZcDYxC+/JNr+gqybPTfoBRkeKpq406XBS8Armxo5n81uyC+vd8zjhWS0rkdTzNUtenK1sJp2/jSnBswP7TBHAfv5W8fQY/6FTlv0GucEfxEQ63SrELQCPjUsiZVBty8VBJvPJb4YbwhK6u3TzfSBBy0H6uKXMJ6AbtLybNdM1n3Xl35/pWiSdssmrIwB57mvxODTjudG3fMUsvrj76cK1E/k5VZ9PmYip7QoFF+hdFOIdbtwcCVVDqVuDAQgL22Lf7sm8C58oqqlX0GSWdqZzoQmUE4MAmG5je7+7ZFLJMDWZVPT9SHwoqvk9loIDb5kTj5Bs+sbaD9WrF8tn5UfcaJZs7QiZA4y9cr0m8E+nENpu9QQwHiAt+mloO3UyQmXKpurJ587mVawsgCA= X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:BY5PR12MB4130.namprd12.prod.outlook.com;PTR:;CAT:NONE;SFS:(4636009)(366004)(83380400001)(8676002)(956004)(38100700002)(66946007)(26005)(4326008)(53546011)(508600001)(2616005)(6486002)(31686004)(8936002)(31696002)(16576012)(66556008)(86362001)(186003)(5660300002)(54906003)(110136005)(66476007)(2906002)(36756003)(316002)(43740500002)(45980500001);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: =?utf-8?B?bitWQWw0d1UrUnpUVGtlRlJ0ODdKUHdJM0xCNTYreTlYVTJBWjNCUSszR2Nw?= =?utf-8?B?UjRKaExLbWpxT01VNWZLVjBLZXd6U0tGSHNhTGNBNHB5VTB0UUw1QlF0cWgx?= =?utf-8?B?aFhaSXRyS3FTK3diUlBKK1lzSzlQS1RqanBrQUJBZzQvYjJPZHNJWTNmcHRB?= =?utf-8?B?WnB6TXFqRXMzS3YxWU4xaG43aFVKUzF0RC9MYWYrck5MRU04MUIvbEhEYmor?= =?utf-8?B?NjlLQzh6MmQrLzJ2c1JFUGtJM1lrSkpUVzgrYXRtNXVtYXkycXVWUlBucm0y?= =?utf-8?B?NWlQM2tYd3dFeUxNelNDWEh0S2dVaytFdzVvRXNjTUpVdk0zVUN2ZS9FR2R1?= =?utf-8?B?dGg5Rkk4YnFwa3htU0NDTWovV3NkRnQzVlFvME9CcllkNUxBZDBaWkdtRHdv?= =?utf-8?B?amxSYlNoOGhhZEJYcU44cXRTVjZyVmhIK0ZZZ21rS1NOYnZiaERLcDF0MENk?= =?utf-8?B?b0VnS0h0M0t0R0NUaU9NN1pwY2tRTzdxc1FxYS84KzVRdmJiNVprcWhML0Qw?= =?utf-8?B?MDVvcjlsV1c3QVlkS0FSNDRDdWlFVzN4K3F3VHVaajhhK1BqeGkvVy9hcExE?= =?utf-8?B?enQ5aWYrYzI3YjVJU0hNUGFSNStqV29iQmRJOXhWUTFMekFqMTBTVExOL3F4?= =?utf-8?B?M2E3KzNZQkErRkZwSlBlVEpCc2UzSDFsU001U2hlcDVrdGF5bnpycHVsYnZX?= =?utf-8?B?Y1lvaDJWamdJVXljTWZnY0pMZ0EvUXZsM3R4MWIzUG1SZEJmVUdnc080Nloz?= =?utf-8?B?ak5mMTZobklGazQyeUhoakVqb2xPelE1T2FuaEhEczdobjdvem1hbFBNQWZE?= =?utf-8?B?ZGNwTVlRTEYxcGp4NmdMNzdSK29GWTFYcER3Nit3blJma1F0K09pWTNobzJt?= =?utf-8?B?MUI2WVNFdWlRV0dMNGJIRUdLWXlDU2pBUzB0b3dlV21OeHI1TG9ZM1dUQURm?= =?utf-8?B?R0N5ZU5sdHZUeXRBbGd1c3R1dUJhem1SMlBjZjNrSHBITmErQ3kyYjV5czl2?= =?utf-8?B?L0c1K0Vjb0ZKbnoxVU1KempBQ1ZnRzVQNjY4RDhQYXIwZVlza21lRURCaGlH?= =?utf-8?B?SkZnMWxkb2dDNEo0N3RsemFsSmxlYTU5QXZVQjhqNVJwRklkcHJVbUVaRnNZ?= =?utf-8?B?bmJHRHNQU2x3YWVzNXhVeit0ekt1TXRMb1NOZzl5T21vaGNMUVJ4Y2ZiUHB6?= =?utf-8?B?a2YrNU5wZkFFT1p1bzVMdEFndnl1OFkzTXNRdVBxL0JRbFlkcExkSjJMeTBM?= =?utf-8?B?WEVOR0xzZjhzWnU1ektYS1dKTmhqbmduNHUxTnJvdmlDSjFIT0FUd1NBMHRO?= =?utf-8?B?d3NDTlVWODJOSUw1eWJIdEZ4WFFwcm1EdUtBRldJS1p5KzJyS0dZUXRSVVBl?= =?utf-8?B?N3NnSlNYd2V0Y2dBUkFWdnBFUkpuSU1VK0tCUFFsQTZsQXJHQzI5dmlIVmtv?= =?utf-8?B?eFBWN0JXUFBQaVN0c092L05PZmdwMWJ5aGlHU0ZaZC8xRWV0dnRwMlhrcUJZ?= =?utf-8?B?NGRob2UwWTByWWtJa05mM2tCTlVtSmgvMnR3S1BHTzB2SHZUUmhOa2lCZ25s?= =?utf-8?B?N3hwWWg5c0lNRERvWnhoRHJuM1l1MzJHZ2Q1MzZJQU5XdEVUTTQ4U0RMU1JT?= =?utf-8?B?YzNMejk5eG8weVRZbmFJMUEybmVGVFV0U1BMeGZFODhVTFp5UGpOM0d1T2xZ?= =?utf-8?B?RTRTcWFyVXZjUjFKaDk2NjBucUNUZythZlZoRWlhV3VNNkJiMTBvMmtoaVhr?= =?utf-8?Q?Q3dJAAoEVupHVc1/Q/9FLhfhpOibOtacpJwLsuX?= X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-Network-Message-Id: 45ddf16f-f4bd-4553-fc52-08d983b5abee X-MS-Exchange-CrossTenant-AuthSource: BY5PR12MB4130.namprd12.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 30 Sep 2021 01:57:34.7845 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: z7/SDCUAGUy6Zqc3boyGangpvEdzU1OQzADCXhY/ZV3rSdvQz3Ev6ed4Y7kCBxnYAlStaZcxuRQQAAq4AsA5Ew== X-MS-Exchange-Transport-CrossTenantHeadersStamped: BYAPR12MB2725 X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: 8CF63900066B X-Stat-Signature: i8xuihkigr4m9kzsh7o5rs6ktzn1dbq8 Authentication-Results: imf29.hostedemail.com; dkim=pass header.d=Nvidia.com header.s=selector2 header.b=ltBXLyRh; dmarc=pass (policy=quarantine) header.from=nvidia.com; spf=none (imf29.hostedemail.com: domain of jhubbard@nvidia.com has no SPF policy when checking 40.107.243.80) smtp.mailfrom=jhubbard@nvidia.com X-HE-Tag: 1632967057-277587 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 9/29/21 15:47, Linus Torvalds wrote: > On Wed, Sep 29, 2021 at 12:57 PM Peter Xu wrote: >> >> Now we have 3 callers of page_maybe_dma_pinned(): >> >> 1. page_needs_cow_for_dma >> 2. pte_is_pinned >> 3. shrink_page_list >> >> The 1st one is good as it takes the seqlock for write properly. The 2nd & 3rd >> are missing, we may need to add them. > > Well, the pte_is_pinned() case at least could do the seqlock in > clear_soft_dirty() - it has the vma and mm available. That part seems straightforward, OK. > > The page shrinker has always been problematic since it doesn't have > the vm (and by "always" I mean "modern times" - long ago we used to > scan virtually, in the days before rmap) > > One option might be for fast-gup to give up on locked pages. I think > the page lock is the only thing that shrink_page_list() serializes > with. > In order to avoid locked pages in gup fast, it is easiest to do a check for locked pages *after* fast-pinning them, and unpin them before returning to the caller. This makes the change much smaller. However, doing so leaves a window of time during which the pages are still marked as maybe-dma-pinned, although those pages are never returned to the caller as such. There is already code that is subject to this in lockless_pages_from_mm(), for the case of a failed seqlock read. I'm thinking it's probably OK, because the pages are not going to be held long-term. They will be unpinned before returning from lockless_pages_from_mm(). The counter argument is that this is merely making the race window smaller, which is usually something that I argue against because it just leads to harder-to-find bugs... To be specific, here's what I mean: diff --git a/mm/gup.c b/mm/gup.c index 886d6148d3d03..8ba871a927668 100644 --- a/mm/gup.c +++ b/mm/gup.c @@ -2657,6 +2657,7 @@ static unsigned long lockless_pages_from_mm(unsigned long start, { unsigned long flags; int nr_pinned = 0; + int i; unsigned seq; if (!IS_ENABLED(CONFIG_HAVE_FAST_GUP) || @@ -2693,7 +2694,23 @@ static unsigned long lockless_pages_from_mm(unsigned long start, unpin_user_pages(pages, nr_pinned); return 0; } + + /* + * Avoiding locked pages, in this fast/lockless context, will + * avoid interfering with shrink_page_list(), in particular. + * Give up upon finding the first locked page, but keep the + * earlier pages, so that slow gup does not have to re-pin them. + */ + for (i = 0; i < nr_pinned; i++) { + if (PageLocked(pages[i])) { + unpin_user_pages(&pages[i], nr_pinned - i); + nr_pinned = i + 1; + break; + } + } } + + return nr_pinned; } thanks, -- John Hubbard NVIDIA