From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-9.9 required=3.0 tests=DKIMWL_WL_MED,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FSL_HELO_FAKE,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_SANE_1, USER_IN_DEF_DKIM_WL autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id CA4FBC4360C for ; Fri, 27 Sep 2019 18:31:24 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 7E6C321655 for ; Fri, 27 Sep 2019 18:31:24 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="kT7XqMaQ" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 7E6C321655 Authentication-Results: mail.kernel.org; dmarc=fail (p=reject dis=none) header.from=google.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 02BC88E0006; Fri, 27 Sep 2019 14:31:24 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id F1F7A8E0001; Fri, 27 Sep 2019 14:31:23 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E0DA08E0006; Fri, 27 Sep 2019 14:31:23 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0029.hostedemail.com [216.40.44.29]) by kanga.kvack.org (Postfix) with ESMTP id C02BF8E0001 for ; Fri, 27 Sep 2019 14:31:23 -0400 (EDT) Received: from smtpin03.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with SMTP id 618B040E6 for ; Fri, 27 Sep 2019 18:31:23 +0000 (UTC) X-FDA: 75981543246.03.corn19_c13d8d508c30 X-HE-Tag: corn19_c13d8d508c30 X-Filterd-Recvd-Size: 6541 Received: from mail-io1-f68.google.com (mail-io1-f68.google.com [209.85.166.68]) by imf44.hostedemail.com (Postfix) with ESMTP for ; Fri, 27 Sep 2019 18:31:22 +0000 (UTC) Received: by mail-io1-f68.google.com with SMTP id c6so18672520ioo.13 for ; Fri, 27 Sep 2019 11:31:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to:user-agent; bh=AVXbvx4e2E7rDQ9RyAirfeinhVun1WNT8Ogyo0puBD0=; b=kT7XqMaQ+sSiQgilMThQws/Qhgd1nWOdiewvcR5dseRcnncUCPNHIXQIgEux2d59Xu nfsD/MZIFdEKALfkElSrJWuVgegv71DtdZM5r8QwEG8Xm/oa3GQb3rucmwkNURqE7IPJ SjQ/Hv9BIBXz51ysPz3EXcZ2ozyI7t4ybrEjEUB77BuHXVzMyzxNxh0ecVNHSMOlqfED 9buQ88i+Lo2F+Y5lq+eSe0BVTS3VPXKCYKEdWCb5TogAa+LIscV0hh6p7J2YWqcCsV+a olwOzcHXXLfhZGVCN4Mrsj8tD+noOengCxMw49yHzLOXRPMkv64zD/Gz1yu2z6vYmfYb wzhA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to:user-agent; bh=AVXbvx4e2E7rDQ9RyAirfeinhVun1WNT8Ogyo0puBD0=; b=q3tGgGJtf9Qx6QkMDW+i5ZWYXA6cj4J33wbYtNBhk8BXi29jfvHpalhlSAf1XIBxMc SN9Vgi0a6O8XIPUa5Vkw9J4iiYlmuVKqh6p7PVLONJWNDhyu4bSgjD40PXemfnUK1y1Y Q74odMBHXaob5/ZE4RIgKbJA3OVusIJ146NdQsXSQr0y8aIM8CNDxkup+S10Fra7xo+u 2jgw3rcvMGl6/cbhWwH5svm2wfeOayVs3TfHeazwh/HY1tmkRtj+vcQejvjekKw5Gb3t t84+v+AjsScdQr2q/Ct7l8Wdt95sOB7aCmT8NhY3T1AOeBWtDpCyhuKK9i7Bt/86z95p /YEA== X-Gm-Message-State: APjAAAUalpPPq+UxQnD3cRCt8eg06rlrq5PfDo4dPu1VPgtcWraYQF61 w3ye06qCfiWQHSPfrdIs5EI43A== X-Google-Smtp-Source: APXvYqz14djYZiY69QXh/n8fcKlSqSbqPkQGKGp4Q153qsj4vfHyH82ZjNQ11egmtWjpEpMkv1rkCw== X-Received: by 2002:a6b:8d06:: with SMTP id p6mr9222954iod.219.1569609081882; Fri, 27 Sep 2019 11:31:21 -0700 (PDT) Received: from google.com ([2620:15c:183:0:9f3b:444a:4649:ca05]) by smtp.gmail.com with ESMTPSA id i18sm1890036ilc.34.2019.09.27.11.31.20 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 27 Sep 2019 11:31:21 -0700 (PDT) Date: Fri, 27 Sep 2019 12:31:16 -0600 From: Yu Zhao To: Michal Hocko Cc: John Hubbard , "Kirill A. Shutemov" , Peter Zijlstra , Andrew Morton , "Kirill A . Shutemov" , Ingo Molnar , Arnaldo Carvalho de Melo , Alexander Shishkin , Jiri Olsa , Namhyung Kim , Vlastimil Babka , Hugh Dickins , =?iso-8859-1?B?Suly9G1l?= Glisse , Andrea Arcangeli , "Aneesh Kumar K . V" , David Rientjes , Matthew Wilcox , Lance Roy , Ralph Campbell , Jason Gunthorpe , Dave Airlie , Thomas Hellstrom , Souptick Joarder , Mel Gorman , Jan Kara , Mike Kravetz , Huang Ying , Aaron Lu , Omar Sandoval , Thomas Gleixner , Vineeth Remanan Pillai , Daniel Jordan , Mike Rapoport , Joel Fernandes , Mark Rutland , Alexander Duyck , Pavel Tatashin , David Hildenbrand , Juergen Gross , Anthony Yznaga , Johannes Weiner , "Darrick J . Wong" , linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: Re: [PATCH v3 3/4] mm: don't expose non-hugetlb page to fast gup prematurely Message-ID: <20190927183116.GA216665@google.com> References: <20190914070518.112954-1-yuzhao@google.com> <20190924232459.214097-1-yuzhao@google.com> <20190924232459.214097-3-yuzhao@google.com> <20190925082530.GD4536@hirez.programming.kicks-ass.net> <20190925222654.GA180125@google.com> <20190926102036.od2wamdx2s7uznvq@box> <9465df76-0229-1b44-5646-5cced1bc1718@nvidia.com> <20190927123056.GE26848@dhcp22.suse.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20190927123056.GE26848@dhcp22.suse.cz> User-Agent: Mutt/1.10.1 (2018-07-13) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Fri, Sep 27, 2019 at 02:33:00PM +0200, Michal Hocko wrote: > On Thu 26-09-19 20:26:46, John Hubbard wrote: > > On 9/26/19 3:20 AM, Kirill A. Shutemov wrote: > > > BTW, have you looked at other levels of page table hierarchy. Do we have > > > the same issue for PMD/PUD/... pages? > > > > > > > Along the lines of "what other memory barriers might be missing for > > get_user_pages_fast(), I'm also concerned that the synchronization between > > get_user_pages_fast() and freeing the page tables might be technically broken, > > due to missing memory barriers on the get_user_pages_fast() side. Details: > > > > gup_fast() disables interrupts, but I think it also needs some sort of > > memory barrier(s), in order to prevent reads of the page table (gup_pgd_range, > > etc) from speculatively happening before the interrupts are disabled. > > Could you be more specific about the race scenario please? I thought > that the unmap path will be serialized by the pte lock. Yes, the unmap path is protected by ptl, but the fast gup isn't. Please correct me if I'm wrong, John. This is the hypothetical race: CPU 1 (gup) CPU 2 (zap) speculatively load a pmd val zap the pte table pointed by the pmd val flush tlb by ipi free the pte table local_irq_disable() use the stale pmd val use-after-free the pte table local_irq_enable() I don't think it would happen because the interrupt context on CPU 1 would act as a full mb and enforce a reload of the pmd val. But I'm not entirely sure.