From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.2 required=3.0 tests=DKIM_SIGNED,DKIM_VALID, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 20B9DC432C2 for ; Thu, 26 Sep 2019 10:20:38 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id B043220872 for ; Thu, 26 Sep 2019 10:20:37 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=shutemov-name.20150623.gappssmtp.com header.i=@shutemov-name.20150623.gappssmtp.com header.b="nl0EgOZE" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org B043220872 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=shutemov.name Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 0545F6B000D; Thu, 26 Sep 2019 06:20:37 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 005758E0005; Thu, 26 Sep 2019 06:20:36 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E35C26B0266; Thu, 26 Sep 2019 06:20:36 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0099.hostedemail.com [216.40.44.99]) by kanga.kvack.org (Postfix) with ESMTP id C0A806B000D for ; Thu, 26 Sep 2019 06:20:36 -0400 (EDT) Received: from smtpin12.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with SMTP id 7AEEF37E7 for ; Thu, 26 Sep 2019 10:20:36 +0000 (UTC) X-FDA: 75976677672.12.scarf81_2252b3e53b14b X-HE-Tag: scarf81_2252b3e53b14b X-Filterd-Recvd-Size: 7137 Received: from mail-ed1-f67.google.com (mail-ed1-f67.google.com [209.85.208.67]) by imf18.hostedemail.com (Postfix) with ESMTP for ; Thu, 26 Sep 2019 10:20:35 +0000 (UTC) Received: by mail-ed1-f67.google.com with SMTP id v8so1484070eds.2 for ; Thu, 26 Sep 2019 03:20:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=shutemov-name.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to:user-agent; bh=uodT1tvqiLXKIQ9l11hzMNokMAMRHRuaFSCfPW/0x+Q=; b=nl0EgOZEmR/2eDBMsHsl1NkGuQqHpVQuP2oxBd/PHOy5geDmP+pCLMBvqgGT3FYSVS yGJ2TidbyOpmM3KEpZEF3PE4uT/ixFGFlRZvg9sTEyOrTvcmy02ygNBSbd9NbCy9Pctj 56jpjuDDpMU/Vtk6NoH8haCnOB7+figVhUk0HlUOBVqYXQAr2yyC8XnyuUl0iK8sscQL X6LiwY9/fBFJeP4QIMELcDOJO395MwOlR0nJj230a3nAxPckAD+pfVSM7UJ4NxgAWebh eGRo37sLOjdvFU6fmtH4He30tvvqRv5BMDt9dB9sXumgfoMfFgDFF6Kkhjm/+fO1W1KH NKAg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to:user-agent; bh=uodT1tvqiLXKIQ9l11hzMNokMAMRHRuaFSCfPW/0x+Q=; b=Md0y7KiXfN0rXsL8ukOGUcQPdr1z3E/WbJYhSEYqkRIffrjZsKQxSmQPRQbTvLBp18 i5kGJqU0zwaZ/V3ykCZoFfksu/rSCDsvUJSruTx+hXaIznq/Xd+SUz1mW37uIjeFmjpf GDIWEvzL2ILpgVs3ILILMYqmsyaxz22KrdSBlCsitahDljA1Qfdk/I5un7FVmIBm9tWm Vs7LWeQAoj2pSbXHi3tXBI3D0QOXpndy3y6BvJl/ZVPeLzWWL7Hr0KgPoQwcbUUWc0zn MGB1jqg9d4I9+3RwXsYxFdF0jwEdac5EsxeJx1wF7fzLXInJyI1YH7BY/GiSZWyeHqao wXxA== X-Gm-Message-State: APjAAAXFUWJFJDWffhZPGO1LSzMIakcZwr4laQIrV2D1wzizh11IBHgh YKh3kYxfwgg0/31rd9RegzY+Cg== X-Google-Smtp-Source: APXvYqzT/xeF8LftFGtdXRpxzSQcoKhuuXTF4UkNJxXgidDWnzF/Jo1kQ/JnCn8qAHoQ+s6dHxlGug== X-Received: by 2002:a17:906:c72d:: with SMTP id fj13mr2421561ejb.36.1569493234452; Thu, 26 Sep 2019 03:20:34 -0700 (PDT) Received: from box.localdomain ([86.57.175.117]) by smtp.gmail.com with ESMTPSA id fx25sm181773ejb.19.2019.09.26.03.20.33 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 26 Sep 2019 03:20:33 -0700 (PDT) Received: by box.localdomain (Postfix, from userid 1000) id 3D9931004E0; Thu, 26 Sep 2019 13:20:36 +0300 (+03) Date: Thu, 26 Sep 2019 13:20:36 +0300 From: "Kirill A. Shutemov" To: Yu Zhao Cc: Peter Zijlstra , Andrew Morton , Michal Hocko , "Kirill A . Shutemov" , Ingo Molnar , Arnaldo Carvalho de Melo , Alexander Shishkin , Jiri Olsa , Namhyung Kim , Vlastimil Babka , Hugh Dickins , =?utf-8?B?SsOpcsO0bWU=?= Glisse , Andrea Arcangeli , "Aneesh Kumar K . V" , David Rientjes , Matthew Wilcox , Lance Roy , Ralph Campbell , Jason Gunthorpe , Dave Airlie , Thomas Hellstrom , Souptick Joarder , Mel Gorman , Jan Kara , Mike Kravetz , Huang Ying , Aaron Lu , Omar Sandoval , Thomas Gleixner , Vineeth Remanan Pillai , Daniel Jordan , Mike Rapoport , Joel Fernandes , Mark Rutland , Alexander Duyck , Pavel Tatashin , David Hildenbrand , Juergen Gross , Anthony Yznaga , Johannes Weiner , "Darrick J . Wong" , linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: Re: [PATCH v3 3/4] mm: don't expose non-hugetlb page to fast gup prematurely Message-ID: <20190926102036.od2wamdx2s7uznvq@box> References: <20190914070518.112954-1-yuzhao@google.com> <20190924232459.214097-1-yuzhao@google.com> <20190924232459.214097-3-yuzhao@google.com> <20190925082530.GD4536@hirez.programming.kicks-ass.net> <20190925222654.GA180125@google.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20190925222654.GA180125@google.com> User-Agent: NeoMutt/20180716 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Wed, Sep 25, 2019 at 04:26:54PM -0600, Yu Zhao wrote: > On Wed, Sep 25, 2019 at 10:25:30AM +0200, Peter Zijlstra wrote: > > On Tue, Sep 24, 2019 at 05:24:58PM -0600, Yu Zhao wrote: > > > We don't want to expose a non-hugetlb page to the fast gup running > > > on a remote CPU before all local non-atomic ops on the page flags > > > are visible first. > > > > > > For an anon page that isn't in swap cache, we need to make sure all > > > prior non-atomic ops, especially __SetPageSwapBacked() in > > > page_add_new_anon_rmap(), are ordered before set_pte_at() to prevent > > > the following race: > > > > > > CPU 1 CPU1 > > > set_pte_at() get_user_pages_fast() > > > page_add_new_anon_rmap() gup_pte_range() > > > __SetPageSwapBacked() SetPageReferenced() > > > > > > This demonstrates a non-fatal scenario. Though haven't been directly > > > observed, the fatal ones can exist, e.g., PG_lock set by fast gup > > > caller and then overwritten by __SetPageSwapBacked(). > > > > > > For an anon page that is already in swap cache or a file page, we > > > don't need smp_wmb() before set_pte_at() because adding to swap or > > > file cach serves as a valid write barrier. Using non-atomic ops > > > thereafter is a bug, obviously. > > > > > > smp_wmb() is added following 11 of total 12 page_add_new_anon_rmap() > > > call sites, with the only exception being > > > do_huge_pmd_wp_page_fallback() because of an existing smp_wmb(). > > > > > > > I'm thinking this patch make stuff rather fragile.. Should we instead > > stick the barrier in set_p*d_at() instead? Or rather, make that store a > > store-release? > > I prefer it this way too, but I suspected the majority would be > concerned with the performance implications, especially those > looping set_pte_at()s in mm/huge_memory.c. We can rename current set_pte_at() to __set_pte_at() or something and leave it in places where barrier is not needed. The new set_pte_at()( will be used in the rest of the places with the barrier inside. BTW, have you looked at other levels of page table hierarchy. Do we have the same issue for PMD/PUD/... pages? -- Kirill A. Shutemov