From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-0.6 required=3.0 tests=DKIM_INVALID,DKIM_SIGNED, FSL_HELO_FAKE,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 489E1C10DCE for ; Fri, 13 Mar 2020 02:00:24 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id EF802206FA for ; Fri, 13 Mar 2020 02:00:23 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="q5T8cai9" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org EF802206FA Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=kernel.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 81E5E6B0005; Thu, 12 Mar 2020 22:00:23 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 7CEA26B0006; Thu, 12 Mar 2020 22:00:23 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 695DF6B0007; Thu, 12 Mar 2020 22:00:23 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0134.hostedemail.com [216.40.44.134]) by kanga.kvack.org (Postfix) with ESMTP id 50F306B0005 for ; Thu, 12 Mar 2020 22:00:23 -0400 (EDT) Received: from smtpin22.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 010F745D8 for ; Fri, 13 Mar 2020 02:00:23 +0000 (UTC) X-FDA: 76588684284.22.beds43_77b600633d02a X-HE-Tag: beds43_77b600633d02a X-Filterd-Recvd-Size: 6263 Received: from mail-pl1-f194.google.com (mail-pl1-f194.google.com [209.85.214.194]) by imf20.hostedemail.com (Postfix) with ESMTP for ; Fri, 13 Mar 2020 02:00:22 +0000 (UTC) Received: by mail-pl1-f194.google.com with SMTP id b22so3441375pls.12 for ; Thu, 12 Mar 2020 19:00:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=sender:date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to:user-agent; bh=ULqLDVd2kkqniHcxRjAej9mWnrKAKa7fjvePHBiV/bU=; b=q5T8cai91O4phd6Vaaykn9whcXBcEuuQZWE5Ke77vprEFCayr01ooxsz+cTajraJN5 2NCuYSDe2McfZej7ckrOAoyb6W1r+bs7mvdnePfhnznlSf0jJASidDUYUNZEvbMuI9+6 HApmx5YS+jOl7jl3Dm2xrBGqhRRu/M0ttmPF9VFFJfaIi/nkarysoI3YKTwCC8uBDhtx 1VZiaLGzS5jnnibEaeaZxrUyqrvLv85IIEq4mlVhE1BBzFaHBLgXRki1jBnDrXLAwqP1 9RMUHMTuWklF0zr6apEgSO7sutkt/spSMhm6/x03K4qwVWSTSsNLResGF/KMvDFiyPEU RY6A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:date:from:to:cc:subject:message-id :references:mime-version:content-disposition:in-reply-to:user-agent; bh=ULqLDVd2kkqniHcxRjAej9mWnrKAKa7fjvePHBiV/bU=; b=c/Mx543+Z/C3370dTIO+UB9EnbQW/TOH/GtnTR8//itizG/IPeWE5Bthn9v+YUxKKp UT1kBPGQQQHutkMyKvwVm45vjBZgy+GsfWRVnI77QKqw+UqVe392waDINTCOHKlBgP8Q asRvJzZEYr+fVR0cP8gkw+pSPhOK5UqbET4XukKFEB6L6rcqUSIGyPctqDFnhx5SJJEU rwaPFQzcpjOYgGlJzbW7+HABdjKTax50Z+JmuHYDn9jc9lkfrLhEj15NTDPvjzsqmlft opSB9WG7S5LRoCSFN3TnDSU14BeoZGSBNa+ZWOMdc9sZ53pcC7qH9cH/h7xsfV+QiJCH P8Mw== X-Gm-Message-State: ANhLgQ0rBHBrPpUkfn3Z++Jmi1ud6Wn181wpcmTzAzDMCJ+Qo6yWcoXV +IXeCBPgVAidSN0X5/bxZiU= X-Google-Smtp-Source: ADFU+vv37YBJnDPGWsEgmZkosR1mP2+GO9n5l20/bA83dh1j15B1BxsiDccwfdYjU8+Ck1M608Z+OA== X-Received: by 2002:a17:90b:4c47:: with SMTP id np7mr7285128pjb.140.1584064821234; Thu, 12 Mar 2020 19:00:21 -0700 (PDT) Received: from google.com ([2620:15c:211:1:3e01:2939:5992:52da]) by smtp.gmail.com with ESMTPSA id b25sm5364108pfp.201.2020.03.12.19.00.19 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 12 Mar 2020 19:00:19 -0700 (PDT) Date: Thu, 12 Mar 2020 19:00:18 -0700 From: Minchan Kim To: Dave Hansen Cc: Michal Hocko , Jann Horn , Linux-MM , kernel list , Daniel Colascione , "Joel Fernandes (Google)" , Andrew Morton Subject: Re: interaction of MADV_PAGEOUT with CoW anonymous mappings? Message-ID: <20200313020018.GC68817@google.com> References: <20200312082248.GS23944@dhcp22.suse.cz> <20200312201602.GA68817@google.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.12.2 (2019-09-21) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Thu, Mar 12, 2020 at 02:41:07PM -0700, Dave Hansen wrote: > One other fun thing. I have a "victim" thread sitting in a loop doing: > > sleep(1) > memcpy(&garbage, buffer, sz); > > The "attacker" is doing > > madvise(buffer, sz, MADV_PAGEOUT); > > in a loop. That, oddly enough doesn't cause the victim to page fault. > But, if I do: > > memcpy(&garbage, buffer, sz); > madvise(buffer, sz, MADV_PAGEOUT); > > It *does* cause the memory to get paged out. The MADV_PAGEOUT code > actually has a !pte_present() check. It will punt on a PTE if it sees > it. In other words, if a page is in the swap cache but not mapped by a > pte_present() PTE, MADV_PAGEOUT won't touch it. > > Shouldn't MADV_PAGEOUT be able to find and reclaim those pages? Patch > attached. > > > --- > > b/mm/madvise.c | 38 +++++++++++++++++++++++++++++++------- > 1 file changed, 31 insertions(+), 7 deletions(-) > > diff -puN mm/madvise.c~madv-pageout-find-swap-cache mm/madvise.c > --- a/mm/madvise.c~madv-pageout-find-swap-cache 2020-03-12 14:24:45.178775035 -0700 > +++ b/mm/madvise.c 2020-03-12 14:35:49.706773378 -0700 > @@ -248,6 +248,36 @@ static void force_shm_swapin_readahead(s > #endif /* CONFIG_SWAP */ > > /* > + * Given a PTE, find the corresponding 'struct page'. Also handles > + * non-present swap PTEs. > + */ > +struct page *pte_to_reclaim_page(struct vm_area_struct *vma, > + unsigned long addr, pte_t ptent) > +{ > + swp_entry_t entry; > + > + /* Totally empty PTE: */ > + if (pte_none(ptent)) > + return NULL; > + > + /* A normal, present page is mapped: */ > + if (pte_present(ptent)) > + return vm_normal_page(vma, addr, ptent); > + Please check is_swap_pte first. > + entry = pte_to_swp_entry(vmf->orig_pte); > + /* Is it one of the "swap PTEs" that's not really swap? */ > + if (non_swap_entry(entry)) > + return false; > + > + /* > + * The PTE was a true swap entry. The page may be in the > + * swap cache. If so, find it and return it so it may be > + * reclaimed. > + */ > + return lookup_swap_cache(entry, vma, addr); If we go with handling only exclusived owned page for anon, I think we should apply the rule to swap cache, too. Do you mind posting it as formal patch? Thanks for the explain about vulnerability and the patch, Dave! > +} > + > +/* > * Schedule all required I/O operations. Do not wait for completion. > */ > static long madvise_willneed(struct vm_area_struct *vma, > @@ -389,13 +419,7 @@ regular_page: > for (; addr < end; pte++, addr += PAGE_SIZE) { > ptent = *pte; > > - if (pte_none(ptent)) > - continue; > - > - if (!pte_present(ptent)) > - continue; > - > - page = vm_normal_page(vma, addr, ptent); > + page = pte_to_reclaim_page(vma, addr, ptent); > if (!page) > continue; > > _