From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.4 required=3.0 tests=DKIMWL_WL_MED,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_IN_DEF_DKIM_WL autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id D0D32C10DCE for ; Tue, 10 Mar 2020 21:53:34 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 59EE0215A4 for ; Tue, 10 Mar 2020 21:53:34 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="utAh3QrV" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 59EE0215A4 Authentication-Results: mail.kernel.org; dmarc=fail (p=reject dis=none) header.from=google.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id CB5296B0003; Tue, 10 Mar 2020 17:53:33 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id C64A56B0006; Tue, 10 Mar 2020 17:53:33 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B537F6B0007; Tue, 10 Mar 2020 17:53:33 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0103.hostedemail.com [216.40.44.103]) by kanga.kvack.org (Postfix) with ESMTP id 9E9BE6B0003 for ; Tue, 10 Mar 2020 17:53:33 -0400 (EDT) Received: from smtpin14.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 77532180AD81A for ; Tue, 10 Mar 2020 21:53:33 +0000 (UTC) X-FDA: 76580804706.14.sleet99_1f04e49a8b804 X-HE-Tag: sleet99_1f04e49a8b804 X-Filterd-Recvd-Size: 6073 Received: from mail-oi1-f175.google.com (mail-oi1-f175.google.com [209.85.167.175]) by imf29.hostedemail.com (Postfix) with ESMTP for ; Tue, 10 Mar 2020 21:53:33 +0000 (UTC) Received: by mail-oi1-f175.google.com with SMTP id g6so15593108oiy.1 for ; Tue, 10 Mar 2020 14:53:32 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=28amOXdcgZ1tgbIjzoT1unfXPJs97/xVZ1Q2OiQ+6Jg=; b=utAh3QrVVsP59y4SPip+7/1ssGzl80dGiqWjfte5x5SdG00BBhDOBE4EuZG8BfyRo1 So7NTUQq/NcOj4DC8vNG3FFcs18hqCogTLIO7apdCBLldxwFlTGWzFKmgGqQsv52Ty2D oONi77phJwslRWog4zvNFEetHTP4AlzmX1BNVrzR6KKMm7NGM7y/qKcXpCMuu/PISI3z /DvGLkZUvJdvcnBL40vZnQqOqRCOGt6P8LmiPdiqHtxHD/gXVvb8BAZixjNPGsLhMbbn SDpGZMeh1blnemUG37moJtlrM0r0xKga9cKxUBSuypbG/B3i92LvAlZ3UuMaSo4PyRaF GnQQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=28amOXdcgZ1tgbIjzoT1unfXPJs97/xVZ1Q2OiQ+6Jg=; b=EkVsiMPAZkxNXeWbCK8UCRFrgNimxAuRCo+HU0NlwmW+O467XQQXhwmMEauGcfdmu9 K8VhBbk9zwdwe3LmNCfAmD0NIlk8dheffQZKqPNAqqZTzN11NJ2DMFERLcYsxbbmKPT7 jM0v7pMn43TOFI5wUMtghQSbN+B9tlJGWDsLpCWDeitq6equ9EGR/gM9lg8EvtvAUNrT apptyQk1YgzB2NN1b3/mREHRNx/JW+lx8WemXH2QmNXHYfLEALpBMgRBl1aDL7LgELBx nWfZmhHv+TKvBlyudqUmYDSZUJ9BpvQdyGG6uioqBwTx/NCiWSOv98qF9B0DwDp/NcrQ MYWw== X-Gm-Message-State: ANhLgQ1hF1LU9xHulmTGJd+wwIiu6jKd21fsDDrrHZ4ObhKMs4tMaWvA d5geVpcqZJHXaHJ34rZ9tI3yG5Z2n9vF74tuc1Rgew== X-Google-Smtp-Source: ADFU+vs/PynMMOdJ4xb1jh1/YKI8pt7pcS0qS8D/NxIHOFOj0Sv8txvnm32MO24awX4H5b9p3Vg3n3q0D+H75nlWAdE= X-Received: by 2002:a05:6808:abc:: with SMTP id r28mr2766071oij.161.1583877211932; Tue, 10 Mar 2020 14:53:31 -0700 (PDT) MIME-Version: 1.0 References: <20200310184814.GA8447@dhcp22.suse.cz> In-Reply-To: From: Daniel Colascione Date: Tue, 10 Mar 2020 14:52:55 -0700 Message-ID: Subject: Re: interaction of MADV_PAGEOUT with CoW anonymous mappings? To: Jann Horn Cc: Michal Hocko , Minchan Kim , Linux-MM , kernel list , Dave Hansen , "Joel Fernandes (Google)" Content-Type: text/plain; charset="UTF-8" X-Bogosity: Ham, tests=bogofilter, spamicity=0.000051, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Tue, Mar 10, 2020 at 2:40 PM Jann Horn wrote: > > On Tue, Mar 10, 2020 at 9:19 PM Daniel Colascione wrote: > > On Tue, Mar 10, 2020 at 11:48 AM Michal Hocko wrote: > > > On Tue 10-03-20 19:08:28, Jann Horn wrote: > > > > >From looking at the source code, it looks to me as if using > > > > MADV_PAGEOUT on a CoW anonymous mapping will page out the page if > > > > possible, even if other processes still have the same page mapped. Is > > > > that correct? > > > > > > > > If so, that's probably bad in environments where many processes (with > > > > different privileges) are forked from a single zygote process (like > > > > Android and Chrome), I think? If you accidentally call it on a CoW > > > > anonymous mapping with shared pages, you'll degrade the performance of > > > > other processes. And if an attacker does it intentionally, they could > > > > use that to aid with exploiting race conditions or weird > > > > microarchitectural stuff (e.g. the new https://lviattack.eu/lvi.pdf > > > > talks about "the assumption that attackers can provoke page faults or > > > > microcode assists for (arbitrary) load operations in the victim > > > > domain"). > > > > > > > > Should madvise_cold_or_pageout_pte_range() maybe refuse to operate on > > > > pages with mapcount>1, or something like that? Or does it already do > > > > that, and I just missed the check? > > > > > > I have brought up side channel attacks earlier [1] but only in the > > > context of shared page cache pages. I didn't really consider shared > > > anonymous pages to be a real problem. I was under impression that CoW > > > pages shouldn't be a real problem because any security sensible > > > applications shouldn't allow untrusted code to be forked and CoW > > > anything really important. I believe we have made this assumption > > > in other places - IIRC on gup with FOLL_FORCE but I admit I have > > > very happily forgot most details. > > > > I'm more worried about the performance implications. Consider > > libc.so's data section: that's a COW mapping, and we COW it during > > zygote initialization as we load and relocate libc.so. Child processes > > shouldn't be dirtying and re-COWing those relocated pages. If I > > understand Jann's message correctly, MADV_PAGEOUT would force the > > pages corresponding to the libc.so data segment out to zram just > > because we MADV_PAGEOUT-ed a single process that happened to use libc. > > We should leave those pages in memory, IMHO. > > Actually, the libc.so data section is a file mapping, so I think > can_do_pageout() would decide whether the caller is allowed to force > pageout based on whether the caller is the owner of (or capable over) > libc (in other words, root, basically). But I think the bss section, > as well as heap memory, could have pageout forced by anyone. lmkd would have that capability though, right? But the point stands regardless. It sounds like both security and performance suggest a behavior change here. Thanks for bringing it up!