From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6D074C433EF for ; Fri, 15 Jul 2022 16:46:16 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id F2D17940202; Fri, 15 Jul 2022 12:46:15 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id EDCEF9401FB; Fri, 15 Jul 2022 12:46:15 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id DAEF4940202; Fri, 15 Jul 2022 12:46:15 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id CAC7F9401FB for ; Fri, 15 Jul 2022 12:46:15 -0400 (EDT) Received: from smtpin09.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 9C51334863 for ; Fri, 15 Jul 2022 16:46:15 +0000 (UTC) X-FDA: 79689911910.09.104B5FE Received: from mail-il1-f179.google.com (mail-il1-f179.google.com [209.85.166.179]) by imf20.hostedemail.com (Postfix) with ESMTP id BEF6D1C00C2 for ; Fri, 15 Jul 2022 16:46:14 +0000 (UTC) Received: by mail-il1-f179.google.com with SMTP id w16so2333260ilh.0 for ; Fri, 15 Jul 2022 09:46:14 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=gal4RztNlbVNjn4UMNIaShjSds2B8jcZoxR2zDJB8aU=; b=g1zCnmSrM3xFDJ0jz0sR9lFC3UHnjwun8lH0tpI7jVbFpOJCkEy38WxdKjUGNIRLOG 8rophyE8tyoImBzCPSfWl6CR1OlPGCULZ2/RNHuX9DaQHW0XM+coVzKwrdKxO+jj31V0 3EoA2byOpdzs300aJS/wcOGBEzaPuzw6uKtkp8fw0edJKhnLNURfQIkhhHQ/m4N/BBdo SiNTo5muJ0Tj0PFqLmR7EuBTYPXvzsGSZBYTlG8ft64nOBHWnVkYbGCiJy+8x9CHu8kv XQzYXeN2nKMOlAo9qinO4LwB8LkNT/F0klBsLivanrRb9GMtqoEL+YfqzsNOg8YL4VQK CJcw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=gal4RztNlbVNjn4UMNIaShjSds2B8jcZoxR2zDJB8aU=; b=FP047Krc4bEZR6HUXMegN2UlAsmYPGDTrtOroiNjXdZr2axj3gF/O3DWEsTx3MQ2zd cNSJDO3XEybWCNGPTRn2lZWUJtqDY1F1n+BKdMs1Cu+hBXSxS3CV3FzGiIMt5UqqOXdS M2cOTjj3WwclCNcGKavB5yDwK3zbVywEiZQsovxzNwlNuCW3zGL26NBsWWbGBtLA1Z+0 1LOenMw5hkcPosn1ETFzjlaApKQsXDk9dR5Bk9W6OimGgdrIkFkmkRdNncVFnh/T5X90 CZEM+PTcetTPbk/olKT33G7RVzLtWxB4zoxksalat/C9qv54t/ibLVbZRk2iRxfYImdm nQew== X-Gm-Message-State: AJIora/gkXEMfnb98kJmFUp1pvwIZiKJ96Jf0qReIYeolY8e3xQgEmks 88KAbG68YGFRqsm1X/lR7TuKSyNCvXeBTX0TjRV0Fg== X-Google-Smtp-Source: AGRyM1sO+gn0C6vsJEtwVVDsmQQfbgcG2KW9ATdqNT7ZjZZySMilR9adfiygZz1PQ9cn2ZqaW1mUCi9xmbAtr+iw/5k= X-Received: by 2002:a05:6e02:d0f:b0:2dc:11c8:9b9d with SMTP id g15-20020a056e020d0f00b002dc11c89b9dmr7226427ilj.192.1657903573876; Fri, 15 Jul 2022 09:46:13 -0700 (PDT) MIME-Version: 1.0 References: <20220712130542.18836-1-linmiaohe@huawei.com> <20220713102357.8328614813db01b569650ffd@linux-foundation.org> <402ae708-4c86-8feb-75c4-9339e1deac3b@huawei.com> In-Reply-To: From: Axel Rasmussen Date: Fri, 15 Jul 2022 09:45:37 -0700 Message-ID: Subject: Re: [PATCH] mm/hugetlb: avoid corrupting page->mapping in hugetlb_mcopy_atomic_pte To: Peter Xu Cc: Miaohe Lin , Andrew Morton , Mike Kravetz , Muchun Song , Linux MM , LKML Content-Type: multipart/alternative; boundary="00000000000098194705e3dac188" ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1657903574; a=rsa-sha256; cv=none; b=oMdejD0d8speSE5PL/g8u1YwJc9VZjoY9XjWwwrTrYi23rgXzy5SFQYUUxtnHL8u9f+DJR arlvByK3IAkRESVlJsX9N4CYx3FbiilZ/9/wXp6FrlGRwcrAzt2jBm9MK2NskmtyIcE5ER AuJ9eiSJtnFdh9IQtcmh0A0K5VnbokQ= ARC-Authentication-Results: i=1; imf20.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=g1zCnmSr; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf20.hostedemail.com: domain of axelrasmussen@google.com designates 209.85.166.179 as permitted sender) smtp.mailfrom=axelrasmussen@google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1657903574; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=gal4RztNlbVNjn4UMNIaShjSds2B8jcZoxR2zDJB8aU=; b=O1hkegRaNOsnsyI+d6SGXui2f98RVSy88X9obQ1960aFzjBFzu/u+wslNVBweVhO8+MuSs WUcnqbhKcEiVQCyM0DLn686AiiNxPDMZNqtgA5RxPXre78hBGDQqNyvZz5YxvADesgvzhy k7ZkYxa6uA3EI1X8KDSw0V0ANdX/tsY= X-Stat-Signature: eobkcicc61zcbcjyf3h81uca4qco9pi5 X-Rspamd-Queue-Id: BEF6D1C00C2 Authentication-Results: imf20.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=g1zCnmSr; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf20.hostedemail.com: domain of axelrasmussen@google.com designates 209.85.166.179 as permitted sender) smtp.mailfrom=axelrasmussen@google.com X-Rspamd-Server: rspam09 X-Rspam-User: X-HE-Tag: 1657903574-57759 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: --00000000000098194705e3dac188 Content-Type: text/plain; charset="UTF-8" On Fri, Jul 15, 2022 at 5:35 AM Peter Xu wrote: > On Fri, Jul 15, 2022 at 11:56:40AM +0800, Miaohe Lin wrote: > > On 2022/7/14 23:52, Peter Xu wrote: > > > On Thu, Jul 14, 2022 at 05:59:53PM +0800, Miaohe Lin wrote: > > >> On 2022/7/14 1:23, Andrew Morton wrote: > > >>> On Tue, 12 Jul 2022 21:05:42 +0800 Miaohe Lin > wrote: > > >>> > > >>>> In MCOPY_ATOMIC_CONTINUE case with a non-shared VMA, pages in the > page > > >>>> cache are installed in the ptes. But hugepage_add_new_anon_rmap is > called > > >>>> for them mistakenly because they're not vm_shared. This will > corrupt the > > >>>> page->mapping used by page cache code. > > >>> > > >>> Well that sounds bad. And theories on why this has gone unnoticed > for > > >>> over a year? I assume this doesn't have coverage in our selftests? > > >> > > >> As discussed in another thread, when minor fault handling is > proposed, only > > >> VM_SHARED vma is expected to be supported. And the test case is also > missing. > > > > > > Yes, after this patch applied it'll be great to have the test case > covering > > > private mappings too. > > > > > > It's just that it'll be a bit more than setting test_uffdio_minor=1 for > > > "hugetlb" test. In hugetlb_allocate_area() we'll need to setup the > alias > > > too for !shared case, it'll be a bit challenging since currently we're > > > using anonymous hugetlb mappings for private tests, and I'm not sure > > > whether we'll need the hugetlb path back just like what we have with > > > "hugetlb_shared" tests. > > > > I'm afraid not. When minor fault handling is proposed, only VM_SHARED > vma is > > expected to be supported. It seems it's hard to image how one might > benefit > > from using it with a private mapping. But I'm not sure as I'm still a > layman > > in userfaultfd now. Any further suggestions? > > IIUC so far we all think it's not required to limit it to shared mappings > only? The effort is mostly the same. > > My suggestion is above - we could enable the kselftest for it, but I don't > strongly ask for that too because I don't know any real use of it, it'll > still be good to have it though for completeness. It's just that we may > need to change some code back in 9ae8f2b849f79 on using fd-based memory, or > I don't know how to create the alias mapping properly. > I agree we should either: - Update the UFFD selftest to exercise this case - Or, don't allow it, update vma_can_userfault() to also require VM_SHARED for VM_UFFD_MINOR registration. The first one is unfortunately not completely straightforward as Peter described. I would say it's probably not worth holding up this fix while we wait for it to happen? I don't really have a strong preference between the two. The second option is what I originally proposed in the first version of the minor fault series, so going back to that isn't a problem at least from my perspective. If in the future we find a real use case for this, we could always easily re-enable it and add selftests for it at that point. > > Thanks, > > -- > Peter Xu > > --00000000000098194705e3dac188 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable


=
On Fri, Jul 15, 2022 at 5:35 AM Peter= Xu <peterx@redhat.com> wrot= e:
On Fri, Jul 1= 5, 2022 at 11:56:40AM +0800, Miaohe Lin wrote:
> On 2022/7/14 23:52, Peter Xu wrote:
> > On Thu, Jul 14, 2022 at 05:59:53PM +0800, Miaohe Lin wrote:
> >> On 2022/7/14 1:23, Andrew Morton wrote:
> >>> On Tue, 12 Jul 2022 21:05:42 +0800 Miaohe Lin <linmiaohe@huawei.com= > wrote:
> >>>
> >>>> In MCOPY_ATOMIC_CONTINUE case with a non-shared VMA, = pages in the page
> >>>> cache are installed in the ptes. But hugepage_add_new= _anon_rmap is called
> >>>> for them mistakenly because they're not vm_shared= . This will corrupt the
> >>>> page->mapping used by page cache code.
> >>>
> >>> Well that sounds bad.=C2=A0 And theories on why this has = gone unnoticed for
> >>> over a year?=C2=A0 I assume this doesn't have coverag= e in our selftests?
> >>
> >> As discussed in another thread, when minor fault handling is = proposed, only
> >> VM_SHARED vma is expected to be supported. And the test case = is also missing.
> >
> > Yes, after this patch applied it'll be great to have the test= case covering
> > private mappings too.
> >
> > It's just that it'll be a bit more than setting test_uffd= io_minor=3D1 for
> > "hugetlb" test.=C2=A0 In hugetlb_allocate_area() we'= ;ll need to setup the alias
> > too for !shared case, it'll be a bit challenging since curren= tly we're
> > using anonymous hugetlb mappings for private tests, and I'm n= ot sure
> > whether we'll need the hugetlb path back just like what we ha= ve with
> > "hugetlb_shared" tests.
>
> I'm afraid not. When minor fault handling is proposed, only VM_SHA= RED vma is
> expected to be supported. It seems it's hard to image how one migh= t benefit
> from using it with a private mapping. But I'm not sure as I'm = still a layman
> in userfaultfd now. Any further suggestions?

IIUC so far we all think it's not required to limit it to shared mappin= gs
only?=C2=A0 The effort is mostly the same.

My suggestion is above - we could enable the kselftest for it, but I don= 9;t
strongly ask for that too because I don't know any real use of it, it&#= 39;ll
still be good to have it though for completeness.=C2=A0 It's just that = we may
need to change some code back in 9ae8f2b849f79 on using fd-based memory, or=
I don't know how to create the alias mapping properly.
=

I agree we should either:
- Update the UFFD s= elftest=C2=A0to exercise this case
- Or, don't allow it, upda= te vma_can_userfault() to also require VM_SHARED for VM_UFFD_MINOR registra= tion.

The first one is unfortunately not completel= y straightforward as Peter described. I would say it's probably not wor= th holding up this fix while we wait for it to happen?

=
I don't really have a strong preference between the two. The secon= d option is what I originally proposed in the first version of the minor fa= ult series, so going back to that isn't a problem at least from my pers= pective. If in the future we find a real use case for this, we could always= easily re-enable it and add selftests for it at that point.
=C2= =A0

Thanks,

--
Peter Xu

--00000000000098194705e3dac188--