From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.3 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 054E8C4363C for ; Mon, 21 Sep 2020 21:17:54 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 709FD23A5C for ; Mon, 21 Sep 2020 21:17:53 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="XtOkh3dk" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 709FD23A5C Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id A45E26B00E8; Mon, 21 Sep 2020 17:17:52 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 9B7E16B00E9; Mon, 21 Sep 2020 17:17:52 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 8B4AC8E0001; Mon, 21 Sep 2020 17:17:52 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0124.hostedemail.com [216.40.44.124]) by kanga.kvack.org (Postfix) with ESMTP id 568A96B00E8 for ; Mon, 21 Sep 2020 17:17:52 -0400 (EDT) Received: from smtpin21.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 20EE0349B for ; Mon, 21 Sep 2020 21:17:52 +0000 (UTC) X-FDA: 77288330784.21.range78_030753b27148 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin21.hostedemail.com (Postfix) with ESMTP id F34B1180442C3 for ; Mon, 21 Sep 2020 21:17:51 +0000 (UTC) X-HE-Tag: range78_030753b27148 X-Filterd-Recvd-Size: 6317 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [216.205.24.124]) by imf23.hostedemail.com (Postfix) with ESMTP for ; Mon, 21 Sep 2020 21:17:51 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1600723071; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=YRDwlQrMAbIzIwgmUjqjVaCrYFl9mXuyYgCBbbkasEo=; b=XtOkh3dkWFKwlgdYWuf5K7GY/CBZ5hgqw9AuGEAPvVn9uMDkmOliyqs8ikJ9eoJgoTkVZF +ZE0BjCUKKF7Lh4iU09Hnalsjx6CbqmEEXEJz2lcTI3+eECPnyJADGY77tPH1JeDlOeXEK sLF+6sQNLZHVMHI/K62wBZ3kt9BYoko= Received: from mail-qt1-f197.google.com (mail-qt1-f197.google.com [209.85.160.197]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-327-KRyJlq49Mkm6cqrUrf5AXA-1; Mon, 21 Sep 2020 17:17:49 -0400 X-MC-Unique: KRyJlq49Mkm6cqrUrf5AXA-1 Received: by mail-qt1-f197.google.com with SMTP id t56so14113200qtt.19 for ; Mon, 21 Sep 2020 14:17:48 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:mime-version :content-transfer-encoding; bh=gWLsY05lICNle2YhUQzZEkQ3xwxCbs5IFMyW9iOIK7U=; b=tDR5DEWr4atqWznWuBdwepxjdhgRoBECAIRjV1vAwRuk4mLnsK024fr6aYh3KKNf// CAyjFc8d4JLgjEUwMEJUvpWUdvwB8ioqw7ZKNgXrb6XnWAFbsW+/GaEdYJ4ji6uYu+0H ZnXiTu6hpqN4tYLdPKrN3MGBhXaEB3BQnwf3bwQZKNz0pSohXHIyTBmRR9veXvDL7u2l 3nMGNX77y4hpxJpjvcHmWDq7CmRphvJWrhKxNKlm/VEMIJuA3tKCET/sWunNWkRUSamv MghKoGas574fVhqTSrWGCRV7WTIhHku/TwhddEP9WnnQuBdOx6kns+PpPimySOj+tvMD vXmw== X-Gm-Message-State: AOAM533IkruzhzrbROgMd7dupYWyuqlFEPhnP1lfbeLzvlBmMGQXEFjD 1o995p9w2MdeV+4mlwLs2CO3w9LWG1bqA7uIUYeQDXZ3KXeTsGQsOyT6IfAtgY8WJ5aaJg5ieBH r3yL+/g9CRow= X-Received: by 2002:a05:620a:4107:: with SMTP id j7mr1630461qko.469.1600723067849; Mon, 21 Sep 2020 14:17:47 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzcCX64AP4qp062+menLvZbLoudv80PWmwgaiADPDH4CiNhTKhFK/aUzRdlX4z70KkQqClrbw== X-Received: by 2002:a05:620a:4107:: with SMTP id j7mr1630433qko.469.1600723067537; Mon, 21 Sep 2020 14:17:47 -0700 (PDT) Received: from xz-x1.redhat.com (bras-vprn-toroon474qw-lp130-11-70-53-122-15.dsl.bell.ca. [70.53.122.15]) by smtp.gmail.com with ESMTPSA id h68sm10225108qkf.30.2020.09.21.14.17.45 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 21 Sep 2020 14:17:46 -0700 (PDT) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: Jason Gunthorpe , Andrew Morton , Jan Kara , Michal Hocko , Kirill Tkhai , Kirill Shutemov , Hugh Dickins , Peter Xu , Christoph Hellwig , Andrea Arcangeli , John Hubbard , Oleg Nesterov , Leon Romanovsky , Linus Torvalds , Jann Horn Subject: [PATCH 0/5] mm: Break COW for pinned pages during fork() Date: Mon, 21 Sep 2020 17:17:39 -0400 Message-Id: <20200921211744.24758-1-peterx@redhat.com> X-Mailer: git-send-email 2.26.2 MIME-Version: 1.0 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=peterx@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Finally I start to post formal patches because it's growing. And also sinc= e=0D we've discussed quite some issues already, so I feel like it's clearer on w= hat=0D we need to do, and how.=0D =0D This series is majorly inspired by the previous discussion on the list [1],= =0D starting from the report from Jason on the rdma test failure. Linus propos= ed=0D the solution, which seems to be a very nice approach to avoid the breakage = of=0D userspace apps that didn't use MADV_DONTFORK properly before. More informa= tion=0D can be found in that thread too.=0D =0D I believe the initial plan was to consider merging something like this for= =0D rc7/rc8. However now I'm not sure due to the fact that the code change in= =0D copy_pte_range() is probably more than expected, so it can be with some ris= k.=0D I'll leave this question to the reviewers...=0D =0D I tested it myself with fork() after vfio pinning a bunch of device pages, = and=0D I verified that the new copy pte logic worked as expected at least in the m= ost=0D general path. However I didn't test thp case yet because afaict vfio does = not=0D support thp backed dma pages. Luckily, the pmd/pud thp patch is much more= =0D straightforward than the pte one, so hopefully it can be directly verified = by=0D some code review plus some more heavy-weight rdma tests.=0D =0D Patch 1: Introduce mm.has_pinned (as single patch as suggested by Jaso= n)=0D Patch 2-3: Some slight rework on copy_page_range() path as preparation= =0D Patch 4: Early cow solution for pte copy for pinned pages=0D Patch 5: Same as above, but for thp (pmd/pud).=0D =0D Hugetlbfs fix is still missing, but as planned, that's not urgent so we can= =0D work upon. Comments greatly welcomed.=0D =0D Thanks.=0D =0D Peter Xu (5):=0D mm: Introduce mm_struct.has_pinned=0D mm/fork: Pass new vma pointer into copy_page_range()=0D mm: Rework return value for copy_one_pte()=0D mm: Do early cow for pinned pages during fork() for ptes=0D mm/thp: Split huge pmds/puds if they're pinned when fork()=0D =0D include/linux/mm.h | 2 +-=0D include/linux/mm_types.h | 10 ++=0D kernel/fork.c | 3 +-=0D mm/gup.c | 6 ++=0D mm/huge_memory.c | 26 +++++=0D mm/memory.c | 226 +++++++++++++++++++++++++++++++++++----=0D 6 files changed, 248 insertions(+), 25 deletions(-)=0D =0D --=20=0D 2.26.2=0D =0D