From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-18.8 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH,MAILING_LIST_MULTI,MENTIONS_GIT_HOSTING,SPF_HELO_NONE,SPF_PASS, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 90D8DC433E0 for ; Wed, 17 Feb 2021 23:35:57 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 20CBE64E63 for ; Wed, 17 Feb 2021 23:35:56 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 20CBE64E63 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 3FE346B0006; Wed, 17 Feb 2021 18:35:56 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 387B66B006C; Wed, 17 Feb 2021 18:35:56 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 229256B006E; Wed, 17 Feb 2021 18:35:56 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0140.hostedemail.com [216.40.44.140]) by kanga.kvack.org (Postfix) with ESMTP id 06D8E6B0006 for ; Wed, 17 Feb 2021 18:35:56 -0500 (EST) Received: from smtpin08.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id BE207181AEF30 for ; Wed, 17 Feb 2021 23:35:55 +0000 (UTC) X-FDA: 77829369870.08.9BCEF77 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [63.128.21.124]) by imf09.hostedemail.com (Postfix) with ESMTP id 757F46002496 for ; Wed, 17 Feb 2021 23:35:52 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1613604954; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=gpsgaLctaifLfMibN8/Wzboo3britZzUHx/xMzenLRo=; b=YtbQDteD7IBUYaXlhdFjNujiJ5U7UZ9jj1jF4PaAbH6OVezjtxo7bySxnGSYyCSHIa/PSr 8mzCSET79GCWb983F/ubP0+qXr+TEVCA7I5SXNtILAF2ySCFr06FeT8MfPl08F4HG7rktx ygkQ5C5PyHFh6pt1la+iemZaCNazdME= Received: from mail-qk1-f199.google.com (mail-qk1-f199.google.com [209.85.222.199]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-252-kiwUjLRRMfOCZMTZtcNHwQ-1; Wed, 17 Feb 2021 18:35:51 -0500 X-MC-Unique: kiwUjLRRMfOCZMTZtcNHwQ-1 Received: by mail-qk1-f199.google.com with SMTP id c63so143601qkd.1 for ; Wed, 17 Feb 2021 15:35:51 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:mime-version :content-transfer-encoding; bh=wPuFJC5ZmioOYdkJPPlSi2y7gf606iWtPZy7eDdgzpg=; b=iCMBV6AiPj0L9B6ykJC4FTeyhqEIpoDxuNvMZet/pNp0tc+n1Cuq8EZ9DL5KPBdENH +6nbCSUhjiNirbTALixgN3wRXT9kOaBHQWlGNpRC449W9VlGU5zpqMpKT2kTy5dfjUt9 +dGpE/CA6cR4U4Rj2P+k0eP3EaQh6ph/OF8AnyBf1hnSenPp8lvqksMFlDUDJbhZJ2h+ X2Y2cra2ssOc4C8Ib9UpQfHeFVOIi26O2wtrHWc5QjUh3A6A0SX+l0+7y72IIysSGRp5 VgGZbR0182KOizPeC2Or2twTppI40092Ya+jgAlpt+FUDbwJvy4+H+rqNi6oTMbPIknl HZLQ== X-Gm-Message-State: AOAM531xA9cA8pbx3G4AJM1u57+m579gozgftajN8Q1wY7nCbYkxoDom 4yHne1wL0UYliPILB78pbyGTOWhJ9uzSrhBDtTfFEi908LnArroWoMV5wx4AqB1YZ5Gq+1ayYk/ ckQOVjdAysLcwoICM6WalPlcm/QTYFbPlRgWYZJxPKi/MqFOH1EIYIRyjr/BA X-Received: by 2002:a37:9f17:: with SMTP id i23mr1669471qke.315.1613604950502; Wed, 17 Feb 2021 15:35:50 -0800 (PST) X-Google-Smtp-Source: ABdhPJwm1CBcAJwKP+CnH8d9e9wr0xY2dhgacL4GB1zw3TmbxOVNILszGTSN2t2l1wdVYEMtEgkELQ== X-Received: by 2002:a37:9f17:: with SMTP id i23mr1669428qke.315.1613604950139; Wed, 17 Feb 2021 15:35:50 -0800 (PST) Received: from xz-x1.redhat.com (bras-vprn-toroon474qw-lp130-20-174-93-89-182.dsl.bell.ca. [174.93.89.182]) by smtp.gmail.com with ESMTPSA id o5sm2739622qkh.59.2021.02.17.15.35.48 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 17 Feb 2021 15:35:49 -0800 (PST) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: Gal Pressman , Matthew Wilcox , Wei Zhang , peterx@redhat.com, Mike Kravetz , Mike Rapoport , Christoph Hellwig , Andrew Morton , Linus Torvalds , David Gibson , Jason Gunthorpe , Jann Horn , Kirill Tkhai , Kirill Shutemov , Miaohe Lin , Andrea Arcangeli , Jan Kara Subject: [PATCH v5 0/5] mm/hugetlb: Early cow on fork, and a few cleanups Date: Wed, 17 Feb 2021 18:35:42 -0500 Message-Id: <20210217233547.93892-1-peterx@redhat.com> X-Mailer: git-send-email 2.26.2 MIME-Version: 1.0 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=peterx@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable X-Stat-Signature: xit7kid537n4wdhb9qayr5pexdyh69cg X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: 757F46002496 Received-SPF: none (redhat.com>: No applicable sender policy available) receiver=imf09; identity=mailfrom; envelope-from=""; helo=us-smtp-delivery-124.mimecast.com; client-ip=63.128.21.124 X-HE-DKIM-Result: pass/pass X-HE-Tag: 1613604952-798942 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: v5:=0D - patch 4: change "int cow" into "bool cow"=0D - collect r-bs for Jason=0D =0D v4:=0D - add r-b for Mike on the last patch, add some more commit message explains= =0D that why we don't need wr-protect trick=0D - fix one warning of unused var in copy_present_page() [Gal]=0D =0D v3:=0D - rebase to linux-next/akpm, switch to the new HPAGE helpers [MikeK]=0D - correct error check for alloc_huge_page(); test it this time to make sure= =0D fork() fails gracefully when overcommit [MikeK]=0D - move page copy out of pgtable lock: this changed quite a bit of the logic= in=0D the last patch, prealloc is dropped since I found it easier to understand= =0D without looping at all [MikeK]=0D =0D v2:=0D - pass in 1 to alloc_huge_page() last param [Mike]=0D - reduce comment, unify the comment in one place [Linus]=0D - add r-bs for Mike and Miaohe=0D =0D ---- original cover letter ----=0D =0D As reported by Gal [1], we still miss the code clip to handle early cow for= =0D hugetlb case, which is true. Again, it still feels odd to fork() after usi= ng a=0D few huge pages, especially if they're privately mapped to me.. However I d= o=0D agree with Gal and Jason in that we should still have that since that'll=0D complete the early cow on fork effort at least, and it'll still fix issues= =0D where buffers are not well under control and not easy to apply MADV_DONTFOR= K.=0D =0D The first two patches (1-2) are some cleanups I noticed when reading into t= he=0D hugetlb reserve map code. I think it's good to have but they're not necess= ary=0D for fixing the fork issue.=0D =0D The last two patches (3-4) is the real fix.=0D =0D I tested this with a fork() after some vfio-pci assignment, so I'm pretty s= ure=0D the page copy path could trigger well (page will be accounted right after t= he=0D fork()), but I didn't do data check since the card I assigned is some rando= m=0D nic. Gal, please feel free to try this if you have better way to verify th= e=0D series.=0D =0D https://github.com/xzpeter/linux/tree/fork-cow-pin-huge=0D =0D Please review, thanks!=0D =0D [1] https://lore.kernel.org/lkml/27564187-4a08-f187-5a84-3df50009f6ca@amazo= n.com/=0D =0D Peter Xu (5):=0D hugetlb: Dedup the code to add a new file_region=0D hugetlg: Break earlier in add_reservation_in_range() when we can=0D mm: Introduce page_needs_cow_for_dma() for deciding whether cow=0D mm: Use is_cow_mapping() across tree where proper=0D hugetlb: Do early cow when page pinned on src mm=0D =0D drivers/gpu/drm/vmwgfx/vmwgfx_page_dirty.c | 4 +-=0D drivers/gpu/drm/vmwgfx/vmwgfx_ttm_glue.c | 2 +-=0D fs/proc/task_mmu.c | 2 -=0D include/linux/mm.h | 21 ++++=0D mm/huge_memory.c | 8 +-=0D mm/hugetlb.c | 123 +++++++++++++++------=0D mm/internal.h | 5 -=0D mm/memory.c | 8 +-=0D 8 files changed, 117 insertions(+), 56 deletions(-)=0D =0D --=20=0D 2.26.2=0D =0D