From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id F039CC433F5 for ; Tue, 23 Nov 2021 01:20:52 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 285C16B0071; Mon, 22 Nov 2021 20:20:37 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 235EE6B0072; Mon, 22 Nov 2021 20:20:37 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 0D6DB6B0073; Mon, 22 Nov 2021 20:20:37 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0184.hostedemail.com [216.40.44.184]) by kanga.kvack.org (Postfix) with ESMTP id F35376B0071 for ; Mon, 22 Nov 2021 20:20:36 -0500 (EST) Received: from smtpin19.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id B228983C52 for ; Tue, 23 Nov 2021 01:20:26 +0000 (UTC) X-FDA: 78838439358.19.5E40239 Received: from mail-lf1-f52.google.com (mail-lf1-f52.google.com [209.85.167.52]) by imf23.hostedemail.com (Postfix) with ESMTP id 508B79000093 for ; Tue, 23 Nov 2021 01:20:22 +0000 (UTC) Received: by mail-lf1-f52.google.com with SMTP id bu18so86902814lfb.0 for ; Mon, 22 Nov 2021 17:20:26 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=X8wfXbw5bjxz9JfJOdAOa5BfAiwRfYmne+uZZ/dKp/g=; b=VlgBqVpm+Xam5cdvEkYMdeYYdYrjdjY76z5PLJOvzhTOi7qcGetldnK6rjO83qoiMo JXHLnf3L1TwFLK/FXp1Jr/dVpWI/th20qgdkYNUC4LtK+JteNgYs9PvNT8htSUUWC9ED 0BMKty7OQ/AQPPva9pRQPdf6uagGzKw5/uQ5hGZ6oy7eF+2lQQs0hYh3ObrvDFTsMhcJ rKAJxXlPURvEGYTDFYp+kp6wgIl6WTQ//KkCpkQIOiDrtLI49o4egZA0hc5sUAJI5wuh pwyUbVvjzY6u8YsQPXexPskBaow8TLr3F7gR7l8QJfEZgogzT6DcFwYkuBV5XybUNC56 FpKw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=X8wfXbw5bjxz9JfJOdAOa5BfAiwRfYmne+uZZ/dKp/g=; b=W6PCB5iyBecEiWseGdMiqScnr0DJzn6t0inAOudDGLFSDdMEPcU8wBSnFeM+XcMC9p whBzcQ3KMCGeXeEhk4nOSb+yDumS+4k6pPBifzD8XywUIA8+CdssjvA2IAf7Q0VLu5rD RTp4tIhiFCFStpdPr6XV+esOceODtyBwS0x2zKAXKCMN1Whelc1GTUXEDOHIM+SGfA2b 9f8XkrSl7F5HoUFdgWyRE/6jJnlbEx1BqJgXrbi9TVgq+t4+oOrUIuymeA6+x3Z2v1KP qgb0jrLa18xEK7ehWjXfQbN0b76HwFro2zx9gQ3lCddRdgFe46iJ1yGONGWcTEG399xz EEGg== X-Gm-Message-State: AOAM531jrz+JaqGufrnRVEX2scaYcOtpnoeLD72ytyJcgNEXFMqrkT9u SL0lyRb5R0nX0MQPLbTve6aZ4h3Vx7hOycUK2TS2zA== X-Google-Smtp-Source: ABdhPJxfwYlKfdZacoYT9IM3uxvudeR9PeDB67TebeAtAn1rZiAl2xnuLQGARcc2+NBdCsqxFg9SFZHiFMHbw4MmQvY= X-Received: by 2002:a05:6512:1113:: with SMTP id l19mr854413lfg.184.1637630424415; Mon, 22 Nov 2021 17:20:24 -0800 (PST) MIME-Version: 1.0 References: <20211120201230.920082-1-shakeelb@google.com> <25b36a5c-5bbd-5423-0c67-05cd6c1432a7@redhat.com> <1b30d06d-f9c0-1737-13e6-2d1a7d7b8507@redhat.com> In-Reply-To: <1b30d06d-f9c0-1737-13e6-2d1a7d7b8507@redhat.com> From: Shakeel Butt Date: Mon, 22 Nov 2021 17:20:13 -0800 Message-ID: Subject: Re: [PATCH] mm: split thp synchronously on MADV_DONTNEED To: David Hildenbrand Cc: "Kirill A . Shutemov" , Yang Shi , Zi Yan , Matthew Wilcox , Andrew Morton , linux-mm@kvack.org, linux-kernel@vger.kernel.org Content-Type: text/plain; charset="UTF-8" X-Rspamd-Server: rspam10 X-Rspamd-Queue-Id: 508B79000093 X-Stat-Signature: k77ikt35sh9coi9qzw3d1zmdtjqaaxqm Authentication-Results: imf23.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=VlgBqVpm; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf23.hostedemail.com: domain of shakeelb@google.com designates 209.85.167.52 as permitted sender) smtp.mailfrom=shakeelb@google.com X-HE-Tag: 1637630422-520266 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Mon, Nov 22, 2021 at 10:59 AM David Hildenbrand wrote: > [...] > > Thanks for the details, that makes sense to me. It's essentially like > another kernel buffer charged to the process, only reclaimed on memory > reclaim. > > (can we add that to the patch description?) > Sure. [...] > > > > I did a simple benchmark of madvise(MADV_DONTNEED) on 10000 THPs on > > x86 for both settings you suggested. I don't see any statistically > > significant difference with and without the patch. Let me know if you > > want me to try something else. > > Awesome, thanks for benchmarking. I did not check, but I assume on > re-access, we won't actually re-use pages from the underlying, partially > unmapped, THP, correct? Correct. > So after MADV_DONTNEED, the zapped sub-pages are > essentially lost until reclaimed by splitting the THP? Yes. > If they could get > reused, there would be value in the deferred split when partially > unmapping a THP. > > > I do wonder which purpose the deferred split serves nowadays at all. > Fortunately, there is documentation: Documentation/vm/transhuge.rst: > > " > Unmapping part of THP (with munmap() or other way) is not going to free > memory immediately. Instead, we detect that a subpage of THP is not in > use in page_remove_rmap() and queue the THP for splitting if memory > pressure comes. Splitting will free up unused subpages. > > Splitting the page right away is not an option due to locking context in > the place where we can detect partial unmap. It also might be > counterproductive since in many cases partial unmap happens during > exit(2) if a THP crosses a VMA boundary. > > The function deferred_split_huge_page() is used to queue a page for > splitting. The splitting itself will happen when we get memory pressure > via shrinker interface. > " > > I do wonder which these locking contexts are exactly, and if we could > also do the same thing on ordinary munmap -- because I assume it can be > similarly problematic for some applications. This is a good question regarding munmap. One main difference is munmap takes mmap_lock in write mode and usually performance critical applications avoid such operations. > The "exit()" case might > indeed be interesting, but I really do wonder if this is even observable > in actual number: I'm not so sure about the "many cases" but I might be > wrong, of course. I am not worried about the exit(). The whole THP will get freed and be removed from the deferred list as well. Note that deferred list does not hold reference to the THP and has a hook in the THP destructor. thanks, Shakeel