From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.1 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,NICE_REPLY_A, SPF_HELO_NONE,SPF_PASS,USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 981BDC433C1 for ; Wed, 24 Mar 2021 09:55:49 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 26211619F2 for ; Wed, 24 Mar 2021 09:55:49 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 26211619F2 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 43DDD6B01AB; Wed, 24 Mar 2021 05:55:48 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 3ED8E6B01AE; Wed, 24 Mar 2021 05:55:48 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 2B52D6B01B3; Wed, 24 Mar 2021 05:55:48 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0066.hostedemail.com [216.40.44.66]) by kanga.kvack.org (Postfix) with ESMTP id 102326B01AB for ; Wed, 24 Mar 2021 05:55:48 -0400 (EDT) Received: from smtpin07.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id CAE061EF1 for ; Wed, 24 Mar 2021 09:55:47 +0000 (UTC) X-FDA: 77954311134.07.E4210F0 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf08.hostedemail.com (Postfix) with ESMTP id 4008080192E4 for ; Wed, 24 Mar 2021 09:55:46 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1616579746; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=HXeqW1rnkzuOqju08eImoGWJPDupbYNUi7KL3T8MSk4=; b=H0REOYB5D51iy6zrWca9B2TzDgmX+SYHAukhNuwOV9TxAa0NIuHcUJsDAleJOMCoTmXm77 07yBcEmFqT/RW+mdgKKrCH5tTZztP/XMO9FeRdWNmgR5sxcg8bv+4Bq3unplTaJ3dlz0S6 G3/uw7O9/vbLGaADI97XX6eAD11BkWg= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-104-0nHJZluQPnSPjdP0425FQg-1; Wed, 24 Mar 2021 05:55:45 -0400 X-MC-Unique: 0nHJZluQPnSPjdP0425FQg-1 Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.phx2.redhat.com [10.5.11.22]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 7446388127C; Wed, 24 Mar 2021 09:55:43 +0000 (UTC) Received: from [10.36.115.66] (ovpn-115-66.ams2.redhat.com [10.36.115.66]) by smtp.corp.redhat.com (Postfix) with ESMTP id 17C771002EE6; Wed, 24 Mar 2021 09:55:36 +0000 (UTC) From: David Hildenbrand To: Linux Memory Management List Cc: Minchan Kim , Matthew Wilcox , Rik van Riel , Michal Hocko , Andrea Arcangeli , Peter Xu , Vlastimil Babka , Yang Shi , Balbir Singh References: Organization: Red Hat GmbH Subject: Re: Page zapping and page table reclaim Message-ID: <53e72516-2e38-f490-4d1f-709291140e2f@redhat.com> Date: Wed, 24 Mar 2021 10:55:36 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.8.0 MIME-Version: 1.0 In-Reply-To: X-Scanned-By: MIMEDefang 2.84 on 10.5.11.22 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=david@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US X-Stat-Signature: feasqqaqj36dkuqnzzzjangrkcmszn3q X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: 4008080192E4 Received-SPF: none (redhat.com>: No applicable sender policy available) receiver=imf08; identity=mailfrom; envelope-from=""; helo=us-smtp-delivery-124.mimecast.com; client-ip=170.10.133.124 X-HE-DKIM-Result: pass/pass X-HE-Tag: 1616579746-848876 Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 11.03.21 19:14, David Hildenbrand wrote: > Hi folks, >=20 > I was wondering, is there any mechanism that reclaims basically empty > page tables in a running process? >=20 > Like: When I MADV_DONTNEED a huge range, there could be plenty of > basically empty (e.g., all entries invalid) page tables we could > reclaim. As soon as we zap a complete PMD we could reclaim (depending o= n > the architecture) a whole page. >=20 > Zapping on the PMD level might make most impact I guess. >=20 > For 1 GB, we need 262144 4k pages. If we assume each PTE is 8 bytes, we > need a total of 8 MB for the lowest level page tables (PTE). >=20 > OTOH, we would need 512 PMD entries - a single 4k page. Zapping 1 TB > would mean we can free up another 4MB - rather a corner case and we can > live with that. >=20 >=20 > Of course, the same might apply to other cases where we can restore all > page table content from the VMA again. One example would be after > MADV_FREE zapped a whole range of entries we marked. >=20 > Looks like if we happen to zap a THP, we should already get what we wan= t > (no page table, nothing to remove) >=20 > I haven't immediately stumbled over anything, but could be I am missing > the obvious. I guess what would need some thought is concurrent > discards/pagefaults - but it feels like being similar to > collapsing/splitting a THP while there is other system activity. >=20 > Maybe there is already something and I am just not aware of it. >=20 > Thanks! Thanks for the feedback so far. I just did a very simple experiment: 1. Start a VM (QEMU) with 60 GB and populate/preallocate all page tables. 2. Inflate the memory balloon (virtio-balloon) in the VM to 58 GB 3. Wait until fully inflated Before inflating the balloon: PageTables: 131760 kB After inflating the balloon: No real change Shutting down the VM: PageTables: 8064 kB In comparison, starting a 2 GB VM and preallocating/populating all=20 memory: PageTables: 12660 kB So in this case, there is quite some room for improvements (> 100 MiB).=20 virtio-balloon will discard in 4k granularity, which means, that we'll=20 never get to zap whole THPs (the first discard will break up the THP),=20 therefore, don't remove any page tables. I'll try identifying other workloads/cases where such an optimization=20 are applicable and work on asynchronous page table reclaim. Thanks! --=20 Thanks, David / dhildenb