Linux-Fsdevel Archive on lore.kernel.org
 help / color / Atom feed
From: Waiman Long <longman@redhat.com>
To: Matthew Wilcox <willy@infradead.org>
Cc: Peter Zijlstra <peterz@infradead.org>,
	Ingo Molnar <mingo@redhat.com>, Will Deacon <will.deacon@arm.com>,
	Alexander Viro <viro@zeniv.linux.org.uk>,
	Mike Kravetz <mike.kravetz@oracle.com>,
	linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	linux-mm@kvack.org, Davidlohr Bueso <dave@stgolabs.net>
Subject: Re: [PATCH 5/5] hugetlbfs: Limit wait time when trying to share huge PMD
Date: Wed, 11 Sep 2019 16:44:32 +0100
Message-ID: <19d9ea18-bd20-e02f-c1de-70e7322f5f22@redhat.com> (raw)
In-Reply-To: <20190911151451.GH29434@bombadil.infradead.org>

On 9/11/19 4:14 PM, Matthew Wilcox wrote:
> On Wed, Sep 11, 2019 at 04:05:37PM +0100, Waiman Long wrote:
>> When allocating a large amount of static hugepages (~500-1500GB) on a
>> system with large number of CPUs (4, 8 or even 16 sockets), performance
>> degradation (random multi-second delays) was observed when thousands
>> of processes are trying to fault in the data into the huge pages. The
>> likelihood of the delay increases with the number of sockets and hence
>> the CPUs a system has.  This only happens in the initial setup phase
>> and will be gone after all the necessary data are faulted in.
> Can;t the application just specify MAP_POPULATE?

Originally, I thought that this happened in the startup phase when the
pages were faulted in. The problem persists after steady state had been
reached though. Every time you have a new user process created, it will
have its own page table. It is the sharing of the of huge page shared
memory that is causing problem. Of course, it depends on how the
application is written.

Anyway, MAP_POPULATE will not be useful in this case.

Thanks,
Longman


  reply index

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-09-11 15:05 [PATCH 0/5] hugetlbfs: Disable PMD sharing for large systems Waiman Long
2019-09-11 15:05 ` [PATCH 1/5] locking/rwsem: Add down_write_timedlock() Waiman Long
2019-09-11 15:05 ` [PATCH 2/5] locking/rwsem: Enable timeout check when spinning on owner Waiman Long
2019-09-11 15:05 ` [PATCH 3/5] locking/osq: Allow early break from OSQ Waiman Long
2019-09-11 15:05 ` [PATCH 4/5] locking/rwsem: Enable timeout check when staying in the OSQ Waiman Long
2019-09-11 15:05 ` [PATCH 5/5] hugetlbfs: Limit wait time when trying to share huge PMD Waiman Long
2019-09-11 15:14   ` Matthew Wilcox
2019-09-11 15:44     ` Waiman Long [this message]
2019-09-11 17:03       ` Mike Kravetz
2019-09-11 17:15         ` Waiman Long
2019-09-11 17:22           ` Qian Cai
2019-09-11 17:28           ` Waiman Long
2019-09-11 16:01   ` Qian Cai
2019-09-11 16:34     ` Waiman Long
2019-09-11 19:42       ` Qian Cai
2019-09-11 20:54         ` Waiman Long
2019-09-11 21:57           ` Qian Cai
2019-09-11 19:57   ` Matthew Wilcox
2019-09-11 20:51     ` Waiman Long
2019-09-12  3:26   ` Mike Kravetz
2019-09-12  3:41     ` Matthew Wilcox
2019-09-12  4:40       ` Davidlohr Bueso
2019-09-16 13:53         ` Waiman Long
2019-09-12  9:06     ` Waiman Long
2019-09-12 16:43       ` Mike Kravetz
2019-09-13 18:23         ` Waiman Long
2019-09-13  1:50 ` [PATCH 0/5] hugetlbfs: Disable PMD sharing for large systems Dave Chinner
2019-09-25  8:35   ` Peter Zijlstra

Reply instructions:

You may reply publically to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=19d9ea18-bd20-e02f-c1de-70e7322f5f22@redhat.com \
    --to=longman@redhat.com \
    --cc=dave@stgolabs.net \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mike.kravetz@oracle.com \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=viro@zeniv.linux.org.uk \
    --cc=will.deacon@arm.com \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Linux-Fsdevel Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-fsdevel/0 linux-fsdevel/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-fsdevel linux-fsdevel/ https://lore.kernel.org/linux-fsdevel \
		linux-fsdevel@vger.kernel.org
	public-inbox-index linux-fsdevel

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-fsdevel


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git