linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Mike Kravetz <mike.kravetz@oracle.com>
To: riel@surriel.com
Cc: linux-kernel@vger.kernel.org, kernel-team@meta.com,
	linux-mm@kvack.org, akpm@linux-foundation.org,
	muchun.song@linux.dev, leit@meta.com, willy@infradead.org,
	Ray Fucillo <Ray.Fucillo@intersystems.com>,
	Jacklin Kotikian <Jacklin.Kotikian@intersystems.com>
Subject: Re: [PATCH 4/4] hugetlbfs: replace hugetlb_vma_lock with invalidate_lock
Date: Mon, 16 Oct 2023 17:52:36 -0700	[thread overview]
Message-ID: <20231017005236.GA236970@monkey> (raw)
In-Reply-To: <20231006040020.3677377-5-riel@surriel.com>

On 10/05/23 23:59, riel@surriel.com wrote:
> From: Rik van Riel <riel@surriel.com>
> 
> Replace the custom hugetlbfs VMA locking code with the recently
> introduced invalidate_lock. This greatly simplifies things.
> 
> However, this is a large enough change that it should probably go in
> separately from the other changes.
> 
> Another question is whether this simplification hurts scalability
> for certain workloads.

Finally got around to running some performance tests on this.

As a reminder, the hugetlb specific vma lock was introduced as a result
of this report:
https://lore.kernel.org/linux-mm/43faf292-245b-5db5-cce9-369d8fb6bd21@infradead.org/

I do not have access to the database or applications to recreate the issues
originally reported.  However, while working the issue I did use a
simulated workload that showed the regression and improvements moving to a
vma specific lock.  Here is part of the commit log describing the testing
when the vma lock was introduced.


"The recent regression report [1] notes page fault and fork latency of
shared hugetlb mappings.  To measure this, I created two simple programs:
1) map a shared hugetlb area, write fault all pages, unmap area
   Do this in a continuous loop to measure faults per second
2) map a shared hugetlb area, write fault a few pages, fork and exit
   Do this in a continuous loop to measure forks per second
These programs were run on a 48 CPU VM with 320GB memory.  The shared
mapping size was 250GB.  For comparison, a single instance of the program
was run.  Then, multiple instances were run in parallel to introduce
lock contention.  Changing the locking scheme results in a significant
performance benefit.

test            instances       unmodified      revert          vma
--------------------------------------------------------------------------
faults per sec  1               393043          395680          389932
faults per sec  24               71405           81191           79048
forks per sec   1                 2802            2747            2725
forks per sec   24                 439             536             500
Combined faults 24                1621           68070           53662
Combined forks  24                 358              67             142

Combined test is when running both faulting program and forking program
simultaneously."

Ray Fucillo (on Cc) verified the performance regression was removed when
the vma lock was introduced.

I have run the same benchmark on this patch.

test            instances       before			after
--------------------------------------------------------------------------
faults per sec  1		385135			386253
faults per sec  24		 95922			 75665
forks per sec   1		  3392			  3207
forks per sec   24		   683			   704
Combined faults 24		 76004			 30407
Combined forks  24		   241			   278

The Combined faults number drops by over 50%.  This is not nearly as dramatic
as the changes originally seen.  However, I do expect that there will be
a noticeable performance regression.  Ray may be able to help running real
workloads on real applications and database.

I suggest we hold off on adding this change until further, more real world
analysis can be performed.  The simplification of the code is nice, but I
would hate to regress any workloads.
-- 
Mike Kravetz

  reply	other threads:[~2023-10-17  0:53 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-10-06  3:59 [PATCH v7 0/4] hugetlbfs: close race between MADV_DONTNEED and page fault riel
2023-10-06  3:59 ` [PATCH 1/4] hugetlbfs: clear resv_map pointer if mmap fails riel
2023-10-06  3:59 ` [PATCH 2/4] hugetlbfs: extend hugetlb_vma_lock to private VMAs riel
2023-10-06  3:59 ` [PATCH 3/4] hugetlbfs: close race between MADV_DONTNEED and page fault riel
2023-10-06 17:57   ` Mike Kravetz
2023-10-06  3:59 ` [PATCH 4/4] hugetlbfs: replace hugetlb_vma_lock with invalidate_lock riel
2023-10-17  0:52   ` Mike Kravetz [this message]
2023-10-17 13:47     ` Rik van Riel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20231017005236.GA236970@monkey \
    --to=mike.kravetz@oracle.com \
    --cc=Jacklin.Kotikian@intersystems.com \
    --cc=Ray.Fucillo@intersystems.com \
    --cc=akpm@linux-foundation.org \
    --cc=kernel-team@meta.com \
    --cc=leit@meta.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=muchun.song@linux.dev \
    --cc=riel@surriel.com \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).