linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Matthew Wilcox <willy@infradead.org>
To: Yang Shi <yang.shi@linux.alibaba.com>
Cc: mhocko@kernel.org, ldufour@linux.vnet.ibm.com, vbabka@suse.cz,
	kirill@shutemov.name, akpm@linux-foundation.org,
	dave.hansen@intel.com, oleg@redhat.com,
	srikar@linux.vnet.ibm.com, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org
Subject: Re: [RFC v10 PATCH 0/3] mm: zap pages with read mmap_sem in munmap for large mapping
Date: Sat, 15 Sep 2018 03:10:42 -0700	[thread overview]
Message-ID: <20180915101042.GD31572@bombadil.infradead.org> (raw)
In-Reply-To: <1536957299-43536-1-git-send-email-yang.shi@linux.alibaba.com>

On Sat, Sep 15, 2018 at 04:34:56AM +0800, Yang Shi wrote:
> Regression and performance data:
> Did the below regression test with setting thresh to 4K manually in the code:
>   * Full LTP
>   * Trinity (munmap/all vm syscalls)
>   * Stress-ng: mmap/mmapfork/mmapfixed/mmapaddr/mmapmany/vm
>   * mm-tests: kernbench, phpbench, sysbench-mariadb, will-it-scale
>   * vm-scalability
> 
> With the patches, exclusive mmap_sem hold time when munmap a 80GB address
> space on a machine with 32 cores of E5-2680 @ 2.70GHz dropped to us level
> from second.
> 
> munmap_test-15002 [008]   594.380138: funcgraph_entry: |  __vm_munmap {
> munmap_test-15002 [008]   594.380146: funcgraph_entry:      !2485684 us |    unmap_region();
> munmap_test-15002 [008]   596.865836: funcgraph_exit:       !2485692 us |  }
> 
> Here the excution time of unmap_region() is used to evaluate the time of
> holding read mmap_sem, then the remaining time is used with holding
> exclusive lock.

Something I've been wondering about for a while is whether we should "sort"
the readers together.  ie if the acquirers look like this:

A write
B read
C read
D write
E read
F read
G write

then we should grant the lock to A, BCEF, D, G rather than A, BC, D, EF, G.
A quick way to test this is in __rwsem_down_read_failed_common do
something like:

-	if (list_empty(&sem->wait_list))
+	if (list_empty(&sem->wait_list)) {
 		adjustment += RWSEM_WAITING_BIAS;
+		list_add(&waiter.list, &sem->wait_list);
+	} else {
+		struct rwsem_waiter *first = list_first_entry(&sem->wait_list,
+						struct rwsem_waiter, list);
+		if (first.type == RWSEM_WAITING_FOR_READ)
+			list_add(&waiter.list, &sem->wait_list);
+		else
+			list_add_tail(&waiter.list, &sem->wait_list);
+	}
-	list_add_tail(&waiter.list, &sem->wait_list);

It'd be interesting to know if this makes any difference with your tests.

(this isn't perfect, of course; it'll fail to sort readers together if there's
a writer at the head of the queue; eg:

A write
B write
C read
D write
E read
F write
G read

but it won't do any worse than we have at the moment).


  parent reply	other threads:[~2018-09-15 10:10 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-09-14 20:34 [RFC v10 PATCH 0/3] mm: zap pages with read mmap_sem in munmap for large mapping Yang Shi
2018-09-14 20:34 ` [RFC v10 PATCH 1/3] mm: mmap: zap pages with read mmap_sem in munmap Yang Shi
2018-09-15  9:21   ` Matthew Wilcox
2018-09-17 19:49     ` Yang Shi
2018-09-14 20:34 ` [RFC v10 PATCH 2/3] mm: unmap VM_HUGETLB mappings with optimized path Yang Shi
2018-09-15  9:44   ` Matthew Wilcox
2018-09-14 20:34 ` [RFC v10 PATCH 3/3] mm: unmap VM_PFNMAP " Yang Shi
2018-09-15  9:45   ` Matthew Wilcox
2018-09-15 10:10 ` Matthew Wilcox [this message]
2018-09-17 20:00   ` [RFC v10 PATCH 0/3] mm: zap pages with read mmap_sem in munmap for large mapping Yang Shi
2018-09-18 10:37     ` Matthew Wilcox

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180915101042.GD31572@bombadil.infradead.org \
    --to=willy@infradead.org \
    --cc=akpm@linux-foundation.org \
    --cc=dave.hansen@intel.com \
    --cc=kirill@shutemov.name \
    --cc=ldufour@linux.vnet.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=oleg@redhat.com \
    --cc=srikar@linux.vnet.ibm.com \
    --cc=vbabka@suse.cz \
    --cc=yang.shi@linux.alibaba.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).