All of lore.kernel.org
 help / color / mirror / Atom feed
From: Matthew Wilcox <willy@infradead.org>
To: Yang Shi <yang.shi@linux.alibaba.com>
Cc: mhocko@kernel.org, ldufour@linux.vnet.ibm.com, vbabka@suse.cz,
	kirill@shutemov.name, akpm@linux-foundation.org,
	dave.hansen@intel.com, oleg@redhat.com,
	srikar@linux.vnet.ibm.com, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org
Subject: Re: [RFC v10 PATCH 0/3] mm: zap pages with read mmap_sem in munmap for large mapping
Date: Sat, 15 Sep 2018 03:10:42 -0700	[thread overview]
Message-ID: <20180915101042.GD31572@bombadil.infradead.org> (raw)
In-Reply-To: <1536957299-43536-1-git-send-email-yang.shi@linux.alibaba.com>

On Sat, Sep 15, 2018 at 04:34:56AM +0800, Yang Shi wrote:
> Regression and performance data:
> Did the below regression test with setting thresh to 4K manually in the code:
>   * Full LTP
>   * Trinity (munmap/all vm syscalls)
>   * Stress-ng: mmap/mmapfork/mmapfixed/mmapaddr/mmapmany/vm
>   * mm-tests: kernbench, phpbench, sysbench-mariadb, will-it-scale
>   * vm-scalability
> 
> With the patches, exclusive mmap_sem hold time when munmap a 80GB address
> space on a machine with 32 cores of E5-2680 @ 2.70GHz dropped to us level
> from second.
> 
> munmap_test-15002 [008]   594.380138: funcgraph_entry: |  __vm_munmap {
> munmap_test-15002 [008]   594.380146: funcgraph_entry:      !2485684 us |    unmap_region();
> munmap_test-15002 [008]   596.865836: funcgraph_exit:       !2485692 us |  }
> 
> Here the excution time of unmap_region() is used to evaluate the time of
> holding read mmap_sem, then the remaining time is used with holding
> exclusive lock.

Something I've been wondering about for a while is whether we should "sort"
the readers together.  ie if the acquirers look like this:

A write
B read
C read
D write
E read
F read
G write

then we should grant the lock to A, BCEF, D, G rather than A, BC, D, EF, G.
A quick way to test this is in __rwsem_down_read_failed_common do
something like:

-	if (list_empty(&sem->wait_list))
+	if (list_empty(&sem->wait_list)) {
 		adjustment += RWSEM_WAITING_BIAS;
+		list_add(&waiter.list, &sem->wait_list);
+	} else {
+		struct rwsem_waiter *first = list_first_entry(&sem->wait_list,
+						struct rwsem_waiter, list);
+		if (first.type == RWSEM_WAITING_FOR_READ)
+			list_add(&waiter.list, &sem->wait_list);
+		else
+			list_add_tail(&waiter.list, &sem->wait_list);
+	}
-	list_add_tail(&waiter.list, &sem->wait_list);

It'd be interesting to know if this makes any difference with your tests.

(this isn't perfect, of course; it'll fail to sort readers together if there's
a writer at the head of the queue; eg:

A write
B write
C read
D write
E read
F write
G read

but it won't do any worse than we have at the moment).


  parent reply	other threads:[~2018-09-15 10:10 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-09-14 20:34 [RFC v10 PATCH 0/3] mm: zap pages with read mmap_sem in munmap for large mapping Yang Shi
2018-09-14 20:34 ` [RFC v10 PATCH 1/3] mm: mmap: zap pages with read mmap_sem in munmap Yang Shi
2018-09-15  9:21   ` Matthew Wilcox
2018-09-17 19:49     ` Yang Shi
2018-09-14 20:34 ` [RFC v10 PATCH 2/3] mm: unmap VM_HUGETLB mappings with optimized path Yang Shi
2018-09-15  9:44   ` Matthew Wilcox
2018-09-14 20:34 ` [RFC v10 PATCH 3/3] mm: unmap VM_PFNMAP " Yang Shi
2018-09-15  9:45   ` Matthew Wilcox
2018-09-15 10:10 ` Matthew Wilcox [this message]
2018-09-17 20:00   ` [RFC v10 PATCH 0/3] mm: zap pages with read mmap_sem in munmap for large mapping Yang Shi
2018-09-18 10:37     ` Matthew Wilcox

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180915101042.GD31572@bombadil.infradead.org \
    --to=willy@infradead.org \
    --cc=akpm@linux-foundation.org \
    --cc=dave.hansen@intel.com \
    --cc=kirill@shutemov.name \
    --cc=ldufour@linux.vnet.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=oleg@redhat.com \
    --cc=srikar@linux.vnet.ibm.com \
    --cc=vbabka@suse.cz \
    --cc=yang.shi@linux.alibaba.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.