linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Linus Torvalds <torvalds@linux-foundation.org>
To: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Ingo Molnar <mingo@elte.hu>, Kan Liang <kan.liang@intel.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Johannes Weiner <hannes@cmpxchg.org>, Jan Kara <jack@suse.cz>,
	linux-mm <linux-mm@kvack.org>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH 1/2] sched/wait: Break up long wake list walk
Date: Tue, 15 Aug 2017 12:41:01 -0700	[thread overview]
Message-ID: <CA+55aFymeC-s6rkGk4==3RjZu6nyyj2R9c5TBzpwTwJd4yjf2A@mail.gmail.com> (raw)
In-Reply-To: <0b7b6132-a374-9636-53f9-c2e1dcec230f@linux.intel.com>

On Tue, Aug 15, 2017 at 12:05 PM, Tim Chen <tim.c.chen@linux.intel.com> wrote:
>
> We have a test case but it is a customer workload.  We'll try to get
> a bit more info.

Ok. Being a customer workload is lovely in the sense that it is
actually a real load, not just a microbecnhmark.

But yeah, it makes it harder to describe and show what's going on.

But you do have access to that workload internally at Intel, and can
at least test things out that way, I assume?

> I agree that dynamic sizing makes a lot of sense.  We'll check to
> see if additional size to the hash table helps, assuming that the
> waiters are distributed among different pages for our test case.

One more thing: it turns out that there are two very different kinds
of users of the page waitqueue.

There's the "wait_on_page_bit*()" users - people waiting for a page to
unlock or stop being under writeback etc.

Those *should* generally be limited to just one wait-queue per waiting
thread, I think.

Then there is the "cachefiles" use, which ends up adding a lot of
waitqueues to a lot of paghes to monitor their state.

Honestly, I think that second use a horrible hack. It basically adds a
waitqueue to each page in order to get a callback when it is ready,
and then copies it.

And it does this for things like cachefiles_read_backing_file(), so
you might have a huge list of pages for copying a large file, and it
adds a callback for every single one of those all at once.

The fix for the cachefiles behavior might be very different from the
fix to the "normal" operations. But making the wait queue hash tables
bigger _should_ help both cases.

We might also want to hash based on the actual bit we're waiting for.
Right now we just do a

        wait_queue_head_t *q = page_waitqueue(page);

but I think the actual bit is always explicit (well, the cachefiles
interface doesn't have that, but looking at the callback for that, it
really only cares about PG_locked, so it *should* make the bit it is
waiting for explicit).

So if we have unnecessarily collisions because we have waiters looking
at different bits of the same page, we could just hash in the bit
number that we're waiting for too.

               Linus

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2017-08-15 19:41 UTC|newest]

Thread overview: 65+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-08-15  0:52 [PATCH 1/2] sched/wait: Break up long wake list walk Tim Chen
2017-08-15  0:52 ` [PATCH 2/2] sched/wait: Introduce lock breaker in wake_up_page_bit Tim Chen
2017-08-15  1:48 ` [PATCH 1/2] sched/wait: Break up long wake list walk Linus Torvalds
2017-08-15  2:27   ` Andi Kleen
2017-08-15  2:52     ` Linus Torvalds
2017-08-15  3:15       ` Andi Kleen
2017-08-15  3:28         ` Linus Torvalds
2017-08-15 19:05           ` Tim Chen
2017-08-15 19:41             ` Linus Torvalds [this message]
2017-08-15 19:47               ` Linus Torvalds
2017-08-15 22:47           ` Davidlohr Bueso
2017-08-15 22:56             ` Linus Torvalds
2017-08-15 22:57               ` Linus Torvalds
2017-08-15 23:50                 ` Linus Torvalds
2017-08-16 23:22                   ` Eric W. Biederman
2017-08-17 16:17   ` Liang, Kan
2017-08-17 16:25     ` Linus Torvalds
2017-08-17 20:18       ` Liang, Kan
2017-08-17 20:44         ` Linus Torvalds
2017-08-18 12:23           ` Mel Gorman
2017-08-18 14:20             ` Liang, Kan
2017-08-18 14:46               ` Mel Gorman
2017-08-18 16:36                 ` Tim Chen
2017-08-18 16:45                   ` Andi Kleen
2017-08-18 16:53                 ` Liang, Kan
2017-08-18 17:48                   ` Linus Torvalds
2017-08-18 18:54                     ` Mel Gorman
2017-08-18 19:14                       ` Linus Torvalds
2017-08-18 19:58                         ` Andi Kleen
2017-08-18 20:10                           ` Linus Torvalds
2017-08-21 18:32                         ` Mel Gorman
2017-08-21 18:56                           ` Liang, Kan
2017-08-22 17:23                             ` Liang, Kan
2017-08-22 18:19                               ` Linus Torvalds
2017-08-22 18:25                                 ` Linus Torvalds
2017-08-22 18:56                                 ` Peter Zijlstra
2017-08-22 19:15                                   ` Linus Torvalds
2017-08-22 19:08                                 ` Peter Zijlstra
2017-08-22 19:30                                   ` Linus Torvalds
2017-08-22 19:37                                     ` Andi Kleen
2017-08-22 21:08                                       ` Christopher Lameter
2017-08-22 21:24                                         ` Andi Kleen
2017-08-22 22:52                                           ` Linus Torvalds
2017-08-22 23:19                                             ` Linus Torvalds
2017-08-23 14:51                                             ` Liang, Kan
2017-08-22 19:55                                 ` Liang, Kan
2017-08-22 20:42                                   ` Linus Torvalds
2017-08-22 20:53                                     ` Peter Zijlstra
2017-08-22 20:58                                       ` Linus Torvalds
2017-08-23 14:49                                     ` Liang, Kan
2017-08-23 15:58                                       ` Tim Chen
2017-08-23 18:17                                         ` Linus Torvalds
2017-08-23 20:55                                           ` Liang, Kan
2017-08-23 23:30                                           ` Linus Torvalds
2017-08-24 17:49                                             ` Tim Chen
2017-08-24 18:16                                               ` Linus Torvalds
2017-08-24 20:44                                                 ` Mel Gorman
2017-08-25 16:44                                                   ` Tim Chen
2017-08-23 16:04                                 ` Mel Gorman
2017-08-18 20:05                     ` Andi Kleen
2017-08-18 20:29                       ` Linus Torvalds
2017-08-18 20:29                     ` Liang, Kan
2017-08-18 20:34                       ` Linus Torvalds
2017-08-18 16:55             ` Linus Torvalds
2017-08-18 13:06           ` Liang, Kan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CA+55aFymeC-s6rkGk4==3RjZu6nyyj2R9c5TBzpwTwJd4yjf2A@mail.gmail.com' \
    --to=torvalds@linux-foundation.org \
    --cc=ak@linux.intel.com \
    --cc=akpm@linux-foundation.org \
    --cc=hannes@cmpxchg.org \
    --cc=jack@suse.cz \
    --cc=kan.liang@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mingo@elte.hu \
    --cc=peterz@infradead.org \
    --cc=tim.c.chen@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).