linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Thomas Gleixner <tglx@linutronix.de>
To: Andy Lutomirski <luto@kernel.org>, Peter Zijlstra <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
	Mel Gorman <mgorman@suse.de>, LKML <linux-kernel@vger.kernel.org>,
	the arch/x86 maintainers <x86@kernel.org>,
	Christoph Hellwig <hch@lst.de>,
	Matthew Wilcox <willy@infradead.org>,
	Daniel Vetter <daniel@ffwll.ch>,
	Andrew Morton <akpm@linux-foundation.org>,
	Linux-MM <linux-mm@kvack.org>, Ingo Molnar <mingo@kernel.org>,
	Juri Lelli <juri.lelli@redhat.com>,
	Vincent Guittot <vincent.guittot@linaro.org>,
	Dietmar Eggemann <dietmar.eggemann@arm.com>,
	Steven Rostedt <rostedt@goodmis.org>,
	Ben Segall <bsegall@google.com>,
	Daniel Bristot de Oliveira <bristot@redhat.com>
Subject: Re: [patch V4 4/8] sched: Make migrate_disable/enable() independent of RT
Date: Mon, 23 Nov 2020 22:15:37 +0100	[thread overview]
Message-ID: <87mtz7n5ae.fsf@nanos.tec.linutronix.de> (raw)
In-Reply-To: <CALCETrWHzHXLKuD4JWxDyBULAuFivP55csFp=6feireZhianVw@mail.gmail.com>

On Sun, Nov 22 2020 at 15:16, Andy Lutomirski wrote:
> On Fri, Nov 20, 2020 at 1:29 AM Peter Zijlstra <peterz@infradead.org> wrote:
>> Anyway, clearly I'm the only one that cares, so I'll just crawl back
>> under my rock...
>
> I'll poke my head out of the rock for a moment, though...
>
> Several years ago, we discussed (in person at some conference IIRC)
> having percpu pagetables to get sane kmaps, percpu memory, etc.

Yes, I remember. That was our initial reaction in Prague to the looming
PTI challenge 3 years ago.

> The conclusion was that Linus thought the performance would suck and
> we shouldn't do it.

Linus had opinions, but we all agreed that depending on the workload and
the CPU features (think !PCID) the copy/pagefault overhead could be
significant.

> Since then, though, we added really fancy infrastructure for keeping
> track of a per-CPU list of recently used mms and efficiently tracking
> when they need to be invalidated.  We called these "ASIDs".  It would
> be fairly straightforward to have an entire pgd for each (cpu, asid)
> pair.  Newly added second-level (p4d/pud/whatever -- have I ever
> mentioned how much I dislike the Linux pagetable naming conventions
> and folding tricks?) tables could be lazily faulted in, and copies of
> the full 2kB mess would only be neeced when a new (cpu,asid) is
> allocated because either a flush happened while the mm was inactive on
> the CPU in question or because the mm fell off the percpu cache.
>
> The total overhead would be a bit more cache usage, 4kB * num cpus *
> num ASIDs per CPU (or 8k for PTI), and a few extra page faults (max
> num cpus * 256 per mm over the entire lifetime of that mm).

> The common case of a CPU switching back and forth between a small
> number of mms would have no significant overhead.

For CPUs which do not support PCID this sucks, which is everything pre
Westmere and all of 32bit. Yes, 32bit. If we go there then 32bit has to
bite the bullet and use the very same mechanism. Not that I care much
TBH.

Even for those CPUs which support it we'd need to increase the number of
ASIDs significantly.  Right now we use only 6 ASIDs, which is not a
lot. There are process heavy workloads out there which do quite some
context switching so avoiding the copy matters. I'm not worried about
fork as the copy will probably be just noise.

That said, I'm not saying it shouldn't be done, but there are quite a
few things which need to be looked at.

TBH, I really would love to see that just to make GS kernel usage and
the related mess in the ASM code go away completely.

For the task at hand, i.e. replacing kmap_atomic() by kmap_local(), this
is not really helpful because we'd need to make all highmem using
architectures do the same thing. But if we can pull it off on x86 the
required changes for the kmap_local() code are not really significant.

> On an unrelated note, what happens if you migrate_disable(), sleep for
> a looooong time, and someone tries to offline your CPU?

The hotplug code will prevent the CPU from going offline in that case,
i.e. it waits until the last task left it's migrate disabled section.

But you are not supposed to invoke sleep($ETERNAL) in such a
context. Emphasis on 'not supposed' :)

Thanks,

        tglx

  reply	other threads:[~2020-11-23 21:16 UTC|newest]

Thread overview: 48+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-11-18 19:48 [patch V4 0/8] mm/highmem: Preemptible variant of kmap_atomic & friends Thomas Gleixner
2020-11-18 19:48 ` [patch V4 1/8] mm/highmem: Provide and use CONFIG_DEBUG_KMAP_LOCAL Thomas Gleixner
2020-11-24 14:20   ` [tip: core/mm] " tip-bot2 for Thomas Gleixner
2020-11-18 19:48 ` [patch V4 2/8] mm/highmem: Provide CONFIG_DEBUG_KMAP_LOCAL_FORCE_MAP Thomas Gleixner
2020-11-18 21:13   ` Linus Torvalds
2020-11-19  8:46     ` Mel Gorman
2020-11-19 17:19       ` Linus Torvalds
2020-11-24 14:20   ` [tip: core/mm] " tip-bot2 for Thomas Gleixner
2020-11-18 19:48 ` [patch V4 3/8] x86: Support kmap_local() forced debugging Thomas Gleixner
2020-11-24 14:20   ` [tip: core/mm] " tip-bot2 for Thomas Gleixner
2021-01-06 23:01   ` [BUG] from " Steven Rostedt
2021-01-07  1:03     ` Linus Torvalds
2021-01-07  1:16       ` Steven Rostedt
2021-01-07  1:49       ` Steven Rostedt
2021-01-07  1:49       ` Jakub Kicinski
2021-01-07  2:11         ` Willem de Bruijn
2021-01-07  4:44           ` Willem de Bruijn
2021-01-07 19:47             ` Linus Torvalds
2021-01-07 20:52               ` Steven Rostedt
2021-01-07 21:07                 ` Willem de Bruijn
2020-11-18 19:48 ` [patch V4 4/8] sched: Make migrate_disable/enable() independent of RT Thomas Gleixner
2020-11-19  9:38   ` Mel Gorman
2020-11-19 11:14     ` Peter Zijlstra
2020-11-19 12:14       ` Mel Gorman
2020-11-19 14:17         ` Steven Rostedt
2020-11-19 17:23       ` Linus Torvalds
2020-11-19 18:28         ` Peter Zijlstra
2020-11-20  1:33           ` Thomas Gleixner
2020-11-20  9:29             ` Peter Zijlstra
2020-11-22 23:16               ` Andy Lutomirski
2020-11-23 21:15                 ` Thomas Gleixner [this message]
2020-11-23 21:25                   ` Thomas Gleixner
2020-11-23 22:07                     ` Andy Lutomirski
2020-11-23 23:10                       ` Thomas Gleixner
2020-11-24 10:29   ` [tip: sched/core] " tip-bot2 for Thomas Gleixner
2020-11-18 19:48 ` [patch V4 5/8] sched: highmem: Store local kmaps in task struct Thomas Gleixner
2020-11-19 11:33   ` Peter Zijlstra
2020-11-19 11:51   ` Peter Zijlstra
2020-11-19 12:12     ` Peter Zijlstra
2020-11-19 14:11       ` Thomas Gleixner
2020-11-24 14:20   ` [tip: core/mm] " tip-bot2 for Thomas Gleixner
2020-11-18 19:48 ` [patch V4 6/8] mm/highmem: Provide kmap_local* Thomas Gleixner
2020-11-24 14:20   ` [tip: core/mm] " tip-bot2 for Thomas Gleixner
2020-11-18 19:48 ` [patch V4 7/8] io-mapping: Provide iomap_local variant Thomas Gleixner
2020-11-24 14:20   ` [tip: core/mm] " tip-bot2 for Thomas Gleixner
2020-11-18 19:48 ` [patch V4 8/8] x86/crashdump/32: Simplify copy_oldmem_page() Thomas Gleixner
2020-11-24 14:20   ` [tip: core/mm] " tip-bot2 for Thomas Gleixner
2020-11-24  8:03 ` [patch V4 0/8] mm/highmem: Preemptible variant of kmap_atomic & friends Peter Zijlstra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87mtz7n5ae.fsf@nanos.tec.linutronix.de \
    --to=tglx@linutronix.de \
    --cc=akpm@linux-foundation.org \
    --cc=bristot@redhat.com \
    --cc=bsegall@google.com \
    --cc=daniel@ffwll.ch \
    --cc=dietmar.eggemann@arm.com \
    --cc=hch@lst.de \
    --cc=juri.lelli@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=luto@kernel.org \
    --cc=mgorman@suse.de \
    --cc=mingo@kernel.org \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    --cc=torvalds@linux-foundation.org \
    --cc=vincent.guittot@linaro.org \
    --cc=willy@infradead.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).