linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Ira Weiny <ira.weiny@intel.com>
To: Matthew Wilcox <willy@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	LKML <linux-kernel@vger.kernel.org>,
	linux-arch <linux-arch@vger.kernel.org>,
	Paul McKenney <paulmck@kernel.org>,
	the arch/x86 maintainers <x86@kernel.org>,
	Sebastian Andrzej Siewior <bigeasy@linutronix.de>,
	Peter Zijlstra <peterz@infradead.org>,
	Juri Lelli <juri.lelli@redhat.com>,
	Vincent Guittot <vincent.guittot@linaro.org>,
	Dietmar Eggemann <dietmar.eggemann@arm.com>,
	Steven Rostedt <rostedt@goodmis.org>,
	Ben Segall <bsegall@google.com>, Mel Gorman <mgorman@suse.de>,
	Daniel Bristot de Oliveira <bristot@redhat.com>,
	Will Deacon <will@kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Linux-MM <linux-mm@kvack.org>,
	Russell King <linux@armlinux.org.uk>,
	Linux ARM <linux-arm-kernel@lists.infradead.org>,
	Chris Zankel <chris@zankel.net>,
	Max Filippov <jcmvbkbc@gmail.com>,
	linux-xtensa@linux-xtensa.org,
	Jani Nikula <jani.nikula@linux.intel.com>,
	Joonas Lahtinen <joonas.lahtinen@linux.intel.com>,
	Rodrigo Vivi <rodrigo.vivi@intel.com>,
	David Airlie <airlied@linux.ie>, Daniel Vetter <daniel@ffwll.ch>,
	intel-gfx <intel-gfx@lists.freedesktop.org>,
	dri-devel <dri-devel@lists.freedesktop.org>,
	Ard Biesheuvel <ardb@kernel.org>,
	Herbert Xu <herbert@gondor.apana.org.au>,
	Vineet Gupta <vgupta@synopsys.com>,
	"open list:SYNOPSYS ARC ARCHITECTURE" 
	<linux-snps-arc@lists.infradead.org>,
	Arnd Bergmann <arnd@arndb.de>, Guo Ren <guoren@kernel.org>,
	linux-csky@vger.kernel.org, Michal Simek <monstr@monstr.eu>,
	Thomas Bogendoerfer <tsbogend@alpha.franken.de>,
	linux-mips@vger.kernel.org, Nick Hu <nickhu@andestech.com>,
	Greentime Hu <green.hu@gmail.com>,
	Vincent Chen <deanbo422@gmail.com>,
	Michael Ellerman <mpe@ellerman.id.au>,
	Benjamin Herrenschmidt <benh@kernel.crashing.org>,
	Paul Mackerras <paulus@samba.org>,
	linuxppc-dev <linuxppc-dev@lists.ozlabs.org>,
	"David S. Miller" <davem@davemloft.net>,
	linux-sparc <sparclinux@vger.kernel.org>
Subject: Re: [patch RFC 00/15] mm/highmem: Provide a preemptible variant of kmap_atomic & friends
Date: Mon, 21 Sep 2020 12:58:25 -0700	[thread overview]
Message-ID: <20200921195824.GM2540965@iweiny-DESK2.sc.intel.com> (raw)
In-Reply-To: <20200919173906.GQ32101@casper.infradead.org>

On Sat, Sep 19, 2020 at 06:39:06PM +0100, Matthew Wilcox wrote:
> On Sat, Sep 19, 2020 at 10:18:54AM -0700, Linus Torvalds wrote:
> > On Sat, Sep 19, 2020 at 2:50 AM Thomas Gleixner <tglx@linutronix.de> wrote:
> > >
> > > this provides a preemptible variant of kmap_atomic & related
> > > interfaces. This is achieved by:
> > 
> > Ack. This looks really nice, even apart from the new capability.
> > 
> > The only thing I really reacted to is that the name doesn't make sense
> > to me: "kmap_temporary()" seems a bit odd.
> > 
> > Particularly for an interface that really is basically meant as a
> > better replacement of "kmap_atomic()" (but is perhaps also a better
> > replacement for "kmap()").
> > 
> > I think I understand how the name came about: I think the "temporary"
> > is there as a distinction from the "longterm" regular kmap(). So I
> > think it makes some sense from an internal implementation angle, but I
> > don't think it makes a lot of sense from an interface name.
> > 
> > I don't know what might be a better name, but if we want to emphasize
> > that it's thread-private and a one-off, maybe "local" would be a
> > better naming, and make it distinct from the "global" nature of the
> > old kmap() interface?
> > 
> > However, another solution might be to just use this new preemptible
> > "local" kmap(), and remove the old global one entirely. Yes, the old
> > global one caches the page table mapping and that sounds really
> > efficient and nice. But it's actually horribly horribly bad, because
> > it means that we need to use locking for them. Your new "temporary"
> > implementation seems to be fundamentally better locking-wise, and only
> > need preemption disabling as locking (and is equally fast for the
> > non-highmem case).
> > 
> > So I wonder if the single-page TLB flush isn't a better model, and
> > whether it wouldn't be a lot simpler to just get rid of the old
> > complex kmap() entirely, and replace it with this?
> > 
> > I agree we can't replace the kmap_atomic() version, because maybe
> > people depend on the preemption disabling it also implied. But what
> > about replacing the non-atomic kmap()?
> 
> My concern with that is people might use kmap() and then pass the address
> to a different task.  So we need to audit the current users of kmap()
> and convert any that do that into using vmap() instead.
> 

I've done some of this work.[3]  PKS and pmem stray write protection[2] depend
on kmap to enable the correct PKS settings.  After working through the
exception handling we realized that some users of kmap() seem to be doing just
this; passing the address to a different task.

From what I have found ~90% of kmap() callers are 'kmap_thread()' and the other
~10% are kmap().[3]  But of those 10% I'm not familiar with the code enough to
know if they really require a 'global' map.  What I do know is they save an
address which appears to be used in other threads.  But I could be wrong.

For PKS I added a 'global' implementation which could then be called by kmap()
and added a new kmap_thread() call which used the original 'local' version of
the PKS interface.  The PKS work is still being reviewed internally for the TIP
core code.  But I've pushed it all to git hub for purposes of this
discussion.[1]

> I like kmap_local().  Or kmap_thread().

I chose kmap_thread() so that makes sense to me.  I also thought about using
kmap_global() as an alternative interface which would change just ~10% of the
callers and make the series much smaller.  But internal discussions lead me to
chose kmap_thread() as the new interface so that we don't change the semantics
of kmap().

Ira


[1] https://github.com/weiny2/linux-kernel/tree/lm-pks-pmem-for-5.10-v3

[2] https://lore.kernel.org/lkml/20200717072056.73134-1-ira.weiny@intel.com/

[3]
12:42:06 > git grep ' kmap(' *.c | grep -v '* ' | wc -l
22

12:43:32 > git grep ' kmap_thread(' *.c | grep -v '* ' | wc -l
204

Here are the callers which hand an address to another thread.

12:45:25 > git grep ' kmap(' *.c | grep -v '* '
arch/x86/mm/dump_pagetables.c:  [PKMAP_BASE_NR]         = { 0UL, "Persistent kmap() Area" },
drivers/firewire/net.c:         ptr = kmap(dev->broadcast_rcv_buffer.pages[u]);
drivers/gpu/drm/i915/gem/i915_gem_pages.c:              return kmap(sg_page(sgt->sgl));
drivers/gpu/drm/i915/selftests/i915_perf.c:     scratch = kmap(ce->vm->scratch[0].base.page);
drivers/gpu/drm/ttm/ttm_bo_util.c:              map->virtual = kmap(map->page);
drivers/infiniband/hw/qib/qib_user_sdma.c:      mpage = kmap(page);
drivers/misc/vmw_vmci/vmci_host.c:      context->notify = kmap(context->notify_page) + (uva & (PAGE_SIZE - 1));
drivers/misc/xilinx_sdfec.c:            addr = kmap(pages[i]);
drivers/mmc/host/usdhi6rol0.c:  host->pg.mapped         = kmap(host->pg.page);
drivers/mmc/host/usdhi6rol0.c:  host->pg.mapped = kmap(host->pg.page);
drivers/mmc/host/usdhi6rol0.c:  host->pg.mapped = kmap(host->pg.page);
drivers/nvme/target/tcp.c:              iov->iov_base = kmap(sg_page(sg)) + sg->offset + sg_offset;
drivers/scsi/libiscsi_tcp.c:            segment->sg_mapped = kmap(sg_page(sg));
drivers/target/iscsi/iscsi_target.c:            iov[i].iov_base = kmap(sg_page(sg)) + sg->offset + page_off;
drivers/target/target_core_transport.c:         return kmap(sg_page(sg)) + sg->offset;
fs/btrfs/check-integrity.c:             block_ctx->datav[i] = kmap(block_ctx->pagev[i]);
fs/ceph/dir.c:          cache_ctl->dentries = kmap(cache_ctl->page);
fs/ceph/inode.c:                ctl->dentries = kmap(ctl->page);
lib/scatterlist.c:              miter->addr = kmap(miter->page) + miter->__offset;
net/ceph/pagelist.c:    pl->mapped_tail = kmap(page);
net/ceph/pagelist.c:            pl->mapped_tail = kmap(page);
virt/kvm/kvm_main.c:                    hva = kmap(page);


  parent reply	other threads:[~2020-09-21 19:58 UTC|newest]

Thread overview: 54+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-09-19  9:17 [patch RFC 00/15] mm/highmem: Provide a preemptible variant of kmap_atomic & friends Thomas Gleixner
2020-09-19  9:17 ` [patch RFC 01/15] mm/highmem: Un-EXPORT __kmap_atomic_idx() Thomas Gleixner
2020-09-21  6:23   ` Christoph Hellwig
2020-09-19  9:17 ` [patch RFC 02/15] highmem: Provide generic variant of kmap_atomic* Thomas Gleixner
2020-09-21  6:28   ` Christoph Hellwig
2020-09-19  9:17 ` [patch RFC 03/15] x86/mm/highmem: Use generic kmap atomic implementation Thomas Gleixner
2020-09-19  9:17 ` [patch RFC 04/15] arc/mm/highmem: " Thomas Gleixner
2020-09-19  9:17 ` [patch RFC 05/15] ARM: highmem: Switch to generic kmap atomic Thomas Gleixner
2020-09-19  9:17 ` [patch RFC 06/15] csky/mm/highmem: " Thomas Gleixner
2020-09-23  0:05   ` Guo Ren
2020-09-19  9:17 ` [patch RFC 07/15] microblaze/mm/highmem: " Thomas Gleixner
2020-09-19  9:17 ` [patch RFC 08/15] mips/mm/highmem: " Thomas Gleixner
2020-09-19  9:18 ` [patch RFC 09/15] nds32/mm/highmem: " Thomas Gleixner
2020-09-19  9:18 ` [patch RFC 10/15] powerpc/mm/highmem: " Thomas Gleixner
2020-09-19  9:18 ` [patch RFC 11/15] sparc/mm/highmem: " Thomas Gleixner
2020-09-19  9:18 ` [patch RFC 12/15] xtensa/mm/highmem: " Thomas Gleixner
2020-09-19  9:18 ` [patch RFC 13/15] mm/highmem: Remove the old kmap_atomic cruft Thomas Gleixner
2020-09-19  9:18 ` [patch RFC 14/15] sched: highmem: Store temporary kmaps in task struct Thomas Gleixner
2020-09-19  9:18 ` [patch RFC 15/15] mm/highmem: Provide kmap_temporary* Thomas Gleixner
2020-09-19 10:35 ` [patch RFC 00/15] mm/highmem: Provide a preemptible variant of kmap_atomic & friends Daniel Vetter
2020-09-19 10:37   ` Daniel Vetter
2020-09-20  6:23     ` Thomas Gleixner
2020-09-20  8:23       ` Daniel Vetter
2020-09-20 17:24         ` Thomas Gleixner
2020-09-19 17:18 ` Linus Torvalds
2020-09-19 17:39   ` Matthew Wilcox
2020-09-19 19:13     ` Linus Torvalds
2020-09-21 19:58     ` Ira Weiny [this message]
2020-09-20  6:41   ` Thomas Gleixner
2020-09-20  8:49     ` Thomas Gleixner
2020-09-20 16:57       ` Linus Torvalds
2020-09-20 17:40         ` Thomas Gleixner
2020-09-20 17:42           ` Linus Torvalds
2020-09-20 17:58             ` Linus Torvalds
2020-09-21  7:39             ` Thomas Gleixner
2020-09-21 16:24               ` Linus Torvalds
2020-09-21 19:27                 ` Thomas Gleixner
2020-09-23  8:40                   ` peterz
2020-09-23 13:35                     ` Thomas Gleixner
2020-09-23 15:52                     ` Steven Rostedt
2020-09-23 20:55                       ` Thomas Gleixner
2020-09-23 21:12                         ` Steven Rostedt
2020-09-24  6:57                           ` Thomas Gleixner
2020-09-24 12:32                             ` Steven Rostedt
2020-09-24 12:42                               ` Peter Zijlstra
2020-09-24 13:51                                 ` Steven Rostedt
2020-09-24 13:58                                   ` Peter Zijlstra
2020-09-24 17:55                               ` Thomas Gleixner
2020-09-24 18:58                                 ` Steven Rostedt
2020-09-24  8:27                       ` peterz
2020-09-24 19:36                         ` Daniel Bristot de Oliveira
2020-09-23 10:19                   ` peterz
2020-09-23 12:33                     ` Thomas Gleixner
2020-09-23 14:33                   ` Thomas Gleixner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200921195824.GM2540965@iweiny-DESK2.sc.intel.com \
    --to=ira.weiny@intel.com \
    --cc=airlied@linux.ie \
    --cc=akpm@linux-foundation.org \
    --cc=ardb@kernel.org \
    --cc=arnd@arndb.de \
    --cc=benh@kernel.crashing.org \
    --cc=bigeasy@linutronix.de \
    --cc=bristot@redhat.com \
    --cc=bsegall@google.com \
    --cc=chris@zankel.net \
    --cc=daniel@ffwll.ch \
    --cc=davem@davemloft.net \
    --cc=deanbo422@gmail.com \
    --cc=dietmar.eggemann@arm.com \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=green.hu@gmail.com \
    --cc=guoren@kernel.org \
    --cc=herbert@gondor.apana.org.au \
    --cc=intel-gfx@lists.freedesktop.org \
    --cc=jani.nikula@linux.intel.com \
    --cc=jcmvbkbc@gmail.com \
    --cc=joonas.lahtinen@linux.intel.com \
    --cc=juri.lelli@redhat.com \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-csky@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mips@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-snps-arc@lists.infradead.org \
    --cc=linux-xtensa@linux-xtensa.org \
    --cc=linux@armlinux.org.uk \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=mgorman@suse.de \
    --cc=monstr@monstr.eu \
    --cc=mpe@ellerman.id.au \
    --cc=nickhu@andestech.com \
    --cc=paulmck@kernel.org \
    --cc=paulus@samba.org \
    --cc=peterz@infradead.org \
    --cc=rodrigo.vivi@intel.com \
    --cc=rostedt@goodmis.org \
    --cc=sparclinux@vger.kernel.org \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    --cc=tsbogend@alpha.franken.de \
    --cc=vgupta@synopsys.com \
    --cc=vincent.guittot@linaro.org \
    --cc=will@kernel.org \
    --cc=willy@infradead.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).