All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andrew Morton <akpm@linux-foundation.org>
To: Toshi Kani <toshi.kani@hp.com>
Cc: hpa@zytor.com, tglx@linutronix.de, mingo@redhat.com,
	arnd@arndb.de, linux-mm@kvack.org, x86@kernel.org,
	linux-kernel@vger.kernel.org, Elliott@hp.com
Subject: Re: [PATCH v2 0/7] Kernel huge I/O mapping support
Date: Mon, 23 Feb 2015 12:22:24 -0800	[thread overview]
Message-ID: <20150223122224.c55554325cc4dadeca067234@linux-foundation.org> (raw)
In-Reply-To: <1423521935-17454-1-git-send-email-toshi.kani@hp.com>

On Mon,  9 Feb 2015 15:45:28 -0700 Toshi Kani <toshi.kani@hp.com> wrote:

> ioremap() and its related interfaces are used to create I/O
> mappings to memory-mapped I/O devices.  The mapping sizes of
> the traditional I/O devices are relatively small.  Non-volatile
> memory (NVM), however, has many GB and is going to have TB soon.
> It is not very efficient to create large I/O mappings with 4KB. 

The changelogging is very good - thanks for taking the time to do this.

> This patchset extends the ioremap() interfaces to transparently
> create I/O mappings with huge pages whenever possible.

I'm wondering if this is prudent.  Existing code which was tested with
4k mappings will magically start to use huge tlb mappings.  I don't
know what could go wrong, but I'd prefer not to find out!  Wouldn't it
be safer to make this an explicit opt-in?

What operations can presently be performed against an ioremapped area? 
Can kernel code perform change_page_attr() against individual pages? 
Can kernel code run iounmap() against just part of that region (I
forget).  There does seem to be potential for breakage if we start
using hugetlb mappings for such things?

>  ioremap()
> continues to use 4KB mappings when a huge page does not fit into
> a requested range.  There is no change necessary to the drivers
> using ioremap().  A requested physical address must be aligned by
> a huge page size (1GB or 2MB on x86) for using huge page mapping,
> though.  The kernel huge I/O mapping will improve performance of
> NVM and other devices with large memory, and reduce the time to
> create their mappings as well.
> 
> On x86, the huge I/O mapping may not be used when a target range is
> covered by multiple MTRRs with different memory types.  The caller
> must make a separate request for each MTRR range, or the huge I/O
> mapping can be disabled with the kernel boot option "nohugeiomap".
> The detail of this issue is described in the email below, and this
> patch takes option C) in favor of simplicity since MTRRs are legacy
> feature.
>  https://lkml.org/lkml/2015/2/5/638

How is this mtrr clash handled?

- The iomap call will fail if there are any MTRRs covering the region?

- The iomap call will fail if there are more than one MTRRs covering
  the region?

- If the ioremap will succeed if a single MTRR covers the region,
  must that MTRR cover the *entire* region?

- What happens if userspace tried fiddling the MTRRs after the region
  has been established?

<reads the code>

Oh.  We don't do any checking at all.  We're just telling userspace
programmers "don't do that".  hrm.  What are your thoughts on adding
the overlap checks to the kernel?

This adds more potential for breaking existing code, doesn't it?  If
there's code which is using 4k ioremap on regions which are covered by
mtrrs, the transparent switch to hugeptes will cause that code to enter
the "undefined behaviour" space?

> The patchset introduces the following configs:
>  HUGE_IOMAP - When selected (default Y), enable huge I/O mappings.
>               Require HAVE_ARCH_HUGE_VMAP set.
>  HAVE_ARCH_HUGE_VMAP - Indicate arch supports huge KVA mappings.
>                        Require X86_PAE set on X86_32.
> 
> Patch 1-4 changes common files to support huge I/O mappings.  There
> is no change in the functinalities until HUGE_IOMAP is set in patch 7.
> 
> Patch 5,6 implement HAVE_ARCH_HUGE_VMAP and HUGE_IOMAP funcs on x86,
> and set HAVE_ARCH_HUGE_VMAP on x86.
> 
> Patch 7 adds HUGE_IOMAP to Kconfig, which is set to Y by default on
> x86.

What do other architectures need to do to utilize this?



WARNING: multiple messages have this Message-ID (diff)
From: Andrew Morton <akpm@linux-foundation.org>
To: Toshi Kani <toshi.kani@hp.com>
Cc: hpa@zytor.com, tglx@linutronix.de, mingo@redhat.com,
	arnd@arndb.de, linux-mm@kvack.org, x86@kernel.org,
	linux-kernel@vger.kernel.org, Elliott@hp.com
Subject: Re: [PATCH v2 0/7] Kernel huge I/O mapping support
Date: Mon, 23 Feb 2015 12:22:24 -0800	[thread overview]
Message-ID: <20150223122224.c55554325cc4dadeca067234@linux-foundation.org> (raw)
In-Reply-To: <1423521935-17454-1-git-send-email-toshi.kani@hp.com>

On Mon,  9 Feb 2015 15:45:28 -0700 Toshi Kani <toshi.kani@hp.com> wrote:

> ioremap() and its related interfaces are used to create I/O
> mappings to memory-mapped I/O devices.  The mapping sizes of
> the traditional I/O devices are relatively small.  Non-volatile
> memory (NVM), however, has many GB and is going to have TB soon.
> It is not very efficient to create large I/O mappings with 4KB. 

The changelogging is very good - thanks for taking the time to do this.

> This patchset extends the ioremap() interfaces to transparently
> create I/O mappings with huge pages whenever possible.

I'm wondering if this is prudent.  Existing code which was tested with
4k mappings will magically start to use huge tlb mappings.  I don't
know what could go wrong, but I'd prefer not to find out!  Wouldn't it
be safer to make this an explicit opt-in?

What operations can presently be performed against an ioremapped area? 
Can kernel code perform change_page_attr() against individual pages? 
Can kernel code run iounmap() against just part of that region (I
forget).  There does seem to be potential for breakage if we start
using hugetlb mappings for such things?

>  ioremap()
> continues to use 4KB mappings when a huge page does not fit into
> a requested range.  There is no change necessary to the drivers
> using ioremap().  A requested physical address must be aligned by
> a huge page size (1GB or 2MB on x86) for using huge page mapping,
> though.  The kernel huge I/O mapping will improve performance of
> NVM and other devices with large memory, and reduce the time to
> create their mappings as well.
> 
> On x86, the huge I/O mapping may not be used when a target range is
> covered by multiple MTRRs with different memory types.  The caller
> must make a separate request for each MTRR range, or the huge I/O
> mapping can be disabled with the kernel boot option "nohugeiomap".
> The detail of this issue is described in the email below, and this
> patch takes option C) in favor of simplicity since MTRRs are legacy
> feature.
>  https://lkml.org/lkml/2015/2/5/638

How is this mtrr clash handled?

- The iomap call will fail if there are any MTRRs covering the region?

- The iomap call will fail if there are more than one MTRRs covering
  the region?

- If the ioremap will succeed if a single MTRR covers the region,
  must that MTRR cover the *entire* region?

- What happens if userspace tried fiddling the MTRRs after the region
  has been established?

<reads the code>

Oh.  We don't do any checking at all.  We're just telling userspace
programmers "don't do that".  hrm.  What are your thoughts on adding
the overlap checks to the kernel?

This adds more potential for breaking existing code, doesn't it?  If
there's code which is using 4k ioremap on regions which are covered by
mtrrs, the transparent switch to hugeptes will cause that code to enter
the "undefined behaviour" space?

> The patchset introduces the following configs:
>  HUGE_IOMAP - When selected (default Y), enable huge I/O mappings.
>               Require HAVE_ARCH_HUGE_VMAP set.
>  HAVE_ARCH_HUGE_VMAP - Indicate arch supports huge KVA mappings.
>                        Require X86_PAE set on X86_32.
> 
> Patch 1-4 changes common files to support huge I/O mappings.  There
> is no change in the functinalities until HUGE_IOMAP is set in patch 7.
> 
> Patch 5,6 implement HAVE_ARCH_HUGE_VMAP and HUGE_IOMAP funcs on x86,
> and set HAVE_ARCH_HUGE_VMAP on x86.
> 
> Patch 7 adds HUGE_IOMAP to Kconfig, which is set to Y by default on
> x86.

What do other architectures need to do to utilize this?


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  parent reply	other threads:[~2015-02-23 20:22 UTC|newest]

Thread overview: 50+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-02-09 22:45 [PATCH v2 0/7] Kernel huge I/O mapping support Toshi Kani
2015-02-09 22:45 ` Toshi Kani
2015-02-09 22:45 ` [PATCH v2 1/7] mm: Change __get_vm_area_node() to use fls_long() Toshi Kani
2015-02-09 22:45   ` Toshi Kani
2015-02-09 22:45 ` [PATCH v2 2/7] lib: Add huge I/O map capability interfaces Toshi Kani
2015-02-09 22:45   ` Toshi Kani
2015-02-09 22:45 ` [PATCH v2 3/7] mm: Change ioremap to set up huge I/O mappings Toshi Kani
2015-02-09 22:45   ` Toshi Kani
2015-02-09 22:45 ` [PATCH v2 4/7] mm: Change vunmap to tear down huge KVA mappings Toshi Kani
2015-02-09 22:45   ` Toshi Kani
2015-02-09 22:45 ` [PATCH v2 5/7] x86, mm: Support huge KVA mappings on x86 Toshi Kani
2015-02-09 22:45   ` Toshi Kani
2015-02-10 18:59   ` Dave Hansen
2015-02-10 18:59     ` Dave Hansen
2015-02-10 20:42     ` Toshi Kani
2015-02-10 20:42       ` Toshi Kani
2015-02-10 20:51       ` Dave Hansen
2015-02-10 20:51         ` Dave Hansen
2015-02-10 22:13         ` Toshi Kani
2015-02-10 22:13           ` Toshi Kani
2015-02-10 22:20           ` Toshi Kani
2015-02-10 22:20             ` Toshi Kani
2015-02-10 23:10           ` Toshi Kani
2015-02-10 23:10             ` Toshi Kani
2015-03-03  0:37             ` Toshi Kani
2015-03-03  0:37               ` Toshi Kani
2015-02-09 22:45 ` [PATCH v2 6/7] x86, mm: Support huge I/O " Toshi Kani
2015-02-09 22:45   ` Toshi Kani
2015-02-18 20:44   ` Ingo Molnar
2015-02-18 20:44     ` Ingo Molnar
2015-02-18 21:13     ` Toshi Kani
2015-02-18 21:13       ` Toshi Kani
2015-02-18 21:15       ` Ingo Molnar
2015-02-18 21:15         ` Ingo Molnar
2015-02-18 21:33         ` Toshi Kani
2015-02-18 21:33           ` Toshi Kani
2015-02-18 21:57           ` Ingo Molnar
2015-02-18 21:57             ` Ingo Molnar
2015-02-18 22:14             ` Toshi Kani
2015-02-18 22:14               ` Toshi Kani
2015-02-09 22:45 ` [PATCH v2 7/7] mm: Add config HUGE_IOMAP to enable huge I/O mappings Toshi Kani
2015-02-09 22:45   ` Toshi Kani
2015-02-23 20:22 ` Andrew Morton [this message]
2015-02-23 20:22   ` [PATCH v2 0/7] Kernel huge I/O mapping support Andrew Morton
2015-02-23 23:54   ` Toshi Kani
2015-02-23 23:54     ` Toshi Kani
2015-02-24  8:09   ` Ingo Molnar
2015-02-24  8:09     ` Ingo Molnar
2015-03-02 15:51     ` Toshi Kani
2015-03-02 15:51       ` Toshi Kani

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150223122224.c55554325cc4dadeca067234@linux-foundation.org \
    --to=akpm@linux-foundation.org \
    --cc=Elliott@hp.com \
    --cc=arnd@arndb.de \
    --cc=hpa@zytor.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mingo@redhat.com \
    --cc=tglx@linutronix.de \
    --cc=toshi.kani@hp.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.