All of lore.kernel.org
 help / color / mirror / Atom feed
From: Toshi Kani <toshi.kani@hp.com>
To: Dan Williams <dan.j.williams@intel.com>
Cc: Borislav Petkov <bp@alien8.de>,
	Ross Zwisler <ross.zwisler@linux.intel.com>,
	"H. Peter Anvin" <hpa@zytor.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@redhat.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Arnd Bergmann <arnd@arndb.de>,
	linux-mm@kvack.org,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	X86 ML <x86@kernel.org>,
	"linux-nvdimm@lists.01.org" <linux-nvdimm@lists.01.org>,
	jgross@suse.com, Stefan Bader <stefan.bader@canonical.com>,
	Andy Lutomirski <luto@amacapital.net>,
	hmh@hmh.eng.br, yigal@plexistor.com,
	Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>,
	Elliott, Robert (Server, " <Elliott@hp.com>,
	mcgrof@suse.com, Christoph Hellwig" <hch@lst.de>,
	Matthew Wilcox <willy@linux.intel.com>
Subject: Re: [PATCH v10 12/12] drivers/block/pmem: Map NVDIMM with ioremap_wt()
Date: Fri, 29 May 2015 12:32:20 -0600	[thread overview]
Message-ID: <1432924340.23540.78.camel@misato.fc.hp.com> (raw)
In-Reply-To: <CAPcyv4g+zYFkEYpa0HCh0Q+2C3wWNr6v3ZU143h52OKf=U=Qvw@mail.gmail.com>

On Fri, 2015-05-29 at 11:19 -0700, Dan Williams wrote:
> On Fri, May 29, 2015 at 8:03 AM, Toshi Kani <toshi.kani@hp.com> wrote:
> > On Fri, 2015-05-29 at 07:43 -0700, Dan Williams wrote:
> >> On Fri, May 29, 2015 at 2:11 AM, Borislav Petkov <bp@alien8.de> wrote:
> >> > On Wed, May 27, 2015 at 09:19:04AM -0600, Toshi Kani wrote:
> >> >> The pmem driver maps NVDIMM with ioremap_nocache() as we cannot
 :
> >> >> -     pmem->virt_addr = ioremap_nocache(pmem->phys_addr, pmem->size);
> >> >> +     pmem->virt_addr = ioremap_wt(pmem->phys_addr, pmem->size);
> >> >>       if (!pmem->virt_addr)
> >> >>               goto out_release_region;
> >> >
> >> > Dan, Ross, what about this one?
> >> >
> >> > ACK to pick it up as a temporary solution?
> >>
> >> I see that is_new_memtype_allowed() is updated to disallow some
> >> combinations, but the manual seems to imply any mixing of memory types
> >> is unsupported.  Which worries me even in the current code where we
> >> have uncached mappings in the driver, and potentially cached DAX
> >> mappings handed out to userspace.
> >
> > is_new_memtype_allowed() is not to allow some combinations of mixing of
> > memory types.  When it is allowed, the requested type of ioremap_xxx()
> > is changed to match with the existing map type, so that mixing of memory
> > types does not happen.
> 
> Yes, but now if the caller was expecting one memory type and gets
> another one that is something I think the driver would want to know.
> At a minimum I don't think we want to get emails about pmem driver
> performance problems when someone's platform is silently degrading WB
> to UC for example.

The pmem driver creates an ioremap map to an NVDIMM range first.  So,
there will be no conflict at this point, unless there is a conflicting
driver claiming the same NVDIMM range.

DAX then uses the pmem driver (or other byte-addressable driver) to
mount a file system and creates a separate user-space mapping for
mmap().  So, a (silent) map-type conflict will happen at this point,
which may not be protected by the ioremap itself.

> > DAX uses vm_insert_mixed(), which does not even check the existing map
> > type to the physical address.
> 
> Right, I think that's a problem...
> 
> >> A general quibble separate from this patch is that we don't have a way
> >> of knowing if ioremap() will reject or change our requested memory
> >> type.  Shouldn't the driver be explicitly requesting a known valid
> >> type in advance?
> >
> > I agree we need a solution here.
> >
> >> Lastly we now have the PMEM API patches from Ross out for review where
> >> he is assuming cached mappings with non-temporal writes:
> >> https://lists.01.org/pipermail/linux-nvdimm/2015-May/000929.html.
> >> This gives us WC semantics on writes which I believe has the nice
> >> property of reducing the number of write transactions to memory.
> >> Also, the numbers in the paper seem to be assuming DAX operation, but
> >> this ioremap_wt() is in the driver and typically behind a file system.
> >> Are the numbers relevant to that usage mode?
> >
> > I have not looked into the Ross's changes yet, but they do not seem to
> > replace the use of ioremap_nocache().  If his changes can use WB type
> > reliably, yes, we do not need a temporary solution of using ioremap_wt()
> > in this driver.
> 
> Hmm, yes you're right, it seems those patches did not change the
> implementation to use ioremap_cache()... which happens to not be
> implemented on all architectures.  I'll take a look.

Thanks,
-Toshi

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

WARNING: multiple messages have this Message-ID (diff)
From: Toshi Kani <toshi.kani@hp.com>
To: Dan Williams <dan.j.williams@intel.com>
Cc: Borislav Petkov <bp@alien8.de>,
	Ross Zwisler <ross.zwisler@linux.intel.com>,
	"H. Peter Anvin" <hpa@zytor.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@redhat.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Arnd Bergmann <arnd@arndb.de>,
	linux-mm@kvack.org,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	X86 ML <x86@kernel.org>,
	"linux-nvdimm@lists.01.org" <linux-nvdimm@ml01.01.org>,
	jgross@suse.com, Stefan Bader <stefan.bader@canonical.com>,
	Andy Lutomirski <luto@amacapital.net>,
	hmh@hmh.eng.br, yigal@plexistor.com,
	Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>,
	"Elliott, Robert (Server Storage)" <Elliott@hp.com>,
	mcgrof@suse.com, Christoph Hellwig <hch@lst.de>,
	Matthew Wilcox <willy@linux.intel.com>
Subject: Re: [PATCH v10 12/12] drivers/block/pmem: Map NVDIMM with ioremap_wt()
Date: Fri, 29 May 2015 12:32:20 -0600	[thread overview]
Message-ID: <1432924340.23540.78.camel@misato.fc.hp.com> (raw)
In-Reply-To: <CAPcyv4g+zYFkEYpa0HCh0Q+2C3wWNr6v3ZU143h52OKf=U=Qvw@mail.gmail.com>

On Fri, 2015-05-29 at 11:19 -0700, Dan Williams wrote:
> On Fri, May 29, 2015 at 8:03 AM, Toshi Kani <toshi.kani@hp.com> wrote:
> > On Fri, 2015-05-29 at 07:43 -0700, Dan Williams wrote:
> >> On Fri, May 29, 2015 at 2:11 AM, Borislav Petkov <bp@alien8.de> wrote:
> >> > On Wed, May 27, 2015 at 09:19:04AM -0600, Toshi Kani wrote:
> >> >> The pmem driver maps NVDIMM with ioremap_nocache() as we cannot
 :
> >> >> -     pmem->virt_addr = ioremap_nocache(pmem->phys_addr, pmem->size);
> >> >> +     pmem->virt_addr = ioremap_wt(pmem->phys_addr, pmem->size);
> >> >>       if (!pmem->virt_addr)
> >> >>               goto out_release_region;
> >> >
> >> > Dan, Ross, what about this one?
> >> >
> >> > ACK to pick it up as a temporary solution?
> >>
> >> I see that is_new_memtype_allowed() is updated to disallow some
> >> combinations, but the manual seems to imply any mixing of memory types
> >> is unsupported.  Which worries me even in the current code where we
> >> have uncached mappings in the driver, and potentially cached DAX
> >> mappings handed out to userspace.
> >
> > is_new_memtype_allowed() is not to allow some combinations of mixing of
> > memory types.  When it is allowed, the requested type of ioremap_xxx()
> > is changed to match with the existing map type, so that mixing of memory
> > types does not happen.
> 
> Yes, but now if the caller was expecting one memory type and gets
> another one that is something I think the driver would want to know.
> At a minimum I don't think we want to get emails about pmem driver
> performance problems when someone's platform is silently degrading WB
> to UC for example.

The pmem driver creates an ioremap map to an NVDIMM range first.  So,
there will be no conflict at this point, unless there is a conflicting
driver claiming the same NVDIMM range.

DAX then uses the pmem driver (or other byte-addressable driver) to
mount a file system and creates a separate user-space mapping for
mmap().  So, a (silent) map-type conflict will happen at this point,
which may not be protected by the ioremap itself.

> > DAX uses vm_insert_mixed(), which does not even check the existing map
> > type to the physical address.
> 
> Right, I think that's a problem...
> 
> >> A general quibble separate from this patch is that we don't have a way
> >> of knowing if ioremap() will reject or change our requested memory
> >> type.  Shouldn't the driver be explicitly requesting a known valid
> >> type in advance?
> >
> > I agree we need a solution here.
> >
> >> Lastly we now have the PMEM API patches from Ross out for review where
> >> he is assuming cached mappings with non-temporal writes:
> >> https://lists.01.org/pipermail/linux-nvdimm/2015-May/000929.html.
> >> This gives us WC semantics on writes which I believe has the nice
> >> property of reducing the number of write transactions to memory.
> >> Also, the numbers in the paper seem to be assuming DAX operation, but
> >> this ioremap_wt() is in the driver and typically behind a file system.
> >> Are the numbers relevant to that usage mode?
> >
> > I have not looked into the Ross's changes yet, but they do not seem to
> > replace the use of ioremap_nocache().  If his changes can use WB type
> > reliably, yes, we do not need a temporary solution of using ioremap_wt()
> > in this driver.
> 
> Hmm, yes you're right, it seems those patches did not change the
> implementation to use ioremap_cache()... which happens to not be
> implemented on all architectures.  I'll take a look.

Thanks,
-Toshi


WARNING: multiple messages have this Message-ID (diff)
From: Toshi Kani <toshi.kani@hp.com>
To: Dan Williams <dan.j.williams@intel.com>
Cc: Borislav Petkov <bp@alien8.de>,
	Ross Zwisler <ross.zwisler@linux.intel.com>,
	"H. Peter Anvin" <hpa@zytor.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@redhat.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Arnd Bergmann <arnd@arndb.de>,
	linux-mm@kvack.org,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	X86 ML <x86@kernel.org>,
	"linux-nvdimm@lists.01.org" <linux-nvdimm@lists.01.org>,
	jgross@suse.com, Stefan Bader <stefan.bader@canonical.com>,
	Andy Lutomirski <luto@amacapital.net>,
	hmh@hmh.eng.br, yigal@plexistor.com,
	Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>,
	"Elliott, Robert (Server Storage)" <Elliott@hp.com>,
	mcgrof@suse.com, Christoph Hellwig <hch@lst.de>,
	Matthew Wilcox <willy@linux.intel.com>
Subject: Re: [PATCH v10 12/12] drivers/block/pmem: Map NVDIMM with ioremap_wt()
Date: Fri, 29 May 2015 12:32:20 -0600	[thread overview]
Message-ID: <1432924340.23540.78.camel@misato.fc.hp.com> (raw)
In-Reply-To: <CAPcyv4g+zYFkEYpa0HCh0Q+2C3wWNr6v3ZU143h52OKf=U=Qvw@mail.gmail.com>

On Fri, 2015-05-29 at 11:19 -0700, Dan Williams wrote:
> On Fri, May 29, 2015 at 8:03 AM, Toshi Kani <toshi.kani@hp.com> wrote:
> > On Fri, 2015-05-29 at 07:43 -0700, Dan Williams wrote:
> >> On Fri, May 29, 2015 at 2:11 AM, Borislav Petkov <bp@alien8.de> wrote:
> >> > On Wed, May 27, 2015 at 09:19:04AM -0600, Toshi Kani wrote:
> >> >> The pmem driver maps NVDIMM with ioremap_nocache() as we cannot
 :
> >> >> -     pmem->virt_addr = ioremap_nocache(pmem->phys_addr, pmem->size);
> >> >> +     pmem->virt_addr = ioremap_wt(pmem->phys_addr, pmem->size);
> >> >>       if (!pmem->virt_addr)
> >> >>               goto out_release_region;
> >> >
> >> > Dan, Ross, what about this one?
> >> >
> >> > ACK to pick it up as a temporary solution?
> >>
> >> I see that is_new_memtype_allowed() is updated to disallow some
> >> combinations, but the manual seems to imply any mixing of memory types
> >> is unsupported.  Which worries me even in the current code where we
> >> have uncached mappings in the driver, and potentially cached DAX
> >> mappings handed out to userspace.
> >
> > is_new_memtype_allowed() is not to allow some combinations of mixing of
> > memory types.  When it is allowed, the requested type of ioremap_xxx()
> > is changed to match with the existing map type, so that mixing of memory
> > types does not happen.
> 
> Yes, but now if the caller was expecting one memory type and gets
> another one that is something I think the driver would want to know.
> At a minimum I don't think we want to get emails about pmem driver
> performance problems when someone's platform is silently degrading WB
> to UC for example.

The pmem driver creates an ioremap map to an NVDIMM range first.  So,
there will be no conflict at this point, unless there is a conflicting
driver claiming the same NVDIMM range.

DAX then uses the pmem driver (or other byte-addressable driver) to
mount a file system and creates a separate user-space mapping for
mmap().  So, a (silent) map-type conflict will happen at this point,
which may not be protected by the ioremap itself.

> > DAX uses vm_insert_mixed(), which does not even check the existing map
> > type to the physical address.
> 
> Right, I think that's a problem...
> 
> >> A general quibble separate from this patch is that we don't have a way
> >> of knowing if ioremap() will reject or change our requested memory
> >> type.  Shouldn't the driver be explicitly requesting a known valid
> >> type in advance?
> >
> > I agree we need a solution here.
> >
> >> Lastly we now have the PMEM API patches from Ross out for review where
> >> he is assuming cached mappings with non-temporal writes:
> >> https://lists.01.org/pipermail/linux-nvdimm/2015-May/000929.html.
> >> This gives us WC semantics on writes which I believe has the nice
> >> property of reducing the number of write transactions to memory.
> >> Also, the numbers in the paper seem to be assuming DAX operation, but
> >> this ioremap_wt() is in the driver and typically behind a file system.
> >> Are the numbers relevant to that usage mode?
> >
> > I have not looked into the Ross's changes yet, but they do not seem to
> > replace the use of ioremap_nocache().  If his changes can use WB type
> > reliably, yes, we do not need a temporary solution of using ioremap_wt()
> > in this driver.
> 
> Hmm, yes you're right, it seems those patches did not change the
> implementation to use ioremap_cache()... which happens to not be
> implemented on all architectures.  I'll take a look.

Thanks,
-Toshi

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2015-05-29 18:32 UTC|newest]

Thread overview: 70+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-05-27 15:18 [PATCH v10 0/12] Support Write-Through mapping on x86 Toshi Kani
2015-05-27 15:18 ` Toshi Kani
2015-05-27 15:18 ` [PATCH v10 1/12] x86, mm, pat: Set WT to PA7 slot of PAT MSR Toshi Kani
2015-05-27 15:18   ` Toshi Kani
2015-05-27 15:18 ` [PATCH v10 2/12] x86, mm, pat: Change reserve_memtype() for WT Toshi Kani
2015-05-27 15:18   ` Toshi Kani
2015-05-27 15:18 ` [PATCH v10 3/12] x86, asm: Change is_new_memtype_allowed() " Toshi Kani
2015-05-27 15:18   ` Toshi Kani
2015-05-27 15:18 ` [PATCH v10 4/12] x86, mm, asm-gen: Add ioremap_wt() " Toshi Kani
2015-05-27 15:18   ` Toshi Kani
2015-05-27 15:18 ` [PATCH v10 5/12] arch/*/asm/io.h: Add ioremap_wt() to all architectures Toshi Kani
2015-05-27 15:18   ` Toshi Kani
2015-05-27 15:18 ` [PATCH v10 6/12] video/fbdev, asm/io.h: Remove ioremap_writethrough() Toshi Kani
2015-05-27 15:18   ` Toshi Kani
2015-05-27 15:18 ` [PATCH v10 7/12] x86, mm, pat: Add pgprot_writethrough() for WT Toshi Kani
2015-05-27 15:18   ` Toshi Kani
2015-05-27 15:19 ` [PATCH v10 8/12] x86, mm, asm: Add WT support to set_page_memtype() Toshi Kani
2015-05-27 15:19   ` Toshi Kani
2015-05-27 15:19 ` [PATCH v10 9/12] x86, mm: Add set_memory_wt() for WT Toshi Kani
2015-05-27 15:19   ` Toshi Kani
2015-05-27 15:19 ` [PATCH v10 10/12] x86, mm, pat: Cleanup init flags in pat_init() Toshi Kani
2015-05-27 15:19   ` Toshi Kani
2015-05-29  8:59   ` Borislav Petkov
2015-05-29  8:59     ` Borislav Petkov
2015-05-27 15:19 ` [PATCH v10 11/12] x86, mm, pat: Refactor !pat_enabled handling Toshi Kani
2015-05-27 15:19   ` Toshi Kani
2015-05-29  8:58   ` Borislav Petkov
2015-05-29  8:58     ` Borislav Petkov
2015-05-29 14:27     ` Toshi Kani
2015-05-29 14:27       ` Toshi Kani
2015-05-29 15:13       ` Borislav Petkov
2015-05-29 15:13         ` Borislav Petkov
2015-05-29 15:17         ` Toshi Kani
2015-05-29 15:17           ` Toshi Kani
2015-05-27 15:19 ` [PATCH v10 12/12] drivers/block/pmem: Map NVDIMM with ioremap_wt() Toshi Kani
2015-05-27 15:19   ` Toshi Kani
2015-05-29  9:11   ` Borislav Petkov
2015-05-29  9:11     ` Borislav Petkov
2015-05-29 14:43     ` Dan Williams
2015-05-29 14:43       ` Dan Williams
2015-05-29 15:03       ` Toshi Kani
2015-05-29 15:03         ` Toshi Kani
2015-05-29 15:03         ` Toshi Kani
2015-05-29 18:19         ` Dan Williams
2015-05-29 18:19           ` Dan Williams
2015-05-29 18:32           ` Toshi Kani [this message]
2015-05-29 18:32             ` Toshi Kani
2015-05-29 18:32             ` Toshi Kani
2015-05-29 19:34             ` Dan Williams
2015-05-29 19:34               ` Dan Williams
2015-05-29 20:10               ` Toshi Kani
2015-05-29 20:10                 ` Toshi Kani
2015-05-29 18:34           ` Andy Lutomirski
2015-05-29 18:34             ` Andy Lutomirski
2015-05-29 19:32             ` Dan Williams
2015-05-29 19:32               ` Dan Williams
2015-05-29 21:29             ` Elliott, Robert (Server Storage)
2015-05-29 21:29               ` Elliott, Robert (Server Storage)
2015-05-29 21:29               ` Elliott, Robert (Server Storage)
2015-05-29 21:46               ` Andy Lutomirski
2015-05-29 21:46                 ` Andy Lutomirski
2015-05-29 22:24                 ` Elliott, Robert (Server Storage)
2015-05-29 22:24                   ` Elliott, Robert (Server Storage)
2015-05-29 22:24                   ` Elliott, Robert (Server Storage)
2015-05-29 22:32                 ` H. Peter Anvin
2015-05-29 22:32                   ` H. Peter Anvin
2015-06-01  8:58                 ` Ingo Molnar
2015-06-01  8:58                   ` Ingo Molnar
2015-06-01 17:10                   ` Andy Lutomirski
2015-06-01 17:10                     ` Andy Lutomirski

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1432924340.23540.78.camel@misato.fc.hp.com \
    --to=toshi.kani@hp.com \
    --cc=akpm@linux-foundation.org \
    --cc=arnd@arndb.de \
    --cc=bp@alien8.de \
    --cc=dan.j.williams@intel.com \
    --cc=hmh@hmh.eng.br \
    --cc=hpa@zytor.com \
    --cc=jgross@suse.com \
    --cc=konrad.wilk@oracle.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-nvdimm@lists.01.org \
    --cc=luto@amacapital.net \
    --cc=mingo@redhat.com \
    --cc=ross.zwisler@linux.intel.com \
    --cc=stefan.bader@canonical.com \
    --cc=tglx@linutronix.de \
    --cc=x86@kernel.org \
    --cc=yigal@plexistor.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.