All of lore.kernel.org
 help / color / mirror / Atom feed
* [Ksummit-discuss] [TECH TOPIC] Semantics of MMIO mapping attributes accross archs
@ 2015-07-04  8:17 Benjamin Herrenschmidt
  2015-07-04 14:12 ` Dan Williams
  2015-07-07  0:01 ` Luis R. Rodriguez
  0 siblings, 2 replies; 15+ messages in thread
From: Benjamin Herrenschmidt @ 2015-07-04  8:17 UTC (permalink / raw)
  To: ksummit-discuss

Allright, it's that time of year ... So here's my attempt at getting
myself invited :-)

We've been talking about some of that on-list recently, and it might
well be that this will trigger a resolution before we even reach KS, but
I though it might be worthwhile to gather enough people from various
arch together to hash things out:

We have a pile of mapping attributes (more showing up recently),
typically used for MMIO mappings (but not necessarily exclusively).
ioremap_cache and ioremap_nocache are the old/common ones, but we have
_wc (write combine), _wt (write through) and possibly more around the
corner.

What are their precise semantics accross all architecture ? This is not
clear (not documented). For example, we define writel(), readl() and
friends as being fully ordered vs each other but also vs DMA etc... but
on what mapping types do they have this property ?

Will _wc() provide the write-combine ability for writel() on all archs ?
Or does it require writel_relaxed() on some ? Will _wc() bring other
side effects such as loss of read vs. write ordering ? (on some archs at
least...). Etc....

There is a growing matrix of MMIO accessors and mapping types whose
semantics are poorly (if at all) defined. We cannot define them all
exactly for all architectures as there are too many differences that
will impact them. But we should be able to guarantee at least *some*,
ie, whether a given type of ordering is guaranteed or not by a given
accessor on a given mapping type, whether write combine (if supported at
all) will happen with a given accessor or not etc...

As for who should participate, well, at least one rep from each major
arch who is familiar with the intricacies of the architecture memory
model I would say, possibly others who dabbled in that stuff recently
such as Luis R. Rodriguez <mcgrof@suse.com> who was proposing patch
series lately to consolidate the use of _wc.

Cheers,
Ben.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Ksummit-discuss] [TECH TOPIC] Semantics of MMIO mapping attributes accross archs
  2015-07-04  8:17 [Ksummit-discuss] [TECH TOPIC] Semantics of MMIO mapping attributes accross archs Benjamin Herrenschmidt
@ 2015-07-04 14:12 ` Dan Williams
  2015-07-05  3:02   ` Benjamin Herrenschmidt
  2015-07-07  0:01 ` Luis R. Rodriguez
  1 sibling, 1 reply; 15+ messages in thread
From: Dan Williams @ 2015-07-04 14:12 UTC (permalink / raw)
  To: Benjamin Herrenschmidt; +Cc: ksummit-discuss

On Sat, Jul 4, 2015 at 1:17 AM, Benjamin Herrenschmidt
<benh@kernel.crashing.org> wrote:
> Allright, it's that time of year ... So here's my attempt at getting
> myself invited :-)
>
> We've been talking about some of that on-list recently, and it might
> well be that this will trigger a resolution before we even reach KS, but
> I though it might be worthwhile to gather enough people from various
> arch together to hash things out:
>
> We have a pile of mapping attributes (more showing up recently),
> typically used for MMIO mappings (but not necessarily exclusively).
> ioremap_cache and ioremap_nocache are the old/common ones, but we have
> _wc (write combine), _wt (write through) and possibly more around the
> corner.
>
> What are their precise semantics accross all architecture ? This is not
> clear (not documented). For example, we define writel(), readl() and
> friends as being fully ordered vs each other but also vs DMA etc... but
> on what mapping types do they have this property ?
>
> Will _wc() provide the write-combine ability for writel() on all archs ?
> Or does it require writel_relaxed() on some ? Will _wc() bring other
> side effects such as loss of read vs. write ordering ? (on some archs at
> least...). Etc....
>
> There is a growing matrix of MMIO accessors and mapping types whose
> semantics are poorly (if at all) defined. We cannot define them all
> exactly for all architectures as there are too many differences that
> will impact them. But we should be able to guarantee at least *some*,
> ie, whether a given type of ordering is guaranteed or not by a given
> accessor on a given mapping type, whether write combine (if supported at
> all) will happen with a given accessor or not etc...
>
> As for who should participate, well, at least one rep from each major
> arch who is familiar with the intricacies of the architecture memory
> model I would say, possibly others who dabbled in that stuff recently
> such as Luis R. Rodriguez <mcgrof@suse.com> who was proposing patch
> series lately to consolidate the use of _wc.
>

Another side topic that has come up in this space is the desire to
define a "memremap" api to clean up __iomem abuses for cases where
"memory-like" mappings are needed.

https://lkml.org/lkml/2015/6/22/100

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Ksummit-discuss] [TECH TOPIC] Semantics of MMIO mapping attributes accross archs
  2015-07-04 14:12 ` Dan Williams
@ 2015-07-05  3:02   ` Benjamin Herrenschmidt
  2015-07-05 18:55     ` Andy Lutomirski
  0 siblings, 1 reply; 15+ messages in thread
From: Benjamin Herrenschmidt @ 2015-07-05  3:02 UTC (permalink / raw)
  To: Dan Williams; +Cc: ksummit-discuss

On Sat, 2015-07-04 at 07:12 -0700, Dan Williams wrote:

> Another side topic that has come up in this space is the desire to
> define a "memremap" api to clean up __iomem abuses for cases where
> "memory-like" mappings are needed.
> 
> https://lkml.org/lkml/2015/6/22/100

Interesting. I had missed this. There is a similar question about
semantics (ordering etc...), ie, are they the same as memory for
example ?

Another thing we might look into is to what extent should we provide
access to the "SAO" mapping attribute that POWER7 and later support
(strong ordering, pretty-much x86 like) and whether this can be used
on ppc to reduce the need for barriers (that attribute is only available
for fully cachable mappings, not generally applicable to IO mappings).

That translate to: should your new memremap() take some kinds of flags
as an argument ? Though of course providing a cross-arch definition of
these flags would be tricky.
 
Ben.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Ksummit-discuss] [TECH TOPIC] Semantics of MMIO mapping attributes accross archs
  2015-07-05  3:02   ` Benjamin Herrenschmidt
@ 2015-07-05 18:55     ` Andy Lutomirski
  2015-07-05 19:56       ` Benjamin Herrenschmidt
                         ` (2 more replies)
  0 siblings, 3 replies; 15+ messages in thread
From: Andy Lutomirski @ 2015-07-05 18:55 UTC (permalink / raw)
  To: Benjamin Herrenschmidt; +Cc: ksummit-discuss

On Sat, Jul 4, 2015 at 8:02 PM, Benjamin Herrenschmidt
<benh@kernel.crashing.org> wrote:
> On Sat, 2015-07-04 at 07:12 -0700, Dan Williams wrote:
>
>> Another side topic that has come up in this space is the desire to
>> define a "memremap" api to clean up __iomem abuses for cases where
>> "memory-like" mappings are needed.
>>
>> https://lkml.org/lkml/2015/6/22/100
>
> Interesting. I had missed this. There is a similar question about
> semantics (ordering etc...), ie, are they the same as memory for
> example ?
>
> Another thing we might look into is to what extent should we provide
> access to the "SAO" mapping attribute that POWER7 and later support
> (strong ordering, pretty-much x86 like) and whether this can be used
> on ppc to reduce the need for barriers (that attribute is only available
> for fully cachable mappings, not generally applicable to IO mappings).
>
> That translate to: should your new memremap() take some kinds of flags
> as an argument ? Though of course providing a cross-arch definition of
> these flags would be tricky.

At some point, it would also be nice if the various macros has
well-defined semantics.  For example, x86 has:

#define pgprot_noncached(prot)                                          \
        ((boot_cpu_data.x86 > 3)                                        \
         ? (__pgprot(pgprot_val(prot) |                                 \
                     cachemode2protval(_PAGE_CACHE_MODE_UC_MINUS)))     \
         : (prot))

Putting aside the pointless boot_cpu_data check (surely the recent PAT
rework completely obsoletes it), what is
pgprot_noncached(pgprot_writecombine(x)) supposed to do?  Currently it
results in garbage.  Should it have well-defined behavior instead?

I suspect the other arches all have their own unique glitches here.

--Andy

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Ksummit-discuss] [TECH TOPIC] Semantics of MMIO mapping attributes accross archs
  2015-07-05 18:55     ` Andy Lutomirski
@ 2015-07-05 19:56       ` Benjamin Herrenschmidt
  2015-07-05 20:09         ` Andy Lutomirski
  2015-07-06  9:33         ` Will Deacon
  2015-07-06  9:52       ` Catalin Marinas
  2015-07-06 19:11       ` Luck, Tony
  2 siblings, 2 replies; 15+ messages in thread
From: Benjamin Herrenschmidt @ 2015-07-05 19:56 UTC (permalink / raw)
  To: Andy Lutomirski; +Cc: ksummit-discuss

On Sun, 2015-07-05 at 11:55 -0700, Andy Lutomirski wrote:
> 
> At some point, it would also be nice if the various macros has
> well-defined semantics.  For example, x86 has:
> 
> #define pgprot_noncached(prot)                                          \
>         ((boot_cpu_data.x86 > 3)                                        \
>          ? (__pgprot(pgprot_val(prot) |                                 \
>                      cachemode2protval(_PAGE_CACHE_MODE_UC_MINUS)))     \
>          : (prot))
> 
> Putting aside the pointless boot_cpu_data check (surely the recent PAT
> rework completely obsoletes it), what is
> pgprot_noncached(pgprot_writecombine(x)) supposed to do?  Currently it
> results in garbage.  Should it have well-defined behavior instead?

Can it ? On powerpc it will just mean pgprot_noncached for example,
those macros manipulate the same bits and it's not a bitmask, it's
either unached or uncached with write combining.

> I suspect the other arches all have their own unique glitches here.

Correct. I'm still trying to get feedback on ARM for example.

I don't think we can (or should try) to have completely identical semantics
for everything, but we should try to find the common set that are guaranteed
and, possibly, do a best effort for archs to individually document the
remaining.

Ben.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Ksummit-discuss] [TECH TOPIC] Semantics of MMIO mapping attributes accross archs
  2015-07-05 19:56       ` Benjamin Herrenschmidt
@ 2015-07-05 20:09         ` Andy Lutomirski
  2015-07-06  9:33         ` Will Deacon
  1 sibling, 0 replies; 15+ messages in thread
From: Andy Lutomirski @ 2015-07-05 20:09 UTC (permalink / raw)
  To: Benjamin Herrenschmidt; +Cc: ksummit-discuss

On Sun, Jul 5, 2015 at 12:56 PM, Benjamin Herrenschmidt
<benh@kernel.crashing.org> wrote:
> On Sun, 2015-07-05 at 11:55 -0700, Andy Lutomirski wrote:
>>
>> At some point, it would also be nice if the various macros has
>> well-defined semantics.  For example, x86 has:
>>
>> #define pgprot_noncached(prot)                                          \
>>         ((boot_cpu_data.x86 > 3)                                        \
>>          ? (__pgprot(pgprot_val(prot) |                                 \
>>                      cachemode2protval(_PAGE_CACHE_MODE_UC_MINUS)))     \
>>          : (prot))
>>
>> Putting aside the pointless boot_cpu_data check (surely the recent PAT
>> rework completely obsoletes it), what is
>> pgprot_noncached(pgprot_writecombine(x)) supposed to do?  Currently it
>> results in garbage.  Should it have well-defined behavior instead?
>
> Can it ? On powerpc it will just mean pgprot_noncached for example,
> those macros manipulate the same bits and it's not a bitmask, it's
> either unached or uncached with write combining.

I think it should mean pgprot_noncached on all arches.  On x86, if it
does, it's purely by luck, since it only sets bits and doesn't clear
them.

>
>> I suspect the other arches all have their own unique glitches here.
>
> Correct. I'm still trying to get feedback on ARM for example.
>
> I don't think we can (or should try) to have completely identical semantics
> for everything, but we should try to find the common set that are guaranteed
> and, possibly, do a best effort for archs to individually document the
> remaining.

Agreed.

--Andy

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Ksummit-discuss] [TECH TOPIC] Semantics of MMIO mapping attributes accross archs
  2015-07-05 19:56       ` Benjamin Herrenschmidt
  2015-07-05 20:09         ` Andy Lutomirski
@ 2015-07-06  9:33         ` Will Deacon
  2015-07-06 22:02           ` Benjamin Herrenschmidt
  1 sibling, 1 reply; 15+ messages in thread
From: Will Deacon @ 2015-07-06  9:33 UTC (permalink / raw)
  To: Benjamin Herrenschmidt; +Cc: ksummit-discuss

On Sun, Jul 05, 2015 at 08:56:24PM +0100, Benjamin Herrenschmidt wrote:
> On Sun, 2015-07-05 at 11:55 -0700, Andy Lutomirski wrote:
> > 
> > At some point, it would also be nice if the various macros has
> > well-defined semantics.  For example, x86 has:
> > 
> > #define pgprot_noncached(prot)                                          \
> >         ((boot_cpu_data.x86 > 3)                                        \
> >          ? (__pgprot(pgprot_val(prot) |                                 \
> >                      cachemode2protval(_PAGE_CACHE_MODE_UC_MINUS)))     \
> >          : (prot))
> > 
> > Putting aside the pointless boot_cpu_data check (surely the recent PAT
> > rework completely obsoletes it), what is
> > pgprot_noncached(pgprot_writecombine(x)) supposed to do?  Currently it
> > results in garbage.  Should it have well-defined behavior instead?
> 
> Can it ? On powerpc it will just mean pgprot_noncached for example,
> those macros manipulate the same bits and it's not a bitmask, it's
> either unached or uncached with write combining.
> 
> > I suspect the other arches all have their own unique glitches here.
> 
> Correct. I'm still trying to get feedback on ARM for example.

We've ended up doing whatever drivers start to rely on from running on
x86, which gives rise to some sort of de-facto semantics, but it's not
necessarily efficient or portable.

On arm64, ioremap == ioremap_nocache, which gives strong ordering
guarantees but forbids things like unaligned access. ioremap_wc gives a
more relaxed mapping, which is non-cached but allows re-ordering and
unaligned access.

ioremap_wt is new and strange, but rmk and I were going down the same
route as ioremap_wc for that, because people expect to be able to do
blind memcpy with those pointers.

As for ordering of writeX/readX wrt DMA, our IO accessors are so insanely
heavyweight that I don't think the ioremap flavour matters atm.

Will

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Ksummit-discuss] [TECH TOPIC] Semantics of MMIO mapping attributes accross archs
  2015-07-05 18:55     ` Andy Lutomirski
  2015-07-05 19:56       ` Benjamin Herrenschmidt
@ 2015-07-06  9:52       ` Catalin Marinas
  2015-07-06 17:14         ` Andy Lutomirski
  2015-07-06 19:11       ` Luck, Tony
  2 siblings, 1 reply; 15+ messages in thread
From: Catalin Marinas @ 2015-07-06  9:52 UTC (permalink / raw)
  To: Andy Lutomirski; +Cc: ksummit-discuss

On Sun, Jul 05, 2015 at 07:55:39PM +0100, Andy Lutomirski wrote:
> On Sat, Jul 4, 2015 at 8:02 PM, Benjamin Herrenschmidt
> <benh@kernel.crashing.org> wrote:
> > On Sat, 2015-07-04 at 07:12 -0700, Dan Williams wrote:
> >
> >> Another side topic that has come up in this space is the desire to
> >> define a "memremap" api to clean up __iomem abuses for cases where
> >> "memory-like" mappings are needed.
> >>
> >> https://lkml.org/lkml/2015/6/22/100
> >
> > Interesting. I had missed this. There is a similar question about
> > semantics (ordering etc...), ie, are they the same as memory for
> > example ?
> >
> > Another thing we might look into is to what extent should we provide
> > access to the "SAO" mapping attribute that POWER7 and later support
> > (strong ordering, pretty-much x86 like) and whether this can be used
> > on ppc to reduce the need for barriers (that attribute is only available
> > for fully cachable mappings, not generally applicable to IO mappings).
> >
> > That translate to: should your new memremap() take some kinds of flags
> > as an argument ? Though of course providing a cross-arch definition of
> > these flags would be tricky.
> 
> At some point, it would also be nice if the various macros has
> well-defined semantics.  For example, x86 has:
> 
> #define pgprot_noncached(prot)                                          \
>         ((boot_cpu_data.x86 > 3)                                        \
>          ? (__pgprot(pgprot_val(prot) |                                 \
>                      cachemode2protval(_PAGE_CACHE_MODE_UC_MINUS)))     \
>          : (prot))
> 
> Putting aside the pointless boot_cpu_data check (surely the recent PAT
> rework completely obsoletes it), what is
> pgprot_noncached(pgprot_writecombine(x)) supposed to do?  Currently it
> results in garbage.  Should it have well-defined behavior instead?

I never thought composing pgprot_* macros/functions is supposed to
return a combined attribute. On arm32/arm64, this construct is just
returning the outermost prot, i.e. noncached here. Even if we would want
to allow such combination, we don't have enough software PTE bits for
each prot type, so these macros simply generate the corresponding
hardware bits (on newer ARM cores, that's a 3-bit index).

FWIW, on ARM, pgprot_noncached has stronger ordering semantics than what
ioremap_nocache returns and in many (most) cases it's not the
appropriate type for ARM (e.g. video framebuffers should use
writecombine since noncached doesn't even allow unaligned accesses).

-- 
Catalin

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Ksummit-discuss] [TECH TOPIC] Semantics of MMIO mapping attributes accross archs
  2015-07-06  9:52       ` Catalin Marinas
@ 2015-07-06 17:14         ` Andy Lutomirski
  2015-07-06 22:04           ` Benjamin Herrenschmidt
  0 siblings, 1 reply; 15+ messages in thread
From: Andy Lutomirski @ 2015-07-06 17:14 UTC (permalink / raw)
  To: Catalin Marinas; +Cc: ksummit-discuss

On Mon, Jul 6, 2015 at 2:52 AM, Catalin Marinas <catalin.marinas@arm.com> wrote:
> On Sun, Jul 05, 2015 at 07:55:39PM +0100, Andy Lutomirski wrote:
>>
>> At some point, it would also be nice if the various macros has
>> well-defined semantics.  For example, x86 has:
>>
>> #define pgprot_noncached(prot)                                          \
>>         ((boot_cpu_data.x86 > 3)                                        \
>>          ? (__pgprot(pgprot_val(prot) |                                 \
>>                      cachemode2protval(_PAGE_CACHE_MODE_UC_MINUS)))     \
>>          : (prot))
>>
>> Putting aside the pointless boot_cpu_data check (surely the recent PAT
>> rework completely obsoletes it), what is
>> pgprot_noncached(pgprot_writecombine(x)) supposed to do?  Currently it
>> results in garbage.  Should it have well-defined behavior instead?
>
> I never thought composing pgprot_* macros/functions is supposed to
> return a combined attribute. On arm32/arm64, this construct is just
> returning the outermost prot, i.e. noncached here. Even if we would want
> to allow such combination, we don't have enough software PTE bits for
> each prot type, so these macros simply generate the corresponding
> hardware bits (on newer ARM cores, that's a 3-bit index).
>

I should have said it more clearly: I think that this construct
*should* result in the outermost prot.  That is: pgprot_xyz(p) should
have mode xyz regardless of p.

x86 doesn't work that way right now.  Instead pgprot_xyz(p) returns
garbage if p has prot bits set.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Ksummit-discuss] [TECH TOPIC] Semantics of MMIO mapping attributes accross archs
  2015-07-05 18:55     ` Andy Lutomirski
  2015-07-05 19:56       ` Benjamin Herrenschmidt
  2015-07-06  9:52       ` Catalin Marinas
@ 2015-07-06 19:11       ` Luck, Tony
  2 siblings, 0 replies; 15+ messages in thread
From: Luck, Tony @ 2015-07-06 19:11 UTC (permalink / raw)
  To: Andy Lutomirski, Benjamin Herrenschmidt; +Cc: ksummit-discuss

>> Another side topic that has come up in this space is the desire to
>> define a "memremap" api to clean up __iomem abuses for cases where
>> "memory-like" mappings are needed.
>>
>> https://lkml.org/lkml/2015/6/22/100
>
> Interesting. I had missed this. There is a similar question about
> semantics (ordering etc...), ie, are they the same as memory for
> example ?

The drivers/acpi/apei/* usages are just mapping bits of normal memory that
happens to be BIOS reserved.  ioremap*() got used because it was available
and did the right thing with the page tables even though it broke the __iomem
annotations.  memremap() sounds like a great idea for these.

-Tony

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Ksummit-discuss] [TECH TOPIC] Semantics of MMIO mapping attributes accross archs
  2015-07-06  9:33         ` Will Deacon
@ 2015-07-06 22:02           ` Benjamin Herrenschmidt
  2015-07-07  9:56             ` Will Deacon
  0 siblings, 1 reply; 15+ messages in thread
From: Benjamin Herrenschmidt @ 2015-07-06 22:02 UTC (permalink / raw)
  To: Will Deacon; +Cc: ksummit-discuss

On Mon, 2015-07-06 at 10:33 +0100, Will Deacon wrote:
> 
> We've ended up doing whatever drivers start to rely on from running on
> x86, which gives rise to some sort of de-facto semantics, but it's not
> necessarily efficient or portable.

Correct, which is why I would like to start documenting what is
mandated/guaranteed and separately what is the expected behaviour for
non-guaranteed bits on each arch.

> On arm64, ioremap == ioremap_nocache, which gives strong ordering
> guarantees but forbids things like unaligned access.

Ok, same for us. Except ordering guarantees aren't even that strong ...

>  ioremap_wc gives a
> more relaxed mapping, which is non-cached but allows re-ordering and
> unaligned access.

Ok, our other mapping (G=0) weakens ordering even more but won't allow
unaligned either. We don't have a non-cachable mapping that allows
unaligned accesses at all in fact :-( I've been fighting with our HW
guys on that one, but they keep thinking it's not useful.
 
> ioremap_wt is new and strange, but rmk and I were going down the same
> route as ioremap_wc for that, because people expect to be able to do
> blind memcpy with those pointers.

Ok, powerpc architecturally supports WT but no recent implementation
does. I'm not sure what is the practical purpose.

> As for ordering of writeX/readX wrt DMA, our IO accessors are so
> insanely
> heavyweight that I don't think the ioremap flavour matters atm.

This is the same for us, but that also means in our case that writeX
will not combine on ioremap_wc(), only relaxed_writeX() might after we
change it to be something else than writeX(). What is the situation for
you ?

Ben.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Ksummit-discuss] [TECH TOPIC] Semantics of MMIO mapping attributes accross archs
  2015-07-06 17:14         ` Andy Lutomirski
@ 2015-07-06 22:04           ` Benjamin Herrenschmidt
  0 siblings, 0 replies; 15+ messages in thread
From: Benjamin Herrenschmidt @ 2015-07-06 22:04 UTC (permalink / raw)
  To: Andy Lutomirski; +Cc: ksummit-discuss

On Mon, 2015-07-06 at 10:14 -0700, Andy Lutomirski wrote:

> I should have said it more clearly: I think that this construct
> *should* result in the outermost prot.  That is: pgprot_xyz(p) should
> have mode xyz regardless of p.
> 
> x86 doesn't work that way right now.  Instead pgprot_xyz(p) returns
> garbage if p has prot bits set.

Agreed. It should.

Ben.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Ksummit-discuss] [TECH TOPIC] Semantics of MMIO mapping attributes accross archs
  2015-07-04  8:17 [Ksummit-discuss] [TECH TOPIC] Semantics of MMIO mapping attributes accross archs Benjamin Herrenschmidt
  2015-07-04 14:12 ` Dan Williams
@ 2015-07-07  0:01 ` Luis R. Rodriguez
  1 sibling, 0 replies; 15+ messages in thread
From: Luis R. Rodriguez @ 2015-07-07  0:01 UTC (permalink / raw)
  To: Benjamin Herrenschmidt; +Cc: Toshi Kani, ksummit-discuss

On Sat, Jul 04, 2015 at 06:17:17PM +1000, Benjamin Herrenschmidt wrote:
> Allright, it's that time of year ... So here's my attempt at getting
> myself invited :-)
> 
> We've been talking about some of that on-list recently, and it might
> well be that this will trigger a resolution before we even reach KS, but
> I though it might be worthwhile to gather enough people from various
> arch together to hash things out:
> 
> We have a pile of mapping attributes (more showing up recently),
> typically used for MMIO mappings (but not necessarily exclusively).
> ioremap_cache and ioremap_nocache are the old/common ones, but we have
> _wc (write combine), _wt (write through) and possibly more around the
> corner.

ioremap_uc() is now merged, its user however is not upstream and requires
addressing some general architecture definitions of ioremap_uc()

> What are their precise semantics accross all architecture ? This is not
> clear (not documented). For example, we define writel(), readl() and
> friends as being fully ordered vs each other but also vs DMA etc... but
> on what mapping types do they have this property ?
> 
> Will _wc() provide the write-combine ability for writel() on all archs ?
> Or does it require writel_relaxed() on some ? Will _wc() bring other
> side effects such as loss of read vs. write ordering ? (on some archs at
> least...). Etc....
> 
> There is a growing matrix of MMIO accessors and mapping types whose
> semantics are poorly (if at all) defined. We cannot define them all
> exactly for all architectures as there are too many differences that
> will impact them. But we should be able to guarantee at least *some*,
> ie, whether a given type of ordering is guaranteed or not by a given
> accessor on a given mapping type, whether write combine (if supported at
> all) will happen with a given accessor or not etc...

Other than this one complexity I found no one to be too comfortable
in considering is the effect of overlapping ioremap() calls and whether
or not it should be supported. There are possible aliasing issues here but
again, what is done on hardware if one does go through with things can vary.
Not only are there ioremap() implementations to consider but also further
mapping modifications (arch_phys / old MTRR calls, then set_memory_*() are
examples). For instance:

 a) an undocumented grammatical requirement here is
    that an arch_phys_wc_add() call requires a respective ioremap_wc() call,
    such requirements can be both implemented and later ensured with Coccinelle
    SmPL, should we really wish to enforce a combinatorial set 
    (make coccicheck M=drivers/path/) we can do so by adding this as a
    rule.

 b) Although you get no error from this set_memory_wc() will not work
    I/O memory becauase as Toshi noted to me a while ago:

    1. __pa(addr) returns a fake address since there is no direct map.
    2. reserve_memtype() tracks regular memory and I/O memory differently.
       For regular memory, set_memory_*() can modify WB with a new type since
       reserve_memtype() does not track WB.  For I/O memory, reserve_memtype()
       detects a conflict when a given type is different from a tracked type.

If we are to grow ioremap calls, and other modifiers such as the above
we should not only consider documenting semantics but actually
defining a proper semantic set with something like coccinelle rules.
Then driver developers / maintainers can check / vet its use with
thins like coccicheck.

> As for who should participate, well, at least one rep from each major
> arch who is familiar with the intricacies of the architecture memory
> model I would say, possibly others who dabbled in that stuff recently
> such as Luis R. Rodriguez <mcgrof@suse.com> who was proposing patch
> series lately to consolidate the use of _wc.

Dan Williams <dan.j.williams@intel.com> and Toshi Kani <toshi.kani@hp.com>
could obvioulsy also contribute to this discussion. Julia for the semantic
side of things.

 Luis

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Ksummit-discuss] [TECH TOPIC] Semantics of MMIO mapping attributes accross archs
  2015-07-06 22:02           ` Benjamin Herrenschmidt
@ 2015-07-07  9:56             ` Will Deacon
  2015-07-07 10:29               ` Will Deacon
  0 siblings, 1 reply; 15+ messages in thread
From: Will Deacon @ 2015-07-07  9:56 UTC (permalink / raw)
  To: Benjamin Herrenschmidt; +Cc: ksummit-discuss

On Mon, Jul 06, 2015 at 11:02:06PM +0100, Benjamin Herrenschmidt wrote:
> On Mon, 2015-07-06 at 10:33 +0100, Will Deacon wrote:
> > On arm64, ioremap == ioremap_nocache, which gives strong ordering
> > guarantees but forbids things like unaligned access.
> 
> Ok, same for us. Except ordering guarantees aren't even that strong ...

By "strong", I mean "ordered with respect to each other" and not subject
to "gathering" (more on that below).

> >  ioremap_wc gives a
> > more relaxed mapping, which is non-cached but allows re-ordering and
> > unaligned access.
> 
> Ok, our other mapping (G=0) weakens ordering even more but won't allow
> unaligned either. We don't have a non-cachable mapping that allows
> unaligned accesses at all in fact :-( I've been fighting with our HW
> guys on that one, but they keep thinking it's not useful.

Yikes. How do you deal with that? I've seen GCC perform idiom recognition
on calls to memset/memcpy assuming unaligned access on arm64.

> > ioremap_wt is new and strange, but rmk and I were going down the same
> > route as ioremap_wc for that, because people expect to be able to do
> > blind memcpy with those pointers.
> 
> Ok, powerpc architecturally supports WT but no recent implementation
> does. I'm not sure what is the practical purpose.

That's a similar story for us. In terms of normal memory, we basically
have writeback-cacheable and non-cacheable.

> > As for ordering of writeX/readX wrt DMA, our IO accessors are so
> > insanely
> > heavyweight that I don't think the ioremap flavour matters atm.
> 
> This is the same for us, but that also means in our case that writeX
> will not combine on ioremap_wc(), only relaxed_writeX() might after we
> change it to be something else than writeX(). What is the situation for
> you ?

The barriers between the writes will forbid any combining. In fact, it
would make the mapping look an awful lot like a plain old ioremap except
that a readX could be speculated and unaligned access is permitted.

As long as the accessors are required to enforce ordering that the
underlying memory type is incapable of providing, I don't see what we
could do to solve this (somehow make readX/writeX behaviour dependent on
the pointer?).

Will

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Ksummit-discuss] [TECH TOPIC] Semantics of MMIO mapping attributes accross archs
  2015-07-07  9:56             ` Will Deacon
@ 2015-07-07 10:29               ` Will Deacon
  0 siblings, 0 replies; 15+ messages in thread
From: Will Deacon @ 2015-07-07 10:29 UTC (permalink / raw)
  To: Benjamin Herrenschmidt; +Cc: ksummit-discuss

On Tue, Jul 07, 2015 at 10:56:15AM +0100, Will Deacon wrote:
> As long as the accessors are required to enforce ordering that the
> underlying memory type is incapable of providing, I don't see what we
> could do to solve this (somehow make readX/writeX behaviour dependent on
> the pointer?).

... which we could do using pointer tags, but christ is that going to be
confusing to use (casting an io-relaxed pointer to order against DMA!).

Will

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2015-07-07 10:29 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-07-04  8:17 [Ksummit-discuss] [TECH TOPIC] Semantics of MMIO mapping attributes accross archs Benjamin Herrenschmidt
2015-07-04 14:12 ` Dan Williams
2015-07-05  3:02   ` Benjamin Herrenschmidt
2015-07-05 18:55     ` Andy Lutomirski
2015-07-05 19:56       ` Benjamin Herrenschmidt
2015-07-05 20:09         ` Andy Lutomirski
2015-07-06  9:33         ` Will Deacon
2015-07-06 22:02           ` Benjamin Herrenschmidt
2015-07-07  9:56             ` Will Deacon
2015-07-07 10:29               ` Will Deacon
2015-07-06  9:52       ` Catalin Marinas
2015-07-06 17:14         ` Andy Lutomirski
2015-07-06 22:04           ` Benjamin Herrenschmidt
2015-07-06 19:11       ` Luck, Tony
2015-07-07  0:01 ` Luis R. Rodriguez

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.