linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/2] ioremap_wc on arm64
@ 2017-05-22  7:01 Jayachandran C
  2017-05-22  7:01 ` [PATCH 1/2] arm64: add PROT_DEVICE_GRE for Device GRE mapping Jayachandran C
                   ` (2 more replies)
  0 siblings, 3 replies; 12+ messages in thread
From: Jayachandran C @ 2017-05-22  7:01 UTC (permalink / raw)
  To: linux-arm-kernel

>From its definition, the device "gather" attribute seems to be a better
fit for implementing write combining mapping in ioremap_wc().  And on
ThunderX2, Device GRE mapping has optimizations that makes it much faster
than normal uncached mapping.

I am not sure of the reasoning behind the original decision to make
ioremap_wc use "Normal Non-Cached" attribute, since all the other variants
of ioremap use device attributes, and ioremap_wc looks like an exception.

Comments are very welcome.

Thanks,
JC.

Jayachandran C (2):
  arm64: add PROT_DEVICE_GRE for Device GRE mapping
  arm64: switch ioremap_wc to use Device GRE

 arch/arm64/include/asm/io.h           | 2 +-
 arch/arm64/include/asm/pgtable-prot.h | 1 +
 2 files changed, 2 insertions(+), 1 deletion(-)

-- 
2.7.4

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 1/2] arm64: add PROT_DEVICE_GRE for Device GRE mapping
  2017-05-22  7:01 [PATCH 0/2] ioremap_wc on arm64 Jayachandran C
@ 2017-05-22  7:01 ` Jayachandran C
  2017-05-22  7:01 ` [PATCH 2/2] arm64: switch ioremap_wc to use Device GRE Jayachandran C
  2017-05-22  8:56 ` [PATCH 0/2] ioremap_wc on arm64 Catalin Marinas
  2 siblings, 0 replies; 12+ messages in thread
From: Jayachandran C @ 2017-05-22  7:01 UTC (permalink / raw)
  To: linux-arm-kernel

The MAIR is programmed with attribute "Device GRE" (0x0C) at index 3,
but there is no corresponding page protection value defined. Make the
page attribute available for use by defining PROT_DEVICE_GRE.

Signed-off-by: Jayachandran C <jnair@caviumnetworks.com>
---
 arch/arm64/include/asm/pgtable-prot.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/arm64/include/asm/pgtable-prot.h b/arch/arm64/include/asm/pgtable-prot.h
index 2142c77..216f9f9 100644
--- a/arch/arm64/include/asm/pgtable-prot.h
+++ b/arch/arm64/include/asm/pgtable-prot.h
@@ -39,6 +39,7 @@
 
 #define PROT_DEVICE_nGnRnE	(PROT_DEFAULT | PTE_PXN | PTE_UXN | PTE_DIRTY | PTE_WRITE | PTE_ATTRINDX(MT_DEVICE_nGnRnE))
 #define PROT_DEVICE_nGnRE	(PROT_DEFAULT | PTE_PXN | PTE_UXN | PTE_DIRTY | PTE_WRITE | PTE_ATTRINDX(MT_DEVICE_nGnRE))
+#define PROT_DEVICE_GRE		(PROT_DEFAULT | PTE_PXN | PTE_UXN | PTE_DIRTY | PTE_WRITE | PTE_ATTRINDX(MT_DEVICE_GRE))
 #define PROT_NORMAL_NC		(PROT_DEFAULT | PTE_PXN | PTE_UXN | PTE_DIRTY | PTE_WRITE | PTE_ATTRINDX(MT_NORMAL_NC))
 #define PROT_NORMAL_WT		(PROT_DEFAULT | PTE_PXN | PTE_UXN | PTE_DIRTY | PTE_WRITE | PTE_ATTRINDX(MT_NORMAL_WT))
 #define PROT_NORMAL		(PROT_DEFAULT | PTE_PXN | PTE_UXN | PTE_DIRTY | PTE_WRITE | PTE_ATTRINDX(MT_NORMAL))
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 2/2] arm64: switch ioremap_wc to use Device GRE
  2017-05-22  7:01 [PATCH 0/2] ioremap_wc on arm64 Jayachandran C
  2017-05-22  7:01 ` [PATCH 1/2] arm64: add PROT_DEVICE_GRE for Device GRE mapping Jayachandran C
@ 2017-05-22  7:01 ` Jayachandran C
  2017-05-22  8:56 ` [PATCH 0/2] ioremap_wc on arm64 Catalin Marinas
  2 siblings, 0 replies; 12+ messages in thread
From: Jayachandran C @ 2017-05-22  7:01 UTC (permalink / raw)
  To: linux-arm-kernel

The device "Gather" attribute is defined in the ARMv8 ARM as:
"The Gathering attribute determines whether it is permissible for either:
* Multiple memory accesses of the same type, read or write, to the same
  memory location to be merged into a single transaction.
* Multiple memory accesses of the same type, read or write, to different
  memory locations to be merged into a single memory transaction on an
  interconnect."

This attribute is more appropriate for the write combining ioremap_wc()
than the non-cached memory attribute used now, so update the ioremap_wc
implementation to use it.

Signed-off-by: Jayachandran C <jnair@caviumnetworks.com>
---
 arch/arm64/include/asm/io.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/arm64/include/asm/io.h b/arch/arm64/include/asm/io.h
index 35b2e50..b1c92e3 100644
--- a/arch/arm64/include/asm/io.h
+++ b/arch/arm64/include/asm/io.h
@@ -168,7 +168,7 @@ extern void __iomem *ioremap_cache(phys_addr_t phys_addr, size_t size);
 
 #define ioremap(addr, size)		__ioremap((addr), (size), __pgprot(PROT_DEVICE_nGnRE))
 #define ioremap_nocache(addr, size)	__ioremap((addr), (size), __pgprot(PROT_DEVICE_nGnRE))
-#define ioremap_wc(addr, size)		__ioremap((addr), (size), __pgprot(PROT_NORMAL_NC))
+#define ioremap_wc(addr, size)		__ioremap((addr), (size), __pgprot(PROT_DEVICE_GRE))
 #define ioremap_wt(addr, size)		__ioremap((addr), (size), __pgprot(PROT_DEVICE_nGnRE))
 #define iounmap				__iounmap
 
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 0/2] ioremap_wc on arm64
  2017-05-22  7:01 [PATCH 0/2] ioremap_wc on arm64 Jayachandran C
  2017-05-22  7:01 ` [PATCH 1/2] arm64: add PROT_DEVICE_GRE for Device GRE mapping Jayachandran C
  2017-05-22  7:01 ` [PATCH 2/2] arm64: switch ioremap_wc to use Device GRE Jayachandran C
@ 2017-05-22  8:56 ` Catalin Marinas
  2017-05-22 11:53   ` Jayachandran C
  2 siblings, 1 reply; 12+ messages in thread
From: Catalin Marinas @ 2017-05-22  8:56 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, May 22, 2017 at 07:01:45AM +0000, Jayachandran C wrote:
> From its definition, the device "gather" attribute seems to be a better
> fit for implementing write combining mapping in ioremap_wc().  And on
> ThunderX2, Device GRE mapping has optimizations that makes it much faster
> than normal uncached mapping.
> 
> I am not sure of the reasoning behind the original decision to make
> ioremap_wc use "Normal Non-Cached" attribute, since all the other variants
> of ioremap use device attributes, and ioremap_wc looks like an exception.

The reason we kept it as Normal NC is that Device_GRE does not allow
unaligned accesses.

-- 
Catalin

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 0/2] ioremap_wc on arm64
  2017-05-22  8:56 ` [PATCH 0/2] ioremap_wc on arm64 Catalin Marinas
@ 2017-05-22 11:53   ` Jayachandran C
  2017-05-22 11:58     ` Alexander Graf
  2017-05-22 12:22     ` Will Deacon
  0 siblings, 2 replies; 12+ messages in thread
From: Jayachandran C @ 2017-05-22 11:53 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, May 22, 2017 at 09:56:16AM +0100, Catalin Marinas wrote:
> On Mon, May 22, 2017 at 07:01:45AM +0000, Jayachandran C wrote:
> > From its definition, the device "gather" attribute seems to be a better
> > fit for implementing write combining mapping in ioremap_wc().  And on
> > ThunderX2, Device GRE mapping has optimizations that makes it much faster
> > than normal uncached mapping.
> > 
> > I am not sure of the reasoning behind the original decision to make
> > ioremap_wc use "Normal Non-Cached" attribute, since all the other variants
> > of ioremap use device attributes, and ioremap_wc looks like an exception.
> 
> The reason we kept it as Normal NC is that Device_GRE does not allow
> unaligned accesses.

There does not to be an expectation to have unaligned access on __iomem
pointers. I also see that memremap can call ioremap_wc (which is a Normal
mapping) or ioremap_wt(which is a Device mapping), so that is inconsistent
as well.

Was this added for a specific use case? Also, do you think this patchset
is acceptable?

Thanks,
JC.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 0/2] ioremap_wc on arm64
  2017-05-22 11:53   ` Jayachandran C
@ 2017-05-22 11:58     ` Alexander Graf
  2017-05-22 12:22     ` Will Deacon
  1 sibling, 0 replies; 12+ messages in thread
From: Alexander Graf @ 2017-05-22 11:58 UTC (permalink / raw)
  To: linux-arm-kernel

On 05/22/2017 01:53 PM, Jayachandran C wrote:
> On Mon, May 22, 2017 at 09:56:16AM +0100, Catalin Marinas wrote:
>> On Mon, May 22, 2017 at 07:01:45AM +0000, Jayachandran C wrote:
>>>  From its definition, the device "gather" attribute seems to be a better
>>> fit for implementing write combining mapping in ioremap_wc().  And on
>>> ThunderX2, Device GRE mapping has optimizations that makes it much faster
>>> than normal uncached mapping.
>>>
>>> I am not sure of the reasoning behind the original decision to make
>>> ioremap_wc use "Normal Non-Cached" attribute, since all the other variants
>>> of ioremap use device attributes, and ioremap_wc looks like an exception.
>> The reason we kept it as Normal NC is that Device_GRE does not allow
>> unaligned accesses.
> There does not to be an expectation to have unaligned access on __iomem
> pointers. I also see that memremap can call ioremap_wc (which is a Normal
> mapping) or ioremap_wt(which is a Device mapping), so that is inconsistent
> as well.
>
> Was this added for a specific use case? Also, do you think this patchset
> is acceptable?

I think the patch description in patch 2/2 needs to be more explicit as 
to why mapping device memory as normal is a problem.


Alex

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 0/2] ioremap_wc on arm64
  2017-05-22 11:53   ` Jayachandran C
  2017-05-22 11:58     ` Alexander Graf
@ 2017-05-22 12:22     ` Will Deacon
  2017-05-22 14:21       ` Jayachandran C
  1 sibling, 1 reply; 12+ messages in thread
From: Will Deacon @ 2017-05-22 12:22 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, May 22, 2017 at 11:53:50AM +0000, Jayachandran C wrote:
> On Mon, May 22, 2017 at 09:56:16AM +0100, Catalin Marinas wrote:
> > On Mon, May 22, 2017 at 07:01:45AM +0000, Jayachandran C wrote:
> > > From its definition, the device "gather" attribute seems to be a better
> > > fit for implementing write combining mapping in ioremap_wc().  And on
> > > ThunderX2, Device GRE mapping has optimizations that makes it much faster
> > > than normal uncached mapping.
> > > 
> > > I am not sure of the reasoning behind the original decision to make
> > > ioremap_wc use "Normal Non-Cached" attribute, since all the other variants
> > > of ioremap use device attributes, and ioremap_wc looks like an exception.
> > 
> > The reason we kept it as Normal NC is that Device_GRE does not allow
> > unaligned accesses.
> 
> There does not to be an expectation to have unaligned access on __iomem
> pointers. I also see that memremap can call ioremap_wc (which is a Normal
> mapping) or ioremap_wt(which is a Device mapping), so that is inconsistent
> as well.
> 
> Was this added for a specific use case? Also, do you think this patchset
> is acceptable?

I think it was used for framebuffers. Note that using normal-NC also aligns
with arch/arm/ and, since normal-NC is strictly a more relaxed memory type
than Device-GRE, I'm really not keen making on this change.

Will

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 0/2] ioremap_wc on arm64
  2017-05-22 12:22     ` Will Deacon
@ 2017-05-22 14:21       ` Jayachandran C
  2017-05-22 15:49         ` Catalin Marinas
  0 siblings, 1 reply; 12+ messages in thread
From: Jayachandran C @ 2017-05-22 14:21 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, May 22, 2017 at 01:22:37PM +0100, Will Deacon wrote:
> On Mon, May 22, 2017 at 11:53:50AM +0000, Jayachandran C wrote:
> > On Mon, May 22, 2017 at 09:56:16AM +0100, Catalin Marinas wrote:
> > > On Mon, May 22, 2017 at 07:01:45AM +0000, Jayachandran C wrote:
> > > > From its definition, the device "gather" attribute seems to be a better
> > > > fit for implementing write combining mapping in ioremap_wc().  And on
> > > > ThunderX2, Device GRE mapping has optimizations that makes it much faster
> > > > than normal uncached mapping.
> > > > 
> > > > I am not sure of the reasoning behind the original decision to make
> > > > ioremap_wc use "Normal Non-Cached" attribute, since all the other variants
> > > > of ioremap use device attributes, and ioremap_wc looks like an exception.
> > > 
> > > The reason we kept it as Normal NC is that Device_GRE does not allow
> > > unaligned accesses.
> > 
> > There does not to be an expectation to have unaligned access on __iomem
> > pointers. I also see that memremap can call ioremap_wc (which is a Normal
> > mapping) or ioremap_wt(which is a Device mapping), so that is inconsistent
> > as well.
> > 
> > Was this added for a specific use case? Also, do you think this patchset
> > is acceptable?
> 
> I think it was used for framebuffers. Note that using normal-NC also aligns
> with arch/arm/ and, since normal-NC is strictly a more relaxed memory type
> than Device-GRE, I'm really not keen making on this change.

If there is a hardware requirement that Normal-NC has to be implemented
as more relaxed attribute than Device GRE, I can go back to the hardware
team on this. I did not see any text in the ARM ARM requiring this.

Even if that is the case, having Normal attribute for ioremap just for
ioremap_wc is inconsistent. The Device Gathering attribute by its definition
is exactly what write combining requires. The memremap() API looks like a
better way to expose Normal-NC mapping (with additional features like
speculation etc.) if implemented correctly.

Also, patch 1/2 should be useful anyway - can that be picked up?

Thanks,
JC.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 0/2] ioremap_wc on arm64
  2017-05-22 14:21       ` Jayachandran C
@ 2017-05-22 15:49         ` Catalin Marinas
  2017-05-23 10:09           ` Jayachandran C
  0 siblings, 1 reply; 12+ messages in thread
From: Catalin Marinas @ 2017-05-22 15:49 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, May 22, 2017 at 02:21:00PM +0000, Jayachandran C wrote:
> On Mon, May 22, 2017 at 01:22:37PM +0100, Will Deacon wrote:
> > On Mon, May 22, 2017 at 11:53:50AM +0000, Jayachandran C wrote:
> > > On Mon, May 22, 2017 at 09:56:16AM +0100, Catalin Marinas wrote:
> > > > On Mon, May 22, 2017 at 07:01:45AM +0000, Jayachandran C wrote:
> > > > > From its definition, the device "gather" attribute seems to be a better
> > > > > fit for implementing write combining mapping in ioremap_wc().  And on
> > > > > ThunderX2, Device GRE mapping has optimizations that makes it much faster
> > > > > than normal uncached mapping.
> > > > > 
> > > > > I am not sure of the reasoning behind the original decision to make
> > > > > ioremap_wc use "Normal Non-Cached" attribute, since all the other variants
> > > > > of ioremap use device attributes, and ioremap_wc looks like an exception.
> > > > 
> > > > The reason we kept it as Normal NC is that Device_GRE does not allow
> > > > unaligned accesses.
> > > 
> > > There does not to be an expectation to have unaligned access on __iomem
> > > pointers. I also see that memremap can call ioremap_wc (which is a Normal
> > > mapping) or ioremap_wt(which is a Device mapping), so that is inconsistent
> > > as well.
> > > 
> > > Was this added for a specific use case? Also, do you think this patchset
> > > is acceptable?
> > 
> > I think it was used for framebuffers. Note that using normal-NC also aligns
> > with arch/arm/ and, since normal-NC is strictly a more relaxed memory type
> > than Device-GRE, I'm really not keen making on this change.
> 
> If there is a hardware requirement that Normal-NC has to be implemented
> as more relaxed attribute than Device GRE, I can go back to the hardware
> team on this. I did not see any text in the ARM ARM requiring this.

It depends what you mean by "relaxed". Normal NC allows unaligned
accesses and also allows speculative loads (you don't get any ordering
guarantees of the Device memory). If you mean skipping some transparent
cache snooping, it may break other use-cases.

> Even if that is the case, having Normal attribute for ioremap just for
> ioremap_wc is inconsistent. The Device Gathering attribute by its definition
> is exactly what write combining requires. The memremap() API looks like a
> better way to expose Normal-NC mapping (with additional features like
> speculation etc.) if implemented correctly.

I agree, memremap() is a better way but this was merged in 4.3 while
ioremap_wc() had been around for a long time. If you want to change its
semantics, you'd need to go through all its uses in the kernel and make
sure they *only* use I/O accessors with such memory. At a quick grep,
there are several places where the __iomem pointer attribute is dropped
shortly after ioremap() and the pointer is accessed directly. That's
where things will break with Device GRE memory since the compiler
doesn't guarantee aligned accesses.

> Also, patch 1/2 should be useful anyway - can that be picked up?

The patch is harmless but are we going to have a user of Device_GRE?

-- 
Catalin

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 0/2] ioremap_wc on arm64
  2017-05-22 15:49         ` Catalin Marinas
@ 2017-05-23 10:09           ` Jayachandran C
  2017-05-23 14:08             ` Catalin Marinas
  0 siblings, 1 reply; 12+ messages in thread
From: Jayachandran C @ 2017-05-23 10:09 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, May 22, 2017 at 04:49:47PM +0100, Catalin Marinas wrote:
> On Mon, May 22, 2017 at 02:21:00PM +0000, Jayachandran C wrote:
> > On Mon, May 22, 2017 at 01:22:37PM +0100, Will Deacon wrote:
> > > On Mon, May 22, 2017 at 11:53:50AM +0000, Jayachandran C wrote:
> > > > On Mon, May 22, 2017 at 09:56:16AM +0100, Catalin Marinas wrote:
> > > > > On Mon, May 22, 2017 at 07:01:45AM +0000, Jayachandran C wrote:
> > > > > > From its definition, the device "gather" attribute seems to be a better
> > > > > > fit for implementing write combining mapping in ioremap_wc().  And on
> > > > > > ThunderX2, Device GRE mapping has optimizations that makes it much faster
> > > > > > than normal uncached mapping.
> > > > > > 
> > > > > > I am not sure of the reasoning behind the original decision to make
> > > > > > ioremap_wc use "Normal Non-Cached" attribute, since all the other variants
> > > > > > of ioremap use device attributes, and ioremap_wc looks like an exception.
> > > > > 
> > > > > The reason we kept it as Normal NC is that Device_GRE does not allow
> > > > > unaligned accesses.
> > > > 
> > > > There does not to be an expectation to have unaligned access on __iomem
> > > > pointers. I also see that memremap can call ioremap_wc (which is a Normal
> > > > mapping) or ioremap_wt(which is a Device mapping), so that is inconsistent
> > > > as well.
> > > > 
> > > > Was this added for a specific use case? Also, do you think this patchset
> > > > is acceptable?
> > > 
> > > I think it was used for framebuffers. Note that using normal-NC also aligns
> > > with arch/arm/ and, since normal-NC is strictly a more relaxed memory type
> > > than Device-GRE, I'm really not keen making on this change.
> > 
> > If there is a hardware requirement that Normal-NC has to be implemented
> > as more relaxed attribute than Device GRE, I can go back to the hardware
> > team on this. I did not see any text in the ARM ARM requiring this.
> 
> It depends what you mean by "relaxed". Normal NC allows unaligned
> accesses and also allows speculative loads (you don't get any ordering
> guarantees of the Device memory). If you mean skipping some transparent
> cache snooping, it may break other use-cases.

Ok. I was using Will's terminology here. This is not the main issue,
but worth understanding. If there is a requirement in ARM spec that
"Normal NC" must be "strictly more releaxed" than "Device GRE", I will
have to point that out to our hardware team.

> > Even if that is the case, having Normal attribute for ioremap just for
> > ioremap_wc is inconsistent. The Device Gathering attribute by its definition
> > is exactly what write combining requires. The memremap() API looks like a
> > better way to expose Normal-NC mapping (with additional features like
> > speculation etc.) if implemented correctly.
> 
> I agree, memremap() is a better way but this was merged in 4.3 while
> ioremap_wc() had been around for a long time. If you want to change its
> semantics, you'd need to go through all its uses in the kernel and make
> sure they *only* use I/O accessors with such memory. At a quick grep,
> there are several places where the __iomem pointer attribute is dropped
> shortly after ioremap() and the pointer is accessed directly. That's
> where things will break with Device GRE memory since the compiler
> doesn't guarantee aligned accesses.

There is a difference in behavior with regard to unaligned access between
ioremap_wc and ioremap already. I don't follow why unaligned access has to
supported for ioremap_wc but not for ioremap - it would be much better to
be consistent here.

> > Also, patch 1/2 should be useful anyway - can that be picked up?
> 
> The patch is harmless but are we going to have a user of Device_GRE?

This patch would make the attribute in MAIR and the index MT_DEVICE_GRE
useful for __ioremap. Otherwise I don't see the use in having the
attribute at  all. I can at least give this to people who want Device GRE
mapping, which is the fastest way to access device memory on ThunderX2.

JC.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 0/2] ioremap_wc on arm64
  2017-05-23 10:09           ` Jayachandran C
@ 2017-05-23 14:08             ` Catalin Marinas
  2017-05-24 13:33               ` Jayachandran C
  0 siblings, 1 reply; 12+ messages in thread
From: Catalin Marinas @ 2017-05-23 14:08 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, May 23, 2017 at 10:09:03AM +0000, Jayachandran C wrote:
> On Mon, May 22, 2017 at 04:49:47PM +0100, Catalin Marinas wrote:
> > On Mon, May 22, 2017 at 02:21:00PM +0000, Jayachandran C wrote:
> > > On Mon, May 22, 2017 at 01:22:37PM +0100, Will Deacon wrote:
> > > > I think it was used for framebuffers. Note that using normal-NC also aligns
> > > > with arch/arm/ and, since normal-NC is strictly a more relaxed memory type
> > > > than Device-GRE, I'm really not keen making on this change.
> > > 
> > > Even if that is the case, having Normal attribute for ioremap just for
> > > ioremap_wc is inconsistent. The Device Gathering attribute by its definition
> > > is exactly what write combining requires. The memremap() API looks like a
> > > better way to expose Normal-NC mapping (with additional features like
> > > speculation etc.) if implemented correctly.
> > 
> > I agree, memremap() is a better way but this was merged in 4.3 while
> > ioremap_wc() had been around for a long time. If you want to change its
> > semantics, you'd need to go through all its uses in the kernel and make
> > sure they *only* use I/O accessors with such memory. At a quick grep,
> > there are several places where the __iomem pointer attribute is dropped
> > shortly after ioremap() and the pointer is accessed directly. That's
> > where things will break with Device GRE memory since the compiler
> > doesn't guarantee aligned accesses.
> 
> There is a difference in behavior with regard to unaligned access between
> ioremap_wc and ioremap already. I don't follow why unaligned access has to
> supported for ioremap_wc but not for ioremap - it would be much better to
> be consistent here.

Please re-read my paragraph above. It's nice if both ioremap_wc() and
ioremap() returned Device memory as long as you fix the ioremap_wc()
misuses throughout the kernel.

> > > Also, patch 1/2 should be useful anyway - can that be picked up?
> > 
> > The patch is harmless but are we going to have a user of Device_GRE?
> 
> This patch would make the attribute in MAIR and the index MT_DEVICE_GRE
> useful for __ioremap. Otherwise I don't see the use in having the
> attribute at  all. I can at least give this to people who want Device GRE
> mapping, which is the fastest way to access device memory on ThunderX2.

Is this a driver that's going to be used on arm64 kernels only? I'm not
that keen on __ioremap() since it's not a standard driver API, so you
end up with #ifdef CONFIG_ARM64 in the driver code.

-- 
Catalin

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 0/2] ioremap_wc on arm64
  2017-05-23 14:08             ` Catalin Marinas
@ 2017-05-24 13:33               ` Jayachandran C
  0 siblings, 0 replies; 12+ messages in thread
From: Jayachandran C @ 2017-05-24 13:33 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, May 23, 2017 at 03:08:24PM +0100, Catalin Marinas wrote:
> On Tue, May 23, 2017 at 10:09:03AM +0000, Jayachandran C wrote:
> > On Mon, May 22, 2017 at 04:49:47PM +0100, Catalin Marinas wrote:
> > > On Mon, May 22, 2017 at 02:21:00PM +0000, Jayachandran C wrote:
> > > > On Mon, May 22, 2017 at 01:22:37PM +0100, Will Deacon wrote:
> > > > > I think it was used for framebuffers. Note that using normal-NC also aligns
> > > > > with arch/arm/ and, since normal-NC is strictly a more relaxed memory type
> > > > > than Device-GRE, I'm really not keen making on this change.
> > > > 
> > > > Even if that is the case, having Normal attribute for ioremap just for
> > > > ioremap_wc is inconsistent. The Device Gathering attribute by its definition
> > > > is exactly what write combining requires. The memremap() API looks like a
> > > > better way to expose Normal-NC mapping (with additional features like
> > > > speculation etc.) if implemented correctly.
> > > 
> > > I agree, memremap() is a better way but this was merged in 4.3 while
> > > ioremap_wc() had been around for a long time. If you want to change its
> > > semantics, you'd need to go through all its uses in the kernel and make
> > > sure they *only* use I/O accessors with such memory. At a quick grep,
> > > there are several places where the __iomem pointer attribute is dropped
> > > shortly after ioremap() and the pointer is accessed directly. That's
> > > where things will break with Device GRE memory since the compiler
> > > doesn't guarantee aligned accesses.
> > 
> > There is a difference in behavior with regard to unaligned access between
> > ioremap_wc and ioremap already. I don't follow why unaligned access has to
> > supported for ioremap_wc but not for ioremap - it would be much better to
> > be consistent here.
> 
> Please re-read my paragraph above.

I had the same feeling after seeing your reply :)

> It's nice if both ioremap_wc() and
> ioremap() returned Device memory as long as you fix the ioremap_wc()
> misuses throughout the kernel.

ioremap and ioremap_wc is been used and abused in multiple places after
dropping the __iomem annotation, and it is not an easy task to figure
out where unaligned access is done on the pointer returned. I am told
specifically that in ioremap_wc needs to support unaligned access as
opposed to ioremap.

I am trying to figure out if there are specific cases you expect it to
fail. Since arm64 is a fairly new architecture and it will be easy
to get this correct now, rather than to let the status quo continue.

There is no "Gather" attribute on arm, so using "Normal uncached" for
ioremap_wc is understandable. On arm64, given that there is an attribute
specifically for write combining - you have decided to go with Normal
uncached. I had hoped for a proper reason for that and not a vague
"maybe framebuffer, go grep the kernel and figure out" response.

> > > > Also, patch 1/2 should be useful anyway - can that be picked up?
> > > 
> > > The patch is harmless but are we going to have a user of Device_GRE?
> > 
> > This patch would make the attribute in MAIR and the index MT_DEVICE_GRE
> > useful for __ioremap. Otherwise I don't see the use in having the
> > attribute at  all. I can at least give this to people who want Device GRE
> > mapping, which is the fastest way to access device memory on ThunderX2.
> 
> Is this a driver that's going to be used on arm64 kernels only? I'm not
> that keen on __ioremap() since it's not a standard driver API, so you
> end up with #ifdef CONFIG_ARM64 in the driver code.

FWIW, __ioremap() is EXPORT_SYMBOL. Having the attribute will be useful.

JC.

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2017-05-24 13:33 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-05-22  7:01 [PATCH 0/2] ioremap_wc on arm64 Jayachandran C
2017-05-22  7:01 ` [PATCH 1/2] arm64: add PROT_DEVICE_GRE for Device GRE mapping Jayachandran C
2017-05-22  7:01 ` [PATCH 2/2] arm64: switch ioremap_wc to use Device GRE Jayachandran C
2017-05-22  8:56 ` [PATCH 0/2] ioremap_wc on arm64 Catalin Marinas
2017-05-22 11:53   ` Jayachandran C
2017-05-22 11:58     ` Alexander Graf
2017-05-22 12:22     ` Will Deacon
2017-05-22 14:21       ` Jayachandran C
2017-05-22 15:49         ` Catalin Marinas
2017-05-23 10:09           ` Jayachandran C
2017-05-23 14:08             ` Catalin Marinas
2017-05-24 13:33               ` Jayachandran C

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).