linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] uio: Fix bus error that access memory mapped by physical
@ 2021-06-23  6:52 wubian
  2021-06-23  7:14 ` Greg KH
  0 siblings, 1 reply; 7+ messages in thread
From: wubian @ 2021-06-23  6:52 UTC (permalink / raw)
  To: gregkh; +Cc: linux-kernel, wubian

On the arm64, register the uio driver and map a physical space
on the pci device to user space, then use memset write data to
the address space, a bus error will occur. This error is due to
the dc instruction(cache operation) used in the assembly of memset,
uio mapping physical memory will call pgprot_noncached() to set
non-cached and non-buffered, while pgprot_writecombine() has fewer
restrictions. It does not prohibit write buffer, so replacing
pgprot_noncached() with pgprot_writecombine() can solve this problem.

Signed-off-by: wubian <wubian@uniontech.com>
---
 drivers/uio/uio.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/drivers/uio/uio.c b/drivers/uio/uio.c
index ea96e319c8a0..09b04b20fa30 100644
--- a/drivers/uio/uio.c
+++ b/drivers/uio/uio.c
@@ -739,7 +739,11 @@ static int uio_mmap_physical(struct vm_area_struct *vma)
 
 	vma->vm_ops = &uio_physical_vm_ops;
 	if (idev->info->mem[mi].memtype == UIO_MEM_PHYS)
+#if defined(CONFIG_ARM64)
+		vma->vm_page_prot = pgprot_writecombine(vma->vm_page_prot);
+#else
 		vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot);
+#endif
 
 	/*
 	 * We cannot use the vm_iomap_memory() helper here,
-- 
2.20.1




^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH] uio: Fix bus error that access memory mapped by physical
  2021-06-23  6:52 [PATCH] uio: Fix bus error that access memory mapped by physical wubian
@ 2021-06-23  7:14 ` Greg KH
  2021-06-23  8:49   ` wubian
  0 siblings, 1 reply; 7+ messages in thread
From: Greg KH @ 2021-06-23  7:14 UTC (permalink / raw)
  To: wubian; +Cc: linux-kernel

On Wed, Jun 23, 2021 at 02:52:14PM +0800, wubian wrote:
> On the arm64, register the uio driver and map a physical space
> on the pci device to user space, then use memset write data to
> the address space, a bus error will occur. This error is due to
> the dc instruction(cache operation) used in the assembly of memset,
> uio mapping physical memory will call pgprot_noncached() to set
> non-cached and non-buffered, while pgprot_writecombine() has fewer
> restrictions. It does not prohibit write buffer, so replacing
> pgprot_noncached() with pgprot_writecombine() can solve this problem.
> 
> Signed-off-by: wubian <wubian@uniontech.com>
> ---
>  drivers/uio/uio.c | 4 ++++
>  1 file changed, 4 insertions(+)
> 
> diff --git a/drivers/uio/uio.c b/drivers/uio/uio.c
> index ea96e319c8a0..09b04b20fa30 100644
> --- a/drivers/uio/uio.c
> +++ b/drivers/uio/uio.c
> @@ -739,7 +739,11 @@ static int uio_mmap_physical(struct vm_area_struct *vma)
>  
>  	vma->vm_ops = &uio_physical_vm_ops;
>  	if (idev->info->mem[mi].memtype == UIO_MEM_PHYS)
> +#if defined(CONFIG_ARM64)
> +		vma->vm_page_prot = pgprot_writecombine(vma->vm_page_prot);
> +#else
>  		vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot);
> +#endif

This feels really wrong, shouldn't stuff like this be handled in the
platform itself and not in the driver?

And why is ARM64 special here?  Why not other arches?  What is odd about
this platform?  We almost never want to use #if in .c files, why is it
ok to do that here?

And is this a bugfix?  If so, what commit does it fix?  Should it go to
stable kernels, and if so, how far back?

I need more information here :)

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] uio: Fix bus error that access memory mapped by physical
  2021-06-23  7:14 ` Greg KH
@ 2021-06-23  8:49   ` wubian
  2021-06-23  9:25     ` Greg KH
  0 siblings, 1 reply; 7+ messages in thread
From: wubian @ 2021-06-23  8:49 UTC (permalink / raw)
  To: Greg KH; +Cc: linux-kernel

First,thanks for your reply

I haven’t found this problem on the x86 platform. I only found it on 
arm64. I used gdb to track memset and found that the bus error in 
glibc/sysdeps/aarch64/memset.S: dc zva, dst; reference "Architecture 
Reference ManualArmv8, for Armv8-A architecture profile" manual found 
that the dc assembly instruction(performance optimization) is related to 
the operation of the cache, that is to say, there is a bus error in the 
operation of the cache, and then check the "ARM Cortex-A Series 
Programmer's Guide for ARMv8-A " manual, found that the armv8 
architecture cache has data cache and write buffer, and there are two 
operation modes write-back and write-through, write operations in these 
two modes, the data flow will pass through the write buffer, and 
pgprot_noncached will prohibit data Cache and write buffer, this causes 
the dc command in memset to fail (bus error), and pgprot_writecombine 
does not prohibit write buffer, so the dc command in memset is 
successfully operated when pgprot_writecombine is used.

Regarding the bugfix issue, this is my first kernel patch submission, I 
may not be too clear about the boundary of the kernel bugfix. I think 
this issue is a bug, so I classify this submission as a bugfix.  At last 
hope to get your suggestion.

thanks,

wubian

On 2021/6/23 下午3:14, Greg KH wrote:
> On Wed, Jun 23, 2021 at 02:52:14PM +0800, wubian wrote:
>> On the arm64, register the uio driver and map a physical space
>> on the pci device to user space, then use memset write data to
>> the address space, a bus error will occur. This error is due to
>> the dc instruction(cache operation) used in the assembly of memset,
>> uio mapping physical memory will call pgprot_noncached() to set
>> non-cached and non-buffered, while pgprot_writecombine() has fewer
>> restrictions. It does not prohibit write buffer, so replacing
>> pgprot_noncached() with pgprot_writecombine() can solve this problem.
>>
>> Signed-off-by: wubian <wubian@uniontech.com>
>> ---
>>   drivers/uio/uio.c | 4 ++++
>>   1 file changed, 4 insertions(+)
>>
>> diff --git a/drivers/uio/uio.c b/drivers/uio/uio.c
>> index ea96e319c8a0..09b04b20fa30 100644
>> --- a/drivers/uio/uio.c
>> +++ b/drivers/uio/uio.c
>> @@ -739,7 +739,11 @@ static int uio_mmap_physical(struct vm_area_struct *vma)
>>   
>>   	vma->vm_ops = &uio_physical_vm_ops;
>>   	if (idev->info->mem[mi].memtype == UIO_MEM_PHYS)
>> +#if defined(CONFIG_ARM64)
>> +		vma->vm_page_prot = pgprot_writecombine(vma->vm_page_prot);
>> +#else
>>   		vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot);
>> +#endif
> This feels really wrong, shouldn't stuff like this be handled in the
> platform itself and not in the driver?
>
> And why is ARM64 special here?  Why not other arches?  What is odd about
> this platform?  We almost never want to use #if in .c files, why is it
> ok to do that here?
>
> And is this a bugfix?  If so, what commit does it fix?  Should it go to
> stable kernels, and if so, how far back?
>
> I need more information here :)
>
> thanks,
>
> greg k-h
>
>



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] uio: Fix bus error that access memory mapped by physical
  2021-06-23  8:49   ` wubian
@ 2021-06-23  9:25     ` Greg KH
  2021-06-23 12:32       ` wubian
  0 siblings, 1 reply; 7+ messages in thread
From: Greg KH @ 2021-06-23  9:25 UTC (permalink / raw)
  To: wubian; +Cc: linux-kernel

A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?
A: Top-posting.
Q: What is the most annoying thing in e-mail?

A: No.
Q: Should I include quotations after my reply?

http://daringfireball.net/2007/07/on_top

On Wed, Jun 23, 2021 at 04:49:16PM +0800, wubian wrote:
> First,thanks for your reply
> 
> I haven’t found this problem on the x86 platform. I only found it on arm64.
> I used gdb to track memset and found that the bus error in
> glibc/sysdeps/aarch64/memset.S: dc zva, dst; reference "Architecture
> Reference ManualArmv8, for Armv8-A architecture profile" manual found that
> the dc assembly instruction(performance optimization) is related to the
> operation of the cache, that is to say, there is a bus error in the
> operation of the cache, and then check the "ARM Cortex-A Series Programmer's
> Guide for ARMv8-A " manual, found that the armv8 architecture cache has data
> cache and write buffer, and there are two operation modes write-back and
> write-through, write operations in these two modes, the data flow will pass
> through the write buffer, and pgprot_noncached will prohibit data Cache and
> write buffer, this causes the dc command in memset to fail (bus error), and
> pgprot_writecombine does not prohibit write buffer, so the dc command in
> memset is successfully operated when pgprot_writecombine is used.

Are you sure this is not just a specific hardware platform issue?  Are
you sure this is going to be correct for _ALL_ arm64 systems?

Perhaps get the arm64 developers to agree with what is happening here as
this is the first time anyone has reported this problem.

What specific platform are you using that this issue happens on?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] uio: Fix bus error that access memory mapped by physical
  2021-06-23  9:25     ` Greg KH
@ 2021-06-23 12:32       ` wubian
  2021-06-24  8:26         ` Will Deacon
  0 siblings, 1 reply; 7+ messages in thread
From: wubian @ 2021-06-23 12:32 UTC (permalink / raw)
  To: Greg KH; +Cc: linux-kernel, catalin.marinas, will.deacon



On 2021/6/23 下午5:25, Greg KH wrote:
> A: Because it messes up the order in which people normally read text.
> Q: Why is top-posting such a bad thing?
> A: Top-posting.
> Q: What is the most annoying thing in e-mail?
> 
> A: No.
> Q: Should I include quotations after my reply?
> 
> http://daringfireball.net/2007/07/on_top
> 
> On Wed, Jun 23, 2021 at 04:49:16PM +0800, wubian wrote:
>> First,thanks for your reply
>>
>> I haven’t found this problem on the x86 platform. I only found it on arm64.
>> I used gdb to track memset and found that the bus error in
>> glibc/sysdeps/aarch64/memset.S: dc zva, dst; reference "Architecture
>> Reference ManualArmv8, for Armv8-A architecture profile" manual found that
>> the dc assembly instruction(performance optimization) is related to the
>> operation of the cache, that is to say, there is a bus error in the
>> operation of the cache, and then check the "ARM Cortex-A Series Programmer's
>> Guide for ARMv8-A " manual, found that the armv8 architecture cache has data
>> cache and write buffer, and there are two operation modes write-back and
>> write-through, write operations in these two modes, the data flow will pass
>> through the write buffer, and pgprot_noncached will prohibit data Cache and
>> write buffer, this causes the dc command in memset to fail (bus error), and
>> pgprot_writecombine does not prohibit write buffer, so the dc command in
>> memset is successfully operated when pgprot_writecombine is used.
> 
> Are you sure this is not just a specific hardware platform issue?  Are
> you sure this is going to be correct for _ALL_ arm64 systems?
> 
> Perhaps get the arm64 developers to agree with what is happening here as
> this is the first time anyone has reported this problem.
> 
> What specific platform are you using that this issue happens on?
> 
> thanks,
> 
> greg k-h
> 
> 

I apologize for the kernel mail reply format problem, I will pay 
attention to it in the future.

I only found this problem on Huawei Kunpeng 920 cpu at present, and 
found that some people
have raised similar problems on the Internet.
link:https://github.com/ikwzm/udmabuf/issues/31

@Catalin Marinas @Will Deacon Has anyone reported a similar problem on 
the arm64 platform?



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] uio: Fix bus error that access memory mapped by physical
  2021-06-23 12:32       ` wubian
@ 2021-06-24  8:26         ` Will Deacon
  2021-06-24  9:48           ` wubian
  0 siblings, 1 reply; 7+ messages in thread
From: Will Deacon @ 2021-06-24  8:26 UTC (permalink / raw)
  To: wubian
  Cc: Greg KH, linux-kernel, catalin.marinas, mark.rutland, linux-arm-kernel

[FYI: this landed in my spam]

On Wed, Jun 23, 2021 at 08:32:55PM +0800, wubian wrote:
> On 2021/6/23 下午5:25, Greg KH wrote:
> > On Wed, Jun 23, 2021 at 04:49:16PM +0800, wubian wrote:
> > > I haven’t found this problem on the x86 platform. I only found it on arm64.
> > > I used gdb to track memset and found that the bus error in
> > > glibc/sysdeps/aarch64/memset.S: dc zva, dst; reference "Architecture
> > > Reference ManualArmv8, for Armv8-A architecture profile" manual found that
> > > the dc assembly instruction(performance optimization) is related to the
> > > operation of the cache, that is to say, there is a bus error in the
> > > operation of the cache, and then check the "ARM Cortex-A Series Programmer's
> > > Guide for ARMv8-A " manual, found that the armv8 architecture cache has data
> > > cache and write buffer, and there are two operation modes write-back and
> > > write-through, write operations in these two modes, the data flow will pass
> > > through the write buffer, and pgprot_noncached will prohibit data Cache and
> > > write buffer, this causes the dc command in memset to fail (bus error), and
> > > pgprot_writecombine does not prohibit write buffer, so the dc command in
> > > memset is successfully operated when pgprot_writecombine is used.
> > 
> > Are you sure this is not just a specific hardware platform issue?  Are
> > you sure this is going to be correct for _ALL_ arm64 systems?
> > 
> > Perhaps get the arm64 developers to agree with what is happening here as
> > this is the first time anyone has reported this problem.
> > 
> > What specific platform are you using that this issue happens on?
> > 
> I apologize for the kernel mail reply format problem, I will pay attention
> to it in the future.
> 
> I only found this problem on Huawei Kunpeng 920 cpu at present, and found
> that some people
> have raised similar problems on the Internet.
> link:https://github.com/ikwzm/udmabuf/issues/31
> 
> @Catalin Marinas @Will Deacon Has anyone reported a similar problem on the
> arm64 platform?

The fundamental issue here (which seems related to [1]) is that you're
mapping MMIO into userspace and then expecting to be able to use standard
string routines such as memset and memcpy on them. This won't work, as the
architecture (and likely the MMIO endpoint) has restrictions on things like
unaligned accesses, access size and atomicity for MMIO that do not apply to
normal memory.

Returning non-cacheable rather than device mappings from UIO is likely to
cause more problems than it solves, as it permits the CPU to make
speculative accesses to the region. If the mapped device has side-effects,
then this will end in disaster.

So I don't think this patch is correct, and I think that if you're exposing
MMIO to userspace then you need to be very careful about what you do with
it rather than blindly pass MMIO pointers into standard routines that expect
to operate on normal memory. There's a reason drivers usually live in the
kernel :)

Thanks,

Will

[1] https://lore.kernel.org/r/da9c2fa9-a545-0c48-4490-d6134cc31425@huawei.com

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] uio: Fix bus error that access memory mapped by physical
  2021-06-24  8:26         ` Will Deacon
@ 2021-06-24  9:48           ` wubian
  0 siblings, 0 replies; 7+ messages in thread
From: wubian @ 2021-06-24  9:48 UTC (permalink / raw)
  To: Will Deacon
  Cc: Greg KH, linux-kernel, catalin.marinas, mark.rutland, linux-arm-kernel



On 2021/6/24 下午4:26, Will Deacon wrote:
> [FYI: this landed in my spam]
> 
> On Wed, Jun 23, 2021 at 08:32:55PM +0800, wubian wrote:
>> On 2021/6/23 下午5:25, Greg KH wrote:
>>> On Wed, Jun 23, 2021 at 04:49:16PM +0800, wubian wrote:
>>>> I haven’t found this problem on the x86 platform. I only found it on arm64.
>>>> I used gdb to track memset and found that the bus error in
>>>> glibc/sysdeps/aarch64/memset.S: dc zva, dst; reference "Architecture
>>>> Reference ManualArmv8, for Armv8-A architecture profile" manual found that
>>>> the dc assembly instruction(performance optimization) is related to the
>>>> operation of the cache, that is to say, there is a bus error in the
>>>> operation of the cache, and then check the "ARM Cortex-A Series Programmer's
>>>> Guide for ARMv8-A " manual, found that the armv8 architecture cache has data
>>>> cache and write buffer, and there are two operation modes write-back and
>>>> write-through, write operations in these two modes, the data flow will pass
>>>> through the write buffer, and pgprot_noncached will prohibit data Cache and
>>>> write buffer, this causes the dc command in memset to fail (bus error), and
>>>> pgprot_writecombine does not prohibit write buffer, so the dc command in
>>>> memset is successfully operated when pgprot_writecombine is used.
>>>
>>> Are you sure this is not just a specific hardware platform issue?  Are
>>> you sure this is going to be correct for _ALL_ arm64 systems?
>>>
>>> Perhaps get the arm64 developers to agree with what is happening here as
>>> this is the first time anyone has reported this problem.
>>>
>>> What specific platform are you using that this issue happens on?
>>>
>> I apologize for the kernel mail reply format problem, I will pay attention
>> to it in the future.
>>
>> I only found this problem on Huawei Kunpeng 920 cpu at present, and found
>> that some people
>> have raised similar problems on the Internet.
>> link:https://github.com/ikwzm/udmabuf/issues/31
>>
>> @Catalin Marinas @Will Deacon Has anyone reported a similar problem on the
>> arm64 platform?
> 
> The fundamental issue here (which seems related to [1]) is that you're
> mapping MMIO into userspace and then expecting to be able to use standard
> string routines such as memset and memcpy on them. This won't work, as the
> architecture (and likely the MMIO endpoint) has restrictions on things like
> unaligned accesses, access size and atomicity for MMIO that do not apply to
> normal memory.
> 
> Returning non-cacheable rather than device mappings from UIO is likely to
> cause more problems than it solves, as it permits the CPU to make
> speculative accesses to the region. If the mapped device has side-effects,
> then this will end in disaster.
> 
> So I don't think this patch is correct, and I think that if you're exposing
> MMIO to userspace then you need to be very careful about what you do with
> it rather than blindly pass MMIO pointers into standard routines that expect
> to operate on normal memory. There's a reason drivers usually live in the
> kernel :)
> 
> Thanks,
> 
> Will
> 
> [1] https://lore.kernel.org/r/da9c2fa9-a545-0c48-4490-d6134cc31425@huawei.com
> 
> 

Thank you very much for your reply,It gave me a deeper understanding of 
this problem.

-- 
Best regards,

wubian



^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2021-06-24  9:48 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-06-23  6:52 [PATCH] uio: Fix bus error that access memory mapped by physical wubian
2021-06-23  7:14 ` Greg KH
2021-06-23  8:49   ` wubian
2021-06-23  9:25     ` Greg KH
2021-06-23 12:32       ` wubian
2021-06-24  8:26         ` Will Deacon
2021-06-24  9:48           ` wubian

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).