All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] ARM: fix alignement of __bug_table section entries
@ 2015-09-02  6:23 ` Robert Jarzmik
  0 siblings, 0 replies; 48+ messages in thread
From: Robert Jarzmik @ 2015-09-02  6:23 UTC (permalink / raw)
  To: Russell King - ARM Linux; +Cc: linux-arm-kernel, linux-kernel, Robert Jarzmik

On old ARM chips, unaligned accesses to memory are not trapped and
fixed.  On module load, symbols are relocated, and the relocation of
__bug_table symbols is done on a u32 basis. Yet the section is not
aligned to a multiple of 4 address, but to a multiple of 2.

This triggers an Oops on pxa architecture, where address 0xbf0021ea
is the first relocation in the __bug_table section :
  apply_relocate(): pxa3xx_nand: section 13 reloc 0 sym ''
  Unable to handle kernel paging request at virtual address bf0021ea
  pgd = e1cd0000
  [bf0021ea] *pgd=c1cce851, *pte=c1cde04f, *ppte=c1cde01f
  Internal error: Oops: 23 [#1] ARM
  Modules linked in:
  CPU: 0 PID: 606 Comm: insmod Not tainted 4.2.0-rc8-next-20150828-cm-x300+ #887
  Hardware name: CM-X300 module
  task: e1c68700 ti: e1c3e000 task.ti: e1c3e000
  PC is at apply_relocate+0x2f4/0x3d4
  LR is at 0xbf0021ea
  pc : [<c000e7c8>]    lr : [<bf0021ea>]    psr: 80000013
  sp : e1c3fe30  ip : 60000013  fp : e49e8c60
  r10: e49e8fa8  r9 : 00000000  r8 : e49e7c58
  r7 : e49e8c38  r6 : e49e8a58  r5 : e49e8920  r4 : e49e8918
  r3 : bf0021ea  r2 : bf007034  r1 : 00000000  r0 : bf000000
  Flags: Nzcv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none
  Control: 0000397f  Table: c1cd0018  DAC: 00000051
  Process insmod (pid: 606, stack limit = 0xe1c3e198)
  [<c000e7c8>] (apply_relocate) from [<c005ce5c>] (load_module+0x1248/0x1f5c)
  [<c005ce5c>] (load_module) from [<c005dc54>] (SyS_init_module+0xe4/0x170)
  [<c005dc54>] (SyS_init_module) from [<c000a420>] (ret_fast_syscall+0x0/0x38)

Fix this by ensuring entries in __bug_table are all aligned to at least
of multiple of 4. This transforms a module section  __bug_table as :
-   [12] __bug_table       PROGBITS        00000000 002232 000018 00   A  0   0  1
+   [12] __bug_table       PROGBITS        00000000 002232 000018 00   A  0   0  4

Signed-off-by: Robert Jarzmik <robert.jarzmik@free.fr>
---
 arch/arm/include/asm/bug.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/arm/include/asm/bug.h b/arch/arm/include/asm/bug.h
index b274bde24905..e7335a92144e 100644
--- a/arch/arm/include/asm/bug.h
+++ b/arch/arm/include/asm/bug.h
@@ -40,6 +40,7 @@ do {								\
 		"2:\t.asciz " #__file "\n" 			\
 		".popsection\n" 				\
 		".pushsection __bug_table,\"a\"\n"		\
+		".align 2\n"					\
 		"3:\t.word 1b, 2b\n"				\
 		"\t.hword " #__line ", 0\n"			\
 		".popsection");					\
-- 
2.1.4


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH] ARM: fix alignement of __bug_table section entries
@ 2015-09-02  6:23 ` Robert Jarzmik
  0 siblings, 0 replies; 48+ messages in thread
From: Robert Jarzmik @ 2015-09-02  6:23 UTC (permalink / raw)
  To: linux-arm-kernel

On old ARM chips, unaligned accesses to memory are not trapped and
fixed.  On module load, symbols are relocated, and the relocation of
__bug_table symbols is done on a u32 basis. Yet the section is not
aligned to a multiple of 4 address, but to a multiple of 2.

This triggers an Oops on pxa architecture, where address 0xbf0021ea
is the first relocation in the __bug_table section :
  apply_relocate(): pxa3xx_nand: section 13 reloc 0 sym ''
  Unable to handle kernel paging request at virtual address bf0021ea
  pgd = e1cd0000
  [bf0021ea] *pgd=c1cce851, *pte=c1cde04f, *ppte=c1cde01f
  Internal error: Oops: 23 [#1] ARM
  Modules linked in:
  CPU: 0 PID: 606 Comm: insmod Not tainted 4.2.0-rc8-next-20150828-cm-x300+ #887
  Hardware name: CM-X300 module
  task: e1c68700 ti: e1c3e000 task.ti: e1c3e000
  PC is at apply_relocate+0x2f4/0x3d4
  LR is at 0xbf0021ea
  pc : [<c000e7c8>]    lr : [<bf0021ea>]    psr: 80000013
  sp : e1c3fe30  ip : 60000013  fp : e49e8c60
  r10: e49e8fa8  r9 : 00000000  r8 : e49e7c58
  r7 : e49e8c38  r6 : e49e8a58  r5 : e49e8920  r4 : e49e8918
  r3 : bf0021ea  r2 : bf007034  r1 : 00000000  r0 : bf000000
  Flags: Nzcv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none
  Control: 0000397f  Table: c1cd0018  DAC: 00000051
  Process insmod (pid: 606, stack limit = 0xe1c3e198)
  [<c000e7c8>] (apply_relocate) from [<c005ce5c>] (load_module+0x1248/0x1f5c)
  [<c005ce5c>] (load_module) from [<c005dc54>] (SyS_init_module+0xe4/0x170)
  [<c005dc54>] (SyS_init_module) from [<c000a420>] (ret_fast_syscall+0x0/0x38)

Fix this by ensuring entries in __bug_table are all aligned to at least
of multiple of 4. This transforms a module section  __bug_table as :
-   [12] __bug_table       PROGBITS        00000000 002232 000018 00   A  0   0  1
+   [12] __bug_table       PROGBITS        00000000 002232 000018 00   A  0   0  4

Signed-off-by: Robert Jarzmik <robert.jarzmik@free.fr>
---
 arch/arm/include/asm/bug.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/arm/include/asm/bug.h b/arch/arm/include/asm/bug.h
index b274bde24905..e7335a92144e 100644
--- a/arch/arm/include/asm/bug.h
+++ b/arch/arm/include/asm/bug.h
@@ -40,6 +40,7 @@ do {								\
 		"2:\t.asciz " #__file "\n" 			\
 		".popsection\n" 				\
 		".pushsection __bug_table,\"a\"\n"		\
+		".align 2\n"					\
 		"3:\t.word 1b, 2b\n"				\
 		"\t.hword " #__line ", 0\n"			\
 		".popsection");					\
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* Re: [PATCH] ARM: fix alignement of __bug_table section entries
  2015-09-02  6:23 ` Robert Jarzmik
@ 2015-09-02 10:39   ` Dave Martin
  -1 siblings, 0 replies; 48+ messages in thread
From: Dave Martin @ 2015-09-02 10:39 UTC (permalink / raw)
  To: Robert Jarzmik; +Cc: Russell King - ARM Linux, linux-kernel, linux-arm-kernel

On Wed, Sep 02, 2015 at 08:23:29AM +0200, Robert Jarzmik wrote:
> On old ARM chips, unaligned accesses to memory are not trapped and
> fixed.  On module load, symbols are relocated, and the relocation of
> __bug_table symbols is done on a u32 basis. Yet the section is not
> aligned to a multiple of 4 address, but to a multiple of 2.
> 
> This triggers an Oops on pxa architecture, where address 0xbf0021ea
> is the first relocation in the __bug_table section :
>   apply_relocate(): pxa3xx_nand: section 13 reloc 0 sym ''
>   Unable to handle kernel paging request at virtual address bf0021ea
>   pgd = e1cd0000
>   [bf0021ea] *pgd=c1cce851, *pte=c1cde04f, *ppte=c1cde01f
>   Internal error: Oops: 23 [#1] ARM
>   Modules linked in:
>   CPU: 0 PID: 606 Comm: insmod Not tainted 4.2.0-rc8-next-20150828-cm-x300+ #887
>   Hardware name: CM-X300 module
>   task: e1c68700 ti: e1c3e000 task.ti: e1c3e000
>   PC is at apply_relocate+0x2f4/0x3d4
>   LR is at 0xbf0021ea
>   pc : [<c000e7c8>]    lr : [<bf0021ea>]    psr: 80000013
>   sp : e1c3fe30  ip : 60000013  fp : e49e8c60
>   r10: e49e8fa8  r9 : 00000000  r8 : e49e7c58
>   r7 : e49e8c38  r6 : e49e8a58  r5 : e49e8920  r4 : e49e8918
>   r3 : bf0021ea  r2 : bf007034  r1 : 00000000  r0 : bf000000
>   Flags: Nzcv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none
>   Control: 0000397f  Table: c1cd0018  DAC: 00000051
>   Process insmod (pid: 606, stack limit = 0xe1c3e198)
>   [<c000e7c8>] (apply_relocate) from [<c005ce5c>] (load_module+0x1248/0x1f5c)
>   [<c005ce5c>] (load_module) from [<c005dc54>] (SyS_init_module+0xe4/0x170)
>   [<c005dc54>] (SyS_init_module) from [<c000a420>] (ret_fast_syscall+0x0/0x38)
> 
> Fix this by ensuring entries in __bug_table are all aligned to at least
> of multiple of 4. This transforms a module section  __bug_table as :
> -   [12] __bug_table       PROGBITS        00000000 002232 000018 00   A  0   0  1
> +   [12] __bug_table       PROGBITS        00000000 002232 000018 00   A  0   0  4
> 
> Signed-off-by: Robert Jarzmik <robert.jarzmik@free.fr>
> ---
>  arch/arm/include/asm/bug.h | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/arch/arm/include/asm/bug.h b/arch/arm/include/asm/bug.h
> index b274bde24905..e7335a92144e 100644
> --- a/arch/arm/include/asm/bug.h
> +++ b/arch/arm/include/asm/bug.h
> @@ -40,6 +40,7 @@ do {								\
>  		"2:\t.asciz " #__file "\n" 			\
>  		".popsection\n" 				\
>  		".pushsection __bug_table,\"a\"\n"		\
> +		".align 2\n"					\
>  		"3:\t.word 1b, 2b\n"				\
>  		"\t.hword " #__line ", 0\n"			\
>  		".popsection");					\

Reviewed-by: Dave Martin <Dave.Martin@arm.com>

I added the .align in my recent patches implementing BUG for arm64,
but didn't touch arch/arm.

When referring to the arm code I did notice that there was no .align.
I'd concluded that the linker script layout and lack of bug reports
meant the arm code was alignment-safe in practice, but I guess I was
mistaken...

Cheers
---Dave


^ permalink raw reply	[flat|nested] 48+ messages in thread

* [PATCH] ARM: fix alignement of __bug_table section entries
@ 2015-09-02 10:39   ` Dave Martin
  0 siblings, 0 replies; 48+ messages in thread
From: Dave Martin @ 2015-09-02 10:39 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Sep 02, 2015 at 08:23:29AM +0200, Robert Jarzmik wrote:
> On old ARM chips, unaligned accesses to memory are not trapped and
> fixed.  On module load, symbols are relocated, and the relocation of
> __bug_table symbols is done on a u32 basis. Yet the section is not
> aligned to a multiple of 4 address, but to a multiple of 2.
> 
> This triggers an Oops on pxa architecture, where address 0xbf0021ea
> is the first relocation in the __bug_table section :
>   apply_relocate(): pxa3xx_nand: section 13 reloc 0 sym ''
>   Unable to handle kernel paging request at virtual address bf0021ea
>   pgd = e1cd0000
>   [bf0021ea] *pgd=c1cce851, *pte=c1cde04f, *ppte=c1cde01f
>   Internal error: Oops: 23 [#1] ARM
>   Modules linked in:
>   CPU: 0 PID: 606 Comm: insmod Not tainted 4.2.0-rc8-next-20150828-cm-x300+ #887
>   Hardware name: CM-X300 module
>   task: e1c68700 ti: e1c3e000 task.ti: e1c3e000
>   PC is at apply_relocate+0x2f4/0x3d4
>   LR is at 0xbf0021ea
>   pc : [<c000e7c8>]    lr : [<bf0021ea>]    psr: 80000013
>   sp : e1c3fe30  ip : 60000013  fp : e49e8c60
>   r10: e49e8fa8  r9 : 00000000  r8 : e49e7c58
>   r7 : e49e8c38  r6 : e49e8a58  r5 : e49e8920  r4 : e49e8918
>   r3 : bf0021ea  r2 : bf007034  r1 : 00000000  r0 : bf000000
>   Flags: Nzcv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none
>   Control: 0000397f  Table: c1cd0018  DAC: 00000051
>   Process insmod (pid: 606, stack limit = 0xe1c3e198)
>   [<c000e7c8>] (apply_relocate) from [<c005ce5c>] (load_module+0x1248/0x1f5c)
>   [<c005ce5c>] (load_module) from [<c005dc54>] (SyS_init_module+0xe4/0x170)
>   [<c005dc54>] (SyS_init_module) from [<c000a420>] (ret_fast_syscall+0x0/0x38)
> 
> Fix this by ensuring entries in __bug_table are all aligned to at least
> of multiple of 4. This transforms a module section  __bug_table as :
> -   [12] __bug_table       PROGBITS        00000000 002232 000018 00   A  0   0  1
> +   [12] __bug_table       PROGBITS        00000000 002232 000018 00   A  0   0  4
> 
> Signed-off-by: Robert Jarzmik <robert.jarzmik@free.fr>
> ---
>  arch/arm/include/asm/bug.h | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/arch/arm/include/asm/bug.h b/arch/arm/include/asm/bug.h
> index b274bde24905..e7335a92144e 100644
> --- a/arch/arm/include/asm/bug.h
> +++ b/arch/arm/include/asm/bug.h
> @@ -40,6 +40,7 @@ do {								\
>  		"2:\t.asciz " #__file "\n" 			\
>  		".popsection\n" 				\
>  		".pushsection __bug_table,\"a\"\n"		\
> +		".align 2\n"					\
>  		"3:\t.word 1b, 2b\n"				\
>  		"\t.hword " #__line ", 0\n"			\
>  		".popsection");					\

Reviewed-by: Dave Martin <Dave.Martin@arm.com>

I added the .align in my recent patches implementing BUG for arm64,
but didn't touch arch/arm.

When referring to the arm code I did notice that there was no .align.
I'd concluded that the linker script layout and lack of bug reports
meant the arm code was alignment-safe in practice, but I guess I was
mistaken...

Cheers
---Dave

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH] ARM: fix alignement of __bug_table section entries
  2015-09-02 10:39   ` Dave Martin
@ 2015-09-05 13:48     ` Robert Jarzmik
  -1 siblings, 0 replies; 48+ messages in thread
From: Robert Jarzmik @ 2015-09-05 13:48 UTC (permalink / raw)
  To: Russell King - ARM Linux, Dave Martin; +Cc: linux-kernel, linux-arm-kernel

Dave Martin <Dave.Martin@arm.com> writes:

> On Wed, Sep 02, 2015 at 08:23:29AM +0200, Robert Jarzmik wrote:
>> On old ARM chips, unaligned accesses to memory are not trapped and
>> fixed.  On module load, symbols are relocated, and the relocation of
>> __bug_table symbols is done on a u32 basis. Yet the section is not
>> aligned to a multiple of 4 address, but to a multiple of 2.

Hi Russell,

I dug deeper, and got another stack, unrelated to modules insertion but related
to alignement fault (see [1] for reference). As before, this didn't happen on
v4.1, but happens on linux-next.
I'm wondering if alignement fixup does work in my case and if I understand it.

This time I took my JTAG to have a look at the flow, in arch/arm/mm/alignment.c,
where I added the small chunk in [2], which gave in my case :
    RJK: fault=4 instr=0x00000000 instrptr=0xc02b37c8 thumb_mode=0 tinstr=0x0000

The instruction (as seen with vmlinux disassembly or JTAG memory dump) is  :
    0xc02b37c8 <+372>:	b2 50 c6 10	strhne	r5, [r6], #2

I must admit I fail to see how the following "fixup:" label can be reached with
a "missed" copy_from_user() (fault == 4).

This is probably what happened to me with the modules __bug_table section, and
it will continue to happen until I understand why this copy fails. It's really
odd nobody but me faces this issue.

Cheers.

-- 
Robert

[1] New stack
=============
  #0: pxa2xx-ac97 (Wolfson WM9713,WM9714)
RJK: fault=4 instr=0x00000000 instrptr=0xc02b37c8 thumb_mode=0 tinstr=0x0000
&Unable to handle kernel paging request at virtual address c3057661
&pgd = c0004000
"[c3057661] *pgd=a300040e(bad)
Internal error: Oops: 803 [#1] PREEMPT ARM
Modules linked in:
CPU: 0 PID: 1 Comm: swapper Not tainted 4.2.0-rc8-next-20150828+ #900
Hardware name: MIO A701
task: c3858bc0 ti: c385c000 task.ti: c385c000
PC is at doc_read_data_area+0x174/0x370
LR is at doc_read_page_getbytes+0x58/0x78
pc : [<c02b37c8>]    lr : [<c02b3a1c>]    psr: a8000013
sp : c385d8e0  ip : c07142b8  fp : c385d91c
r10: c3aac540  r9 : c070ffc0  r8 : c385c000
@QGi
r7 : 00000002  r6 : c3057661  r5 : 0000c1e5  r4 : 0000000b
r3 : 0000000a  r2 : 0000000b  r1 : c3057661  r0 : 00000000
Flags: NzCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none
Control: 0000397f  Table: a0004000  DAC: 00000053
Process swapper (pid: 1, stack limit = 0xc385c198)
Stack: (0xc385d8e0 to 0xc385e000)
d8e0: 00000000 c02b3a34 00000001 0000000a c385d914 c3057660 0000000c c3aac540
... chop chop ...
dfe0: 00000000 00000000 00000000 00000000 00000013 00000000 f3ff8cdd 4cf3d76f
[<c02b37c8>] (doc_read_data_area) from [<c02b3a1c>] (doc_read_page_getbytes+0x58/0x78)
[<c02b3a1c>] (doc_read_page_getbytes) from [<c02b5f20>] (doc_read_oob+0x22c/0x75c)
[<c02b5f20>] (doc_read_oob) from [<c02b64b4>] (doc_read+0x64/0x74)
[<c02b64b4>] (doc_read) from [<c02ade08>] (part_read+0x58/0x9c)
[<c02ade08>] (part_read) from [<c02aa8d8>] (mtd_read+0x88/0xbc)
[<c02aa8d8>] (mtd_read) from [<c02c35ec>] (ubi_io_read+0x16c/0x358)
[<c02c35ec>] (ubi_io_read) from [<c02c0450>] (ubi_eba_read_leb+0x384/0x4d4)
[<c02c0450>] (ubi_eba_read_leb) from [<c02bf944>] (ubi_leb_read+0xd4/0x134)
[<c02bf944>] (ubi_leb_read) from [<c0185e9c>] (ubifs_leb_read+0x3c/0xa8)s
[<c0185e9c>] (ubifs_leb_read) from [<c019fd94>] (ubifs_read_nnode+0xec/0x200)
[<c019fd94>] (ubifs_read_nnode) from [<c01a03a0>] (ubifs_lpt_lookup_dirty+0x38/0x330)
[<c01a03a0>] (ubifs_lpt_lookup_dirty) from [<c0190448>] (ubifs_replay_journal+0x3c/0x1b38)
[<c0190448>] (ubifs_replay_journal) from [<c0182c38>] (ubifs_mount+0x1444/0x2410)
[<c0182c38>] (ubifs_mount) from [<c00eb604>] (mount_fs+0x24/0xb0)
[<c00eb604>] (mount_fs) from [<c01081d0>] (vfs_kern_mount+0x58/0x124)
[<c01081d0>] (vfs_kern_mount) from [<c010b384>] (do_mount+0xb40/0xd10)
[<c010b384>] (do_mount) from [<c010b8a8>] (SyS_mount+0x84/0xb0)
[<c010b8a8>] (SyS_mount) from [<c068319c>] (mount_block_root+0x12c/0x2dc)
[<c068319c>] (mount_block_root) from [<c0683564>] (prepare_namespace+0x98/0x1bc)
[<c0683564>] (prepare_namespace) from [<c0682eb4>] (kernel_init_freeable+0x188/0x1d4)
[<c0682eb4>] (kernel_init_freeable) from [<c04a2610>] (kernel_init+0x18/0xfc)
[<c04a2610>] (kernel_init) from [<c000a5ec>] (ret_from_fork+0x14/0x28)
Code: 1a000060 e51b3030 e3560000 e2877002 (10c650b2) 
---[ end trace a0bcca195299a22d ]---

[2] Debug patch
===============
diff --git a/arch/arm/mm/alignment.c b/arch/arm/mm/alignment.c
index 2c0c541c60ca..b0897da5456c 100644
--- a/arch/arm/mm/alignment.c
+++ b/arch/arm/mm/alignment.c
@@ -787,6 +787,8 @@ do_alignment(unsigned long addr, unsigned int fsr, struct pt_regs *regs)
                instr = __mem_to_opcode_arm(instr);
        }
 
+       pr_info("RJK: fault=%d instr=0x%08lx instrptr=0x%08lx thumb_mode=%lu tinstr=0x%04x\n",
+               fault, instr, instrptr, thumb_mode(regs), tinstr);
        if (fault) {
                type = TYPE_FAULT;
                goto bad_or_fault;

[3] Abort stack
===============
#0  panic (fmt=0xc385d6c4 <incomplete sequence \344>) at kernel/panic.c:72
#1  0xc000e788 in oops_end (regs=<optimized out>, signr=<optimized out>, flags=<optimized out>) at arch/arm/kernel/traps.c:311
#2  die (str=0x68000013 "", regs=<optimized out>, err=-1014638958) at arch/arm/kernel/traps.c:333
#3  0xc00137c0 in __do_kernel_fault (mm=0xc06e8c04 <init_mm>, addr=3271915105, fsr=2051, regs=0xc385d890) at arch/arm/mm/fault.c:150
#4  0xc0013b10 in do_bad_area (addr=<optimized out>, fsr=<optimized out>, regs=<optimized out>) at arch/arm/mm/fault.c:198
#5  0xc00166c8 in do_alignment (addr=3271915105, fsr=2051, regs=0xc385d890) at arch/arm/mm/alignment.c:900
#6  0xc0009264 in do_DataAbort (addr=3271915105, fsr=2051, regs=0xc385d890) at arch/arm/mm/fault.c:550
#7  0xc000f024 in __dabt_svc () at arch/arm/kernel/entry-armv.S:204

[4] frame #6 regs dump
======================
$1 = {uregs = {0, 3271915105, 11, 10, 11, 49637, 3271915105, 2, 3280322560, 3228630976, 3282748736, 3280328988, 3228648120, 3280328928, 3224058396, 3224057800, 2818572307, 
    4294967295}}

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH] ARM: fix alignement of __bug_table section entries
@ 2015-09-05 13:48     ` Robert Jarzmik
  0 siblings, 0 replies; 48+ messages in thread
From: Robert Jarzmik @ 2015-09-05 13:48 UTC (permalink / raw)
  To: linux-arm-kernel

Dave Martin <Dave.Martin@arm.com> writes:

> On Wed, Sep 02, 2015 at 08:23:29AM +0200, Robert Jarzmik wrote:
>> On old ARM chips, unaligned accesses to memory are not trapped and
>> fixed.  On module load, symbols are relocated, and the relocation of
>> __bug_table symbols is done on a u32 basis. Yet the section is not
>> aligned to a multiple of 4 address, but to a multiple of 2.

Hi Russell,

I dug deeper, and got another stack, unrelated to modules insertion but related
to alignement fault (see [1] for reference). As before, this didn't happen on
v4.1, but happens on linux-next.
I'm wondering if alignement fixup does work in my case and if I understand it.

This time I took my JTAG to have a look at the flow, in arch/arm/mm/alignment.c,
where I added the small chunk in [2], which gave in my case :
    RJK: fault=4 instr=0x00000000 instrptr=0xc02b37c8 thumb_mode=0 tinstr=0x0000

The instruction (as seen with vmlinux disassembly or JTAG memory dump) is  :
    0xc02b37c8 <+372>:	b2 50 c6 10	strhne	r5, [r6], #2

I must admit I fail to see how the following "fixup:" label can be reached with
a "missed" copy_from_user() (fault == 4).

This is probably what happened to me with the modules __bug_table section, and
it will continue to happen until I understand why this copy fails. It's really
odd nobody but me faces this issue.

Cheers.

-- 
Robert

[1] New stack
=============
  #0: pxa2xx-ac97 (Wolfson WM9713,WM9714)
RJK: fault=4 instr=0x00000000 instrptr=0xc02b37c8 thumb_mode=0 tinstr=0x0000
&Unable to handle kernel paging request at virtual address c3057661
&pgd = c0004000
"[c3057661] *pgd=a300040e(bad)
Internal error: Oops: 803 [#1] PREEMPT ARM
Modules linked in:
CPU: 0 PID: 1 Comm: swapper Not tainted 4.2.0-rc8-next-20150828+ #900
Hardware name: MIO A701
task: c3858bc0 ti: c385c000 task.ti: c385c000
PC is at doc_read_data_area+0x174/0x370
LR is at doc_read_page_getbytes+0x58/0x78
pc : [<c02b37c8>]    lr : [<c02b3a1c>]    psr: a8000013
sp : c385d8e0  ip : c07142b8  fp : c385d91c
r10: c3aac540  r9 : c070ffc0  r8 : c385c000
@QGi
r7 : 00000002  r6 : c3057661  r5 : 0000c1e5  r4 : 0000000b
r3 : 0000000a  r2 : 0000000b  r1 : c3057661  r0 : 00000000
Flags: NzCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none
Control: 0000397f  Table: a0004000  DAC: 00000053
Process swapper (pid: 1, stack limit = 0xc385c198)
Stack: (0xc385d8e0 to 0xc385e000)
d8e0: 00000000 c02b3a34 00000001 0000000a c385d914 c3057660 0000000c c3aac540
... chop chop ...
dfe0: 00000000 00000000 00000000 00000000 00000013 00000000 f3ff8cdd 4cf3d76f
[<c02b37c8>] (doc_read_data_area) from [<c02b3a1c>] (doc_read_page_getbytes+0x58/0x78)
[<c02b3a1c>] (doc_read_page_getbytes) from [<c02b5f20>] (doc_read_oob+0x22c/0x75c)
[<c02b5f20>] (doc_read_oob) from [<c02b64b4>] (doc_read+0x64/0x74)
[<c02b64b4>] (doc_read) from [<c02ade08>] (part_read+0x58/0x9c)
[<c02ade08>] (part_read) from [<c02aa8d8>] (mtd_read+0x88/0xbc)
[<c02aa8d8>] (mtd_read) from [<c02c35ec>] (ubi_io_read+0x16c/0x358)
[<c02c35ec>] (ubi_io_read) from [<c02c0450>] (ubi_eba_read_leb+0x384/0x4d4)
[<c02c0450>] (ubi_eba_read_leb) from [<c02bf944>] (ubi_leb_read+0xd4/0x134)
[<c02bf944>] (ubi_leb_read) from [<c0185e9c>] (ubifs_leb_read+0x3c/0xa8)s
[<c0185e9c>] (ubifs_leb_read) from [<c019fd94>] (ubifs_read_nnode+0xec/0x200)
[<c019fd94>] (ubifs_read_nnode) from [<c01a03a0>] (ubifs_lpt_lookup_dirty+0x38/0x330)
[<c01a03a0>] (ubifs_lpt_lookup_dirty) from [<c0190448>] (ubifs_replay_journal+0x3c/0x1b38)
[<c0190448>] (ubifs_replay_journal) from [<c0182c38>] (ubifs_mount+0x1444/0x2410)
[<c0182c38>] (ubifs_mount) from [<c00eb604>] (mount_fs+0x24/0xb0)
[<c00eb604>] (mount_fs) from [<c01081d0>] (vfs_kern_mount+0x58/0x124)
[<c01081d0>] (vfs_kern_mount) from [<c010b384>] (do_mount+0xb40/0xd10)
[<c010b384>] (do_mount) from [<c010b8a8>] (SyS_mount+0x84/0xb0)
[<c010b8a8>] (SyS_mount) from [<c068319c>] (mount_block_root+0x12c/0x2dc)
[<c068319c>] (mount_block_root) from [<c0683564>] (prepare_namespace+0x98/0x1bc)
[<c0683564>] (prepare_namespace) from [<c0682eb4>] (kernel_init_freeable+0x188/0x1d4)
[<c0682eb4>] (kernel_init_freeable) from [<c04a2610>] (kernel_init+0x18/0xfc)
[<c04a2610>] (kernel_init) from [<c000a5ec>] (ret_from_fork+0x14/0x28)
Code: 1a000060 e51b3030 e3560000 e2877002 (10c650b2) 
---[ end trace a0bcca195299a22d ]---

[2] Debug patch
===============
diff --git a/arch/arm/mm/alignment.c b/arch/arm/mm/alignment.c
index 2c0c541c60ca..b0897da5456c 100644
--- a/arch/arm/mm/alignment.c
+++ b/arch/arm/mm/alignment.c
@@ -787,6 +787,8 @@ do_alignment(unsigned long addr, unsigned int fsr, struct pt_regs *regs)
                instr = __mem_to_opcode_arm(instr);
        }
 
+       pr_info("RJK: fault=%d instr=0x%08lx instrptr=0x%08lx thumb_mode=%lu tinstr=0x%04x\n",
+               fault, instr, instrptr, thumb_mode(regs), tinstr);
        if (fault) {
                type = TYPE_FAULT;
                goto bad_or_fault;

[3] Abort stack
===============
#0  panic (fmt=0xc385d6c4 <incomplete sequence \344>) at kernel/panic.c:72
#1  0xc000e788 in oops_end (regs=<optimized out>, signr=<optimized out>, flags=<optimized out>) at arch/arm/kernel/traps.c:311
#2  die (str=0x68000013 "", regs=<optimized out>, err=-1014638958) at arch/arm/kernel/traps.c:333
#3  0xc00137c0 in __do_kernel_fault (mm=0xc06e8c04 <init_mm>, addr=3271915105, fsr=2051, regs=0xc385d890) at arch/arm/mm/fault.c:150
#4  0xc0013b10 in do_bad_area (addr=<optimized out>, fsr=<optimized out>, regs=<optimized out>) at arch/arm/mm/fault.c:198
#5  0xc00166c8 in do_alignment (addr=3271915105, fsr=2051, regs=0xc385d890) at arch/arm/mm/alignment.c:900
#6  0xc0009264 in do_DataAbort (addr=3271915105, fsr=2051, regs=0xc385d890) at arch/arm/mm/fault.c:550
#7  0xc000f024 in __dabt_svc () at arch/arm/kernel/entry-armv.S:204

[4] frame #6 regs dump
======================
$1 = {uregs = {0, 3271915105, 11, 10, 11, 49637, 3271915105, 2, 3280322560, 3228630976, 3282748736, 3280328988, 3228648120, 3280328928, 3224058396, 3224057800, 2818572307, 
    4294967295}}

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* Re: [PATCH] ARM: fix alignement of __bug_table section entries
  2015-09-05 13:48     ` Robert Jarzmik
@ 2015-09-05 14:25       ` Russell King - ARM Linux
  -1 siblings, 0 replies; 48+ messages in thread
From: Russell King - ARM Linux @ 2015-09-05 14:25 UTC (permalink / raw)
  To: Robert Jarzmik; +Cc: Dave Martin, linux-kernel, linux-arm-kernel

On Sat, Sep 05, 2015 at 03:48:38PM +0200, Robert Jarzmik wrote:
> Dave Martin <Dave.Martin@arm.com> writes:
> 
> > On Wed, Sep 02, 2015 at 08:23:29AM +0200, Robert Jarzmik wrote:
> >> On old ARM chips, unaligned accesses to memory are not trapped and
> >> fixed.  On module load, symbols are relocated, and the relocation of
> >> __bug_table symbols is done on a u32 basis. Yet the section is not
> >> aligned to a multiple of 4 address, but to a multiple of 2.
> 
> Hi Russell,
> 
> I dug deeper, and got another stack, unrelated to modules insertion but related
> to alignement fault (see [1] for reference). As before, this didn't happen on
> v4.1, but happens on linux-next.
> I'm wondering if alignement fixup does work in my case and if I understand it.
> 
> This time I took my JTAG to have a look at the flow, in arch/arm/mm/alignment.c,
> where I added the small chunk in [2], which gave in my case :
>     RJK: fault=4 instr=0x00000000 instrptr=0xc02b37c8 thumb_mode=0 tinstr=0x0000

Right, so as fault is nonzero, this means that we were unable to read the
instruction.  That seems mad though - the instruction pointer is certainly
valid, and as we're using probe_kernel_address(), that switches to the
kernel "segment" before trying to read kernel addresses.  That should
mean that __copy_from_user_inatomic() is able to read the instruction.

I think this is the root cause of the issue.

> The instruction (as seen with vmlinux disassembly or JTAG memory dump) is  :
>     0xc02b37c8 <+372>:	b2 50 c6 10	strhne	r5, [r6], #2
> 
> I must admit I fail to see how the following "fixup:" label can be reached with
> a "missed" copy_from_user() (fault == 4).

We can't fix up the fault if we failed to read the instruction causing the
fault - because we've no idea what register(s) will need updating.

> [1] New stack
> =============
>   #0: pxa2xx-ac97 (Wolfson WM9713,WM9714)
> RJK: fault=4 instr=0x00000000 instrptr=0xc02b37c8 thumb_mode=0 tinstr=0x0000
> &Unable to handle kernel paging request at virtual address c3057661
> &pgd = c0004000
> "[c3057661] *pgd=a300040e(bad)
> Internal error: Oops: 803 [#1] PREEMPT ARM
> Modules linked in:
> CPU: 0 PID: 1 Comm: swapper Not tainted 4.2.0-rc8-next-20150828+ #900
> Hardware name: MIO A701
> task: c3858bc0 ti: c385c000 task.ti: c385c000
> PC is at doc_read_data_area+0x174/0x370
> LR is at doc_read_page_getbytes+0x58/0x78
> pc : [<c02b37c8>]    lr : [<c02b3a1c>]    psr: a8000013
> sp : c385d8e0  ip : c07142b8  fp : c385d91c
> r10: c3aac540  r9 : c070ffc0  r8 : c385c000
> @QGi
> r7 : 00000002  r6 : c3057661  r5 : 0000c1e5  r4 : 0000000b
> r3 : 0000000a  r2 : 0000000b  r1 : c3057661  r0 : 00000000
> Flags: NzCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none

It seems you have SW_DOMAIN_PAN enabled.

> Control: 0000397f  Table: a0004000  DAC: 00000053

And the DACR for the parent context shows that user no-access, kernel
manager-access (iow, in the doc_read_data_area() function).  I have to
wonder why that would be the case - I can't find anything that would
set the kernel to manager-access.  There's no get_ds() or KERNEL_DS
reference in fs/ubifs or drivers/mtd.

> [3] Abort stack
> ===============
> #0  panic (fmt=0xc385d6c4 <incomplete sequence \344>) at kernel/panic.c:72
> #1  0xc000e788 in oops_end (regs=<optimized out>, signr=<optimized out>, flags=<optimized out>) at arch/arm/kernel/traps.c:311
> #2  die (str=0x68000013 "", regs=<optimized out>, err=-1014638958) at arch/arm/kernel/traps.c:333
> #3  0xc00137c0 in __do_kernel_fault (mm=0xc06e8c04 <init_mm>, addr=3271915105, fsr=2051, regs=0xc385d890) at arch/arm/mm/fault.c:150
> #4  0xc0013b10 in do_bad_area (addr=<optimized out>, fsr=<optimized out>, regs=<optimized out>) at arch/arm/mm/fault.c:198
> #5  0xc00166c8 in do_alignment (addr=3271915105, fsr=2051, regs=0xc385d890) at arch/arm/mm/alignment.c:900
> #6  0xc0009264 in do_DataAbort (addr=3271915105, fsr=2051, regs=0xc385d890) at arch/arm/mm/fault.c:550
> #7  0xc000f024 in __dabt_svc () at arch/arm/kernel/entry-armv.S:204

I'm afraid this isn't useful, and I think (as seems to be typical with gdb)
some of those values are completely wrong.  There's no way "str" would
be 0x68000013 in frame 2 for example.

-- 
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [PATCH] ARM: fix alignement of __bug_table section entries
@ 2015-09-05 14:25       ` Russell King - ARM Linux
  0 siblings, 0 replies; 48+ messages in thread
From: Russell King - ARM Linux @ 2015-09-05 14:25 UTC (permalink / raw)
  To: linux-arm-kernel

On Sat, Sep 05, 2015 at 03:48:38PM +0200, Robert Jarzmik wrote:
> Dave Martin <Dave.Martin@arm.com> writes:
> 
> > On Wed, Sep 02, 2015 at 08:23:29AM +0200, Robert Jarzmik wrote:
> >> On old ARM chips, unaligned accesses to memory are not trapped and
> >> fixed.  On module load, symbols are relocated, and the relocation of
> >> __bug_table symbols is done on a u32 basis. Yet the section is not
> >> aligned to a multiple of 4 address, but to a multiple of 2.
> 
> Hi Russell,
> 
> I dug deeper, and got another stack, unrelated to modules insertion but related
> to alignement fault (see [1] for reference). As before, this didn't happen on
> v4.1, but happens on linux-next.
> I'm wondering if alignement fixup does work in my case and if I understand it.
> 
> This time I took my JTAG to have a look at the flow, in arch/arm/mm/alignment.c,
> where I added the small chunk in [2], which gave in my case :
>     RJK: fault=4 instr=0x00000000 instrptr=0xc02b37c8 thumb_mode=0 tinstr=0x0000

Right, so as fault is nonzero, this means that we were unable to read the
instruction.  That seems mad though - the instruction pointer is certainly
valid, and as we're using probe_kernel_address(), that switches to the
kernel "segment" before trying to read kernel addresses.  That should
mean that __copy_from_user_inatomic() is able to read the instruction.

I think this is the root cause of the issue.

> The instruction (as seen with vmlinux disassembly or JTAG memory dump) is  :
>     0xc02b37c8 <+372>:	b2 50 c6 10	strhne	r5, [r6], #2
> 
> I must admit I fail to see how the following "fixup:" label can be reached with
> a "missed" copy_from_user() (fault == 4).

We can't fix up the fault if we failed to read the instruction causing the
fault - because we've no idea what register(s) will need updating.

> [1] New stack
> =============
>   #0: pxa2xx-ac97 (Wolfson WM9713,WM9714)
> RJK: fault=4 instr=0x00000000 instrptr=0xc02b37c8 thumb_mode=0 tinstr=0x0000
> &Unable to handle kernel paging request at virtual address c3057661
> &pgd = c0004000
> "[c3057661] *pgd=a300040e(bad)
> Internal error: Oops: 803 [#1] PREEMPT ARM
> Modules linked in:
> CPU: 0 PID: 1 Comm: swapper Not tainted 4.2.0-rc8-next-20150828+ #900
> Hardware name: MIO A701
> task: c3858bc0 ti: c385c000 task.ti: c385c000
> PC is at doc_read_data_area+0x174/0x370
> LR is at doc_read_page_getbytes+0x58/0x78
> pc : [<c02b37c8>]    lr : [<c02b3a1c>]    psr: a8000013
> sp : c385d8e0  ip : c07142b8  fp : c385d91c
> r10: c3aac540  r9 : c070ffc0  r8 : c385c000
> @QGi
> r7 : 00000002  r6 : c3057661  r5 : 0000c1e5  r4 : 0000000b
> r3 : 0000000a  r2 : 0000000b  r1 : c3057661  r0 : 00000000
> Flags: NzCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none

It seems you have SW_DOMAIN_PAN enabled.

> Control: 0000397f  Table: a0004000  DAC: 00000053

And the DACR for the parent context shows that user no-access, kernel
manager-access (iow, in the doc_read_data_area() function).  I have to
wonder why that would be the case - I can't find anything that would
set the kernel to manager-access.  There's no get_ds() or KERNEL_DS
reference in fs/ubifs or drivers/mtd.

> [3] Abort stack
> ===============
> #0  panic (fmt=0xc385d6c4 <incomplete sequence \344>) at kernel/panic.c:72
> #1  0xc000e788 in oops_end (regs=<optimized out>, signr=<optimized out>, flags=<optimized out>) at arch/arm/kernel/traps.c:311
> #2  die (str=0x68000013 "", regs=<optimized out>, err=-1014638958) at arch/arm/kernel/traps.c:333
> #3  0xc00137c0 in __do_kernel_fault (mm=0xc06e8c04 <init_mm>, addr=3271915105, fsr=2051, regs=0xc385d890) at arch/arm/mm/fault.c:150
> #4  0xc0013b10 in do_bad_area (addr=<optimized out>, fsr=<optimized out>, regs=<optimized out>) at arch/arm/mm/fault.c:198
> #5  0xc00166c8 in do_alignment (addr=3271915105, fsr=2051, regs=0xc385d890) at arch/arm/mm/alignment.c:900
> #6  0xc0009264 in do_DataAbort (addr=3271915105, fsr=2051, regs=0xc385d890) at arch/arm/mm/fault.c:550
> #7  0xc000f024 in __dabt_svc () at arch/arm/kernel/entry-armv.S:204

I'm afraid this isn't useful, and I think (as seems to be typical with gdb)
some of those values are completely wrong.  There's no way "str" would
be 0x68000013 in frame 2 for example.

-- 
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH] ARM: fix alignement of __bug_table section entries
  2015-09-05 14:25       ` Russell King - ARM Linux
@ 2015-09-05 17:10         ` Robert Jarzmik
  -1 siblings, 0 replies; 48+ messages in thread
From: Robert Jarzmik @ 2015-09-05 17:10 UTC (permalink / raw)
  To: Russell King - ARM Linux; +Cc: Dave Martin, linux-kernel, linux-arm-kernel

Russell King - ARM Linux <linux@arm.linux.org.uk> writes:

> On Sat, Sep 05, 2015 at 03:48:38PM +0200, Robert Jarzmik wrote:
>> This time I took my JTAG to have a look at the flow, in arch/arm/mm/alignment.c,
>> where I added the small chunk in [2], which gave in my case :
>>     RJK: fault=4 instr=0x00000000 instrptr=0xc02b37c8 thumb_mode=0 tinstr=0x0000
>
> Right, so as fault is nonzero, this means that we were unable to read the
> instruction.  That seems mad though - the instruction pointer is certainly
> valid, and as we're using probe_kernel_address(), that switches to the
> kernel "segment" before trying to read kernel addresses.  That should
> mean that __copy_from_user_inatomic() is able to read the instruction.
>
> I think this is the root cause of the issue.

And there is more madness to come : I tried to "reread" the instruction [1] a
second time if the first result was 4 :
RJK: fault=4 instr=0x00000000(@c385d72c) instrptr=0xc02b39e8 thumb_mode=0 tinstr=0x0000
RJK: reread instruction: [0xc02b39e8] = 0x10c650b2: 0

Guess what, the second probe_kernel_address() with the same parameters returns
0, and everything works. It's insane.

>> Flags: NzCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none
> It seems you have SW_DOMAIN_PAN enabled.
That's the default arch/arm/Kconfig implies.
And ... this is what also _is_ the cause of this behavior : removing
SW_DOMAIN_PAN makes all my pxa boards work again !!!

Moreover, this is consistent with the fact that this commit is in linux-next but
not in v4.1 :
    a5e090acbf54 ("ARM: software-based priviledged-no-access support")

So the issue is around this SW_DOMAIN_PAN, at least on PXA.

--
Robert

[1]
@@ -787,6 +798,15 @@ do_alignment(unsigned long addr, unsigned int fsr, struct pt_regs *regs)
                instr = __mem_to_opcode_arm(instr);
        }
 
+       pr_info("RJK: fault=%d instr=0x%08lx(@%p) instrptr=0x%08lx thumb_mode=%lu tinstr=0x%04x\n",
+               fault, instr, &instr, instrptr, thumb_mode(regs), tinstr);
+       if (fault == 4 && !thumb_mode(regs)) {
+               fault = probe_kernel_address(instrptr, instr);
+               pr_info("RJK: reread instruction: [0x%08lx] = 0x%08lx: %u\n",
+                       instrptr, instr, fault);
+               rjk_debug_point(instrptr);
+       }
+
        if (fault) {
                type = TYPE_FAULT;
                goto bad_or_fault;


^ permalink raw reply	[flat|nested] 48+ messages in thread

* [PATCH] ARM: fix alignement of __bug_table section entries
@ 2015-09-05 17:10         ` Robert Jarzmik
  0 siblings, 0 replies; 48+ messages in thread
From: Robert Jarzmik @ 2015-09-05 17:10 UTC (permalink / raw)
  To: linux-arm-kernel

Russell King - ARM Linux <linux@arm.linux.org.uk> writes:

> On Sat, Sep 05, 2015 at 03:48:38PM +0200, Robert Jarzmik wrote:
>> This time I took my JTAG to have a look at the flow, in arch/arm/mm/alignment.c,
>> where I added the small chunk in [2], which gave in my case :
>>     RJK: fault=4 instr=0x00000000 instrptr=0xc02b37c8 thumb_mode=0 tinstr=0x0000
>
> Right, so as fault is nonzero, this means that we were unable to read the
> instruction.  That seems mad though - the instruction pointer is certainly
> valid, and as we're using probe_kernel_address(), that switches to the
> kernel "segment" before trying to read kernel addresses.  That should
> mean that __copy_from_user_inatomic() is able to read the instruction.
>
> I think this is the root cause of the issue.

And there is more madness to come : I tried to "reread" the instruction [1] a
second time if the first result was 4 :
RJK: fault=4 instr=0x00000000(@c385d72c) instrptr=0xc02b39e8 thumb_mode=0 tinstr=0x0000
RJK: reread instruction: [0xc02b39e8] = 0x10c650b2: 0

Guess what, the second probe_kernel_address() with the same parameters returns
0, and everything works. It's insane.

>> Flags: NzCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none
> It seems you have SW_DOMAIN_PAN enabled.
That's the default arch/arm/Kconfig implies.
And ... this is what also _is_ the cause of this behavior : removing
SW_DOMAIN_PAN makes all my pxa boards work again !!!

Moreover, this is consistent with the fact that this commit is in linux-next but
not in v4.1 :
    a5e090acbf54 ("ARM: software-based priviledged-no-access support")

So the issue is around this SW_DOMAIN_PAN, at least on PXA.

--
Robert

[1]
@@ -787,6 +798,15 @@ do_alignment(unsigned long addr, unsigned int fsr, struct pt_regs *regs)
                instr = __mem_to_opcode_arm(instr);
        }
 
+       pr_info("RJK: fault=%d instr=0x%08lx(@%p) instrptr=0x%08lx thumb_mode=%lu tinstr=0x%04x\n",
+               fault, instr, &instr, instrptr, thumb_mode(regs), tinstr);
+       if (fault == 4 && !thumb_mode(regs)) {
+               fault = probe_kernel_address(instrptr, instr);
+               pr_info("RJK: reread instruction: [0x%08lx] = 0x%08lx: %u\n",
+                       instrptr, instr, fault);
+               rjk_debug_point(instrptr);
+       }
+
        if (fault) {
                type = TYPE_FAULT;
                goto bad_or_fault;

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH] ARM: fix alignement of __bug_table section entries
  2015-09-05 17:10         ` Robert Jarzmik
@ 2015-09-05 20:38           ` Russell King - ARM Linux
  -1 siblings, 0 replies; 48+ messages in thread
From: Russell King - ARM Linux @ 2015-09-05 20:38 UTC (permalink / raw)
  To: Robert Jarzmik; +Cc: Dave Martin, linux-kernel, linux-arm-kernel

On Sat, Sep 05, 2015 at 07:10:49PM +0200, Robert Jarzmik wrote:
> Russell King - ARM Linux <linux@arm.linux.org.uk> writes:
> 
> > On Sat, Sep 05, 2015 at 03:48:38PM +0200, Robert Jarzmik wrote:
> >> This time I took my JTAG to have a look at the flow, in arch/arm/mm/alignment.c,
> >> where I added the small chunk in [2], which gave in my case :
> >>     RJK: fault=4 instr=0x00000000 instrptr=0xc02b37c8 thumb_mode=0 tinstr=0x0000
> >
> > Right, so as fault is nonzero, this means that we were unable to read the
> > instruction.  That seems mad though - the instruction pointer is certainly
> > valid, and as we're using probe_kernel_address(), that switches to the
> > kernel "segment" before trying to read kernel addresses.  That should
> > mean that __copy_from_user_inatomic() is able to read the instruction.
> >
> > I think this is the root cause of the issue.
> 
> And there is more madness to come : I tried to "reread" the instruction [1] a
> second time if the first result was 4 :
> RJK: fault=4 instr=0x00000000(@c385d72c) instrptr=0xc02b39e8 thumb_mode=0 tinstr=0x0000
> RJK: reread instruction: [0xc02b39e8] = 0x10c650b2: 0
> 
> Guess what, the second probe_kernel_address() with the same parameters returns
> 0, and everything works. It's insane.
> 
> >> Flags: NzCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none
> > It seems you have SW_DOMAIN_PAN enabled.
> That's the default arch/arm/Kconfig implies.
> And ... this is what also _is_ the cause of this behavior : removing
> SW_DOMAIN_PAN makes all my pxa boards work again !!!
> 
> Moreover, this is consistent with the fact that this commit is in linux-next but
> not in v4.1 :
>     a5e090acbf54 ("ARM: software-based priviledged-no-access support")
> 
> So the issue is around this SW_DOMAIN_PAN, at least on PXA.

Is it only PXA which seems to be affected?

If so, you may need to add:

	mrc p15, 0, \rd, c2, c0, 0
	mov \rd, \rd
	sub pc, pc, #4

to the places we update the domain access register to ensure that the
Xscale pipeline stalls to allow the CP15 DACR update to hit.

-- 
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [PATCH] ARM: fix alignement of __bug_table section entries
@ 2015-09-05 20:38           ` Russell King - ARM Linux
  0 siblings, 0 replies; 48+ messages in thread
From: Russell King - ARM Linux @ 2015-09-05 20:38 UTC (permalink / raw)
  To: linux-arm-kernel

On Sat, Sep 05, 2015 at 07:10:49PM +0200, Robert Jarzmik wrote:
> Russell King - ARM Linux <linux@arm.linux.org.uk> writes:
> 
> > On Sat, Sep 05, 2015 at 03:48:38PM +0200, Robert Jarzmik wrote:
> >> This time I took my JTAG to have a look at the flow, in arch/arm/mm/alignment.c,
> >> where I added the small chunk in [2], which gave in my case :
> >>     RJK: fault=4 instr=0x00000000 instrptr=0xc02b37c8 thumb_mode=0 tinstr=0x0000
> >
> > Right, so as fault is nonzero, this means that we were unable to read the
> > instruction.  That seems mad though - the instruction pointer is certainly
> > valid, and as we're using probe_kernel_address(), that switches to the
> > kernel "segment" before trying to read kernel addresses.  That should
> > mean that __copy_from_user_inatomic() is able to read the instruction.
> >
> > I think this is the root cause of the issue.
> 
> And there is more madness to come : I tried to "reread" the instruction [1] a
> second time if the first result was 4 :
> RJK: fault=4 instr=0x00000000(@c385d72c) instrptr=0xc02b39e8 thumb_mode=0 tinstr=0x0000
> RJK: reread instruction: [0xc02b39e8] = 0x10c650b2: 0
> 
> Guess what, the second probe_kernel_address() with the same parameters returns
> 0, and everything works. It's insane.
> 
> >> Flags: NzCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none
> > It seems you have SW_DOMAIN_PAN enabled.
> That's the default arch/arm/Kconfig implies.
> And ... this is what also _is_ the cause of this behavior : removing
> SW_DOMAIN_PAN makes all my pxa boards work again !!!
> 
> Moreover, this is consistent with the fact that this commit is in linux-next but
> not in v4.1 :
>     a5e090acbf54 ("ARM: software-based priviledged-no-access support")
> 
> So the issue is around this SW_DOMAIN_PAN, at least on PXA.

Is it only PXA which seems to be affected?

If so, you may need to add:

	mrc p15, 0, \rd, c2, c0, 0
	mov \rd, \rd
	sub pc, pc, #4

to the places we update the domain access register to ensure that the
Xscale pipeline stalls to allow the CP15 DACR update to hit.

-- 
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH] ARM: fix alignement of __bug_table section entries
  2015-09-05 20:38           ` Russell King - ARM Linux
@ 2015-09-05 22:12             ` Robert Jarzmik
  -1 siblings, 0 replies; 48+ messages in thread
From: Robert Jarzmik @ 2015-09-05 22:12 UTC (permalink / raw)
  To: Russell King - ARM Linux; +Cc: Dave Martin, linux-kernel, linux-arm-kernel

Russell King - ARM Linux <linux@arm.linux.org.uk> writes:

>> Moreover, this is consistent with the fact that this commit is in linux-next but
>> not in v4.1 :
>>     a5e090acbf54 ("ARM: software-based priviledged-no-access support")
>> 
>> So the issue is around this SW_DOMAIN_PAN, at least on PXA.
>
> Is it only PXA which seems to be affected?
Sorry I don't know, I only own pxa platforms.

> If so, you may need to add:
>
> 	mrc p15, 0, \rd, c2, c0, 0
> 	mov \rd, \rd
> 	sub pc, pc, #4
>
> to the places we update the domain access register to ensure that the
> Xscale pipeline stalls to allow the CP15 DACR update to hit.
Okay, I'll try that.

By the way, the ARMv5 manual states in chapter "B4.5.1 MMU Fault" that for a
DACR update, a "PrefetchFlush" operation has to be done (chapter B2.6.3
PrefetchFlush CP15 register 7), quoting :
    Changes to the Domain Access Control register are synchronized by performing
    a PrefetchFlush operation (or as result of an exception or exception
    return). See Changes to CP15 registers and the memory order model on page
    B2-24 for details.

Cheers.

-- 
Robert

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [PATCH] ARM: fix alignement of __bug_table section entries
@ 2015-09-05 22:12             ` Robert Jarzmik
  0 siblings, 0 replies; 48+ messages in thread
From: Robert Jarzmik @ 2015-09-05 22:12 UTC (permalink / raw)
  To: linux-arm-kernel

Russell King - ARM Linux <linux@arm.linux.org.uk> writes:

>> Moreover, this is consistent with the fact that this commit is in linux-next but
>> not in v4.1 :
>>     a5e090acbf54 ("ARM: software-based priviledged-no-access support")
>> 
>> So the issue is around this SW_DOMAIN_PAN, at least on PXA.
>
> Is it only PXA which seems to be affected?
Sorry I don't know, I only own pxa platforms.

> If so, you may need to add:
>
> 	mrc p15, 0, \rd, c2, c0, 0
> 	mov \rd, \rd
> 	sub pc, pc, #4
>
> to the places we update the domain access register to ensure that the
> Xscale pipeline stalls to allow the CP15 DACR update to hit.
Okay, I'll try that.

By the way, the ARMv5 manual states in chapter "B4.5.1 MMU Fault" that for a
DACR update, a "PrefetchFlush" operation has to be done (chapter B2.6.3
PrefetchFlush CP15 register 7), quoting :
    Changes to the Domain Access Control register are synchronized by performing
    a PrefetchFlush operation (or as result of an exception or exception
    return). See Changes to CP15 registers and the memory order model on page
    B2-24 for details.

Cheers.

-- 
Robert

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH] ARM: fix alignement of __bug_table section entries
  2015-09-05 20:38           ` Russell King - ARM Linux
@ 2015-09-06 17:25             ` Robert Jarzmik
  -1 siblings, 0 replies; 48+ messages in thread
From: Robert Jarzmik @ 2015-09-06 17:25 UTC (permalink / raw)
  To: Russell King - ARM Linux; +Cc: Dave Martin, linux-kernel, linux-arm-kernel

Russell King - ARM Linux <linux@arm.linux.org.uk> writes:

> On Sat, Sep 05, 2015 at 07:10:49PM +0200, Robert Jarzmik wrote:
>> Russell King - ARM Linux <linux@arm.linux.org.uk> writes:
>> So the issue is around this SW_DOMAIN_PAN, at least on PXA.
>
> If so, you may need to add:
>
> 	mrc p15, 0, \rd, c2, c0, 0
> 	mov \rd, \rd
> 	sub pc, pc, #4
>
> to the places we update the domain access register to ensure that the
> Xscale pipeline stalls to allow the CP15 DACR update to hit.
Nope, that didn't work.
I have tried 2 different patches :
 - in [1], your proposed solution
 - in [2], a PrefetchFlush as adviced by ARM Architecture Reference Manual

None of them worked. I confirmed by disassembling __dabt_svc my changes hit the
abort routine, and they did. I'll continue next week by trying to have a closer
look at the SW_DOMAIN_PAN commit (a5e090acbf545), and take time to walk through
the whole Oops for information I have missed.

Cheers.

-- 
Robert

[1] Approach 1 : translation table sync
=======================================
diff --cc arch/arm/include/asm/assembler.h
index 7bbf325a4f31,7bbf325a4f31..6bb46198fd08
--- a/arch/arm/include/asm/assembler.h
+++ b/arch/arm/include/asm/assembler.h
@@@ -449,6 -449,6 +449,12 @@@ THUMB(    orr     \reg , \reg , #PSR_T_BIT        
  #endif
        .endm
  
++      .macro  dacr_sync, rd
++      mrc     p15, 0, \rd, c2, c0, 0
++      mov     \rd, \rd
++      sub     pc, pc, #4
++      .endm
++
        .macro  uaccess_disable, tmp, isb=1
  #ifdef CONFIG_CPU_SW_DOMAIN_PAN
        /*
@@@ -457,6 -457,6 +463,7 @@@
         */
        mov     \tmp, #DACR_UACCESS_DISABLE
        mcr     p15, 0, \tmp, c3, c0, 0         @ Set domain register
++      dacr_sync \tmp
        .if     \isb
        instr_sync
        .endif
@@@ -471,6 -471,6 +478,7 @@@
         */
        mov     \tmp, #DACR_UACCESS_ENABLE
        mcr     p15, 0, \tmp, c3, c0, 0
++      dacr_sync \tmp
        .if     \isb
        instr_sync
        .endif
@@@ -488,6 -488,6 +496,7 @@@
  #ifdef CONFIG_CPU_SW_DOMAIN_PAN
        ldr     r0, [sp, #S_FRAME_SIZE]
        mcr     p15, 0, r0, c3, c0, 0
++      dacr_sync r0
  #endif
        .endm

[2] Approach 2 : PrefetchFlush
==============================
diff --git a/arch/arm/include/asm/assembler.h b/arch/arm/include/asm/assembler.h
index 7bbf325a4f31..2152d43d7ede 100644
--- a/arch/arm/include/asm/assembler.h
+++ b/arch/arm/include/asm/assembler.h
@@ -449,6 +449,10 @@ THUMB(     orr     \reg , \reg , #PSR_T_BIT        )
 #endif
        .endm
 
+       .macro  dacr_sync, rd
+       mcr     p15, 0, \rd, c7, c5, 4
+       .endm
+
        .macro  uaccess_disable, tmp, isb=1
 #ifdef CONFIG_CPU_SW_DOMAIN_PAN
        /*
@@ -457,6 +461,7 @@ THUMB(      orr     \reg , \reg , #PSR_T_BIT        )
         */
        mov     \tmp, #DACR_UACCESS_DISABLE
        mcr     p15, 0, \tmp, c3, c0, 0         @ Set domain register
+       dacr_sync \tmp
        .if     \isb
        instr_sync
        .endif
@@ -471,6 +476,7 @@ THUMB(      orr     \reg , \reg , #PSR_T_BIT        )
         */
        mov     \tmp, #DACR_UACCESS_ENABLE
        mcr     p15, 0, \tmp, c3, c0, 0
+       dacr_sync \tmp
        .if     \isb
        instr_sync
        .endif
@@ -488,6 +494,7 @@ THUMB(      orr     \reg , \reg , #PSR_T_BIT        )
 #ifdef CONFIG_CPU_SW_DOMAIN_PAN
        ldr     r0, [sp, #S_FRAME_SIZE]
        mcr     p15, 0, r0, c3, c0, 0
+       dacr_sync r0
 #endif
        .endm

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH] ARM: fix alignement of __bug_table section entries
@ 2015-09-06 17:25             ` Robert Jarzmik
  0 siblings, 0 replies; 48+ messages in thread
From: Robert Jarzmik @ 2015-09-06 17:25 UTC (permalink / raw)
  To: linux-arm-kernel

Russell King - ARM Linux <linux@arm.linux.org.uk> writes:

> On Sat, Sep 05, 2015 at 07:10:49PM +0200, Robert Jarzmik wrote:
>> Russell King - ARM Linux <linux@arm.linux.org.uk> writes:
>> So the issue is around this SW_DOMAIN_PAN, at least on PXA.
>
> If so, you may need to add:
>
> 	mrc p15, 0, \rd, c2, c0, 0
> 	mov \rd, \rd
> 	sub pc, pc, #4
>
> to the places we update the domain access register to ensure that the
> Xscale pipeline stalls to allow the CP15 DACR update to hit.
Nope, that didn't work.
I have tried 2 different patches :
 - in [1], your proposed solution
 - in [2], a PrefetchFlush as adviced by ARM Architecture Reference Manual

None of them worked. I confirmed by disassembling __dabt_svc my changes hit the
abort routine, and they did. I'll continue next week by trying to have a closer
look at the SW_DOMAIN_PAN commit (a5e090acbf545), and take time to walk through
the whole Oops for information I have missed.

Cheers.

-- 
Robert

[1] Approach 1 : translation table sync
=======================================
diff --cc arch/arm/include/asm/assembler.h
index 7bbf325a4f31,7bbf325a4f31..6bb46198fd08
--- a/arch/arm/include/asm/assembler.h
+++ b/arch/arm/include/asm/assembler.h
@@@ -449,6 -449,6 +449,12 @@@ THUMB(    orr     \reg , \reg , #PSR_T_BIT        
  #endif
        .endm
  
++      .macro  dacr_sync, rd
++      mrc     p15, 0, \rd, c2, c0, 0
++      mov     \rd, \rd
++      sub     pc, pc, #4
++      .endm
++
        .macro  uaccess_disable, tmp, isb=1
  #ifdef CONFIG_CPU_SW_DOMAIN_PAN
        /*
@@@ -457,6 -457,6 +463,7 @@@
         */
        mov     \tmp, #DACR_UACCESS_DISABLE
        mcr     p15, 0, \tmp, c3, c0, 0         @ Set domain register
++      dacr_sync \tmp
        .if     \isb
        instr_sync
        .endif
@@@ -471,6 -471,6 +478,7 @@@
         */
        mov     \tmp, #DACR_UACCESS_ENABLE
        mcr     p15, 0, \tmp, c3, c0, 0
++      dacr_sync \tmp
        .if     \isb
        instr_sync
        .endif
@@@ -488,6 -488,6 +496,7 @@@
  #ifdef CONFIG_CPU_SW_DOMAIN_PAN
        ldr     r0, [sp, #S_FRAME_SIZE]
        mcr     p15, 0, r0, c3, c0, 0
++      dacr_sync r0
  #endif
        .endm

[2] Approach 2 : PrefetchFlush
==============================
diff --git a/arch/arm/include/asm/assembler.h b/arch/arm/include/asm/assembler.h
index 7bbf325a4f31..2152d43d7ede 100644
--- a/arch/arm/include/asm/assembler.h
+++ b/arch/arm/include/asm/assembler.h
@@ -449,6 +449,10 @@ THUMB(     orr     \reg , \reg , #PSR_T_BIT        )
 #endif
        .endm
 
+       .macro  dacr_sync, rd
+       mcr     p15, 0, \rd, c7, c5, 4
+       .endm
+
        .macro  uaccess_disable, tmp, isb=1
 #ifdef CONFIG_CPU_SW_DOMAIN_PAN
        /*
@@ -457,6 +461,7 @@ THUMB(      orr     \reg , \reg , #PSR_T_BIT        )
         */
        mov     \tmp, #DACR_UACCESS_DISABLE
        mcr     p15, 0, \tmp, c3, c0, 0         @ Set domain register
+       dacr_sync \tmp
        .if     \isb
        instr_sync
        .endif
@@ -471,6 +476,7 @@ THUMB(      orr     \reg , \reg , #PSR_T_BIT        )
         */
        mov     \tmp, #DACR_UACCESS_ENABLE
        mcr     p15, 0, \tmp, c3, c0, 0
+       dacr_sync \tmp
        .if     \isb
        instr_sync
        .endif
@@ -488,6 +494,7 @@ THUMB(      orr     \reg , \reg , #PSR_T_BIT        )
 #ifdef CONFIG_CPU_SW_DOMAIN_PAN
        ldr     r0, [sp, #S_FRAME_SIZE]
        mcr     p15, 0, r0, c3, c0, 0
+       dacr_sync r0
 #endif
        .endm

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* Re: [PATCH] ARM: fix alignement of __bug_table section entries
  2015-09-06 17:25             ` Robert Jarzmik
@ 2015-09-06 19:48               ` Russell King - ARM Linux
  -1 siblings, 0 replies; 48+ messages in thread
From: Russell King - ARM Linux @ 2015-09-06 19:48 UTC (permalink / raw)
  To: Robert Jarzmik; +Cc: Dave Martin, linux-kernel, linux-arm-kernel

On Sun, Sep 06, 2015 at 07:25:01PM +0200, Robert Jarzmik wrote:
> Russell King - ARM Linux <linux@arm.linux.org.uk> writes:
> 
> > On Sat, Sep 05, 2015 at 07:10:49PM +0200, Robert Jarzmik wrote:
> >> Russell King - ARM Linux <linux@arm.linux.org.uk> writes:
> >> So the issue is around this SW_DOMAIN_PAN, at least on PXA.
> >
> > If so, you may need to add:
> >
> > 	mrc p15, 0, \rd, c2, c0, 0
> > 	mov \rd, \rd
> > 	sub pc, pc, #4
> >
> > to the places we update the domain access register to ensure that the
> > Xscale pipeline stalls to allow the CP15 DACR update to hit.
> Nope, that didn't work.
> I have tried 2 different patches :
>  - in [1], your proposed solution
>  - in [2], a PrefetchFlush as adviced by ARM Architecture Reference Manual
> 
> None of them worked. I confirmed by disassembling __dabt_svc my changes hit the
> abort routine, and they did. I'll continue next week by trying to have a closer
> look at the SW_DOMAIN_PAN commit (a5e090acbf545), and take time to walk through
> the whole Oops for information I have missed.
> 
> Cheers.
> 
> -- 
> Robert
> 
> [1] Approach 1 : translation table sync
> =======================================
> diff --cc arch/arm/include/asm/assembler.h
> index 7bbf325a4f31,7bbf325a4f31..6bb46198fd08
> --- a/arch/arm/include/asm/assembler.h
> +++ b/arch/arm/include/asm/assembler.h
> @@@ -449,6 -449,6 +449,12 @@@ THUMB(    orr     \reg , \reg , #PSR_T_BIT        
>   #endif
>         .endm
>   
> ++      .macro  dacr_sync, rd
> ++      mrc     p15, 0, \rd, c2, c0, 0
> ++      mov     \rd, \rd
> ++      sub     pc, pc, #4
> ++      .endm
> ++
>         .macro  uaccess_disable, tmp, isb=1
>   #ifdef CONFIG_CPU_SW_DOMAIN_PAN
>         /*
> @@@ -457,6 -457,6 +463,7 @@@
>          */
>         mov     \tmp, #DACR_UACCESS_DISABLE
>         mcr     p15, 0, \tmp, c3, c0, 0         @ Set domain register
> ++      dacr_sync \tmp
>         .if     \isb
>         instr_sync
>         .endif
> @@@ -471,6 -471,6 +478,7 @@@
>          */
>         mov     \tmp, #DACR_UACCESS_ENABLE
>         mcr     p15, 0, \tmp, c3, c0, 0
> ++      dacr_sync \tmp
>         .if     \isb
>         instr_sync
>         .endif
> @@@ -488,6 -488,6 +496,7 @@@
>   #ifdef CONFIG_CPU_SW_DOMAIN_PAN
>         ldr     r0, [sp, #S_FRAME_SIZE]
>         mcr     p15, 0, r0, c3, c0, 0
> ++      dacr_sync r0
>   #endif
>         .endm

The important place is in arch/arm/include/asm/domain.h, which is where
we manipulate the DACR within probe_kernel_address().

-- 
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [PATCH] ARM: fix alignement of __bug_table section entries
@ 2015-09-06 19:48               ` Russell King - ARM Linux
  0 siblings, 0 replies; 48+ messages in thread
From: Russell King - ARM Linux @ 2015-09-06 19:48 UTC (permalink / raw)
  To: linux-arm-kernel

On Sun, Sep 06, 2015 at 07:25:01PM +0200, Robert Jarzmik wrote:
> Russell King - ARM Linux <linux@arm.linux.org.uk> writes:
> 
> > On Sat, Sep 05, 2015 at 07:10:49PM +0200, Robert Jarzmik wrote:
> >> Russell King - ARM Linux <linux@arm.linux.org.uk> writes:
> >> So the issue is around this SW_DOMAIN_PAN, at least on PXA.
> >
> > If so, you may need to add:
> >
> > 	mrc p15, 0, \rd, c2, c0, 0
> > 	mov \rd, \rd
> > 	sub pc, pc, #4
> >
> > to the places we update the domain access register to ensure that the
> > Xscale pipeline stalls to allow the CP15 DACR update to hit.
> Nope, that didn't work.
> I have tried 2 different patches :
>  - in [1], your proposed solution
>  - in [2], a PrefetchFlush as adviced by ARM Architecture Reference Manual
> 
> None of them worked. I confirmed by disassembling __dabt_svc my changes hit the
> abort routine, and they did. I'll continue next week by trying to have a closer
> look at the SW_DOMAIN_PAN commit (a5e090acbf545), and take time to walk through
> the whole Oops for information I have missed.
> 
> Cheers.
> 
> -- 
> Robert
> 
> [1] Approach 1 : translation table sync
> =======================================
> diff --cc arch/arm/include/asm/assembler.h
> index 7bbf325a4f31,7bbf325a4f31..6bb46198fd08
> --- a/arch/arm/include/asm/assembler.h
> +++ b/arch/arm/include/asm/assembler.h
> @@@ -449,6 -449,6 +449,12 @@@ THUMB(    orr     \reg , \reg , #PSR_T_BIT        
>   #endif
>         .endm
>   
> ++      .macro  dacr_sync, rd
> ++      mrc     p15, 0, \rd, c2, c0, 0
> ++      mov     \rd, \rd
> ++      sub     pc, pc, #4
> ++      .endm
> ++
>         .macro  uaccess_disable, tmp, isb=1
>   #ifdef CONFIG_CPU_SW_DOMAIN_PAN
>         /*
> @@@ -457,6 -457,6 +463,7 @@@
>          */
>         mov     \tmp, #DACR_UACCESS_DISABLE
>         mcr     p15, 0, \tmp, c3, c0, 0         @ Set domain register
> ++      dacr_sync \tmp
>         .if     \isb
>         instr_sync
>         .endif
> @@@ -471,6 -471,6 +478,7 @@@
>          */
>         mov     \tmp, #DACR_UACCESS_ENABLE
>         mcr     p15, 0, \tmp, c3, c0, 0
> ++      dacr_sync \tmp
>         .if     \isb
>         instr_sync
>         .endif
> @@@ -488,6 -488,6 +496,7 @@@
>   #ifdef CONFIG_CPU_SW_DOMAIN_PAN
>         ldr     r0, [sp, #S_FRAME_SIZE]
>         mcr     p15, 0, r0, c3, c0, 0
> ++      dacr_sync r0
>   #endif
>         .endm

The important place is in arch/arm/include/asm/domain.h, which is where
we manipulate the DACR within probe_kernel_address().

-- 
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH] ARM: fix alignement of __bug_table section entries
  2015-09-06 19:48               ` Russell King - ARM Linux
@ 2015-09-06 21:31                 ` Robert Jarzmik
  -1 siblings, 0 replies; 48+ messages in thread
From: Robert Jarzmik @ 2015-09-06 21:31 UTC (permalink / raw)
  To: Russell King - ARM Linux; +Cc: Dave Martin, linux-kernel, linux-arm-kernel

Russell King - ARM Linux <linux@arm.linux.org.uk> writes:

>> [1] Approach 1 : translation table sync
>> =======================================
...
> The important place is in arch/arm/include/asm/domain.h, which is where
> we manipulate the DACR within probe_kernel_address().

Gah, silly me. But even with [1], I still get an error [2]. I have a
confirmation that I have a "Page Permission" fault on the
probe_kernel_address().

Next thing I'll check is if I can read the TLB cache for the code entry. It's a
very instructive bug for me :)

Cheers.

-- 
Robert

[1] Approach 1 : translation table sync + PrefetchFlush
=======================================================
diff --git a/arch/arm/include/asm/assembler.h b/arch/arm/include/asm/assembler.h
index 7bbf325a4f31..73d5ad456e32 100644
--- a/arch/arm/include/asm/assembler.h
+++ b/arch/arm/include/asm/assembler.h
@@ -449,6 +449,13 @@ THUMB(	orr	\reg , \reg , #PSR_T_BIT	)
 #endif
 	.endm
 
+	.macro	dacr_sync, rd
+	mrc     p15, 0, \rd, c2, c0, 0
+	mov     \rd, \rd
+	sub     pc, pc, #4
+	mcr     p15, 0, \rd, c7, c5, 4
+	.endm
+
 	.macro	uaccess_disable, tmp, isb=1
 #ifdef CONFIG_CPU_SW_DOMAIN_PAN
 	/*
@@ -457,6 +464,7 @@ THUMB(	orr	\reg , \reg , #PSR_T_BIT	)
 	 */
 	mov	\tmp, #DACR_UACCESS_DISABLE
 	mcr	p15, 0, \tmp, c3, c0, 0		@ Set domain register
+	dacr_sync \tmp
 	.if	\isb
 	instr_sync
 	.endif
@@ -471,6 +479,7 @@ THUMB(	orr	\reg , \reg , #PSR_T_BIT	)
 	 */
 	mov	\tmp, #DACR_UACCESS_ENABLE
 	mcr	p15, 0, \tmp, c3, c0, 0
+	dacr_sync \tmp
 	.if	\isb
 	instr_sync
 	.endif
@@ -488,6 +497,7 @@ THUMB(	orr	\reg , \reg , #PSR_T_BIT	)
 #ifdef CONFIG_CPU_SW_DOMAIN_PAN
 	ldr	r0, [sp, #S_FRAME_SIZE]
 	mcr	p15, 0, r0, c3, c0, 0
+	dacr_sync r0
 #endif
 	.endm
 
diff --git a/arch/arm/include/asm/domain.h b/arch/arm/include/asm/domain.h
index e878129f2fee..10c9a38636ac 100644
--- a/arch/arm/include/asm/domain.h
+++ b/arch/arm/include/asm/domain.h
@@ -97,7 +97,11 @@ static inline unsigned int get_domain(void)
 static inline void set_domain(unsigned val)
 {
 	asm volatile(
-	"mcr	p15, 0, %0, c3, c0	@ set domain"
+	"mcr	p15, 0, %0, c3, c0;	@ set domain	\
+	mrc	p15, 0, %0, c2, c0, 0;			\
+	mov	%0, %0;					\
+	sub	pc, pc, #4;				\
+	mcr     p15, 0, %0, c7, c5, 4"
 	  : : "r" (val));
 	isb();
 }
diff --git a/arch/arm/mm/alignment.c b/arch/arm/mm/alignment.c
index 9769f1eefe3b..c9c454129344 100644
--- a/arch/arm/mm/alignment.c
+++ b/arch/arm/mm/alignment.c
@@ -747,6 +747,27 @@ do_alignment_t32_to_handler(unsigned long *pinstr, struct pt_regs *regs,
 	return NULL;
 }
 
+static u32 far_read(void)
+{
+        u32 far;
+        asm("mrc        p15, 0, %0, c6, c0, 0" : "=r" (far));
+        return far;
+}
+
+static u32 fsr_read(void)
+{
+        u32 fsr;
+        asm("mrc        p15, 0, %0, c5, c0, 0" : "=r" (fsr));
+        return fsr;
+}
+
+static u32 dacr_read(void)
+{
+        u32 dacr;
+        asm("mrc        p15, 0, %0, c3, c0, 0" : "=r" (dacr));
+        return dacr;
+}
+
 static int
 do_alignment(unsigned long addr, unsigned int fsr, struct pt_regs *regs)
 {
@@ -763,6 +784,8 @@ do_alignment(unsigned long addr, unsigned int fsr, struct pt_regs *regs)
 		local_irq_enable();
 
 	instrptr = instruction_pointer(regs);
+	pr_info("RJK1: fsr=%x far=%x dacr=%x\n", fsr_read(), far_read(), dacr_read());
+	pr_info("RJK2: fsr=%x far=%x dacr=%x\n", fsr_read(), far_read(), dacr_read());
 
 	if (thumb_mode(regs)) {
 		u16 *ptr = (u16 *)(instrptr & ~1);
@@ -787,6 +810,8 @@ do_alignment(unsigned long addr, unsigned int fsr, struct pt_regs *regs)
 		instr = __mem_to_opcode_arm(instr);
 	}
 
+	pr_info("RJK3: fsr=%x far=%x dacr=%x\n", fsr_read(), far_read(), dacr_read());
+	pr_info("RJK: fault=%d instr=0x%08x instrptr=%p\n", fault, instr, instrptr);
 	if (fault) {
 		type = TYPE_FAULT;
 		goto bad_or_fault;

[2] Oops
========
# insmod /tmp/unalign.ko 
RJK1: fsr=23 far=e1c23643 dacr=51
RJK2: fsr=23 far=e1c23643 dacr=51
RJK3: fsr=2f far=bf00202c dacr=51
RJK: fault=4 instr=0x00000000 instrptr=bf00202c
Unable to handle kernel paging request at virtual address e1c23643
pgd = e1cd4000
[e1c23643] *pgd=c1c0044e(bad)
Internal error: Oops: 823 [#1] ARM
Modules linked in: unalign(+)
CPU: 0 PID: 608 Comm: insmod Not tainted 4.2.0-rc8-next-20150828-cm-x300+ #919
Hardware name: CM-X300 module
task: e1c69880 ti: e1cb0000 task.ti: e1cb0000
PC is at u_init+0x2c/0x40 [unalign]
LR is at u_init+0x14/0x40 [unalign]
pc : [<bf00202c>]    lr : [<bf002014>]    psr: a0000013
sp : e1cb1df8  ip : e1c23e00  fp : 1e3dc89c
r10: e1c23780  r9 : 00000001  r8 : 00000000
r7 : bf002000  r6 : e1ca65c0  r5 : c0b85b80  r4 : c0b85b80
r3 : e1c23640  r2 : 00000004  r1 : a0000013  r0 : 00000000
Flags: NzCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none
Control: 0000397f  Table: c1cd4018  DAC: 00000051
Process insmod (pid: 608, stack limit = 0xe1cb0198)

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH] ARM: fix alignement of __bug_table section entries
@ 2015-09-06 21:31                 ` Robert Jarzmik
  0 siblings, 0 replies; 48+ messages in thread
From: Robert Jarzmik @ 2015-09-06 21:31 UTC (permalink / raw)
  To: linux-arm-kernel

Russell King - ARM Linux <linux@arm.linux.org.uk> writes:

>> [1] Approach 1 : translation table sync
>> =======================================
...
> The important place is in arch/arm/include/asm/domain.h, which is where
> we manipulate the DACR within probe_kernel_address().

Gah, silly me. But even with [1], I still get an error [2]. I have a
confirmation that I have a "Page Permission" fault on the
probe_kernel_address().

Next thing I'll check is if I can read the TLB cache for the code entry. It's a
very instructive bug for me :)

Cheers.

-- 
Robert

[1] Approach 1 : translation table sync + PrefetchFlush
=======================================================
diff --git a/arch/arm/include/asm/assembler.h b/arch/arm/include/asm/assembler.h
index 7bbf325a4f31..73d5ad456e32 100644
--- a/arch/arm/include/asm/assembler.h
+++ b/arch/arm/include/asm/assembler.h
@@ -449,6 +449,13 @@ THUMB(	orr	\reg , \reg , #PSR_T_BIT	)
 #endif
 	.endm
 
+	.macro	dacr_sync, rd
+	mrc     p15, 0, \rd, c2, c0, 0
+	mov     \rd, \rd
+	sub     pc, pc, #4
+	mcr     p15, 0, \rd, c7, c5, 4
+	.endm
+
 	.macro	uaccess_disable, tmp, isb=1
 #ifdef CONFIG_CPU_SW_DOMAIN_PAN
 	/*
@@ -457,6 +464,7 @@ THUMB(	orr	\reg , \reg , #PSR_T_BIT	)
 	 */
 	mov	\tmp, #DACR_UACCESS_DISABLE
 	mcr	p15, 0, \tmp, c3, c0, 0		@ Set domain register
+	dacr_sync \tmp
 	.if	\isb
 	instr_sync
 	.endif
@@ -471,6 +479,7 @@ THUMB(	orr	\reg , \reg , #PSR_T_BIT	)
 	 */
 	mov	\tmp, #DACR_UACCESS_ENABLE
 	mcr	p15, 0, \tmp, c3, c0, 0
+	dacr_sync \tmp
 	.if	\isb
 	instr_sync
 	.endif
@@ -488,6 +497,7 @@ THUMB(	orr	\reg , \reg , #PSR_T_BIT	)
 #ifdef CONFIG_CPU_SW_DOMAIN_PAN
 	ldr	r0, [sp, #S_FRAME_SIZE]
 	mcr	p15, 0, r0, c3, c0, 0
+	dacr_sync r0
 #endif
 	.endm
 
diff --git a/arch/arm/include/asm/domain.h b/arch/arm/include/asm/domain.h
index e878129f2fee..10c9a38636ac 100644
--- a/arch/arm/include/asm/domain.h
+++ b/arch/arm/include/asm/domain.h
@@ -97,7 +97,11 @@ static inline unsigned int get_domain(void)
 static inline void set_domain(unsigned val)
 {
 	asm volatile(
-	"mcr	p15, 0, %0, c3, c0	@ set domain"
+	"mcr	p15, 0, %0, c3, c0;	@ set domain	\
+	mrc	p15, 0, %0, c2, c0, 0;			\
+	mov	%0, %0;					\
+	sub	pc, pc, #4;				\
+	mcr     p15, 0, %0, c7, c5, 4"
 	  : : "r" (val));
 	isb();
 }
diff --git a/arch/arm/mm/alignment.c b/arch/arm/mm/alignment.c
index 9769f1eefe3b..c9c454129344 100644
--- a/arch/arm/mm/alignment.c
+++ b/arch/arm/mm/alignment.c
@@ -747,6 +747,27 @@ do_alignment_t32_to_handler(unsigned long *pinstr, struct pt_regs *regs,
 	return NULL;
 }
 
+static u32 far_read(void)
+{
+        u32 far;
+        asm("mrc        p15, 0, %0, c6, c0, 0" : "=r" (far));
+        return far;
+}
+
+static u32 fsr_read(void)
+{
+        u32 fsr;
+        asm("mrc        p15, 0, %0, c5, c0, 0" : "=r" (fsr));
+        return fsr;
+}
+
+static u32 dacr_read(void)
+{
+        u32 dacr;
+        asm("mrc        p15, 0, %0, c3, c0, 0" : "=r" (dacr));
+        return dacr;
+}
+
 static int
 do_alignment(unsigned long addr, unsigned int fsr, struct pt_regs *regs)
 {
@@ -763,6 +784,8 @@ do_alignment(unsigned long addr, unsigned int fsr, struct pt_regs *regs)
 		local_irq_enable();
 
 	instrptr = instruction_pointer(regs);
+	pr_info("RJK1: fsr=%x far=%x dacr=%x\n", fsr_read(), far_read(), dacr_read());
+	pr_info("RJK2: fsr=%x far=%x dacr=%x\n", fsr_read(), far_read(), dacr_read());
 
 	if (thumb_mode(regs)) {
 		u16 *ptr = (u16 *)(instrptr & ~1);
@@ -787,6 +810,8 @@ do_alignment(unsigned long addr, unsigned int fsr, struct pt_regs *regs)
 		instr = __mem_to_opcode_arm(instr);
 	}
 
+	pr_info("RJK3: fsr=%x far=%x dacr=%x\n", fsr_read(), far_read(), dacr_read());
+	pr_info("RJK: fault=%d instr=0x%08x instrptr=%p\n", fault, instr, instrptr);
 	if (fault) {
 		type = TYPE_FAULT;
 		goto bad_or_fault;

[2] Oops
========
# insmod /tmp/unalign.ko 
RJK1: fsr=23 far=e1c23643 dacr=51
RJK2: fsr=23 far=e1c23643 dacr=51
RJK3: fsr=2f far=bf00202c dacr=51
RJK: fault=4 instr=0x00000000 instrptr=bf00202c
Unable to handle kernel paging request at virtual address e1c23643
pgd = e1cd4000
[e1c23643] *pgd=c1c0044e(bad)
Internal error: Oops: 823 [#1] ARM
Modules linked in: unalign(+)
CPU: 0 PID: 608 Comm: insmod Not tainted 4.2.0-rc8-next-20150828-cm-x300+ #919
Hardware name: CM-X300 module
task: e1c69880 ti: e1cb0000 task.ti: e1cb0000
PC is at u_init+0x2c/0x40 [unalign]
LR is at u_init+0x14/0x40 [unalign]
pc : [<bf00202c>]    lr : [<bf002014>]    psr: a0000013
sp : e1cb1df8  ip : e1c23e00  fp : 1e3dc89c
r10: e1c23780  r9 : 00000001  r8 : 00000000
r7 : bf002000  r6 : e1ca65c0  r5 : c0b85b80  r4 : c0b85b80
r3 : e1c23640  r2 : 00000004  r1 : a0000013  r0 : 00000000
Flags: NzCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none
Control: 0000397f  Table: c1cd4018  DAC: 00000051
Process insmod (pid: 608, stack limit = 0xe1cb0198)

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* Re: [PATCH] ARM: fix alignement of __bug_table section entries
  2015-09-06 21:31                 ` Robert Jarzmik
@ 2015-09-06 23:54                   ` Russell King - ARM Linux
  -1 siblings, 0 replies; 48+ messages in thread
From: Russell King - ARM Linux @ 2015-09-06 23:54 UTC (permalink / raw)
  To: Robert Jarzmik; +Cc: Dave Martin, linux-kernel, linux-arm-kernel

On Sun, Sep 06, 2015 at 11:31:34PM +0200, Robert Jarzmik wrote:
> Russell King - ARM Linux <linux@arm.linux.org.uk> writes:
> 
> >> [1] Approach 1 : translation table sync
> >> =======================================
> ...
> > The important place is in arch/arm/include/asm/domain.h, which is where
> > we manipulate the DACR within probe_kernel_address().
> 
> Gah, silly me. But even with [1], I still get an error [2]. I have a
> confirmation that I have a "Page Permission" fault on the
> probe_kernel_address().

Hmm, that's not right.  If it's the DACR, then it should be a page domain
fault, not a page permission fault.

> [2] Oops
> ========
> # insmod /tmp/unalign.ko 
> RJK1: fsr=23 far=e1c23643 dacr=51
> RJK2: fsr=23 far=e1c23643 dacr=51
> RJK3: fsr=2f far=bf00202c dacr=51
> RJK: fault=4 instr=0x00000000 instrptr=bf00202c

Can you add a show_pte(current->mm, instrptr) to dump those page
table entries please?

-- 
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [PATCH] ARM: fix alignement of __bug_table section entries
@ 2015-09-06 23:54                   ` Russell King - ARM Linux
  0 siblings, 0 replies; 48+ messages in thread
From: Russell King - ARM Linux @ 2015-09-06 23:54 UTC (permalink / raw)
  To: linux-arm-kernel

On Sun, Sep 06, 2015 at 11:31:34PM +0200, Robert Jarzmik wrote:
> Russell King - ARM Linux <linux@arm.linux.org.uk> writes:
> 
> >> [1] Approach 1 : translation table sync
> >> =======================================
> ...
> > The important place is in arch/arm/include/asm/domain.h, which is where
> > we manipulate the DACR within probe_kernel_address().
> 
> Gah, silly me. But even with [1], I still get an error [2]. I have a
> confirmation that I have a "Page Permission" fault on the
> probe_kernel_address().

Hmm, that's not right.  If it's the DACR, then it should be a page domain
fault, not a page permission fault.

> [2] Oops
> ========
> # insmod /tmp/unalign.ko 
> RJK1: fsr=23 far=e1c23643 dacr=51
> RJK2: fsr=23 far=e1c23643 dacr=51
> RJK3: fsr=2f far=bf00202c dacr=51
> RJK: fault=4 instr=0x00000000 instrptr=bf00202c

Can you add a show_pte(current->mm, instrptr) to dump those page
table entries please?

-- 
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH] ARM: fix alignement of __bug_table section entries
  2015-09-06 23:54                   ` Russell King - ARM Linux
@ 2015-09-08 17:01                     ` Robert Jarzmik
  -1 siblings, 0 replies; 48+ messages in thread
From: Robert Jarzmik @ 2015-09-08 17:01 UTC (permalink / raw)
  To: Russell King - ARM Linux; +Cc: Dave Martin, linux-kernel, linux-arm-kernel

Russell King - ARM Linux <linux@arm.linux.org.uk> writes:

>> Gah, silly me. But even with [1], I still get an error [2]. I have a
>> confirmation that I have a "Page Permission" fault on the
>> probe_kernel_address().
>
> Hmm, that's not right.  If it's the DACR, then it should be a page domain
> fault, not a page permission fault.
>
>> [2] Oops
>> ========
>> # insmod /tmp/unalign.ko 
>> RJK1: fsr=23 far=e1c23643 dacr=51
>> RJK2: fsr=23 far=e1c23643 dacr=51
>> RJK3: fsr=2f far=bf00202c dacr=51
>> RJK: fault=4 instr=0x00000000 instrptr=bf00202c
>
> Can you add a show_pte(current->mm, instrptr) to dump those page
> table entries please?
Most certainly, here we go :

# insmod /tmp/unalign.ko 
RJK1: fsr=23 far=e1c1f743 dacr=51
RJK2: fsr=23 far=e1c1f743 dacr=51
pgd = e1cc4000
[bf00202c] *pgd=c1cab851, *pte=c1cb504f, *ppte=c1cb501f
RJK3: fsr=2f far=bf00202c dacr=51
RJK4: fault=4 instr=0x00000000 instrptr=bf00202c
pgd = e1cc4000
[bf00202c] *pgd=c1cab851, *pte=c1cb504f, *ppte=c1cb501f

Unable to handle kernel paging request at virtual address e1c1f743
pgd = e1cc4000
[e1c1f743] *pgd=c1c0044e(bad)
Internal error: Oops: 823 [#1] ARM
Modules linked in: unalign(+)
CPU: 0 PID: 608 Comm: insmod Not tainted 4.2.0-rc8-next-20150828-cm-x300+ #926
Hardware name: CM-X300 module
task: e1c68380 ti: e1c84000 task.ti: e1c84000
PC is at u_init+0x2c/0x40 [unalign]
LR is at u_init+0x14/0x40 [unalign]
pc : [<bf00202c>]    lr : [<bf002014>]    psr: a0000013
sp : e1c85df8  ip : e1c1f700  fp : 1e3e041c
r10: e1c1fc00  r9 : 00000001  r8 : 00000000
r7 : bf002000  r6 : e1cad660  r5 : c0b85b80  r4 : c0b85b80
r3 : e1c1f740  r2 : 00000004  r1 : a0000013  r0 : 00000000
Flags: NzCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none
Control: 0000397f  Table: c1cc4018  DAC: 00000051
Process insmod (pid: 608, stack limit = 0xe1c84198)

It happens on both mioa701(pxa270) and cm-x300(pxa310), with the same
cross-compiler+host and kernel source.
Yet doesn't happen on zylonite(pxa310), but different cross-compiler+host.

I'll try to have a single kernel (binary) tried over the cm-x300 and zylonite to
cross-check.

Cheers.

-- 
Robert

PS: unalign.ko is a module which does a p=kmalloc(4096), then dereferences
    *(p+3)

[1] Personal memo: memory pagetables
====================================
# cat /sys/kernel/debug/kernel_page_tables 
---[ Modules ]---
0xbf000000-0xbf001000           4K     RW x      MEM/CACHED/WBRA
0xbf002000-0xbf003000           4K     RW x      MEM/CACHED/WBRA
---[ Kernel Mapping ]---
0xc0000000-0xc4000000          64M     RW x     
0xe0000000-0xe4000000          64M     RW x     
---[ vmalloc() Area ]---
0xe4804000-0xe4844000         256K     RW NX     SO/UNCACHED
0xe4845000-0xe4850000          44K     RW NX     MEM/CACHED/WBRA
0xe485a000-0xe485b000           4K     RW NX SHD DEV/SHARED
0xe485c000-0xe485d000           4K     RW NX SHD DEV/SHARED
0xe485e000-0xe485f000           4K     RW NX     SO/UNCACHED
0xe4860000-0xe4870000          64K     RW NX SHD DEV/SHARED
0xe487a000-0xe487d000          12K     RW NX     MEM/CACHED/WBRA
0xe4880000-0xe48c0000         256K     RW NX SHD DEV/SHARED
0xe48c1000-0xe4903000         264K     RW NX     MEM/CACHED/WBRA
0xe4904000-0xe491e000         104K     RW NX     SO/UNCACHED
0xe49a0000-0xe49b0000          64K     RW NX SHD DEV/SHARED
0xe49b1000-0xe49d5000         144K     RW NX     MEM/CACHED/WBRA
0xe49d6000-0xe49e1000          44K     RW NX     MEM/CACHED/WBRA
0xf2000000-0xf4000000          32M     RW x  SHD
0xf6000000-0xf6200000           2M     RW x  SHD
0xf6200000-0xf6201000           4K     RW NX SHD DEV/SHARED
0xf6300000-0xf6400000           1M     RW x  SHD
---[ vmalloc() End ]---
---[ Fixmap Area ]---
---[ Vectors ]---
0xffff0000-0xffff1000           4K USR ro x      MEM/CACHED/WBRA
0xffff1000-0xffff2000           4K     ro x      MEM/CACHED/WBRA
---[ Vectors End ]---

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [PATCH] ARM: fix alignement of __bug_table section entries
@ 2015-09-08 17:01                     ` Robert Jarzmik
  0 siblings, 0 replies; 48+ messages in thread
From: Robert Jarzmik @ 2015-09-08 17:01 UTC (permalink / raw)
  To: linux-arm-kernel

Russell King - ARM Linux <linux@arm.linux.org.uk> writes:

>> Gah, silly me. But even with [1], I still get an error [2]. I have a
>> confirmation that I have a "Page Permission" fault on the
>> probe_kernel_address().
>
> Hmm, that's not right.  If it's the DACR, then it should be a page domain
> fault, not a page permission fault.
>
>> [2] Oops
>> ========
>> # insmod /tmp/unalign.ko 
>> RJK1: fsr=23 far=e1c23643 dacr=51
>> RJK2: fsr=23 far=e1c23643 dacr=51
>> RJK3: fsr=2f far=bf00202c dacr=51
>> RJK: fault=4 instr=0x00000000 instrptr=bf00202c
>
> Can you add a show_pte(current->mm, instrptr) to dump those page
> table entries please?
Most certainly, here we go :

# insmod /tmp/unalign.ko 
RJK1: fsr=23 far=e1c1f743 dacr=51
RJK2: fsr=23 far=e1c1f743 dacr=51
pgd = e1cc4000
[bf00202c] *pgd=c1cab851, *pte=c1cb504f, *ppte=c1cb501f
RJK3: fsr=2f far=bf00202c dacr=51
RJK4: fault=4 instr=0x00000000 instrptr=bf00202c
pgd = e1cc4000
[bf00202c] *pgd=c1cab851, *pte=c1cb504f, *ppte=c1cb501f

Unable to handle kernel paging request at virtual address e1c1f743
pgd = e1cc4000
[e1c1f743] *pgd=c1c0044e(bad)
Internal error: Oops: 823 [#1] ARM
Modules linked in: unalign(+)
CPU: 0 PID: 608 Comm: insmod Not tainted 4.2.0-rc8-next-20150828-cm-x300+ #926
Hardware name: CM-X300 module
task: e1c68380 ti: e1c84000 task.ti: e1c84000
PC is at u_init+0x2c/0x40 [unalign]
LR is at u_init+0x14/0x40 [unalign]
pc : [<bf00202c>]    lr : [<bf002014>]    psr: a0000013
sp : e1c85df8  ip : e1c1f700  fp : 1e3e041c
r10: e1c1fc00  r9 : 00000001  r8 : 00000000
r7 : bf002000  r6 : e1cad660  r5 : c0b85b80  r4 : c0b85b80
r3 : e1c1f740  r2 : 00000004  r1 : a0000013  r0 : 00000000
Flags: NzCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none
Control: 0000397f  Table: c1cc4018  DAC: 00000051
Process insmod (pid: 608, stack limit = 0xe1c84198)

It happens on both mioa701(pxa270) and cm-x300(pxa310), with the same
cross-compiler+host and kernel source.
Yet doesn't happen on zylonite(pxa310), but different cross-compiler+host.

I'll try to have a single kernel (binary) tried over the cm-x300 and zylonite to
cross-check.

Cheers.

-- 
Robert

PS: unalign.ko is a module which does a p=kmalloc(4096), then dereferences
    *(p+3)

[1] Personal memo: memory pagetables
====================================
# cat /sys/kernel/debug/kernel_page_tables 
---[ Modules ]---
0xbf000000-0xbf001000           4K     RW x      MEM/CACHED/WBRA
0xbf002000-0xbf003000           4K     RW x      MEM/CACHED/WBRA
---[ Kernel Mapping ]---
0xc0000000-0xc4000000          64M     RW x     
0xe0000000-0xe4000000          64M     RW x     
---[ vmalloc() Area ]---
0xe4804000-0xe4844000         256K     RW NX     SO/UNCACHED
0xe4845000-0xe4850000          44K     RW NX     MEM/CACHED/WBRA
0xe485a000-0xe485b000           4K     RW NX SHD DEV/SHARED
0xe485c000-0xe485d000           4K     RW NX SHD DEV/SHARED
0xe485e000-0xe485f000           4K     RW NX     SO/UNCACHED
0xe4860000-0xe4870000          64K     RW NX SHD DEV/SHARED
0xe487a000-0xe487d000          12K     RW NX     MEM/CACHED/WBRA
0xe4880000-0xe48c0000         256K     RW NX SHD DEV/SHARED
0xe48c1000-0xe4903000         264K     RW NX     MEM/CACHED/WBRA
0xe4904000-0xe491e000         104K     RW NX     SO/UNCACHED
0xe49a0000-0xe49b0000          64K     RW NX SHD DEV/SHARED
0xe49b1000-0xe49d5000         144K     RW NX     MEM/CACHED/WBRA
0xe49d6000-0xe49e1000          44K     RW NX     MEM/CACHED/WBRA
0xf2000000-0xf4000000          32M     RW x  SHD
0xf6000000-0xf6200000           2M     RW x  SHD
0xf6200000-0xf6201000           4K     RW NX SHD DEV/SHARED
0xf6300000-0xf6400000           1M     RW x  SHD
---[ vmalloc() End ]---
---[ Fixmap Area ]---
---[ Vectors ]---
0xffff0000-0xffff1000           4K USR ro x      MEM/CACHED/WBRA
0xffff1000-0xffff2000           4K     ro x      MEM/CACHED/WBRA
---[ Vectors End ]---

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH] ARM: fix alignement of __bug_table section entries
  2015-09-08 17:01                     ` Robert Jarzmik
@ 2015-09-08 20:08                       ` Russell King - ARM Linux
  -1 siblings, 0 replies; 48+ messages in thread
From: Russell King - ARM Linux @ 2015-09-08 20:08 UTC (permalink / raw)
  To: Robert Jarzmik; +Cc: Dave Martin, linux-kernel, linux-arm-kernel

On Tue, Sep 08, 2015 at 07:01:00PM +0200, Robert Jarzmik wrote:
> Russell King - ARM Linux <linux@arm.linux.org.uk> writes:
> 
> >> Gah, silly me. But even with [1], I still get an error [2]. I have a
> >> confirmation that I have a "Page Permission" fault on the
> >> probe_kernel_address().
> >
> > Hmm, that's not right.  If it's the DACR, then it should be a page domain
> > fault, not a page permission fault.
> >
> >> [2] Oops
> >> ========
> >> # insmod /tmp/unalign.ko 
> >> RJK1: fsr=23 far=e1c23643 dacr=51
> >> RJK2: fsr=23 far=e1c23643 dacr=51
> >> RJK3: fsr=2f far=bf00202c dacr=51
> >> RJK: fault=4 instr=0x00000000 instrptr=bf00202c
> >
> > Can you add a show_pte(current->mm, instrptr) to dump those page
> > table entries please?
> Most certainly, here we go :
> 
> # insmod /tmp/unalign.ko 
> RJK1: fsr=23 far=e1c1f743 dacr=51
> RJK2: fsr=23 far=e1c1f743 dacr=51
> pgd = e1cc4000
> [bf00202c] *pgd=c1cab851, *pte=c1cb504f, *ppte=c1cb501f
> RJK3: fsr=2f far=bf00202c dacr=51
> RJK4: fault=4 instr=0x00000000 instrptr=bf00202c
> pgd = e1cc4000
> [bf00202c] *pgd=c1cab851, *pte=c1cb504f, *ppte=c1cb501f

Okay, so domain = 2 (which for an Xscale 3 kernel is DOMAIN_KERNEL).
The page table entry has AP=01, which is user no-access, svc read/write.

What should happen is:

#define probe_kernel_address(addr, retval)              \
        ({                                              \
                long ret;                               \
                mm_segment_t old_fs = get_fs();         \
                                                        \
                set_fs(KERNEL_DS);                      \

This should update the DACR from 0x51 to 0x71 - switching domain 2 to
manager mode.

                pagefault_disable();                    \
                ret = __copy_from_user_inatomic(&(retval), \
			(__force typeof(retval) __user *)(addr), \
			sizeof(retval)); \
...
__copy_from_user_inatomic is an alias for __copy_from_user:

static inline unsigned long __must_check
__copy_from_user(void *to, const void __user *from, unsigned long n)
{
        unsigned int __ua_flags = uaccess_save_and_enable();

and uaccess_save_and_enable() does:

        unsigned int old_domain = get_domain();

        /* Set the current domain access to permit user accesses */
        set_domain((old_domain & ~domain_mask(DOMAIN_USER)) |
                   domain_val(DOMAIN_USER, DOMAIN_CLIENT));

So this should then end up changing the DACR from 0x71 to 0x75,
enabling user access (because this function may access userspace.)

What this means is that (going back to __copy_from_user):

        n = arm_copy_from_user(to, from, n);

At the point we call into this code, the DACR should be 0x75, which
should allow us to read the instruction at 0xbf00202c.  But this is
failing with a permission error - which it would do if it thought
the kernel domain was in manager mode (iow, 0x55).

I think some debugging in the above areas is needed - but I can't
see anything wrong.  If something were wrong, I'm pretty sure we'd
have every pre-ARMv6 machine exploding right now because of it -
and oopses wouldn't work.

You're clearly getting oopses reported correctly, which rather
proves out this code path.

Now, when you get the fault inside arm_copy_from_user(), you can
print the DACR value saved at the time the fault was generated by
printing the word above struct pt_regs on the stack - add to
arch/arm/mm/fault.c:do_DataAbort():

if (addr == 0xbf00202c) printk("DACR=0x%08x\n", *(u32 *)(regs + 1));

before the "if (!inf->fn(addr, fsr & ~FSR_LNX_PF, regs))" line.
That'll tell us what the DACR register was when we saved it.

If it isn't 0x75, then the next part of the puzzle is going to be
working out why it isn't.

-- 
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [PATCH] ARM: fix alignement of __bug_table section entries
@ 2015-09-08 20:08                       ` Russell King - ARM Linux
  0 siblings, 0 replies; 48+ messages in thread
From: Russell King - ARM Linux @ 2015-09-08 20:08 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Sep 08, 2015 at 07:01:00PM +0200, Robert Jarzmik wrote:
> Russell King - ARM Linux <linux@arm.linux.org.uk> writes:
> 
> >> Gah, silly me. But even with [1], I still get an error [2]. I have a
> >> confirmation that I have a "Page Permission" fault on the
> >> probe_kernel_address().
> >
> > Hmm, that's not right.  If it's the DACR, then it should be a page domain
> > fault, not a page permission fault.
> >
> >> [2] Oops
> >> ========
> >> # insmod /tmp/unalign.ko 
> >> RJK1: fsr=23 far=e1c23643 dacr=51
> >> RJK2: fsr=23 far=e1c23643 dacr=51
> >> RJK3: fsr=2f far=bf00202c dacr=51
> >> RJK: fault=4 instr=0x00000000 instrptr=bf00202c
> >
> > Can you add a show_pte(current->mm, instrptr) to dump those page
> > table entries please?
> Most certainly, here we go :
> 
> # insmod /tmp/unalign.ko 
> RJK1: fsr=23 far=e1c1f743 dacr=51
> RJK2: fsr=23 far=e1c1f743 dacr=51
> pgd = e1cc4000
> [bf00202c] *pgd=c1cab851, *pte=c1cb504f, *ppte=c1cb501f
> RJK3: fsr=2f far=bf00202c dacr=51
> RJK4: fault=4 instr=0x00000000 instrptr=bf00202c
> pgd = e1cc4000
> [bf00202c] *pgd=c1cab851, *pte=c1cb504f, *ppte=c1cb501f

Okay, so domain = 2 (which for an Xscale 3 kernel is DOMAIN_KERNEL).
The page table entry has AP=01, which is user no-access, svc read/write.

What should happen is:

#define probe_kernel_address(addr, retval)              \
        ({                                              \
                long ret;                               \
                mm_segment_t old_fs = get_fs();         \
                                                        \
                set_fs(KERNEL_DS);                      \

This should update the DACR from 0x51 to 0x71 - switching domain 2 to
manager mode.

                pagefault_disable();                    \
                ret = __copy_from_user_inatomic(&(retval), \
			(__force typeof(retval) __user *)(addr), \
			sizeof(retval)); \
...
__copy_from_user_inatomic is an alias for __copy_from_user:

static inline unsigned long __must_check
__copy_from_user(void *to, const void __user *from, unsigned long n)
{
        unsigned int __ua_flags = uaccess_save_and_enable();

and uaccess_save_and_enable() does:

        unsigned int old_domain = get_domain();

        /* Set the current domain access to permit user accesses */
        set_domain((old_domain & ~domain_mask(DOMAIN_USER)) |
                   domain_val(DOMAIN_USER, DOMAIN_CLIENT));

So this should then end up changing the DACR from 0x71 to 0x75,
enabling user access (because this function may access userspace.)

What this means is that (going back to __copy_from_user):

        n = arm_copy_from_user(to, from, n);

At the point we call into this code, the DACR should be 0x75, which
should allow us to read the instruction at 0xbf00202c.  But this is
failing with a permission error - which it would do if it thought
the kernel domain was in manager mode (iow, 0x55).

I think some debugging in the above areas is needed - but I can't
see anything wrong.  If something were wrong, I'm pretty sure we'd
have every pre-ARMv6 machine exploding right now because of it -
and oopses wouldn't work.

You're clearly getting oopses reported correctly, which rather
proves out this code path.

Now, when you get the fault inside arm_copy_from_user(), you can
print the DACR value saved at the time the fault was generated by
printing the word above struct pt_regs on the stack - add to
arch/arm/mm/fault.c:do_DataAbort():

if (addr == 0xbf00202c) printk("DACR=0x%08x\n", *(u32 *)(regs + 1));

before the "if (!inf->fn(addr, fsr & ~FSR_LNX_PF, regs))" line.
That'll tell us what the DACR register was when we saved it.

If it isn't 0x75, then the next part of the puzzle is going to be
working out why it isn't.

-- 
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH] ARM: fix alignement of __bug_table section entries
  2015-09-08 20:08                       ` Russell King - ARM Linux
@ 2015-09-08 20:46                         ` Robert Jarzmik
  -1 siblings, 0 replies; 48+ messages in thread
From: Robert Jarzmik @ 2015-09-08 20:46 UTC (permalink / raw)
  To: Russell King - ARM Linux; +Cc: Dave Martin, linux-kernel, linux-arm-kernel

Russell King - ARM Linux <linux@arm.linux.org.uk> writes:

> What should happen is:
Thanks very much for the explanation, hopefully I have enough material to fly on
my own now.

> Now, when you get the fault inside arm_copy_from_user(), you can
> print the DACR value saved at the time the fault was generated by
> printing the word above struct pt_regs on the stack - add to
> arch/arm/mm/fault.c:do_DataAbort():
>
> if (addr == 0xbf00202c) printk("DACR=0x%08x\n", *(u32 *)(regs + 1));
>
> before the "if (!inf->fn(addr, fsr & ~FSR_LNX_PF, regs))" line.
> That'll tell us what the DACR register was when we saved it.
>
> If it isn't 0x75, then the next part of the puzzle is going to be
> working out why it isn't.
It's 0x55. I'll track down how this happens, there are not that many places
where DACR is touched, and I'm in a very controlled environement, so I can
cunningly place JTAG breakpoints and watch DACR.

I'll report once I have a better idea what is happening, that might take me a
couple of days given that most of my workforce is available on weekends only.

Thanks again for the free lesson.

-- 
Robert

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [PATCH] ARM: fix alignement of __bug_table section entries
@ 2015-09-08 20:46                         ` Robert Jarzmik
  0 siblings, 0 replies; 48+ messages in thread
From: Robert Jarzmik @ 2015-09-08 20:46 UTC (permalink / raw)
  To: linux-arm-kernel

Russell King - ARM Linux <linux@arm.linux.org.uk> writes:

> What should happen is:
Thanks very much for the explanation, hopefully I have enough material to fly on
my own now.

> Now, when you get the fault inside arm_copy_from_user(), you can
> print the DACR value saved at the time the fault was generated by
> printing the word above struct pt_regs on the stack - add to
> arch/arm/mm/fault.c:do_DataAbort():
>
> if (addr == 0xbf00202c) printk("DACR=0x%08x\n", *(u32 *)(regs + 1));
>
> before the "if (!inf->fn(addr, fsr & ~FSR_LNX_PF, regs))" line.
> That'll tell us what the DACR register was when we saved it.
>
> If it isn't 0x75, then the next part of the puzzle is going to be
> working out why it isn't.
It's 0x55. I'll track down how this happens, there are not that many places
where DACR is touched, and I'm in a very controlled environement, so I can
cunningly place JTAG breakpoints and watch DACR.

I'll report once I have a better idea what is happening, that might take me a
couple of days given that most of my workforce is available on weekends only.

Thanks again for the free lesson.

-- 
Robert

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH] ARM: fix alignement of __bug_table section entries
  2015-09-08 20:08                       ` Russell King - ARM Linux
@ 2015-09-09 23:06                         ` Robert Jarzmik
  -1 siblings, 0 replies; 48+ messages in thread
From: Robert Jarzmik @ 2015-09-09 23:06 UTC (permalink / raw)
  To: Russell King - ARM Linux; +Cc: Dave Martin, linux-kernel, linux-arm-kernel

Russell King - ARM Linux <linux@arm.linux.org.uk> writes:

> On Tue, Sep 08, 2015 at 07:01:00PM +0200, Robert Jarzmik wrote:
>> Russell King - ARM Linux <linux@arm.linux.org.uk> writes:
> At the point we call into this code, the DACR should be 0x75, which
> should allow us to read the instruction at 0xbf00202c.  But this is
> failing with a permission error - which it would do if it thought
> the kernel domain was in manager mode (iow, 0x55).

Okay Russell, I have a good idea what's happening now. Basically, it boils down
to compiler optimization of get_domain() which is called twice (set_fs() ->
modify_domain() -> get_domain()). See the piece in [1] for a more complete
explanation.

I still haven't finished my work, as I need to disassemble the do_alignment()
function to confirm the DACR read by get_domain() is only done once in the
probe_kernel_address() call, and yet my hopes are high this is the cause as I
traced the DACR modifications, which led me to :
[-0] 0xc0017080: set_domain(0x00000055) => dacr = 0x00000055 => second set_domain()
[-1] 0xc0017080: set_domain(0x00000071) => dacr = 0x00000071 => first set_domain()
[-2] 0xc008a124: set_domain(0x00000051) => dacr = 0x00000051

Once I have my disassembly properly analyzed (ie. the second set_fs() doesn't do
a mrc instruction), I'll have my proof. My setup is too full of traces and
attempts to stall pipeline/prefetch to conclude yet, but there is a fair chance
I'm closer to the solution now.

Cheers.

--
Robert

[1] Current patch
=================
---8<---
>From 07cdda877b1b0c4fcdba3d756f6a22ce035ee672 Mon Sep 17 00:00:00 2001
From: Robert Jarzmik <robert.jarzmik@free.fr>
Date: Thu, 10 Sep 2015 00:32:26 +0200
Subject: [PATCH] ARM: fix domain access

On the pxa310 platform, unaligned accesses are not fixed anymore with
the SW_DOMAIN_PAN configuration activated.

The trouble is that probe_kernel_address() cannot read the faulting
instruction, because of a permission error. The permission error comes
from the DACR register, whose value is incorrectly set to 0x55 instead
of 0x75.

I appears that at least gcc 4.8.2 optmizes the do_alignment() function,
and inside probe_kernel_address(), where set_fs() is called twice,
ie. modify_domain() is called twice, the get_domain() is called only
once, and the first call's value is reused. As a consequence, instead of
changing DACR as 0x51 -> 0x71 -> 0x75, it instead does 0x51 -> 0x71 ->
0x55. This is because instead of doing the last transition as :
 - (as (0x71 & ~0x03) | 0x04 = 0x75)
it does
 - (as (0x51 & ~0x03) | 0x04 = 0x55)

As a consequence, the alignment fault cannot be fixed, and triggers an
Oops:
Unable to handle kernel paging request at virtual address e1c20ec3
pgd = e1ce4000
[e1c20ec3] *pgd=c1c0044e(bad)
Internal error: Oops: 823 [#1] ARM
Modules linked in: unalign(+)
CPU: 0 PID: 610 Comm: insmod Not tainted 4.2.0-rc8-next-20150828-cm-x300+ #946
Hardware name: CM-X300 module
task: e1c69500 ti: e1cc0000 task.ti: e1cc0000
PC is at u_init+0x2c/0x40 [unalign]
LR is at u_init+0x14/0x40 [unalign]
pc : [<bf00202c>]    lr : [<bf002014>]    psr: a0000013
sp : e1cc1df0  ip : e1c20640  fp : 1e3dffdc
r10: 00000001  r9 : e1c20040  r8 : 00000000
r7 : bf002000  r6 : e1c5ee80  r5 : c0be5b80  r4 : c0be5b80
r3 : e1c20ec0  r2 : 00000004  r1 : a0000013  r0 : 00000000
Flags: NzCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none
Control: 0000397f  Table: c1ce4018  DAC: 00000051
Process insmod (pid: 610, stack limit = 0xe1cc0198)
Stack: (0xe1cc1df0 to 0xe1cc2000)

Signed-off-by: Robert Jarzmik <robert.jarzmik@free.fr>
---
 arch/arm/include/asm/domain.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/arm/include/asm/domain.h b/arch/arm/include/asm/domain.h
index 10c9a38636ac..47f3b8445237 100644
--- a/arch/arm/include/asm/domain.h
+++ b/arch/arm/include/asm/domain.h
@@ -87,7 +87,7 @@ static inline unsigned int get_domain(void)
 {
 	unsigned int domain;
 
-	asm(
+	asm volatile(
 	"mrc	p15, 0, %0, c3, c0	@ get domain"
 	 : "=r" (domain));
 
-- 
2.1.4


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH] ARM: fix alignement of __bug_table section entries
@ 2015-09-09 23:06                         ` Robert Jarzmik
  0 siblings, 0 replies; 48+ messages in thread
From: Robert Jarzmik @ 2015-09-09 23:06 UTC (permalink / raw)
  To: linux-arm-kernel

Russell King - ARM Linux <linux@arm.linux.org.uk> writes:

> On Tue, Sep 08, 2015 at 07:01:00PM +0200, Robert Jarzmik wrote:
>> Russell King - ARM Linux <linux@arm.linux.org.uk> writes:
> At the point we call into this code, the DACR should be 0x75, which
> should allow us to read the instruction at 0xbf00202c.  But this is
> failing with a permission error - which it would do if it thought
> the kernel domain was in manager mode (iow, 0x55).

Okay Russell, I have a good idea what's happening now. Basically, it boils down
to compiler optimization of get_domain() which is called twice (set_fs() ->
modify_domain() -> get_domain()). See the piece in [1] for a more complete
explanation.

I still haven't finished my work, as I need to disassemble the do_alignment()
function to confirm the DACR read by get_domain() is only done once in the
probe_kernel_address() call, and yet my hopes are high this is the cause as I
traced the DACR modifications, which led me to :
[-0] 0xc0017080: set_domain(0x00000055) => dacr = 0x00000055 => second set_domain()
[-1] 0xc0017080: set_domain(0x00000071) => dacr = 0x00000071 => first set_domain()
[-2] 0xc008a124: set_domain(0x00000051) => dacr = 0x00000051

Once I have my disassembly properly analyzed (ie. the second set_fs() doesn't do
a mrc instruction), I'll have my proof. My setup is too full of traces and
attempts to stall pipeline/prefetch to conclude yet, but there is a fair chance
I'm closer to the solution now.

Cheers.

--
Robert

[1] Current patch
=================
---8<---
>From 07cdda877b1b0c4fcdba3d756f6a22ce035ee672 Mon Sep 17 00:00:00 2001
From: Robert Jarzmik <robert.jarzmik@free.fr>
Date: Thu, 10 Sep 2015 00:32:26 +0200
Subject: [PATCH] ARM: fix domain access

On the pxa310 platform, unaligned accesses are not fixed anymore with
the SW_DOMAIN_PAN configuration activated.

The trouble is that probe_kernel_address() cannot read the faulting
instruction, because of a permission error. The permission error comes
from the DACR register, whose value is incorrectly set to 0x55 instead
of 0x75.

I appears that at least gcc 4.8.2 optmizes the do_alignment() function,
and inside probe_kernel_address(), where set_fs() is called twice,
ie. modify_domain() is called twice, the get_domain() is called only
once, and the first call's value is reused. As a consequence, instead of
changing DACR as 0x51 -> 0x71 -> 0x75, it instead does 0x51 -> 0x71 ->
0x55. This is because instead of doing the last transition as :
 - (as (0x71 & ~0x03) | 0x04 = 0x75)
it does
 - (as (0x51 & ~0x03) | 0x04 = 0x55)

As a consequence, the alignment fault cannot be fixed, and triggers an
Oops:
Unable to handle kernel paging request at virtual address e1c20ec3
pgd = e1ce4000
[e1c20ec3] *pgd=c1c0044e(bad)
Internal error: Oops: 823 [#1] ARM
Modules linked in: unalign(+)
CPU: 0 PID: 610 Comm: insmod Not tainted 4.2.0-rc8-next-20150828-cm-x300+ #946
Hardware name: CM-X300 module
task: e1c69500 ti: e1cc0000 task.ti: e1cc0000
PC is at u_init+0x2c/0x40 [unalign]
LR is at u_init+0x14/0x40 [unalign]
pc : [<bf00202c>]    lr : [<bf002014>]    psr: a0000013
sp : e1cc1df0  ip : e1c20640  fp : 1e3dffdc
r10: 00000001  r9 : e1c20040  r8 : 00000000
r7 : bf002000  r6 : e1c5ee80  r5 : c0be5b80  r4 : c0be5b80
r3 : e1c20ec0  r2 : 00000004  r1 : a0000013  r0 : 00000000
Flags: NzCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none
Control: 0000397f  Table: c1ce4018  DAC: 00000051
Process insmod (pid: 610, stack limit = 0xe1cc0198)
Stack: (0xe1cc1df0 to 0xe1cc2000)

Signed-off-by: Robert Jarzmik <robert.jarzmik@free.fr>
---
 arch/arm/include/asm/domain.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/arm/include/asm/domain.h b/arch/arm/include/asm/domain.h
index 10c9a38636ac..47f3b8445237 100644
--- a/arch/arm/include/asm/domain.h
+++ b/arch/arm/include/asm/domain.h
@@ -87,7 +87,7 @@ static inline unsigned int get_domain(void)
 {
 	unsigned int domain;
 
-	asm(
+	asm volatile(
 	"mrc	p15, 0, %0, c3, c0	@ get domain"
 	 : "=r" (domain));
 
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* Re: [PATCH] ARM: fix alignement of __bug_table section entries
  2015-09-09 23:06                         ` Robert Jarzmik
@ 2015-09-10 19:01                           ` Robert Jarzmik
  -1 siblings, 0 replies; 48+ messages in thread
From: Robert Jarzmik @ 2015-09-10 19:01 UTC (permalink / raw)
  To: Russell King - ARM Linux; +Cc: Dave Martin, linux-kernel, linux-arm-kernel

Robert Jarzmik <robert.jarzmik@free.fr> writes:

> Russell King - ARM Linux <linux@arm.linux.org.uk> writes:
>
>> On Tue, Sep 08, 2015 at 07:01:00PM +0200, Robert Jarzmik wrote:
>>> Russell King - ARM Linux <linux@arm.linux.org.uk> writes:
>> At the point we call into this code, the DACR should be 0x75, which
>> should allow us to read the instruction at 0xbf00202c.  But this is
>> failing with a permission error - which it would do if it thought
>> the kernel domain was in manager mode (iow, 0x55).
>
> Okay Russell, I have a good idea what's happening now. Basically, it boils down
> to compiler optimization of get_domain() which is called twice (set_fs() ->
> modify_domain() -> get_domain()). See the piece in [1] for a more complete
> explanation.
>
> I still haven't finished my work, as I need to disassemble the do_alignment()
And I have the proof of gcc optimization, which I'll add to the commit message
if you want :
00000728 <do_alignment>:
     ...
     770:	ee134f10 	mrc	15, 0, r4, cr3, cr0, {0}
     ... no r4 or mrc/mcr usage
     788:	e3842030 	orr	r2, r4, #48	; 0x30
     ... no r2/r4 or mrc/mcr usage
     794:	ee032f10 	mcr	15, 0, r2, cr3, cr0, {0}
     798:	ee07cf95 	mcr	15, 0, ip, cr7, cr5, {4}
     ... no r4 or mrc/mcr usage
     7ac:	e3c4300c 	bic	r3, r4, #12
     7b0:	e3833004 	orr	r3, r3, #4
     7b4:	ee033f10 	mcr	15, 0, r3, cr3, cr0, {0}
     ... no mrc/mcr usage
     7cc:	ebfffffe 	bl	0 <arm_copy_from_user>

Here, we have in probe_kernel_address() in do_alignment():
 - @770 : r4 = DACR
 - @794 : DACR = r4 | 0x30
 - @7b4 : DACR = (r4 & 0x0c) | 0x04 => the 0x30 is lost !!!

I'll send my patch to the mailing list tomorrow, as well as the other one to
align the __bug_table session.

Cheers.

--
Robert

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [PATCH] ARM: fix alignement of __bug_table section entries
@ 2015-09-10 19:01                           ` Robert Jarzmik
  0 siblings, 0 replies; 48+ messages in thread
From: Robert Jarzmik @ 2015-09-10 19:01 UTC (permalink / raw)
  To: linux-arm-kernel

Robert Jarzmik <robert.jarzmik@free.fr> writes:

> Russell King - ARM Linux <linux@arm.linux.org.uk> writes:
>
>> On Tue, Sep 08, 2015 at 07:01:00PM +0200, Robert Jarzmik wrote:
>>> Russell King - ARM Linux <linux@arm.linux.org.uk> writes:
>> At the point we call into this code, the DACR should be 0x75, which
>> should allow us to read the instruction at 0xbf00202c.  But this is
>> failing with a permission error - which it would do if it thought
>> the kernel domain was in manager mode (iow, 0x55).
>
> Okay Russell, I have a good idea what's happening now. Basically, it boils down
> to compiler optimization of get_domain() which is called twice (set_fs() ->
> modify_domain() -> get_domain()). See the piece in [1] for a more complete
> explanation.
>
> I still haven't finished my work, as I need to disassemble the do_alignment()
And I have the proof of gcc optimization, which I'll add to the commit message
if you want :
00000728 <do_alignment>:
     ...
     770:	ee134f10 	mrc	15, 0, r4, cr3, cr0, {0}
     ... no r4 or mrc/mcr usage
     788:	e3842030 	orr	r2, r4, #48	; 0x30
     ... no r2/r4 or mrc/mcr usage
     794:	ee032f10 	mcr	15, 0, r2, cr3, cr0, {0}
     798:	ee07cf95 	mcr	15, 0, ip, cr7, cr5, {4}
     ... no r4 or mrc/mcr usage
     7ac:	e3c4300c 	bic	r3, r4, #12
     7b0:	e3833004 	orr	r3, r3, #4
     7b4:	ee033f10 	mcr	15, 0, r3, cr3, cr0, {0}
     ... no mrc/mcr usage
     7cc:	ebfffffe 	bl	0 <arm_copy_from_user>

Here, we have in probe_kernel_address() in do_alignment():
 - @770 : r4 = DACR
 - @794 : DACR = r4 | 0x30
 - @7b4 : DACR = (r4 & 0x0c) | 0x04 => the 0x30 is lost !!!

I'll send my patch to the mailing list tomorrow, as well as the other one to
align the __bug_table session.

Cheers.

--
Robert

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH] ARM: fix alignement of __bug_table section entries
  2015-09-10 19:01                           ` Robert Jarzmik
@ 2015-09-10 19:16                             ` Russell King - ARM Linux
  -1 siblings, 0 replies; 48+ messages in thread
From: Russell King - ARM Linux @ 2015-09-10 19:16 UTC (permalink / raw)
  To: Robert Jarzmik; +Cc: Dave Martin, linux-kernel, linux-arm-kernel

On Thu, Sep 10, 2015 at 09:01:41PM +0200, Robert Jarzmik wrote:
> And I have the proof of gcc optimization, which I'll add to the commit message
> if you want :
> 00000728 <do_alignment>:
>      ...
>      770:	ee134f10 	mrc	15, 0, r4, cr3, cr0, {0}
>      ... no r4 or mrc/mcr usage
>      788:	e3842030 	orr	r2, r4, #48	; 0x30
>      ... no r2/r4 or mrc/mcr usage
>      794:	ee032f10 	mcr	15, 0, r2, cr3, cr0, {0}
>      798:	ee07cf95 	mcr	15, 0, ip, cr7, cr5, {4}
>      ... no r4 or mrc/mcr usage
>      7ac:	e3c4300c 	bic	r3, r4, #12
>      7b0:	e3833004 	orr	r3, r3, #4
>      7b4:	ee033f10 	mcr	15, 0, r3, cr3, cr0, {0}
>      ... no mrc/mcr usage
>      7cc:	ebfffffe 	bl	0 <arm_copy_from_user>
> 
> Here, we have in probe_kernel_address() in do_alignment():
>  - @770 : r4 = DACR
>  - @794 : DACR = r4 | 0x30
>  - @7b4 : DACR = (r4 & 0x0c) | 0x04 => the 0x30 is lost !!!
> 
> I'll send my patch to the mailing list tomorrow, as well as the other one to
> align the __bug_table session.

I've been wondering whether we can teach GCC that set_domain modifies
the value that get_domain returns, rather than throwing a volatile
onto the asm in get_domain.  The issue with a volatile there is that
even if the result is unused, but the code is reachable, gcc still has
to output the code to read the register.

We might be able to get away with a memory clobber on the set_domain,
and fake a memory read in get_domain, eg, by passing
	"m" (current_thread_info()->cpu_domain))
to the get_domain asm.

-- 
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [PATCH] ARM: fix alignement of __bug_table section entries
@ 2015-09-10 19:16                             ` Russell King - ARM Linux
  0 siblings, 0 replies; 48+ messages in thread
From: Russell King - ARM Linux @ 2015-09-10 19:16 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Sep 10, 2015 at 09:01:41PM +0200, Robert Jarzmik wrote:
> And I have the proof of gcc optimization, which I'll add to the commit message
> if you want :
> 00000728 <do_alignment>:
>      ...
>      770:	ee134f10 	mrc	15, 0, r4, cr3, cr0, {0}
>      ... no r4 or mrc/mcr usage
>      788:	e3842030 	orr	r2, r4, #48	; 0x30
>      ... no r2/r4 or mrc/mcr usage
>      794:	ee032f10 	mcr	15, 0, r2, cr3, cr0, {0}
>      798:	ee07cf95 	mcr	15, 0, ip, cr7, cr5, {4}
>      ... no r4 or mrc/mcr usage
>      7ac:	e3c4300c 	bic	r3, r4, #12
>      7b0:	e3833004 	orr	r3, r3, #4
>      7b4:	ee033f10 	mcr	15, 0, r3, cr3, cr0, {0}
>      ... no mrc/mcr usage
>      7cc:	ebfffffe 	bl	0 <arm_copy_from_user>
> 
> Here, we have in probe_kernel_address() in do_alignment():
>  - @770 : r4 = DACR
>  - @794 : DACR = r4 | 0x30
>  - @7b4 : DACR = (r4 & 0x0c) | 0x04 => the 0x30 is lost !!!
> 
> I'll send my patch to the mailing list tomorrow, as well as the other one to
> align the __bug_table session.

I've been wondering whether we can teach GCC that set_domain modifies
the value that get_domain returns, rather than throwing a volatile
onto the asm in get_domain.  The issue with a volatile there is that
even if the result is unused, but the code is reachable, gcc still has
to output the code to read the register.

We might be able to get away with a memory clobber on the set_domain,
and fake a memory read in get_domain, eg, by passing
	"m" (current_thread_info()->cpu_domain))
to the get_domain asm.

-- 
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH] ARM: fix alignement of __bug_table section entries
  2015-09-10 19:16                             ` Russell King - ARM Linux
@ 2015-09-10 20:53                               ` Robert Jarzmik
  -1 siblings, 0 replies; 48+ messages in thread
From: Robert Jarzmik @ 2015-09-10 20:53 UTC (permalink / raw)
  To: Russell King - ARM Linux; +Cc: Dave Martin, linux-kernel, linux-arm-kernel

Russell King - ARM Linux <linux@arm.linux.org.uk> writes:

> I've been wondering whether we can teach GCC that set_domain modifies
> the value that get_domain returns, rather than throwing a volatile
> onto the asm in get_domain.  The issue with a volatile there is that
> even if the result is unused, but the code is reachable, gcc still has
> to output the code to read the register.
>
> We might be able to get away with a memory clobber on the set_domain,
> and fake a memory read in get_domain, eg, by passing
> 	"m" (current_thread_info()->cpu_domain))
> to the get_domain asm.
Ok, let's say we do it that way.

I have some concerns about it:
  (a) I see an inbalance, as set_domain() doesn't actually modify
      current_thread_info()->cpu_domain. I don't see how it will protect use
      from this scenario :
        - get_domain()
        - set_domain()
        - set_domain()

  (b) domain.h is included from thread_info.h, not the other way around
      => current_thread_info() is not accessible from domain.h
      => that would require a bit of moving things around, as thread_info
         structure description should be available for example.
      This is currently my biggest problem with this approach.

  (c) I was also wondering if a case like this could happen :
     - a function foo() does a get_domain()
       => an exception/irq whatever happens and modifies the DACR
     - foo() continues a makes a modify_domain()
       => and here the modify_domain() uses the old DACR value
      Or said differently, I wonder if there is a case of 2 get_domain() calls
      in a row with a DACR modification in between. I

What about something such as [1], without a memory clobber, but a "fake" memory
variable link ?

Cheers.

--
Robert

[1] get_domain() / set_domain() link
diff --git a/arch/arm/include/asm/domain.h b/arch/arm/include/asm/domain.h
index e878129f2fee..fc1d9c43aa08 100644
--- a/arch/arm/include/asm/domain.h
+++ b/arch/arm/include/asm/domain.h
@@ -83,13 +83,17 @@
 
 #ifndef __ASSEMBLY__
 
+static int domain_barrier;
+/*
+ * how to get the current stack pointer in C
+ */
 static inline unsigned int get_domain(void)
 {
        unsigned int domain;
 
        asm(
        "mrc    p15, 0, %0, c3, c0      @ get domain"
-        : "=r" (domain));
+        : "=r" (domain), "=m" (domain_barrier));
 
        return domain;
 }
@@ -97,8 +101,8 @@ static inline unsigned int get_domain(void)
 static inline void set_domain(unsigned val)
 {
        asm volatile(
-       "mcr    p15, 0, %0, c3, c0      @ set domain"
-         : : "r" (val));
+       "mcr    p15, 0, %1, c3, c0      @ set domain"
+       : "=m" (domain_barrier) : "r" (val));
        isb();
 }

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH] ARM: fix alignement of __bug_table section entries
@ 2015-09-10 20:53                               ` Robert Jarzmik
  0 siblings, 0 replies; 48+ messages in thread
From: Robert Jarzmik @ 2015-09-10 20:53 UTC (permalink / raw)
  To: linux-arm-kernel

Russell King - ARM Linux <linux@arm.linux.org.uk> writes:

> I've been wondering whether we can teach GCC that set_domain modifies
> the value that get_domain returns, rather than throwing a volatile
> onto the asm in get_domain.  The issue with a volatile there is that
> even if the result is unused, but the code is reachable, gcc still has
> to output the code to read the register.
>
> We might be able to get away with a memory clobber on the set_domain,
> and fake a memory read in get_domain, eg, by passing
> 	"m" (current_thread_info()->cpu_domain))
> to the get_domain asm.
Ok, let's say we do it that way.

I have some concerns about it:
  (a) I see an inbalance, as set_domain() doesn't actually modify
      current_thread_info()->cpu_domain. I don't see how it will protect use
      from this scenario :
        - get_domain()
        - set_domain()
        - set_domain()

  (b) domain.h is included from thread_info.h, not the other way around
      => current_thread_info() is not accessible from domain.h
      => that would require a bit of moving things around, as thread_info
         structure description should be available for example.
      This is currently my biggest problem with this approach.

  (c) I was also wondering if a case like this could happen :
     - a function foo() does a get_domain()
       => an exception/irq whatever happens and modifies the DACR
     - foo() continues a makes a modify_domain()
       => and here the modify_domain() uses the old DACR value
      Or said differently, I wonder if there is a case of 2 get_domain() calls
      in a row with a DACR modification in between. I

What about something such as [1], without a memory clobber, but a "fake" memory
variable link ?

Cheers.

--
Robert

[1] get_domain() / set_domain() link
diff --git a/arch/arm/include/asm/domain.h b/arch/arm/include/asm/domain.h
index e878129f2fee..fc1d9c43aa08 100644
--- a/arch/arm/include/asm/domain.h
+++ b/arch/arm/include/asm/domain.h
@@ -83,13 +83,17 @@
 
 #ifndef __ASSEMBLY__
 
+static int domain_barrier;
+/*
+ * how to get the current stack pointer in C
+ */
 static inline unsigned int get_domain(void)
 {
        unsigned int domain;
 
        asm(
        "mrc    p15, 0, %0, c3, c0      @ get domain"
-        : "=r" (domain));
+        : "=r" (domain), "=m" (domain_barrier));
 
        return domain;
 }
@@ -97,8 +101,8 @@ static inline unsigned int get_domain(void)
 static inline void set_domain(unsigned val)
 {
        asm volatile(
-       "mcr    p15, 0, %0, c3, c0      @ set domain"
-         : : "r" (val));
+       "mcr    p15, 0, %1, c3, c0      @ set domain"
+       : "=m" (domain_barrier) : "r" (val));
        isb();
 }

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* Re: [PATCH] ARM: fix alignement of __bug_table section entries
  2015-09-10 20:53                               ` Robert Jarzmik
@ 2015-09-11  9:54                                 ` Russell King - ARM Linux
  -1 siblings, 0 replies; 48+ messages in thread
From: Russell King - ARM Linux @ 2015-09-11  9:54 UTC (permalink / raw)
  To: Robert Jarzmik; +Cc: Dave Martin, linux-kernel, linux-arm-kernel

On Thu, Sep 10, 2015 at 10:53:43PM +0200, Robert Jarzmik wrote:
> Russell King - ARM Linux <linux@arm.linux.org.uk> writes:
> 
> > I've been wondering whether we can teach GCC that set_domain modifies
> > the value that get_domain returns, rather than throwing a volatile
> > onto the asm in get_domain.  The issue with a volatile there is that
> > even if the result is unused, but the code is reachable, gcc still has
> > to output the code to read the register.
> >
> > We might be able to get away with a memory clobber on the set_domain,
> > and fake a memory read in get_domain, eg, by passing
> > 	"m" (current_thread_info()->cpu_domain))
> > to the get_domain asm.
> Ok, let's say we do it that way.
> 
> I have some concerns about it:
>   (a) I see an inbalance, as set_domain() doesn't actually modify
>       current_thread_info()->cpu_domain. I don't see how it will protect use
>       from this scenario :
>         - get_domain()
>         - set_domain()
>         - set_domain()

That should be fine, because if you've only got one get_domain(), then
you only get the value of the DACR once.

>   (b) domain.h is included from thread_info.h, not the other way around
>       => current_thread_info() is not accessible from domain.h
>       => that would require a bit of moving things around, as thread_info
>          structure description should be available for example.
>       This is currently my biggest problem with this approach.

It's not a problem since 1eef5d2f1b46 removed the need for domain.h to be
included by thread_info.h - the existing include can be dropped.

>   (c) I was also wondering if a case like this could happen :
>      - a function foo() does a get_domain()
>        => an exception/irq whatever happens and modifies the DACR

We always preserve the value of DACR across an exception.

>      - foo() continues a makes a modify_domain()
>        => and here the modify_domain() uses the old DACR value
>       Or said differently, I wonder if there is a case of 2 get_domain() calls
>       in a row with a DACR modification in between. I
> 
> What about something such as [1], without a memory clobber, but a "fake" memory
> variable link ?

The problem is the compiler will need to issue instructions to arrange
for the address of this variable to end up in registers even though the
assembly doesn't use it.

That's true of my suggestion as well, but looking at the callsites, we
generally already have, or very shortly there-after have the current
thread_info address in a register.

Patches to follow - I've not been able to confirm the instruction ordering
you've observed with my compiler, so I can't prove whether this solves
the problem locally.

-- 
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [PATCH] ARM: fix alignement of __bug_table section entries
@ 2015-09-11  9:54                                 ` Russell King - ARM Linux
  0 siblings, 0 replies; 48+ messages in thread
From: Russell King - ARM Linux @ 2015-09-11  9:54 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Sep 10, 2015 at 10:53:43PM +0200, Robert Jarzmik wrote:
> Russell King - ARM Linux <linux@arm.linux.org.uk> writes:
> 
> > I've been wondering whether we can teach GCC that set_domain modifies
> > the value that get_domain returns, rather than throwing a volatile
> > onto the asm in get_domain.  The issue with a volatile there is that
> > even if the result is unused, but the code is reachable, gcc still has
> > to output the code to read the register.
> >
> > We might be able to get away with a memory clobber on the set_domain,
> > and fake a memory read in get_domain, eg, by passing
> > 	"m" (current_thread_info()->cpu_domain))
> > to the get_domain asm.
> Ok, let's say we do it that way.
> 
> I have some concerns about it:
>   (a) I see an inbalance, as set_domain() doesn't actually modify
>       current_thread_info()->cpu_domain. I don't see how it will protect use
>       from this scenario :
>         - get_domain()
>         - set_domain()
>         - set_domain()

That should be fine, because if you've only got one get_domain(), then
you only get the value of the DACR once.

>   (b) domain.h is included from thread_info.h, not the other way around
>       => current_thread_info() is not accessible from domain.h
>       => that would require a bit of moving things around, as thread_info
>          structure description should be available for example.
>       This is currently my biggest problem with this approach.

It's not a problem since 1eef5d2f1b46 removed the need for domain.h to be
included by thread_info.h - the existing include can be dropped.

>   (c) I was also wondering if a case like this could happen :
>      - a function foo() does a get_domain()
>        => an exception/irq whatever happens and modifies the DACR

We always preserve the value of DACR across an exception.

>      - foo() continues a makes a modify_domain()
>        => and here the modify_domain() uses the old DACR value
>       Or said differently, I wonder if there is a case of 2 get_domain() calls
>       in a row with a DACR modification in between. I
> 
> What about something such as [1], without a memory clobber, but a "fake" memory
> variable link ?

The problem is the compiler will need to issue instructions to arrange
for the address of this variable to end up in registers even though the
assembly doesn't use it.

That's true of my suggestion as well, but looking at the callsites, we
generally already have, or very shortly there-after have the current
thread_info address in a register.

Patches to follow - I've not been able to confirm the instruction ordering
you've observed with my compiler, so I can't prove whether this solves
the problem locally.

-- 
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [PATCH 1/2] ARM: domains: thread_info.h no longer needs asm/domains.h
  2015-09-11  9:54                                 ` Russell King - ARM Linux
@ 2015-09-11  9:56                                   ` Russell King
  -1 siblings, 0 replies; 48+ messages in thread
From: Russell King @ 2015-09-11  9:56 UTC (permalink / raw)
  To: Robert Jarzmik; +Cc: Dave Martin, linux-kernel, linux-arm-kernel

As of 1eef5d2f1b46 ("ARM: domains: switch to keeping domain value in
register") we no longer need to include asm/domains.h into
asm/thread_info.h.  Remove it.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
---
 arch/arm/include/asm/thread_info.h | 1 -
 1 file changed, 1 deletion(-)

diff --git a/arch/arm/include/asm/thread_info.h b/arch/arm/include/asm/thread_info.h
index d0a1119dcaf3..776757d1604a 100644
--- a/arch/arm/include/asm/thread_info.h
+++ b/arch/arm/include/asm/thread_info.h
@@ -25,7 +25,6 @@
 struct task_struct;
 
 #include <asm/types.h>
-#include <asm/domain.h>
 
 typedef unsigned long mm_segment_t;
 
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH 1/2] ARM: domains: thread_info.h no longer needs asm/domains.h
@ 2015-09-11  9:56                                   ` Russell King
  0 siblings, 0 replies; 48+ messages in thread
From: Russell King @ 2015-09-11  9:56 UTC (permalink / raw)
  To: linux-arm-kernel

As of 1eef5d2f1b46 ("ARM: domains: switch to keeping domain value in
register") we no longer need to include asm/domains.h into
asm/thread_info.h.  Remove it.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
---
 arch/arm/include/asm/thread_info.h | 1 -
 1 file changed, 1 deletion(-)

diff --git a/arch/arm/include/asm/thread_info.h b/arch/arm/include/asm/thread_info.h
index d0a1119dcaf3..776757d1604a 100644
--- a/arch/arm/include/asm/thread_info.h
+++ b/arch/arm/include/asm/thread_info.h
@@ -25,7 +25,6 @@
 struct task_struct;
 
 #include <asm/types.h>
-#include <asm/domain.h>
 
 typedef unsigned long mm_segment_t;
 
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH 2/2] ARM: domains: add memory dependencies to get_domain/set_domain
  2015-09-11  9:54                                 ` Russell King - ARM Linux
@ 2015-09-11  9:56                                   ` Russell King
  -1 siblings, 0 replies; 48+ messages in thread
From: Russell King @ 2015-09-11  9:56 UTC (permalink / raw)
  To: Robert Jarzmik; +Cc: Dave Martin, linux-kernel, linux-arm-kernel

We need to have memory dependencies on get_domain/set_domain to avoid
the compiler over-optimising these inline assembly instructions.

Loads/stores must not be reordered across a set_domain(), so introduce
a compiler barrier for that assembly.

The value of get_domain() must not be cached across a set_domain(), but
we still want to allow the compiler to optimise it away.  Introduce a
dependency on current_thread_info()->cpu_domain to avoid this; the new
memory clobber in set_domain() should therefore cause the compiler to
re-load this.  The other advantage of using this is we should have its
address in the register set already, or very soon after at most call
sites.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
---
 arch/arm/include/asm/domain.h | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/arch/arm/include/asm/domain.h b/arch/arm/include/asm/domain.h
index e878129f2fee..fc8ba1663601 100644
--- a/arch/arm/include/asm/domain.h
+++ b/arch/arm/include/asm/domain.h
@@ -12,6 +12,7 @@
 
 #ifndef __ASSEMBLY__
 #include <asm/barrier.h>
+#include <asm/thread_info.h>
 #endif
 
 /*
@@ -89,7 +90,8 @@ static inline unsigned int get_domain(void)
 
 	asm(
 	"mrc	p15, 0, %0, c3, c0	@ get domain"
-	 : "=r" (domain));
+	 : "=r" (domain)
+	 : "m" (current_thread_info()->cpu_domain));
 
 	return domain;
 }
@@ -98,7 +100,7 @@ static inline void set_domain(unsigned val)
 {
 	asm volatile(
 	"mcr	p15, 0, %0, c3, c0	@ set domain"
-	  : : "r" (val));
+	  : : "r" (val) : "memory");
 	isb();
 }
 
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH 2/2] ARM: domains: add memory dependencies to get_domain/set_domain
@ 2015-09-11  9:56                                   ` Russell King
  0 siblings, 0 replies; 48+ messages in thread
From: Russell King @ 2015-09-11  9:56 UTC (permalink / raw)
  To: linux-arm-kernel

We need to have memory dependencies on get_domain/set_domain to avoid
the compiler over-optimising these inline assembly instructions.

Loads/stores must not be reordered across a set_domain(), so introduce
a compiler barrier for that assembly.

The value of get_domain() must not be cached across a set_domain(), but
we still want to allow the compiler to optimise it away.  Introduce a
dependency on current_thread_info()->cpu_domain to avoid this; the new
memory clobber in set_domain() should therefore cause the compiler to
re-load this.  The other advantage of using this is we should have its
address in the register set already, or very soon after at most call
sites.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
---
 arch/arm/include/asm/domain.h | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/arch/arm/include/asm/domain.h b/arch/arm/include/asm/domain.h
index e878129f2fee..fc8ba1663601 100644
--- a/arch/arm/include/asm/domain.h
+++ b/arch/arm/include/asm/domain.h
@@ -12,6 +12,7 @@
 
 #ifndef __ASSEMBLY__
 #include <asm/barrier.h>
+#include <asm/thread_info.h>
 #endif
 
 /*
@@ -89,7 +90,8 @@ static inline unsigned int get_domain(void)
 
 	asm(
 	"mrc	p15, 0, %0, c3, c0	@ get domain"
-	 : "=r" (domain));
+	 : "=r" (domain)
+	 : "m" (current_thread_info()->cpu_domain));
 
 	return domain;
 }
@@ -98,7 +100,7 @@ static inline void set_domain(unsigned val)
 {
 	asm volatile(
 	"mcr	p15, 0, %0, c3, c0	@ set domain"
-	  : : "r" (val));
+	  : : "r" (val) : "memory");
 	isb();
 }
 
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* Re: [PATCH 2/2] ARM: domains: add memory dependencies to get_domain/set_domain
  2015-09-11  9:56                                   ` Russell King
@ 2015-09-11 14:56                                     ` Robert Jarzmik
  -1 siblings, 0 replies; 48+ messages in thread
From: Robert Jarzmik @ 2015-09-11 14:56 UTC (permalink / raw)
  To: Russell King; +Cc: Dave Martin, linux-kernel, linux-arm-kernel

Russell King <rmk+kernel@arm.linux.org.uk> writes:

> We need to have memory dependencies on get_domain/set_domain to avoid
> the compiler over-optimising these inline assembly instructions.
>
> Loads/stores must not be reordered across a set_domain(), so introduce
> a compiler barrier for that assembly.
>
> The value of get_domain() must not be cached across a set_domain(), but
> we still want to allow the compiler to optimise it away.  Introduce a
> dependency on current_thread_info()->cpu_domain to avoid this; the new
> memory clobber in set_domain() should therefore cause the compiler to
> re-load this.  The other advantage of using this is we should have its
> address in the register set already, or very soon after at most call
> sites.
>
> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Tested-by: Robert Jarzmik <robert.jarzmik@free.fr>

The test is failing without these 2 patches, while with them, an unaligned
access is fixed, and the generated code looks good :
     7ac:       ee13cf10        mrc     15, 0, ip, cr3, cr0, {0}
     7b0:       e3cc300c        bic     r3, ip, #12
     7b4:       e58dc014        str     ip, [sp, #20]
     7b8:       e3833004        orr     r3, r3, #4
     7bc:       ee033f10        mcr     15, 0, r3, cr3, cr0, {0}

Cheers.

--
Robert

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [PATCH 2/2] ARM: domains: add memory dependencies to get_domain/set_domain
@ 2015-09-11 14:56                                     ` Robert Jarzmik
  0 siblings, 0 replies; 48+ messages in thread
From: Robert Jarzmik @ 2015-09-11 14:56 UTC (permalink / raw)
  To: linux-arm-kernel

Russell King <rmk+kernel@arm.linux.org.uk> writes:

> We need to have memory dependencies on get_domain/set_domain to avoid
> the compiler over-optimising these inline assembly instructions.
>
> Loads/stores must not be reordered across a set_domain(), so introduce
> a compiler barrier for that assembly.
>
> The value of get_domain() must not be cached across a set_domain(), but
> we still want to allow the compiler to optimise it away.  Introduce a
> dependency on current_thread_info()->cpu_domain to avoid this; the new
> memory clobber in set_domain() should therefore cause the compiler to
> re-load this.  The other advantage of using this is we should have its
> address in the register set already, or very soon after at most call
> sites.
>
> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Tested-by: Robert Jarzmik <robert.jarzmik@free.fr>

The test is failing without these 2 patches, while with them, an unaligned
access is fixed, and the generated code looks good :
     7ac:       ee13cf10        mrc     15, 0, ip, cr3, cr0, {0}
     7b0:       e3cc300c        bic     r3, ip, #12
     7b4:       e58dc014        str     ip, [sp, #20]
     7b8:       e3833004        orr     r3, r3, #4
     7bc:       ee033f10        mcr     15, 0, r3, cr3, cr0, {0}

Cheers.

--
Robert

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 2/2] ARM: domains: add memory dependencies to get_domain/set_domain
  2015-09-11 14:56                                     ` Robert Jarzmik
@ 2015-09-11 15:10                                       ` Russell King - ARM Linux
  -1 siblings, 0 replies; 48+ messages in thread
From: Russell King - ARM Linux @ 2015-09-11 15:10 UTC (permalink / raw)
  To: Robert Jarzmik; +Cc: Dave Martin, linux-kernel, linux-arm-kernel

On Fri, Sep 11, 2015 at 04:56:27PM +0200, Robert Jarzmik wrote:
> Russell King <rmk+kernel@arm.linux.org.uk> writes:
> > We need to have memory dependencies on get_domain/set_domain to avoid
> > the compiler over-optimising these inline assembly instructions.
> >
> > Loads/stores must not be reordered across a set_domain(), so introduce
> > a compiler barrier for that assembly.
> >
> > The value of get_domain() must not be cached across a set_domain(), but
> > we still want to allow the compiler to optimise it away.  Introduce a
> > dependency on current_thread_info()->cpu_domain to avoid this; the new
> > memory clobber in set_domain() should therefore cause the compiler to
> > re-load this.  The other advantage of using this is we should have its
> > address in the register set already, or very soon after at most call
> > sites.
> >
> > Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
> Tested-by: Robert Jarzmik <robert.jarzmik@free.fr>
> 
> The test is failing without these 2 patches, while with them, an unaligned
> access is fixed, and the generated code looks good :
>      7ac:       ee13cf10        mrc     15, 0, ip, cr3, cr0, {0}
>      7b0:       e3cc300c        bic     r3, ip, #12
>      7b4:       e58dc014        str     ip, [sp, #20]
>      7b8:       e3833004        orr     r3, r3, #4
>      7bc:       ee033f10        mcr     15, 0, r3, cr3, cr0, {0}

Thanks Robert, I've queued these as fixes now.

If you want to put the bug alignment patch in the patch system, I'll get
that off to Linus this weekend too.

-- 
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [PATCH 2/2] ARM: domains: add memory dependencies to get_domain/set_domain
@ 2015-09-11 15:10                                       ` Russell King - ARM Linux
  0 siblings, 0 replies; 48+ messages in thread
From: Russell King - ARM Linux @ 2015-09-11 15:10 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Sep 11, 2015 at 04:56:27PM +0200, Robert Jarzmik wrote:
> Russell King <rmk+kernel@arm.linux.org.uk> writes:
> > We need to have memory dependencies on get_domain/set_domain to avoid
> > the compiler over-optimising these inline assembly instructions.
> >
> > Loads/stores must not be reordered across a set_domain(), so introduce
> > a compiler barrier for that assembly.
> >
> > The value of get_domain() must not be cached across a set_domain(), but
> > we still want to allow the compiler to optimise it away.  Introduce a
> > dependency on current_thread_info()->cpu_domain to avoid this; the new
> > memory clobber in set_domain() should therefore cause the compiler to
> > re-load this.  The other advantage of using this is we should have its
> > address in the register set already, or very soon after at most call
> > sites.
> >
> > Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
> Tested-by: Robert Jarzmik <robert.jarzmik@free.fr>
> 
> The test is failing without these 2 patches, while with them, an unaligned
> access is fixed, and the generated code looks good :
>      7ac:       ee13cf10        mrc     15, 0, ip, cr3, cr0, {0}
>      7b0:       e3cc300c        bic     r3, ip, #12
>      7b4:       e58dc014        str     ip, [sp, #20]
>      7b8:       e3833004        orr     r3, r3, #4
>      7bc:       ee033f10        mcr     15, 0, r3, cr3, cr0, {0}

Thanks Robert, I've queued these as fixes now.

If you want to put the bug alignment patch in the patch system, I'll get
that off to Linus this weekend too.

-- 
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 2/2] ARM: domains: add memory dependencies to get_domain/set_domain
  2015-09-11 15:10                                       ` Russell King - ARM Linux
@ 2015-09-11 15:40                                         ` Robert Jarzmik
  -1 siblings, 0 replies; 48+ messages in thread
From: Robert Jarzmik @ 2015-09-11 15:40 UTC (permalink / raw)
  To: Russell King - ARM Linux; +Cc: Dave Martin, linux-kernel, linux-arm-kernel

Russell King - ARM Linux <linux@arm.linux.org.uk> writes:

> If you want to put the bug alignment patch in the patch system, I'll get
> that off to Linus this weekend too.
Sure, that will be done before tomorrow evening.

Cheers.

-- 
Robert

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [PATCH 2/2] ARM: domains: add memory dependencies to get_domain/set_domain
@ 2015-09-11 15:40                                         ` Robert Jarzmik
  0 siblings, 0 replies; 48+ messages in thread
From: Robert Jarzmik @ 2015-09-11 15:40 UTC (permalink / raw)
  To: linux-arm-kernel

Russell King - ARM Linux <linux@arm.linux.org.uk> writes:

> If you want to put the bug alignment patch in the patch system, I'll get
> that off to Linus this weekend too.
Sure, that will be done before tomorrow evening.

Cheers.

-- 
Robert

^ permalink raw reply	[flat|nested] 48+ messages in thread

end of thread, other threads:[~2015-09-11 15:45 UTC | newest]

Thread overview: 48+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-09-02  6:23 [PATCH] ARM: fix alignement of __bug_table section entries Robert Jarzmik
2015-09-02  6:23 ` Robert Jarzmik
2015-09-02 10:39 ` Dave Martin
2015-09-02 10:39   ` Dave Martin
2015-09-05 13:48   ` Robert Jarzmik
2015-09-05 13:48     ` Robert Jarzmik
2015-09-05 14:25     ` Russell King - ARM Linux
2015-09-05 14:25       ` Russell King - ARM Linux
2015-09-05 17:10       ` Robert Jarzmik
2015-09-05 17:10         ` Robert Jarzmik
2015-09-05 20:38         ` Russell King - ARM Linux
2015-09-05 20:38           ` Russell King - ARM Linux
2015-09-05 22:12           ` Robert Jarzmik
2015-09-05 22:12             ` Robert Jarzmik
2015-09-06 17:25           ` Robert Jarzmik
2015-09-06 17:25             ` Robert Jarzmik
2015-09-06 19:48             ` Russell King - ARM Linux
2015-09-06 19:48               ` Russell King - ARM Linux
2015-09-06 21:31               ` Robert Jarzmik
2015-09-06 21:31                 ` Robert Jarzmik
2015-09-06 23:54                 ` Russell King - ARM Linux
2015-09-06 23:54                   ` Russell King - ARM Linux
2015-09-08 17:01                   ` Robert Jarzmik
2015-09-08 17:01                     ` Robert Jarzmik
2015-09-08 20:08                     ` Russell King - ARM Linux
2015-09-08 20:08                       ` Russell King - ARM Linux
2015-09-08 20:46                       ` Robert Jarzmik
2015-09-08 20:46                         ` Robert Jarzmik
2015-09-09 23:06                       ` Robert Jarzmik
2015-09-09 23:06                         ` Robert Jarzmik
2015-09-10 19:01                         ` Robert Jarzmik
2015-09-10 19:01                           ` Robert Jarzmik
2015-09-10 19:16                           ` Russell King - ARM Linux
2015-09-10 19:16                             ` Russell King - ARM Linux
2015-09-10 20:53                             ` Robert Jarzmik
2015-09-10 20:53                               ` Robert Jarzmik
2015-09-11  9:54                               ` Russell King - ARM Linux
2015-09-11  9:54                                 ` Russell King - ARM Linux
2015-09-11  9:56                                 ` [PATCH 1/2] ARM: domains: thread_info.h no longer needs asm/domains.h Russell King
2015-09-11  9:56                                   ` Russell King
2015-09-11  9:56                                 ` [PATCH 2/2] ARM: domains: add memory dependencies to get_domain/set_domain Russell King
2015-09-11  9:56                                   ` Russell King
2015-09-11 14:56                                   ` Robert Jarzmik
2015-09-11 14:56                                     ` Robert Jarzmik
2015-09-11 15:10                                     ` Russell King - ARM Linux
2015-09-11 15:10                                       ` Russell King - ARM Linux
2015-09-11 15:40                                       ` Robert Jarzmik
2015-09-11 15:40                                         ` Robert Jarzmik

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.