linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [BUG mips llvm] MIPS: malformed R_MIPS_{HI16,LO16} with LLVM
@ 2021-01-09 17:11 Alexander Lobakin
  2021-01-09 17:50 ` Nick Desaulniers
  2021-05-07  7:47 ` Nathan Chancellor
  0 siblings, 2 replies; 11+ messages in thread
From: Alexander Lobakin @ 2021-01-09 17:11 UTC (permalink / raw)
  To: clang-built-linux, linux-mips
  Cc: Alexander Lobakin, Thomas Bogendoerfer, Kees Cook,
	Nathan Chancellor, Nick Desaulniers, Fangrui Song, Sami Tolvanen,
	Ralf Baechle, linux-kernel, linux-arch

Machine: MIPS32 R2 Big Endian (interAptiv (multi))

While testing MIPS with LLVM, I found a weird and very rare bug with
MIPS relocs that LLVM emits into kernel modules. It happens on both
11.0.0 and latest git snapshot and applies, as I can see, only to
references to static symbols.

When the kernel loads the module, it allocates a space for every
section and then manually apply the relocations relative to the
new address.

Let's say we have a function phy_probe() in drivers/net/phy/libphy.ko.
It's static and referenced only in phy_register_driver(), where it's
used to fill callback pointer in a structure.

The real function address after module loading is 0xc06c1444, that
is observed in its ELF st_value field.
There are two relocs related to this usage in phy_register_driver():

R_MIPS_HI16 refers to 0x3c010000
R_MIPS_LO16 refers to 0x24339444

The address of .text is 0xc06b8000. So the destination is calculated
as follows:

0x00000000 from hi16;
0xffff9444 from lo16 (sign extend as it's always treated as signed);
0xc06b8000 from base.

= 0xc06b1444. The value is lower than the real phy_probe() address
(0xc06c1444) by 0x10000 and is lower than the base address of
module's .text, so it's 100% incorrect.

This results in:

[    2.204022] CPU 3 Unable to handle kernel paging request at virtual
address c06b1444, epc == c06b1444, ra == 803f1090

The correct instructions should be:

R_MIPS_HI16 0x3c010001
R_MIPS_LO16 0x24339444

so there'll be 0x00010000 from hi16.

I tried to catch those bugs in arch/mips/kernel/module.c (by checking
if the destination is lower than the base address, which should never
happen), and seems like I have only 3 such places in libphy.ko (and
one in nf_tables.ko).
I don't think it should be handled somehow in mentioned source code
as it would look rather ugly and may break kernels build with GNU
stack, which seems to not produce such bad codes.

If I should report this to any other resources, please let me know.
I chose clang-built-linux and LKML as it may not happen with userland
(didn't tried to catch).

Thanks,
Al


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [BUG mips llvm] MIPS: malformed R_MIPS_{HI16,LO16} with LLVM
  2021-01-09 17:11 [BUG mips llvm] MIPS: malformed R_MIPS_{HI16,LO16} with LLVM Alexander Lobakin
@ 2021-01-09 17:50 ` Nick Desaulniers
  2021-01-09 19:15   ` Alexander Lobakin
  2021-05-07  7:47 ` Nathan Chancellor
  1 sibling, 1 reply; 11+ messages in thread
From: Nick Desaulniers @ 2021-01-09 17:50 UTC (permalink / raw)
  To: Alexander Lobakin
  Cc: clang-built-linux, linux-mips, Thomas Bogendoerfer, Kees Cook,
	Nathan Chancellor, Fangrui Song, Sami Tolvanen, Ralf Baechle,
	LKML, linux-arch

On Sat, Jan 9, 2021 at 9:11 AM Alexander Lobakin <alobakin@pm.me> wrote:
>
> Machine: MIPS32 R2 Big Endian (interAptiv (multi))
>
> While testing MIPS with LLVM, I found a weird and very rare bug with
> MIPS relocs that LLVM emits into kernel modules. It happens on both
> 11.0.0 and latest git snapshot and applies, as I can see, only to
> references to static symbols.
>
> When the kernel loads the module, it allocates a space for every
> section and then manually apply the relocations relative to the
> new address.
>
> Let's say we have a function phy_probe() in drivers/net/phy/libphy.ko.
> It's static and referenced only in phy_register_driver(), where it's
> used to fill callback pointer in a structure.
>
> The real function address after module loading is 0xc06c1444, that
> is observed in its ELF st_value field.
> There are two relocs related to this usage in phy_register_driver():
>
> R_MIPS_HI16 refers to 0x3c010000
> R_MIPS_LO16 refers to 0x24339444
>
> The address of .text is 0xc06b8000. So the destination is calculated
> as follows:
>
> 0x00000000 from hi16;
> 0xffff9444 from lo16 (sign extend as it's always treated as signed);
> 0xc06b8000 from base.
>
> = 0xc06b1444. The value is lower than the real phy_probe() address
> (0xc06c1444) by 0x10000 and is lower than the base address of
> module's .text, so it's 100% incorrect.
>
> This results in:
>
> [    2.204022] CPU 3 Unable to handle kernel paging request at virtual
> address c06b1444, epc == c06b1444, ra == 803f1090
>
> The correct instructions should be:
>
> R_MIPS_HI16 0x3c010001
> R_MIPS_LO16 0x24339444
>
> so there'll be 0x00010000 from hi16.
>
> I tried to catch those bugs in arch/mips/kernel/module.c (by checking
> if the destination is lower than the base address, which should never
> happen), and seems like I have only 3 such places in libphy.ko (and
> one in nf_tables.ko).
> I don't think it should be handled somehow in mentioned source code
> as it would look rather ugly and may break kernels build with GNU
> stack, which seems to not produce such bad codes.
>
> If I should report this to any other resources, please let me know.
> I chose clang-built-linux and LKML as it may not happen with userland
> (didn't tried to catch).

Thanks for the report.  Sounds like we may indeed be producing an
incorrect relocation.  This is only seen for big endian triples?

Getting a way for us to deterministically reproduce would be a good
first step.  Which config or configs beyond defconfig, and which
relocations specifically are you observing this with?
-- 
Thanks,
~Nick Desaulniers

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [BUG mips llvm] MIPS: malformed R_MIPS_{HI16,LO16} with LLVM
  2021-01-09 17:50 ` Nick Desaulniers
@ 2021-01-09 19:15   ` Alexander Lobakin
  2021-01-09 23:29     ` Alexander Lobakin
  2021-01-11 20:03     ` Nick Desaulniers
  0 siblings, 2 replies; 11+ messages in thread
From: Alexander Lobakin @ 2021-01-09 19:15 UTC (permalink / raw)
  To: Nick Desaulniers
  Cc: Alexander Lobakin, clang-built-linux, linux-mips,
	Thomas Bogendoerfer, Kees Cook, Nathan Chancellor, Fangrui Song,
	Sami Tolvanen, Ralf Baechle, LKML, linux-arch

From: Nick Desaulniers <ndesaulniers@google.com>
Date: Sat, 9 Jan 2021 09:50:44 -0800

> On Sat, Jan 9, 2021 at 9:11 AM Alexander Lobakin <alobakin@pm.me> wrote:
>>
>> Machine: MIPS32 R2 Big Endian (interAptiv (multi))
>>
>> While testing MIPS with LLVM, I found a weird and very rare bug with
>> MIPS relocs that LLVM emits into kernel modules. It happens on both
>> 11.0.0 and latest git snapshot and applies, as I can see, only to
>> references to static symbols.
>>
>> When the kernel loads the module, it allocates a space for every
>> section and then manually apply the relocations relative to the
>> new address.
>>
>> Let's say we have a function phy_probe() in drivers/net/phy/libphy.ko.
>> It's static and referenced only in phy_register_driver(), where it's
>> used to fill callback pointer in a structure.
>>
>> The real function address after module loading is 0xc06c1444, that
>> is observed in its ELF st_value field.
>> There are two relocs related to this usage in phy_register_driver():
>>
>> R_MIPS_HI16 refers to 0x3c010000
>> R_MIPS_LO16 refers to 0x24339444
>>
>> The address of .text is 0xc06b8000. So the destination is calculated
>> as follows:
>>
>> 0x00000000 from hi16;
>> 0xffff9444 from lo16 (sign extend as it's always treated as signed);
>> 0xc06b8000 from base.
>>
>> = 0xc06b1444. The value is lower than the real phy_probe() address
>> (0xc06c1444) by 0x10000 and is lower than the base address of
>> module's .text, so it's 100% incorrect.
>>
>> This results in:
>>
>> [    2.204022] CPU 3 Unable to handle kernel paging request at virtual
>> address c06b1444, epc == c06b1444, ra == 803f1090
>>
>> The correct instructions should be:
>>
>> R_MIPS_HI16 0x3c010001
>> R_MIPS_LO16 0x24339444
>>
>> so there'll be 0x00010000 from hi16.
>>
>> I tried to catch those bugs in arch/mips/kernel/module.c (by checking
>> if the destination is lower than the base address, which should never
>> happen), and seems like I have only 3 such places in libphy.ko (and
>> one in nf_tables.ko).
>> I don't think it should be handled somehow in mentioned source code
>> as it would look rather ugly and may break kernels build with GNU
>> stack, which seems to not produce such bad codes.
>>
>> If I should report this to any other resources, please let me know.
>> I chose clang-built-linux and LKML as it may not happen with userland
>> (didn't tried to catch).
>
> Thanks for the report.  Sounds like we may indeed be producing an
> incorrect relocation.  This is only seen for big endian triples?

Unfortunately I don't have a LE board to play with, so can confirm
only Big Endian.

(BTW, if someone can say if it's possible for MIPS (and how if it is)
to launch a LE kernel from BE-booted preloader and U-Boot, that would
be super cool)

> Getting a way for us to deterministically reproduce would be a good
> first step.  Which config or configs beyond defconfig, and which
> relocations specifically are you observing this with?

I use `make 32r2_defconfig` which combines several configs from
arch/mips/configs:
 - generic_defconfig;
 - generic/32r2.config;
 - generic/eb.config.

Aside from that, I enable a bunch of my WIP drivers and the
Netfilter. On my setup, this bug is always present in libphy.ko,
so CONFIG_PHYLIB=m (with all deps) should be enough.

The three failed relocs belongs to this part of code: [0]

llvm-readelf on them:

Relocation section '.rel.text' at offset 0xbf60 contains 2281 entries:¬
[...]
00005740  00029305 R_MIPS_HI16            00000000   .text
00005744  00029306 R_MIPS_LO16            00000000   .text
00005720  00029305 R_MIPS_HI16            00000000   .text
00005748  00029306 R_MIPS_LO16            00000000   .text
0000573c  00029305 R_MIPS_HI16            00000000   .text
0000574c  00029306 R_MIPS_LO16            00000000   .text

The first pair is the one from my first mail:
0x3c010000 <-- should be 0x3c010001 to work properly
0x24339444

I'm planning to hunt for more now, will let you know.

[0] https://elixir.bootlin.com/linux/v5.11-rc2/source/drivers/net/phy/phy_device.c#L2989

> Thanks,
> ~Nick Desaulniers

Thanks,
Al


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [BUG mips llvm] MIPS: malformed R_MIPS_{HI16,LO16} with LLVM
  2021-01-09 19:15   ` Alexander Lobakin
@ 2021-01-09 23:29     ` Alexander Lobakin
  2021-01-10  1:08       ` Alexander Lobakin
  2021-01-11 20:03     ` Nick Desaulniers
  1 sibling, 1 reply; 11+ messages in thread
From: Alexander Lobakin @ 2021-01-09 23:29 UTC (permalink / raw)
  To: Nick Desaulniers
  Cc: Alexander Lobakin, clang-built-linux, linux-mips,
	Thomas Bogendoerfer, Kees Cook, Nathan Chancellor, Fangrui Song,
	Sami Tolvanen, Ralf Baechle, LKML, linux-arch

From: Alexander Lobakin <alobakin@pm.me>
Date: Sat, 09 Jan 2021 19:15:31 +0000

> From: Nick Desaulniers <ndesaulniers@google.com>
> Date: Sat, 9 Jan 2021 09:50:44 -0800
>
>> On Sat, Jan 9, 2021 at 9:11 AM Alexander Lobakin <alobakin@pm.me> wrote:
>>>
>>> Machine: MIPS32 R2 Big Endian (interAptiv (multi))
>>>
>>> While testing MIPS with LLVM, I found a weird and very rare bug with
>>> MIPS relocs that LLVM emits into kernel modules. It happens on both
>>> 11.0.0 and latest git snapshot and applies, as I can see, only to
>>> references to static symbols.
>>>
>>> When the kernel loads the module, it allocates a space for every
>>> section and then manually apply the relocations relative to the
>>> new address.
>>>
>>> Let's say we have a function phy_probe() in drivers/net/phy/libphy.ko.
>>> It's static and referenced only in phy_register_driver(), where it's
>>> used to fill callback pointer in a structure.
>>>
>>> The real function address after module loading is 0xc06c1444, that
>>> is observed in its ELF st_value field.
>>> There are two relocs related to this usage in phy_register_driver():
>>>
>>> R_MIPS_HI16 refers to 0x3c010000
>>> R_MIPS_LO16 refers to 0x24339444
>>>
>>> The address of .text is 0xc06b8000. So the destination is calculated
>>> as follows:
>>>
>>> 0x00000000 from hi16;
>>> 0xffff9444 from lo16 (sign extend as it's always treated as signed);
>>> 0xc06b8000 from base.
>>>
>>> = 0xc06b1444. The value is lower than the real phy_probe() address
>>> (0xc06c1444) by 0x10000 and is lower than the base address of
>>> module's .text, so it's 100% incorrect.
>>>
>>> This results in:
>>>
>>> [    2.204022] CPU 3 Unable to handle kernel paging request at virtual
>>> address c06b1444, epc =3D=3D c06b1444, ra =3D=3D 803f1090
>>>
>>> The correct instructions should be:
>>>
>>> R_MIPS_HI16 0x3c010001
>>> R_MIPS_LO16 0x24339444
>>>
>>> so there'll be 0x00010000 from hi16.
>>>
>>> I tried to catch those bugs in arch/mips/kernel/module.c (by checking
>>> if the destination is lower than the base address, which should never
>>> happen), and seems like I have only 3 such places in libphy.ko (and
>>> one in nf_tables.ko).
>>> I don't think it should be handled somehow in mentioned source code
>>> as it would look rather ugly and may break kernels build with GNU
>>> stack, which seems to not produce such bad codes.
>>>
>>> If I should report this to any other resources, please let me know.
>>> I chose clang-built-linux and LKML as it may not happen with userland
>>> (didn't tried to catch).
>>
>> Thanks for the report.  Sounds like we may indeed be producing an
>> incorrect relocation.  This is only seen for big endian triples?
>
> Unfortunately I don't have a LE board to play with, so can confirm
> only Big Endian.
>
> (BTW, if someone can say if it's possible for MIPS (and how if it is)
> to launch a LE kernel from BE-booted preloader and U-Boot, that would
> be super cool)
>
>> Getting a way for us to deterministically reproduce would be a good
>> first step.  Which config or configs beyond defconfig, and which
>> relocations specifically are you observing this with?
>
> I use `make 32r2_defconfig` which combines several configs from
> arch/mips/configs:
>  - generic_defconfig;
>  - generic/32r2.config;
>  - generic/eb.config.
>
> Aside from that, I enable a bunch of my WIP drivers and the
> Netfilter. On my setup, this bug is always present in libphy.ko,
> so CONFIG_PHYLIB=m (with all deps) should be enough.
>
> The three failed relocs belongs to this part of code: [0]
>
> llvm-readelf on them:
>
> Relocation section '.rel.text' at offset 0xbf60 contains 2281 entries:
> [...]
> 00005740  00029305 R_MIPS_HI16            00000000   .text
> 00005744  00029306 R_MIPS_LO16            00000000   .text
> 00005720  00029305 R_MIPS_HI16            00000000   .text
> 00005748  00029306 R_MIPS_LO16            00000000   .text
> 0000573c  00029305 R_MIPS_HI16            00000000   .text
> 0000574c  00029306 R_MIPS_LO16            00000000   .text
>
> The first pair is the one from my first mail:
> 0x3c010000 <-- should be 0x3c010001 to work properly
> 0x24339444
>
> I'm planning to hunt for more now, will let you know.

Unfortunately, R_MIPS_32 also suffers from that. And unlikely
R_MIPS_{HI,LO}16, they can't be handled runtime as the values
are pure random.
I expanded arch/mips/kernel/module.c a bit, so it tries to find
the actual symbol in .symtab after each applied relocation and
print the detailed info. Here's an example from nf_defrag_ipv6
loading:

[  429.789793] nf_defrag_ipv6: final section addresses:
[  429.795409] 	0xc07214fc __ksymtab_gpl
[  429.799574] 	0xc0720000 .text
[  429.802902] 	0xc07216b0 .data
[  429.806249] 	0xc0721790 .bss
[  429.809474] 	0xc0721508 __ksymtab_strings
[  429.813977] 	0xc0728000 .init.text
[  429.817781] 	0xc07214c0 .exit.text
[  429.821606] 	0xc0721520 .rodata
[  429.825120] 	0xc0721578 .rodata.str1.1
[  429.829322] 	0xc0721638 .note.Linux
[  429.833226] 	0xc0721800 .gnu.linkonce.this_module
[  429.838503] 	0xc0721650 .MIPS.abiflags
[  429.842702] 	0xc0721668 .reginfo
[  429.846326] 	0xc0721680 .note.gnu.build-id
[  429.851129] nf_defrag_ipv6: R_MIPS_32 [0x00000008]: 0xc07216b0 -> 0xc07216b8 is broken
[  429.860017] nf_defrag_ipv6: R_MIPS_32 [0x00000008]: 0xc07216b0 -> 0xc07216b8 is broken
[  429.868875] nf_defrag_ipv6: R_MIPS_32 [0x00000138]: 0xc0720000 -> 0xc0720138 is defrag6_net_exit
[  429.878706] nf_defrag_ipv6: R_MIPS_32 [0x000012c8]: 0xc0720000 -> 0xc07212c8 is nf_ct_net_init
[  429.888335] nf_defrag_ipv6: R_MIPS_32 [0x0000142c]: 0xc0720000 -> 0xc072142c is nf_ct_net_pre_exit
[  429.898367] nf_defrag_ipv6: R_MIPS_32 [0x00001440]: 0xc0720000 -> 0xc0721440 is nf_ct_net_exit
[  429.907994] nf_defrag_ipv6: R_MIPS_32 [0x00000057]: 0xc0721578 -> 0xc07215cf is broken
[  429.916872] nf_defrag_ipv6: R_MIPS_32 [0x00000000]: 0x80f297f0 -> 0x80f297f0 is proc_dointvec_jiffies
[  429.927177] nf_defrag_ipv6: R_MIPS_32 [0x00000039]: 0xc0721578 -> 0xc07215b1 is broken
[  429.936044] nf_defrag_ipv6: R_MIPS_32 [0x00000000]: 0x80f29374 -> 0x80f29374 is proc_doulongvec_minmax
[  429.946453] nf_defrag_ipv6: R_MIPS_32 [0x00000072]: 0xc0721578 -> 0xc07215ea is broken
[  429.955320] nf_defrag_ipv6: R_MIPS_32 [0x00000000]: 0x80f29374 -> 0x80f29374 is proc_doulongvec_minmax
[  429.965737] nf_defrag_ipv6: R_MIPS_32 [0x000000a4]: 0xc0720000 -> 0xc07200a4 is ipv6_defrag
[  429.975094] nf_defrag_ipv6: R_MIPS_32 [0x000000a4]: 0xc0720000 -> 0xc07200a4 is ipv6_defrag
[  429.984431] nf_defrag_ipv6: R_MIPS_32 [0x0000106c]: 0xc0720000 -> 0xc072106c is ip6frag_key_hashfn
[  429.994470] nf_defrag_ipv6: R_MIPS_32 [0x00001090]: 0xc0720000 -> 0xc0721090 is ip6frag_obj_hashfn
[  430.004486] nf_defrag_ipv6: R_MIPS_32 [0x000010b8]: 0xc0720000 -> 0xc07210b8 is ip6frag_obj_cmpfn
[  430.014425] nf_defrag_ipv6: R_MIPS_32 [0x00000000]: 0xc0720000 -> 0xc0720000 is nf_defrag_ipv6_enable
[  430.024742] nf_defrag_ipv6: R_MIPS_32 [0x00000001]: 0xc0721508 -> 0xc0721509 is __kstrtab_nf_defrag_ipv6_enable
[  430.036074] nf_defrag_ipv6: R_MIPS_32 [0x00000000]: 0xc0721508 -> 0xc0721508 is __kstrtabns_nf_defrag_ipv6_enable
[  430.047561] nf_defrag_ipv6: R_MIPS_32 [0x00000000]: 0xc0728000 -> 0xc0728000 is init_module
[  430.056930] nf_defrag_ipv6: R_MIPS_32 [0x00000000]: 0xc07214c0 -> 0xc07214c0 is cleanup_module

At least five symbols are broken and lead to nowhere: two from .data
and three from .rodata. Values in square braces are initial references
that can be observed via `nm -n` -- and for broken ones they really
don't correspond to any symbols, mismatching the neighbours' addresses
by 0x40-0x50.

So for now seems like it's really an LLVM problem and there can't be
any simple workaround for it in the kernel.

> [0] https://elixir.bootlin.com/linux/v5.11-rc2/source/drivers/net/phy/phy_device.c#L2989
>
>> Thanks,
>> ~Nick Desaulniers
>
> Thanks,
> Al

Al


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [BUG mips llvm] MIPS: malformed R_MIPS_{HI16,LO16} with LLVM
  2021-01-09 23:29     ` Alexander Lobakin
@ 2021-01-10  1:08       ` Alexander Lobakin
  0 siblings, 0 replies; 11+ messages in thread
From: Alexander Lobakin @ 2021-01-10  1:08 UTC (permalink / raw)
  To: Nick Desaulniers
  Cc: Alexander Lobakin, clang-built-linux, linux-mips,
	Thomas Bogendoerfer, Kees Cook, Nathan Chancellor, Fangrui Song,
	Sami Tolvanen, Ralf Baechle, LKML, linux-arch

From: Alexander Lobakin <alobakin@pm.me>
Date: Sat, 09 Jan 2021 23:29:26 +0000

> From: Alexander Lobakin <alobakin@pm.me>
> Date: Sat, 09 Jan 2021 19:15:31 +0000
>
>> From: Nick Desaulniers <ndesaulniers@google.com>
>> Date: Sat, 9 Jan 2021 09:50:44 -0800
>>
>>> On Sat, Jan 9, 2021 at 9:11 AM Alexander Lobakin <alobakin@pm.me> wrote:
>>>>
>>>> Machine: MIPS32 R2 Big Endian (interAptiv (multi))
>>>>
>>>> While testing MIPS with LLVM, I found a weird and very rare bug with
>>>> MIPS relocs that LLVM emits into kernel modules. It happens on both
>>>> 11.0.0 and latest git snapshot and applies, as I can see, only to
>>>> references to static symbols.
>>>>
>>>> When the kernel loads the module, it allocates a space for every
>>>> section and then manually apply the relocations relative to the
>>>> new address.
>>>>
>>>> Let's say we have a function phy_probe() in drivers/net/phy/libphy.ko.
>>>> It's static and referenced only in phy_register_driver(), where it's
>>>> used to fill callback pointer in a structure.
>>>>
>>>> The real function address after module loading is 0xc06c1444, that
>>>> is observed in its ELF st_value field.
>>>> There are two relocs related to this usage in phy_register_driver():
>>>>
>>>> R_MIPS_HI16 refers to 0x3c010000
>>>> R_MIPS_LO16 refers to 0x24339444
>>>>
>>>> The address of .text is 0xc06b8000. So the destination is calculated
>>>> as follows:
>>>>
>>>> 0x00000000 from hi16;
>>>> 0xffff9444 from lo16 (sign extend as it's always treated as signed);
>>>> 0xc06b8000 from base.
>>>>
>>>> = 0xc06b1444. The value is lower than the real phy_probe() address
>>>> (0xc06c1444) by 0x10000 and is lower than the base address of
>>>> module's .text, so it's 100% incorrect.
>>>>
>>>> This results in:
>>>>
>>>> [    2.204022] CPU 3 Unable to handle kernel paging request at virtual
>>>> address c06b1444, epc == c06b1444, ra == 803f1090
>>>>
>>>> The correct instructions should be:
>>>>
>>>> R_MIPS_HI16 0x3c010001
>>>> R_MIPS_LO16 0x24339444
>>>>
>>>> so there'll be 0x00010000 from hi16.
>>>>
>>>> I tried to catch those bugs in arch/mips/kernel/module.c (by checking
>>>> if the destination is lower than the base address, which should never
>>>> happen), and seems like I have only 3 such places in libphy.ko (and
>>>> one in nf_tables.ko).
>>>> I don't think it should be handled somehow in mentioned source code
>>>> as it would look rather ugly and may break kernels build with GNU
>>>> stack, which seems to not produce such bad codes.
>>>>
>>>> If I should report this to any other resources, please let me know.
>>>> I chose clang-built-linux and LKML as it may not happen with userland
>>>> (didn't tried to catch).
>>>
>>> Thanks for the report.  Sounds like we may indeed be producing an
>>> incorrect relocation.  This is only seen for big endian triples?
>>
>> Unfortunately I don't have a LE board to play with, so can confirm
>> only Big Endian.
>>
>> (BTW, if someone can say if it's possible for MIPS (and how if it is)
>> to launch a LE kernel from BE-booted preloader and U-Boot, that would
>> be super cool)
>>
>>> Getting a way for us to deterministically reproduce would be a good
>>> first step.  Which config or configs beyond defconfig, and which
>>> relocations specifically are you observing this with?
>>
>> I use `make 32r2_defconfig` which combines several configs from
>> arch/mips/configs:
>>  - generic_defconfig;
>>  - generic/32r2.config;
>>  - generic/eb.config.
>>
>> Aside from that, I enable a bunch of my WIP drivers and the
>> Netfilter. On my setup, this bug is always present in libphy.ko,
>> so CONFIG_PHYLIB=m (with all deps) should be enough.
>>
>> The three failed relocs belongs to this part of code: [0]
>>
>> llvm-readelf on them:
>>
>> Relocation section '.rel.text' at offset 0xbf60 contains 2281 entries:
>> [...]
>> 00005740  00029305 R_MIPS_HI16            00000000   .text
>> 00005744  00029306 R_MIPS_LO16            00000000   .text
>> 00005720  00029305 R_MIPS_HI16            00000000   .text
>> 00005748  00029306 R_MIPS_LO16            00000000   .text
>> 0000573c  00029305 R_MIPS_HI16            00000000   .text
>> 0000574c  00029306 R_MIPS_LO16            00000000   .text
>>
>> The first pair is the one from my first mail:
>> 0x3c010000 <-- should be 0x3c010001 to work properly
>> 0x24339444
>>
>> I'm planning to hunt for more now, will let you know.
>
> Unfortunately, R_MIPS_32 also suffers from that. And unlikely
> R_MIPS_{HI,LO}16, they can't be handled runtime as the values
> are pure random.
> I expanded arch/mips/kernel/module.c a bit, so it tries to find
> the actual symbol in .symtab after each applied relocation and
> print the detailed info. Here's an example from nf_defrag_ipv6
> loading:
>
> [  429.789793] nf_defrag_ipv6: final section addresses:
> [  429.795409] =090xc07214fc __ksymtab_gpl
> [  429.799574] =090xc0720000 .text
> [  429.802902] =090xc07216b0 .data
> [  429.806249] =090xc0721790 .bss
> [  429.809474] =090xc0721508 __ksymtab_strings
> [  429.813977] =090xc0728000 .init.text
> [  429.817781] =090xc07214c0 .exit.text
> [  429.821606] =090xc0721520 .rodata
> [  429.825120] =090xc0721578 .rodata.str1.1
> [  429.829322] =090xc0721638 .note.Linux
> [  429.833226] =090xc0721800 .gnu.linkonce.this_module
> [  429.838503] =090xc0721650 .MIPS.abiflags
> [  429.842702] =090xc0721668 .reginfo
> [  429.846326] =090xc0721680 .note.gnu.build-id
> [  429.851129] nf_defrag_ipv6: R_MIPS_32 [0x00000008]: 0xc07216b0 -> 0xc07216b8 is broken
> [  429.860017] nf_defrag_ipv6: R_MIPS_32 [0x00000008]: 0xc07216b0 -> 0xc07216b8 is broken
> [  429.868875] nf_defrag_ipv6: R_MIPS_32 [0x00000138]: 0xc0720000 -> 0xc0720138 is defrag6_net_exit
> [  429.878706] nf_defrag_ipv6: R_MIPS_32 [0x000012c8]: 0xc0720000 -> 0xc07212c8 is nf_ct_net_init
> [  429.888335] nf_defrag_ipv6: R_MIPS_32 [0x0000142c]: 0xc0720000 -> 0xc072142c is nf_ct_net_pre_exit
> [  429.898367] nf_defrag_ipv6: R_MIPS_32 [0x00001440]: 0xc0720000 -> 0xc0721440 is nf_ct_net_exit
> [  429.907994] nf_defrag_ipv6: R_MIPS_32 [0x00000057]: 0xc0721578 -> 0xc07215cf is broken
> [  429.916872] nf_defrag_ipv6: R_MIPS_32 [0x00000000]: 0x80f297f0 -> 0x80f297f0 is proc_dointvec_jiffies
> [  429.927177] nf_defrag_ipv6: R_MIPS_32 [0x00000039]: 0xc0721578 -> 0xc07215b1 is broken
> [  429.936044] nf_defrag_ipv6: R_MIPS_32 [0x00000000]: 0x80f29374 -> 0x80f29374 is proc_doulongvec_minmax
> [  429.946453] nf_defrag_ipv6: R_MIPS_32 [0x00000072]: 0xc0721578 -> 0xc07215ea is broken
> [  429.955320] nf_defrag_ipv6: R_MIPS_32 [0x00000000]: 0x80f29374 -> 0x80f29374 is proc_doulongvec_minmax
> [  429.965737] nf_defrag_ipv6: R_MIPS_32 [0x000000a4]: 0xc0720000 -> 0xc07200a4 is ipv6_defrag
> [  429.975094] nf_defrag_ipv6: R_MIPS_32 [0x000000a4]: 0xc0720000 -> 0xc07200a4 is ipv6_defrag
> [  429.984431] nf_defrag_ipv6: R_MIPS_32 [0x0000106c]: 0xc0720000 -> 0xc072106c is ip6frag_key_hashfn
> [  429.994470] nf_defrag_ipv6: R_MIPS_32 [0x00001090]: 0xc0720000 -> 0xc0721090 is ip6frag_obj_hashfn
> [  430.004486] nf_defrag_ipv6: R_MIPS_32 [0x000010b8]: 0xc0720000 -> 0xc07210b8 is ip6frag_obj_cmpfn
> [  430.014425] nf_defrag_ipv6: R_MIPS_32 [0x00000000]: 0xc0720000 -> 0xc0720000 is nf_defrag_ipv6_enable
> [  430.024742] nf_defrag_ipv6: R_MIPS_32 [0x00000001]: 0xc0721508 -> 0xc0721509 is __kstrtab_nf_defrag_ipv6_enable
> [  430.036074] nf_defrag_ipv6: R_MIPS_32 [0x00000000]: 0xc0721508 -> 0xc0721508 is __kstrtabns_nf_defrag_ipv6_enable
> [  430.047561] nf_defrag_ipv6: R_MIPS_32 [0x00000000]: 0xc0728000 -> 0xc0728000 is init_module
> [  430.056930] nf_defrag_ipv6: R_MIPS_32 [0x00000000]: 0xc07214c0 -> 0xc07214c0 is cleanup_module
>
> At least five symbols are broken and lead to nowhere: two from .data
> and three from .rodata. Values in square braces are initial references
> that can be observed via `nm -n` -- and for broken ones they really
> don't correspond to any symbols, mismatching the neighbours' addresses
> by 0x40-0x50.

Oops, my bad. I forgot that they can point to the middle of the struct
or array or what else. Nevermind, only the problem with R_MIPS_HI16 is
actual.

With the "add 0x10000 if can't find the symbol" workaround I'm now
able to run kernel and modules without any visible defects or issues.
Full list of detected and fixed relocs on my setup:

libphy: R_MIPS_HI16 [0x3c030000, 0x24639444]: .text -> 0xc06b1444 is broken
libphy: R_MIPS_HI16 [0x3c030000, 0x24639444]: .text -> 0xc06c1444 is phy_probe
libphy: R_MIPS_HI16 [0x3c020000, 0x2442970c]: .text -> 0xc06b170c is broken
libphy: R_MIPS_HI16 [0x3c020000, 0x2442970c]: .text -> 0xc06c170c is phy_remove
libphy: R_MIPS_HI16 [0x3c010000, 0x242197ac]: .text -> 0xc06b17ac is broken
libphy: R_MIPS_HI16 [0x3c010000, 0x242197ac]: .text -> 0xc06c17ac is phy_shutdown
nf_tables: R_MIPS_HI16 [0x3c010001, 0x24218164]: .text -> 0xc07bc164 is broken
nf_tables: R_MIPS_HI16 [0x3c010001, 0x24218164]: .text -> 0xc07cc164 is nf_tables_dump_obj_done
nf_tables: R_MIPS_HI16 [0x3c010001, 0x243981bc]: .text -> 0xc07bc1bc is broken
nf_tables: R_MIPS_HI16 [0x3c010001, 0x243981bc]: .text -> 0xc07cc1bc is nft_flowtable_parse_hook
nf_tables: R_MIPS_HI16 [0x3c010001, 0x24398390]: .text -> 0xc07bc390 is broken
nf_tables: R_MIPS_HI16 [0x3c010001, 0x24398390]: .text -> 0xc07cc390 is nft_register_flowtable_net_hooks
nf_tables: R_MIPS_HI16 [0x3c010001, 0x243981bc]: .text -> 0xc07bc1bc is broken
nf_tables: R_MIPS_HI16 [0x3c010001, 0x243981bc]: .text -> 0xc07cc1bc is nft_flowtable_parse_hook
nf_tables: R_MIPS_HI16 [0x3c010001, 0x24398390]: .text -> 0xc07bc390 is broken
nf_tables: R_MIPS_HI16 [0x3c010001, 0x24398390]: .text -> 0xc07cc390 is nft_register_flowtable_net_hooks
nf_tables: R_MIPS_HI16 [0x3c010001, 0x2421866c]: .text -> 0xc07bc66c is broken
nf_tables: R_MIPS_HI16 [0x3c010001, 0x2421866c]: .text -> 0xc07cc66c is nf_tables_dump_flowtable
nf_tables: R_MIPS_HI16 [0x3c010001, 0x242185b4]: .text -> 0xc07bc5b4 is broken
nf_tables: R_MIPS_HI16 [0x3c010001, 0x242185b4]: .text -> 0xc07cc5b4 is nf_tables_dump_flowtable_start
nf_tables: R_MIPS_HI16 [0x3c010001, 0x243981bc]: .text -> 0xc07bc1bc is broken
nf_tables: R_MIPS_HI16 [0x3c010001, 0x243981bc]: .text -> 0xc07cc1bc is nft_flowtable_parse_hook
nf_tables: R_MIPS_HI16 [0x3c020002, 0x24428080]: .text -> 0xc07cc080 is broken
nf_tables: R_MIPS_HI16 [0x3c020002, 0x24428080]: .text -> 0xc07dc080 is nft_rbtree_gc
nf_conntrack: R_MIPS_HI16 [0x3c010000, 0x24258538]: .text -> 0xc077c538 is broken
nf_conntrack: R_MIPS_HI16 [0x3c010000, 0x24258538]: .text -> 0xc078c538 is nf_ct_expectation_timed_out

> So for now seems like it's really an LLVM problem and there can't be
> any simple workaround for it in the kernel.
>
>> [0] https://elixir.bootlin.com/linux/v5.11-rc2/source/drivers/net/phy/phy_device.c#L2989
>>
>>> Thanks,
>>> ~Nick Desaulniers
>>
>> Thanks,
>> Al
>
> Al

Al


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [BUG mips llvm] MIPS: malformed R_MIPS_{HI16,LO16} with LLVM
  2021-01-09 19:15   ` Alexander Lobakin
  2021-01-09 23:29     ` Alexander Lobakin
@ 2021-01-11 20:03     ` Nick Desaulniers
  2021-01-11 20:50       ` Alexander Lobakin
  1 sibling, 1 reply; 11+ messages in thread
From: Nick Desaulniers @ 2021-01-11 20:03 UTC (permalink / raw)
  To: Alexander Lobakin, Fangrui Song
  Cc: clang-built-linux, linux-mips, Thomas Bogendoerfer, Kees Cook,
	Nathan Chancellor, Sami Tolvanen, Ralf Baechle, LKML, linux-arch

Hi Alexander,
I'm genuinely trying to reproduce/understand this report, questions below:

On Sat, Jan 9, 2021 at 11:15 AM Alexander Lobakin <alobakin@pm.me> wrote:
>
> From: Nick Desaulniers <ndesaulniers@google.com>
> Date: Sat, 9 Jan 2021 09:50:44 -0800
>
> > On Sat, Jan 9, 2021 at 9:11 AM Alexander Lobakin <alobakin@pm.me> wrote:
> >>
> >> Machine: MIPS32 R2 Big Endian (interAptiv (multi))
> >>
> >> While testing MIPS with LLVM, I found a weird and very rare bug with
> >> MIPS relocs that LLVM emits into kernel modules. It happens on both
> >> 11.0.0 and latest git snapshot and applies, as I can see, only to
> >> references to static symbols.
> >>
> >> When the kernel loads the module, it allocates a space for every
> >> section and then manually apply the relocations relative to the
> >> new address.
> >>
> >> Let's say we have a function phy_probe() in drivers/net/phy/libphy.ko.
> >> It's static and referenced only in phy_register_driver(), where it's
> >> used to fill callback pointer in a structure.
> >>
> >> The real function address after module loading is 0xc06c1444, that
> >> is observed in its ELF st_value field.
> >> There are two relocs related to this usage in phy_register_driver():
> >>
> >> R_MIPS_HI16 refers to 0x3c010000
> >> R_MIPS_LO16 refers to 0x24339444

Sorry, how are these calculated?  (Explicit shell commands invoked
would be appreciated)

I'm doing:
$ ARCH=mips CROSS_COMPILE=mips-linux-gnu- make CC=clang -j71 32r2_defconfig
$ ARCH=mips CROSS_COMPILE=mips-linux-gnu- make CC=clang -j71 modules
$ llvm-nm --format=sysv drivers/net/phy/phy_device.o | grep phy_probe
$ llvm-objdump -Dr --disassemble-symbols=phy_driver_register
drivers/net/phy/phy_device.o
$ llvm-readelf -r drivers/net/phy/phy_device.o  | grep -e R_MIPS_HI16
-e R_MIPS_LO16

for some of the commands trying to verify.

> >>
> >> The address of .text is 0xc06b8000. So the destination is calculated
> >> as follows:
> >>
> >> 0x00000000 from hi16;
> >> 0xffff9444 from lo16 (sign extend as it's always treated as signed);
> >> 0xc06b8000 from base.
> >>
> >> = 0xc06b1444. The value is lower than the real phy_probe() address
> >> (0xc06c1444) by 0x10000 and is lower than the base address of
> >> module's .text, so it's 100% incorrect.

The disassembly for me produces:
    399c: 3c 03 00 00   lui     $3, 0 <phy_device_free>
                        0000399c:  R_MIPS_HI16  .text
...
    39a8: 24 63 3a 5c   addiu   $3, $3, 14940 <phy_probe>
                        000039a8:  R_MIPS_LO16  .text

I'm not really sure how to manually resolve the relocations; Fangrui
do you have any tips? (I'm coincidentally reading through Linkers &
Loaders currently, but only just started chpt. 4).

> >>
> >> This results in:
> >>
> >> [    2.204022] CPU 3 Unable to handle kernel paging request at virtual
> >> address c06b1444, epc == c06b1444, ra == 803f1090
> >>
> >> The correct instructions should be:
> >>
> >> R_MIPS_HI16 0x3c010001
> >> R_MIPS_LO16 0x24339444
> >>
> >> so there'll be 0x00010000 from hi16.
> >>
> >> I tried to catch those bugs in arch/mips/kernel/module.c (by checking
> >> if the destination is lower than the base address, which should never
> >> happen), and seems like I have only 3 such places in libphy.ko (and
> >> one in nf_tables.ko).
> >> I don't think it should be handled somehow in mentioned source code
> >> as it would look rather ugly and may break kernels build with GNU
> >> stack, which seems to not produce such bad codes.
> >>
> >> If I should report this to any other resources, please let me know.
> >> I chose clang-built-linux and LKML as it may not happen with userland
> >> (didn't tried to catch).
> >
> > Thanks for the report.  Sounds like we may indeed be producing an
> > incorrect relocation.  This is only seen for big endian triples?
>
> Unfortunately I don't have a LE board to play with, so can confirm
> only Big Endian.
>
> (BTW, if someone can say if it's possible for MIPS (and how if it is)
> to launch a LE kernel from BE-booted preloader and U-Boot, that would
> be super cool)
>
> > Getting a way for us to deterministically reproduce would be a good
> > first step.  Which config or configs beyond defconfig, and which
> > relocations specifically are you observing this with?
>
> I use `make 32r2_defconfig` which combines several configs from
> arch/mips/configs:
>  - generic_defconfig;
>  - generic/32r2.config;
>  - generic/eb.config.
>
> Aside from that, I enable a bunch of my WIP drivers and the
> Netfilter. On my setup, this bug is always present in libphy.ko,
> so CONFIG_PHYLIB=m (with all deps) should be enough.
>
> The three failed relocs belongs to this part of code: [0]
>
> llvm-readelf on them:
>
> Relocation section '.rel.text' at offset 0xbf60 contains 2281 entries:¬
> [...]
> 00005740  00029305 R_MIPS_HI16            00000000   .text
> 00005744  00029306 R_MIPS_LO16            00000000   .text
> 00005720  00029305 R_MIPS_HI16            00000000   .text
> 00005748  00029306 R_MIPS_LO16            00000000   .text
> 0000573c  00029305 R_MIPS_HI16            00000000   .text
> 0000574c  00029306 R_MIPS_LO16            00000000   .text
>
> The first pair is the one from my first mail:
> 0x3c010000 <-- should be 0x3c010001 to work properly
> 0x24339444
>
> I'm planning to hunt for more now, will let you know.
>
> [0] https://elixir.bootlin.com/linux/v5.11-rc2/source/drivers/net/phy/phy_device.c#L2989
>
> > Thanks,
> > ~Nick Desaulniers
>
> Thanks,
> Al
>


-- 
Thanks,
~Nick Desaulniers

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [BUG mips llvm] MIPS: malformed R_MIPS_{HI16,LO16} with LLVM
  2021-01-11 20:03     ` Nick Desaulniers
@ 2021-01-11 20:50       ` Alexander Lobakin
  2021-01-12 22:14         ` Nick Desaulniers
  0 siblings, 1 reply; 11+ messages in thread
From: Alexander Lobakin @ 2021-01-11 20:50 UTC (permalink / raw)
  To: Nick Desaulniers
  Cc: Alexander Lobakin, Fangrui Song, clang-built-linux, linux-mips,
	Thomas Bogendoerfer, Kees Cook, Nathan Chancellor, Sami Tolvanen,
	Ralf Baechle, LKML, linux-arch

From: Nick Desaulniers <ndesaulniers@google.com>
Date: Mon, 11 Jan 2021 12:03:29 -0800

> Hi Alexander,
> I'm genuinely trying to reproduce/understand this report, questions below:

Hi Nick!

> On Sat, Jan 9, 2021 at 11:15 AM Alexander Lobakin <alobakin@pm.me> wrote:
>>
>> From: Nick Desaulniers <ndesaulniers@google.com>
>> Date: Sat, 9 Jan 2021 09:50:44 -0800
>>
>>> On Sat, Jan 9, 2021 at 9:11 AM Alexander Lobakin <alobakin@pm.me> wrote:
>>>>
>>>> Machine: MIPS32 R2 Big Endian (interAptiv (multi))
>>>>
>>>> While testing MIPS with LLVM, I found a weird and very rare bug with
>>>> MIPS relocs that LLVM emits into kernel modules. It happens on both
>>>> 11.0.0 and latest git snapshot and applies, as I can see, only to
>>>> references to static symbols.
>>>>
>>>> When the kernel loads the module, it allocates a space for every
>>>> section and then manually apply the relocations relative to the
>>>> new address.
>>>>
>>>> Let's say we have a function phy_probe() in drivers/net/phy/libphy.ko.
>>>> It's static and referenced only in phy_register_driver(), where it's
>>>> used to fill callback pointer in a structure.
>>>>
>>>> The real function address after module loading is 0xc06c1444, that
>>>> is observed in its ELF st_value field.
>>>> There are two relocs related to this usage in phy_register_driver():
>>>>
>>>> R_MIPS_HI16 refers to 0x3c010000
>>>> R_MIPS_LO16 refers to 0x24339444
>
> Sorry, how are these calculated?  (Explicit shell commands invoked
> would be appreciated)

Just look at the disassembly below and you'll see the similar
instructions in your module.

> I'm doing:
> $ ARCH=mips CROSS_COMPILE=mips-linux-gnu- make CC=clang -j71 32r2_defconfig
> $ ARCH=mips CROSS_COMPILE=mips-linux-gnu- make CC=clang -j71 modules
> $ llvm-nm --format=sysv drivers/net/phy/phy_device.o | grep phy_probe
> $ llvm-objdump -Dr --disassemble-symbols=phy_driver_register drivers/net/phy/phy_device.o
> $ llvm-readelf -r drivers/net/phy/phy_device.o  | grep -e R_MIPS_HI16 -e R_MIPS_LO16
>
> for some of the commands trying to verify.
>
>>>>
>>>> The address of .text is 0xc06b8000. So the destination is calculated
>>>> as follows:
>>>>
>>>> 0x00000000 from hi16;
>>>> 0xffff9444 from lo16 (sign extend as it's always treated as signed);
>>>> 0xc06b8000 from base.
>>>>
>>>> = 0xc06b1444. The value is lower than the real phy_probe() address
>>>> (0xc06c1444) by 0x10000 and is lower than the base address of
>>>> module's .text, so it's 100% incorrect.
>
> The disassembly for me produces:
>     399c: 3c 03 00 00   lui     $3, 0 <phy_device_free>
>                         0000399c:  R_MIPS_HI16  .text
> ...
>     39a8: 24 63 3a 5c   addiu   $3, $3, 14940 <phy_probe>
>                         000039a8:  R_MIPS_LO16  .text

So, in your case the values of the instructions that relocs refer are:

0x3c030000 R_MIPS_HI16
0x24633a5c R_MIPS_LO16

Mine were:

0x3c010000
0x24339444

Your second one doesn't have bit 15 set, so I think this pair won't
break the code.
Try to hunt for R_MIPS_LO16 that have this bit set, i.e. they have
'8', '9', 'a', 'b', 'c', 'd' or 'e' as their [15:12].

> I'm not really sure how to manually resolve the relocations; Fangrui
> do you have any tips? (I'm coincidentally reading through Linkers &
> Loaders currently, but only just started chpt. 4).
>
>>>>
>>>> This results in:
>>>>
>>>> [    2.204022] CPU 3 Unable to handle kernel paging request at virtual
>>>> address c06b1444, epc == c06b1444, ra == 803f1090
>>>>
>>>> The correct instructions should be:
>>>>
>>>> R_MIPS_HI16 0x3c010001
>>>> R_MIPS_LO16 0x24339444
>>>>
>>>> so there'll be 0x00010000 from hi16.
>>>>
>>>> I tried to catch those bugs in arch/mips/kernel/module.c (by checking
>>>> if the destination is lower than the base address, which should never
>>>> happen), and seems like I have only 3 such places in libphy.ko (and
>>>> one in nf_tables.ko).
>>>> I don't think it should be handled somehow in mentioned source code
>>>> as it would look rather ugly and may break kernels build with GNU
>>>> stack, which seems to not produce such bad codes.
>>>>
>>>> If I should report this to any other resources, please let me know.
>>>> I chose clang-built-linux and LKML as it may not happen with userland
>>>> (didn't tried to catch).
>>>
>>> Thanks for the report.  Sounds like we may indeed be producing an
>>> incorrect relocation.  This is only seen for big endian triples?
>>
>> Unfortunately I don't have a LE board to play with, so can confirm
>> only Big Endian.
>>
>> (BTW, if someone can say if it's possible for MIPS (and how if it is)
>> to launch a LE kernel from BE-booted preloader and U-Boot, that would
>> be super cool)
>>
>>> Getting a way for us to deterministically reproduce would be a good
>>> first step.  Which config or configs beyond defconfig, and which
>>> relocations specifically are you observing this with?
>>
>> I use `make 32r2_defconfig` which combines several configs from
>> arch/mips/configs:
>>  - generic_defconfig;
>>  - generic/32r2.config;
>>  - generic/eb.config.
>>
>> Aside from that, I enable a bunch of my WIP drivers and the
>> Netfilter. On my setup, this bug is always present in libphy.ko,
>> so CONFIG_PHYLIB=m (with all deps) should be enough.
>>
>> The three failed relocs belongs to this part of code: [0]
>>
>> llvm-readelf on them:
>>
>> Relocation section '.rel.text' at offset 0xbf60 contains 2281 entries:
>> [...]
>> 00005740  00029305 R_MIPS_HI16            00000000   .text
>> 00005744  00029306 R_MIPS_LO16            00000000   .text
>> 00005720  00029305 R_MIPS_HI16            00000000   .text
>> 00005748  00029306 R_MIPS_LO16            00000000   .text
>> 0000573c  00029305 R_MIPS_HI16            00000000   .text
>> 0000574c  00029306 R_MIPS_LO16            00000000   .text
>>
>> The first pair is the one from my first mail:
>> 0x3c010000 <-- should be 0x3c010001 to work properly
>> 0x24339444
>>
>> I'm planning to hunt for more now, will let you know.
>>
>> [0] https://elixir.bootlin.com/linux/v5.11-rc2/source/drivers/net/phy/phy_device.c#L2989
>>
>> > Thanks,
>> > ~Nick Desaulniers
>>
>> Thanks,
>> Al
>>
>
> Thanks,
> ~Nick Desaulniers

Thanks,
Al


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [BUG mips llvm] MIPS: malformed R_MIPS_{HI16,LO16} with LLVM
  2021-01-11 20:50       ` Alexander Lobakin
@ 2021-01-12 22:14         ` Nick Desaulniers
  2021-01-13 10:16           ` Alexander Lobakin
  0 siblings, 1 reply; 11+ messages in thread
From: Nick Desaulniers @ 2021-01-12 22:14 UTC (permalink / raw)
  To: Alexander Lobakin
  Cc: Fangrui Song, clang-built-linux, linux-mips, Thomas Bogendoerfer,
	Ralf Baechle, LKML, linux-arch

[-- Attachment #1: Type: text/plain, Size: 858 bytes --]

On Mon, Jan 11, 2021 at 12:50 PM Alexander Lobakin <alobakin@pm.me> wrote:
>
> > The disassembly for me produces:
> >     399c: 3c 03 00 00   lui     $3, 0 <phy_device_free>
> >                         0000399c:  R_MIPS_HI16  .text
> > ...
> >     39a8: 24 63 3a 5c   addiu   $3, $3, 14940 <phy_probe>
> >                         000039a8:  R_MIPS_LO16  .text
>
> So, in your case the values of the instructions that relocs refer are:
>
> 0x3c030000 R_MIPS_HI16
> 0x24633a5c R_MIPS_LO16
>
> Mine were:
>
> 0x3c010000
> 0x24339444
>
> Your second one doesn't have bit 15 set, so I think this pair won't
> break the code.
> Try to hunt for R_MIPS_LO16 that have this bit set, i.e. they have
> '8', '9', 'a', 'b', 'c', 'd' or 'e' as their [15:12].

I don't think any of my R_MIPS_LO16 in that file have that bit set.
See attached.
-- 
Thanks,
~Nick Desaulniers

[-- Attachment #2: drivers_net_phy_phy_device.objdump.txt --]
[-- Type: text/plain, Size: 6145 bytes --]

			0000004c:  R_MIPS_LO16	kmalloc_caches
			00000088:  R_MIPS_LO16	.data
			00000098:  R_MIPS_LO16	.data
			0000010c:  R_MIPS_LO16	kmalloc_caches
			00000130:  R_MIPS_LO16	$.str
			0000014c:  R_MIPS_LO16	.data
			0000015c:  R_MIPS_LO16	.data
			000001c4:  R_MIPS_LO16	kmalloc_caches
			00000204:  R_MIPS_LO16	.data
			00000214:  R_MIPS_LO16	.data
			00000290:  R_MIPS_LO16	.data
			000002a0:  R_MIPS_LO16	.data
			000002a4:  R_MIPS_LO16	.data
			00000384:  R_MIPS_LO16	.data
			00000394:  R_MIPS_LO16	.data
			00000398:  R_MIPS_LO16	.data
			000003ac:  R_MIPS_LO16	$.str
			00000474:  R_MIPS_LO16	.data
			00000484:  R_MIPS_LO16	.data
			00000488:  R_MIPS_LO16	.data
			0000055c:  R_MIPS_LO16	kmalloc_caches
			00000588:  R_MIPS_LO16	mdio_bus_type
			000005a4:  R_MIPS_LO16	.text
			000005a8:  R_MIPS_LO16	.text
			000005ac:  R_MIPS_LO16	.text
			000005b4:  R_MIPS_LO16	.rodata
			00000638:  R_MIPS_LO16	$.str.1
			0000065c:  R_MIPS_LO16	$.str.2
			00000664:  R_MIPS_LO16	.sbss
			00000684:  R_MIPS_LO16	delayed_work_timer_fn
			00000688:  R_MIPS_LO16	phy_state_machine
			00000998:  R_MIPS_LO16	$.str.62
			000009f8:  R_MIPS_LO16	$.str.63
			00000a50:  R_MIPS_LO16	__stack_chk_guard
			00000d84:  R_MIPS_LO16	__stack_chk_guard
			00000e3c:  R_MIPS_LO16	$.str.3
			00000e4c:  R_MIPS_LO16	$.str.4
			00000ea4:  R_MIPS_LO16	.data
			00000eb4:  R_MIPS_LO16	.data
			00000eb8:  R_MIPS_LO16	.data
			00000ed0:  R_MIPS_LO16	$.str
			00001164:  R_MIPS_LO16	.text
			000011ac:  R_MIPS_LO16	$.str.19
			000011c8:  R_MIPS_LO16	$.str.18
			000011f4:  R_MIPS_LO16	genphy_c45_driver
			000011fc:  R_MIPS_LO16	.data
			00001280:  R_MIPS_LO16	.data
			000013e4:  R_MIPS_LO16	$.str.15
			00001404:  R_MIPS_LO16	$.str.16
			00001444:  R_MIPS_LO16	$.str.14
			0000145c:  R_MIPS_LO16	$.str.17
			00001478:  R_MIPS_LO16	$.str.64
			000014c0:  R_MIPS_LO16	mdio_bus_type
			000014c8:  R_MIPS_LO16	device_match_name
			00001568:  R_MIPS_LO16	$.str.5
			000015f4:  R_MIPS_LO16	__stack_chk_guard
			00001624:  R_MIPS_LO16	$.str.18
			0000162c:  R_MIPS_LO16	$.str.19
			00001648:  R_MIPS_LO16	.data
			00001660:  R_MIPS_LO16	.rodata
			00001668:  R_MIPS_LO16	.rodata
			0000170c:  R_MIPS_LO16	.data
			0000173c:  R_MIPS_LO16	genphy_c45_driver
			00001794:  R_MIPS_LO16	__stack_chk_guard
			00001900:  R_MIPS_LO16	__stack_chk_guard
			00001930:  R_MIPS_LO16	$.str.7
			0000193c:  R_MIPS_LO16	$.str.6
			00001944:  R_MIPS_LO16	$.str.8
			0000195c:  R_MIPS_LO16	$.str.9
			00001984:  R_MIPS_LO16	$.str.10
			0000198c:  R_MIPS_LO16	$.str.11
			00001998:  R_MIPS_LO16	$.str.12
			000019b0:  R_MIPS_LO16	__stack_chk_guard
			00001a10:  R_MIPS_LO16	__stack_chk_guard
			00001a1c:  R_MIPS_LO16	$.str.10
			00001a24:  R_MIPS_LO16	$.str.11
			00001a50:  R_MIPS_LO16	$.str.7
			00001a5c:  R_MIPS_LO16	$.str.6
			00001a64:  R_MIPS_LO16	$.str.8
			00001a80:  R_MIPS_LO16	$.str.9
			00001ab0:  R_MIPS_LO16	$.str.12
			00001ae0:  R_MIPS_LO16	$.str.13
			00001b0c:  R_MIPS_LO16	__stack_chk_guard
			00001b5c:  R_MIPS_LO16	__stack_chk_guard
			00001b88:  R_MIPS_LO16	$.str.7
			00001b94:  R_MIPS_LO16	$.str.6
			00001b9c:  R_MIPS_LO16	$.str.8
			00001bb4:  R_MIPS_LO16	$.str.9
			00001bc4:  R_MIPS_LO16	__stack_chk_guard
			00001d60:  R_MIPS_LO16	mdio_bus_type
			00001d68:  R_MIPS_LO16	device_match_name
			00001dec:  R_MIPS_LO16	$.str.5
			00001e34:  R_MIPS_LO16	.data
			00001e90:  R_MIPS_LO16	genphy_c45_driver
			00001f70:  R_MIPS_LO16	kmalloc_caches
			000020dc:  R_MIPS_LO16	.text
			00002218:  R_MIPS_LO16	__stack_chk_guard
			00002224:  R_MIPS_LO16	.rodata
			00002248:  R_MIPS_LO16	.rodata
			000022d0:  R_MIPS_LO16	__stack_chk_guard
			00002658:  R_MIPS_LO16	.rodata
			00002870:  R_MIPS_LO16	$.str.66
			00002c88:  R_MIPS_LO16	$.str.21
			00002ca0:  R_MIPS_LO16	$.str.20
			00003218:  R_MIPS_LO16	$.str.67
			00003220:  R_MIPS_LO16	$__func__.phy_poll_reset
			000033fc:  R_MIPS_LO16	__stack_chk_guard
			00003474:  R_MIPS_LO16	__stack_chk_guard
			000034c0:  R_MIPS_LO16	__stack_chk_guard
			00003510:  R_MIPS_LO16	__stack_chk_guard
			00003628:  R_MIPS_LO16	__stack_chk_guard
			00003688:  R_MIPS_LO16	__stack_chk_guard
			00003778:  R_MIPS_LO16	__stack_chk_guard
			00003790:  R_MIPS_LO16	$.str.22
			000037d0:  R_MIPS_LO16	$.str.23
			000038c0:  R_MIPS_LO16	__stack_chk_guard
			00003904:  R_MIPS_LO16	$.str.24
			00003920:  R_MIPS_LO16	$.str.25
			00003938:  R_MIPS_LO16	__stack_chk_guard
			000039a4:  R_MIPS_LO16	mdio_bus_type
			000039a8:  R_MIPS_LO16	.text
			000039ac:  R_MIPS_LO16	.text
			000039b0:  R_MIPS_LO16	.text
			00003a18:  R_MIPS_LO16	$.str.28
			00003a30:  R_MIPS_LO16	$.str.26
			00003a50:  R_MIPS_LO16	$.str.27
			00003a78:  R_MIPS_LO16	__stack_chk_guard
			00003c84:  R_MIPS_LO16	__stack_chk_guard
			00003e9c:  R_MIPS_LO16	$.str.31
			00003ee4:  R_MIPS_LO16	.rodata
			00003f00:  R_MIPS_LO16	$.str.33
			00003f08:  R_MIPS_LO16	$.str.59
			00003f10:  R_MIPS_LO16	$.str.34
			00003f3c:  R_MIPS_LO16	$.str.61
			00003f78:  R_MIPS_LO16	__stack_chk_guard
			00004014:  R_MIPS_LO16	.rodata
			00004038:  R_MIPS_LO16	.rodata
			000040ec:  R_MIPS_LO16	__stack_chk_guard
			000042e8:  R_MIPS_LO16	$.str.61
			00000010:  R_MIPS_LO16	genphy_c45_driver
			0000001c:  R_MIPS_LO16	.data
			00000028:  R_MIPS_LO16	.rodata
			00000030:  R_MIPS_LO16	phy_basic_t1_features
			00000044:  R_MIPS_LO16	phy_gbit_features
			00000048:  R_MIPS_LO16	phy_gbit_fibre_features
			00000054:  R_MIPS_LO16	phy_gbit_all_ports_features
			00000060:  R_MIPS_LO16	phy_10gbit_features
			00000074:  R_MIPS_LO16	phy_10gbit_full_features
			00000084:  R_MIPS_LO16	phy_basic_t1_features
			00000090:  R_MIPS_LO16	phy_basic_features
			00000098:  R_MIPS_LO16	phy_basic_features
			0000009c:  R_MIPS_LO16	phy_basic_t1_features
			000000a4:  R_MIPS_LO16	phy_gbit_features
			000000a8:  R_MIPS_LO16	phy_gbit_fibre_features
			000000ac:  R_MIPS_LO16	phy_gbit_all_ports_features
			000000b0:  R_MIPS_LO16	phy_10gbit_features
			000000b8:  R_MIPS_LO16	phy_10gbit_full_features
			000000c0:  R_MIPS_LO16	phy_10gbit_fec_features
			000000cc:  R_MIPS_LO16	phy_10gbit_fec_features
			000000d4:  R_MIPS_LO16	genphy_c45_driver
			000000f0:  R_MIPS_LO16	.data
			00000110:  R_MIPS_LO16	genphy_c45_driver

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [BUG mips llvm] MIPS: malformed R_MIPS_{HI16,LO16} with LLVM
  2021-01-12 22:14         ` Nick Desaulniers
@ 2021-01-13 10:16           ` Alexander Lobakin
  0 siblings, 0 replies; 11+ messages in thread
From: Alexander Lobakin @ 2021-01-13 10:16 UTC (permalink / raw)
  To: Nick Desaulniers
  Cc: Alexander Lobakin, Fangrui Song, clang-built-linux, linux-mips,
	Thomas Bogendoerfer, Ralf Baechle, LKML, linux-arch

From: Nick Desaulniers <ndesaulniers@google.com>
Date: Tue, 12 Jan 2021 14:14:58 -0800

> On Mon, Jan 11, 2021 at 12:50 PM Alexander Lobakin <alobakin@pm.me> wrote:
>>
>>> The disassembly for me produces:
>>>     399c: 3c 03 00 00   lui     $3, 0 <phy_device_free>
>>>                         0000399c:  R_MIPS_HI16  .text
>>> ...
>>>     39a8: 24 63 3a 5c   addiu   $3, $3, 14940 <phy_probe>
>>>                         000039a8:  R_MIPS_LO16  .text
>>
>> So, in your case the values of the instructions that relocs refer are:
>>
>> 0x3c030000 R_MIPS_HI16
>> 0x24633a5c R_MIPS_LO16
>>
>> Mine were:
>>
>> 0x3c010000
>> 0x24339444
>>
>> Your second one doesn't have bit 15 set, so I think this pair won't
>> break the code.
>> Try to hunt for R_MIPS_LO16 that have this bit set, i.e. they have
>> '8', '9', 'a', 'b', 'c', 'd' or 'e' as their [15:12].
>
> I don't think any of my R_MIPS_LO16 in that file have that bit set.
> See attached.

Well, seems like you got lucky here.
You can try to check for other modules if any of them have
R_MIPS_LO16 with bit 15 set.

Also, you can try to enable:
CONFIG_LD_DEAD_CODE_DATA_ELIMINATION=y
CONFIG_TRIM_UNUSED_KSYMS=y

to alter the build process. These options didn't change anything
in terms of relocs for me though.

> --
> Thanks,
> ~Nick Desaulniers

Thanks,
Al


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [BUG mips llvm] MIPS: malformed R_MIPS_{HI16,LO16} with LLVM
  2021-01-09 17:11 [BUG mips llvm] MIPS: malformed R_MIPS_{HI16,LO16} with LLVM Alexander Lobakin
  2021-01-09 17:50 ` Nick Desaulniers
@ 2021-05-07  7:47 ` Nathan Chancellor
  2021-05-08 16:29   ` Alexander Lobakin
  1 sibling, 1 reply; 11+ messages in thread
From: Nathan Chancellor @ 2021-05-07  7:47 UTC (permalink / raw)
  To: Alexander Lobakin
  Cc: clang-built-linux, linux-mips, Thomas Bogendoerfer, Kees Cook,
	Nathan Chancellor, Nick Desaulniers, Fangrui Song, Sami Tolvanen,
	Ralf Baechle, linux-kernel, linux-arch

On Sat, Jan 09, 2021 at 05:11:18PM +0000, Alexander Lobakin wrote:
> Machine: MIPS32 R2 Big Endian (interAptiv (multi))
> 
> While testing MIPS with LLVM, I found a weird and very rare bug with
> MIPS relocs that LLVM emits into kernel modules. It happens on both
> 11.0.0 and latest git snapshot and applies, as I can see, only to
> references to static symbols.
> 
> When the kernel loads the module, it allocates a space for every
> section and then manually apply the relocations relative to the
> new address.
> 
> Let's say we have a function phy_probe() in drivers/net/phy/libphy.ko.
> It's static and referenced only in phy_register_driver(), where it's
> used to fill callback pointer in a structure.
> 
> The real function address after module loading is 0xc06c1444, that
> is observed in its ELF st_value field.
> There are two relocs related to this usage in phy_register_driver():
> 
> R_MIPS_HI16 refers to 0x3c010000
> R_MIPS_LO16 refers to 0x24339444
> 
> The address of .text is 0xc06b8000. So the destination is calculated
> as follows:
> 
> 0x00000000 from hi16;
> 0xffff9444 from lo16 (sign extend as it's always treated as signed);
> 0xc06b8000 from base.
> 
> = 0xc06b1444. The value is lower than the real phy_probe() address
> (0xc06c1444) by 0x10000 and is lower than the base address of
> module's .text, so it's 100% incorrect.
> 
> This results in:
> 
> [    2.204022] CPU 3 Unable to handle kernel paging request at virtual
> address c06b1444, epc == c06b1444, ra == 803f1090
> 
> The correct instructions should be:
> 
> R_MIPS_HI16 0x3c010001
> R_MIPS_LO16 0x24339444
> 
> so there'll be 0x00010000 from hi16.
> 
> I tried to catch those bugs in arch/mips/kernel/module.c (by checking
> if the destination is lower than the base address, which should never
> happen), and seems like I have only 3 such places in libphy.ko (and
> one in nf_tables.ko).
> I don't think it should be handled somehow in mentioned source code
> as it would look rather ugly and may break kernels build with GNU
> stack, which seems to not produce such bad codes.
> 
> If I should report this to any other resources, please let me know.
> I chose clang-built-linux and LKML as it may not happen with userland
> (didn't tried to catch).
> 
> Thanks,
> Al
> 

Hi Alexander,

Doubling back around to this as I was browsing through the LLVM 12.0.1
blockers on LLVM's bug tracker and I noticed a commit that could resolve
this? It refers to the same relocations that you reference here.

https://bugs.llvm.org/show_bug.cgi?id=49821

http://github.com/llvm/llvm-project/commit/7e83a7f1fdfcc2edde61f0a535f9d7a56f531db9

I think that Debian's apt.llvm.org repository should have a build
available with that commit in it. Otherwise, building it from source is
not too complicated with my script:

https://github.com/ClangBuiltLinux/tc-build

$ ./build-llvm.py --build-stage1-only --install-stage1-only --projects "clang;lld" --targets "Mips;X86"

would get you a working toolchain relatively quickly.

Cheers,
Nathan

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [BUG mips llvm] MIPS: malformed R_MIPS_{HI16,LO16} with LLVM
  2021-05-07  7:47 ` Nathan Chancellor
@ 2021-05-08 16:29   ` Alexander Lobakin
  0 siblings, 0 replies; 11+ messages in thread
From: Alexander Lobakin @ 2021-05-08 16:29 UTC (permalink / raw)
  To: Nathan Chancellor
  Cc: clang-built-linux, linux-mips, Thomas Bogendoerfer, Kees Cook,
	Nathan Chancellor, Nick Desaulniers, Fangrui Song, Sami Tolvanen,
	Ralf Baechle, linux-kernel, linux-arch

From: Nathan Chancellor <nathan@kernel.org>
Date: Fri, 7 May 2021 00:47:14 -0700

> On Sat, Jan 09, 2021 at 05:11:18PM +0000, Alexander Lobakin wrote:
> > Machine: MIPS32 R2 Big Endian (interAptiv (multi))
> >
> > While testing MIPS with LLVM, I found a weird and very rare bug with
> > MIPS relocs that LLVM emits into kernel modules. It happens on both
> > 11.0.0 and latest git snapshot and applies, as I can see, only to
> > references to static symbols.
> >
> > When the kernel loads the module, it allocates a space for every
> > section and then manually apply the relocations relative to the
> > new address.
> >
> > Let's say we have a function phy_probe() in drivers/net/phy/libphy.ko.
> > It's static and referenced only in phy_register_driver(), where it's
> > used to fill callback pointer in a structure.
> >
> > The real function address after module loading is 0xc06c1444, that
> > is observed in its ELF st_value field.
> > There are two relocs related to this usage in phy_register_driver():
> >
> > R_MIPS_HI16 refers to 0x3c010000
> > R_MIPS_LO16 refers to 0x24339444
> >
> > The address of .text is 0xc06b8000. So the destination is calculated
> > as follows:
> >
> > 0x00000000 from hi16;
> > 0xffff9444 from lo16 (sign extend as it's always treated as signed);
> > 0xc06b8000 from base.
> >
> > =3D 0xc06b1444. The value is lower than the real phy_probe() address
> > (0xc06c1444) by 0x10000 and is lower than the base address of
> > module's .text, so it's 100% incorrect.
> >
> > This results in:
> >
> > [    2.204022] CPU 3 Unable to handle kernel paging request at virtual
> > address c06b1444, epc =3D=3D c06b1444, ra =3D=3D 803f1090
> >
> > The correct instructions should be:
> >
> > R_MIPS_HI16 0x3c010001
> > R_MIPS_LO16 0x24339444
> >
> > so there'll be 0x00010000 from hi16.
> >
> > I tried to catch those bugs in arch/mips/kernel/module.c (by checking
> > if the destination is lower than the base address, which should never
> > happen), and seems like I have only 3 such places in libphy.ko (and
> > one in nf_tables.ko).
> > I don't think it should be handled somehow in mentioned source code
> > as it would look rather ugly and may break kernels build with GNU
> > stack, which seems to not produce such bad codes.
> >
> > If I should report this to any other resources, please let me know.
> > I chose clang-built-linux and LKML as it may not happen with userland
> > (didn't tried to catch).
> >
> > Thanks,
> > Al
> >
>
> Hi Alexander,

Hi!

> Doubling back around to this as I was browsing through the LLVM 12.0.1
> blockers on LLVM's bug tracker and I noticed a commit that could resolve
> this? It refers to the same relocations that you reference here.
>
> https://bugs.llvm.org/show_bug.cgi?id=3D49821
>
> http://github.com/llvm/llvm-project/commit/7e83a7f1fdfcc2edde61f0a535f9d7a=
> 56f531db9

This really seems very related to the bug I encountered.
Currently I don't have a MIPS setup to try since I've moved to
another country, but I should "deploy" it again soon. So I'll
definitely take a look a bit later, thanks for pointing on this
commit!

> I think that Debian's apt.llvm.org repository should have a build
> available with that commit in it. Otherwise, building it from source is
> not too complicated with my script:
>
> https://github.com/ClangBuiltLinux/tc-build
>
> $ ./build-llvm.py --build-stage1-only --install-stage1-only --projects "cl=
> ang;lld" --targets "Mips;X86"
>
> would get you a working toolchain relatively quickly.

I could just build llvm-git from Arch Linux User Repository :) I did
that last time when was checking if the latest snapshot also suffers
from the bug, and I think it didn't take much time to build.

> Cheers,
> Nathan

Thanks,
Al

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2021-05-08 16:30 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-01-09 17:11 [BUG mips llvm] MIPS: malformed R_MIPS_{HI16,LO16} with LLVM Alexander Lobakin
2021-01-09 17:50 ` Nick Desaulniers
2021-01-09 19:15   ` Alexander Lobakin
2021-01-09 23:29     ` Alexander Lobakin
2021-01-10  1:08       ` Alexander Lobakin
2021-01-11 20:03     ` Nick Desaulniers
2021-01-11 20:50       ` Alexander Lobakin
2021-01-12 22:14         ` Nick Desaulniers
2021-01-13 10:16           ` Alexander Lobakin
2021-05-07  7:47 ` Nathan Chancellor
2021-05-08 16:29   ` Alexander Lobakin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).