* [BUG mips llvm] MIPS: malformed R_MIPS_{HI16,LO16} with LLVM @ 2021-01-09 17:11 Alexander Lobakin 2021-01-09 17:50 ` Nick Desaulniers 2021-05-07 7:47 ` Nathan Chancellor 0 siblings, 2 replies; 11+ messages in thread From: Alexander Lobakin @ 2021-01-09 17:11 UTC (permalink / raw) To: clang-built-linux, linux-mips Cc: Alexander Lobakin, Thomas Bogendoerfer, Kees Cook, Nathan Chancellor, Nick Desaulniers, Fangrui Song, Sami Tolvanen, Ralf Baechle, linux-kernel, linux-arch Machine: MIPS32 R2 Big Endian (interAptiv (multi)) While testing MIPS with LLVM, I found a weird and very rare bug with MIPS relocs that LLVM emits into kernel modules. It happens on both 11.0.0 and latest git snapshot and applies, as I can see, only to references to static symbols. When the kernel loads the module, it allocates a space for every section and then manually apply the relocations relative to the new address. Let's say we have a function phy_probe() in drivers/net/phy/libphy.ko. It's static and referenced only in phy_register_driver(), where it's used to fill callback pointer in a structure. The real function address after module loading is 0xc06c1444, that is observed in its ELF st_value field. There are two relocs related to this usage in phy_register_driver(): R_MIPS_HI16 refers to 0x3c010000 R_MIPS_LO16 refers to 0x24339444 The address of .text is 0xc06b8000. So the destination is calculated as follows: 0x00000000 from hi16; 0xffff9444 from lo16 (sign extend as it's always treated as signed); 0xc06b8000 from base. = 0xc06b1444. The value is lower than the real phy_probe() address (0xc06c1444) by 0x10000 and is lower than the base address of module's .text, so it's 100% incorrect. This results in: [ 2.204022] CPU 3 Unable to handle kernel paging request at virtual address c06b1444, epc == c06b1444, ra == 803f1090 The correct instructions should be: R_MIPS_HI16 0x3c010001 R_MIPS_LO16 0x24339444 so there'll be 0x00010000 from hi16. I tried to catch those bugs in arch/mips/kernel/module.c (by checking if the destination is lower than the base address, which should never happen), and seems like I have only 3 such places in libphy.ko (and one in nf_tables.ko). I don't think it should be handled somehow in mentioned source code as it would look rather ugly and may break kernels build with GNU stack, which seems to not produce such bad codes. If I should report this to any other resources, please let me know. I chose clang-built-linux and LKML as it may not happen with userland (didn't tried to catch). Thanks, Al ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [BUG mips llvm] MIPS: malformed R_MIPS_{HI16,LO16} with LLVM 2021-01-09 17:11 [BUG mips llvm] MIPS: malformed R_MIPS_{HI16,LO16} with LLVM Alexander Lobakin @ 2021-01-09 17:50 ` Nick Desaulniers 2021-01-09 19:15 ` Alexander Lobakin 2021-05-07 7:47 ` Nathan Chancellor 1 sibling, 1 reply; 11+ messages in thread From: Nick Desaulniers @ 2021-01-09 17:50 UTC (permalink / raw) To: Alexander Lobakin Cc: clang-built-linux, linux-mips, Thomas Bogendoerfer, Kees Cook, Nathan Chancellor, Fangrui Song, Sami Tolvanen, Ralf Baechle, LKML, linux-arch On Sat, Jan 9, 2021 at 9:11 AM Alexander Lobakin <alobakin@pm.me> wrote: > > Machine: MIPS32 R2 Big Endian (interAptiv (multi)) > > While testing MIPS with LLVM, I found a weird and very rare bug with > MIPS relocs that LLVM emits into kernel modules. It happens on both > 11.0.0 and latest git snapshot and applies, as I can see, only to > references to static symbols. > > When the kernel loads the module, it allocates a space for every > section and then manually apply the relocations relative to the > new address. > > Let's say we have a function phy_probe() in drivers/net/phy/libphy.ko. > It's static and referenced only in phy_register_driver(), where it's > used to fill callback pointer in a structure. > > The real function address after module loading is 0xc06c1444, that > is observed in its ELF st_value field. > There are two relocs related to this usage in phy_register_driver(): > > R_MIPS_HI16 refers to 0x3c010000 > R_MIPS_LO16 refers to 0x24339444 > > The address of .text is 0xc06b8000. So the destination is calculated > as follows: > > 0x00000000 from hi16; > 0xffff9444 from lo16 (sign extend as it's always treated as signed); > 0xc06b8000 from base. > > = 0xc06b1444. The value is lower than the real phy_probe() address > (0xc06c1444) by 0x10000 and is lower than the base address of > module's .text, so it's 100% incorrect. > > This results in: > > [ 2.204022] CPU 3 Unable to handle kernel paging request at virtual > address c06b1444, epc == c06b1444, ra == 803f1090 > > The correct instructions should be: > > R_MIPS_HI16 0x3c010001 > R_MIPS_LO16 0x24339444 > > so there'll be 0x00010000 from hi16. > > I tried to catch those bugs in arch/mips/kernel/module.c (by checking > if the destination is lower than the base address, which should never > happen), and seems like I have only 3 such places in libphy.ko (and > one in nf_tables.ko). > I don't think it should be handled somehow in mentioned source code > as it would look rather ugly and may break kernels build with GNU > stack, which seems to not produce such bad codes. > > If I should report this to any other resources, please let me know. > I chose clang-built-linux and LKML as it may not happen with userland > (didn't tried to catch). Thanks for the report. Sounds like we may indeed be producing an incorrect relocation. This is only seen for big endian triples? Getting a way for us to deterministically reproduce would be a good first step. Which config or configs beyond defconfig, and which relocations specifically are you observing this with? -- Thanks, ~Nick Desaulniers ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [BUG mips llvm] MIPS: malformed R_MIPS_{HI16,LO16} with LLVM 2021-01-09 17:50 ` Nick Desaulniers @ 2021-01-09 19:15 ` Alexander Lobakin 2021-01-09 23:29 ` Alexander Lobakin 2021-01-11 20:03 ` Nick Desaulniers 0 siblings, 2 replies; 11+ messages in thread From: Alexander Lobakin @ 2021-01-09 19:15 UTC (permalink / raw) To: Nick Desaulniers Cc: Alexander Lobakin, clang-built-linux, linux-mips, Thomas Bogendoerfer, Kees Cook, Nathan Chancellor, Fangrui Song, Sami Tolvanen, Ralf Baechle, LKML, linux-arch From: Nick Desaulniers <ndesaulniers@google.com> Date: Sat, 9 Jan 2021 09:50:44 -0800 > On Sat, Jan 9, 2021 at 9:11 AM Alexander Lobakin <alobakin@pm.me> wrote: >> >> Machine: MIPS32 R2 Big Endian (interAptiv (multi)) >> >> While testing MIPS with LLVM, I found a weird and very rare bug with >> MIPS relocs that LLVM emits into kernel modules. It happens on both >> 11.0.0 and latest git snapshot and applies, as I can see, only to >> references to static symbols. >> >> When the kernel loads the module, it allocates a space for every >> section and then manually apply the relocations relative to the >> new address. >> >> Let's say we have a function phy_probe() in drivers/net/phy/libphy.ko. >> It's static and referenced only in phy_register_driver(), where it's >> used to fill callback pointer in a structure. >> >> The real function address after module loading is 0xc06c1444, that >> is observed in its ELF st_value field. >> There are two relocs related to this usage in phy_register_driver(): >> >> R_MIPS_HI16 refers to 0x3c010000 >> R_MIPS_LO16 refers to 0x24339444 >> >> The address of .text is 0xc06b8000. So the destination is calculated >> as follows: >> >> 0x00000000 from hi16; >> 0xffff9444 from lo16 (sign extend as it's always treated as signed); >> 0xc06b8000 from base. >> >> = 0xc06b1444. The value is lower than the real phy_probe() address >> (0xc06c1444) by 0x10000 and is lower than the base address of >> module's .text, so it's 100% incorrect. >> >> This results in: >> >> [ 2.204022] CPU 3 Unable to handle kernel paging request at virtual >> address c06b1444, epc == c06b1444, ra == 803f1090 >> >> The correct instructions should be: >> >> R_MIPS_HI16 0x3c010001 >> R_MIPS_LO16 0x24339444 >> >> so there'll be 0x00010000 from hi16. >> >> I tried to catch those bugs in arch/mips/kernel/module.c (by checking >> if the destination is lower than the base address, which should never >> happen), and seems like I have only 3 such places in libphy.ko (and >> one in nf_tables.ko). >> I don't think it should be handled somehow in mentioned source code >> as it would look rather ugly and may break kernels build with GNU >> stack, which seems to not produce such bad codes. >> >> If I should report this to any other resources, please let me know. >> I chose clang-built-linux and LKML as it may not happen with userland >> (didn't tried to catch). > > Thanks for the report. Sounds like we may indeed be producing an > incorrect relocation. This is only seen for big endian triples? Unfortunately I don't have a LE board to play with, so can confirm only Big Endian. (BTW, if someone can say if it's possible for MIPS (and how if it is) to launch a LE kernel from BE-booted preloader and U-Boot, that would be super cool) > Getting a way for us to deterministically reproduce would be a good > first step. Which config or configs beyond defconfig, and which > relocations specifically are you observing this with? I use `make 32r2_defconfig` which combines several configs from arch/mips/configs: - generic_defconfig; - generic/32r2.config; - generic/eb.config. Aside from that, I enable a bunch of my WIP drivers and the Netfilter. On my setup, this bug is always present in libphy.ko, so CONFIG_PHYLIB=m (with all deps) should be enough. The three failed relocs belongs to this part of code: [0] llvm-readelf on them: Relocation section '.rel.text' at offset 0xbf60 contains 2281 entries:¬ [...] 00005740 00029305 R_MIPS_HI16 00000000 .text 00005744 00029306 R_MIPS_LO16 00000000 .text 00005720 00029305 R_MIPS_HI16 00000000 .text 00005748 00029306 R_MIPS_LO16 00000000 .text 0000573c 00029305 R_MIPS_HI16 00000000 .text 0000574c 00029306 R_MIPS_LO16 00000000 .text The first pair is the one from my first mail: 0x3c010000 <-- should be 0x3c010001 to work properly 0x24339444 I'm planning to hunt for more now, will let you know. [0] https://elixir.bootlin.com/linux/v5.11-rc2/source/drivers/net/phy/phy_device.c#L2989 > Thanks, > ~Nick Desaulniers Thanks, Al ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [BUG mips llvm] MIPS: malformed R_MIPS_{HI16,LO16} with LLVM 2021-01-09 19:15 ` Alexander Lobakin @ 2021-01-09 23:29 ` Alexander Lobakin 2021-01-10 1:08 ` Alexander Lobakin 2021-01-11 20:03 ` Nick Desaulniers 1 sibling, 1 reply; 11+ messages in thread From: Alexander Lobakin @ 2021-01-09 23:29 UTC (permalink / raw) To: Nick Desaulniers Cc: Alexander Lobakin, clang-built-linux, linux-mips, Thomas Bogendoerfer, Kees Cook, Nathan Chancellor, Fangrui Song, Sami Tolvanen, Ralf Baechle, LKML, linux-arch From: Alexander Lobakin <alobakin@pm.me> Date: Sat, 09 Jan 2021 19:15:31 +0000 > From: Nick Desaulniers <ndesaulniers@google.com> > Date: Sat, 9 Jan 2021 09:50:44 -0800 > >> On Sat, Jan 9, 2021 at 9:11 AM Alexander Lobakin <alobakin@pm.me> wrote: >>> >>> Machine: MIPS32 R2 Big Endian (interAptiv (multi)) >>> >>> While testing MIPS with LLVM, I found a weird and very rare bug with >>> MIPS relocs that LLVM emits into kernel modules. It happens on both >>> 11.0.0 and latest git snapshot and applies, as I can see, only to >>> references to static symbols. >>> >>> When the kernel loads the module, it allocates a space for every >>> section and then manually apply the relocations relative to the >>> new address. >>> >>> Let's say we have a function phy_probe() in drivers/net/phy/libphy.ko. >>> It's static and referenced only in phy_register_driver(), where it's >>> used to fill callback pointer in a structure. >>> >>> The real function address after module loading is 0xc06c1444, that >>> is observed in its ELF st_value field. >>> There are two relocs related to this usage in phy_register_driver(): >>> >>> R_MIPS_HI16 refers to 0x3c010000 >>> R_MIPS_LO16 refers to 0x24339444 >>> >>> The address of .text is 0xc06b8000. So the destination is calculated >>> as follows: >>> >>> 0x00000000 from hi16; >>> 0xffff9444 from lo16 (sign extend as it's always treated as signed); >>> 0xc06b8000 from base. >>> >>> = 0xc06b1444. The value is lower than the real phy_probe() address >>> (0xc06c1444) by 0x10000 and is lower than the base address of >>> module's .text, so it's 100% incorrect. >>> >>> This results in: >>> >>> [ 2.204022] CPU 3 Unable to handle kernel paging request at virtual >>> address c06b1444, epc =3D=3D c06b1444, ra =3D=3D 803f1090 >>> >>> The correct instructions should be: >>> >>> R_MIPS_HI16 0x3c010001 >>> R_MIPS_LO16 0x24339444 >>> >>> so there'll be 0x00010000 from hi16. >>> >>> I tried to catch those bugs in arch/mips/kernel/module.c (by checking >>> if the destination is lower than the base address, which should never >>> happen), and seems like I have only 3 such places in libphy.ko (and >>> one in nf_tables.ko). >>> I don't think it should be handled somehow in mentioned source code >>> as it would look rather ugly and may break kernels build with GNU >>> stack, which seems to not produce such bad codes. >>> >>> If I should report this to any other resources, please let me know. >>> I chose clang-built-linux and LKML as it may not happen with userland >>> (didn't tried to catch). >> >> Thanks for the report. Sounds like we may indeed be producing an >> incorrect relocation. This is only seen for big endian triples? > > Unfortunately I don't have a LE board to play with, so can confirm > only Big Endian. > > (BTW, if someone can say if it's possible for MIPS (and how if it is) > to launch a LE kernel from BE-booted preloader and U-Boot, that would > be super cool) > >> Getting a way for us to deterministically reproduce would be a good >> first step. Which config or configs beyond defconfig, and which >> relocations specifically are you observing this with? > > I use `make 32r2_defconfig` which combines several configs from > arch/mips/configs: > - generic_defconfig; > - generic/32r2.config; > - generic/eb.config. > > Aside from that, I enable a bunch of my WIP drivers and the > Netfilter. On my setup, this bug is always present in libphy.ko, > so CONFIG_PHYLIB=m (with all deps) should be enough. > > The three failed relocs belongs to this part of code: [0] > > llvm-readelf on them: > > Relocation section '.rel.text' at offset 0xbf60 contains 2281 entries: > [...] > 00005740 00029305 R_MIPS_HI16 00000000 .text > 00005744 00029306 R_MIPS_LO16 00000000 .text > 00005720 00029305 R_MIPS_HI16 00000000 .text > 00005748 00029306 R_MIPS_LO16 00000000 .text > 0000573c 00029305 R_MIPS_HI16 00000000 .text > 0000574c 00029306 R_MIPS_LO16 00000000 .text > > The first pair is the one from my first mail: > 0x3c010000 <-- should be 0x3c010001 to work properly > 0x24339444 > > I'm planning to hunt for more now, will let you know. Unfortunately, R_MIPS_32 also suffers from that. And unlikely R_MIPS_{HI,LO}16, they can't be handled runtime as the values are pure random. I expanded arch/mips/kernel/module.c a bit, so it tries to find the actual symbol in .symtab after each applied relocation and print the detailed info. Here's an example from nf_defrag_ipv6 loading: [ 429.789793] nf_defrag_ipv6: final section addresses: [ 429.795409] 0xc07214fc __ksymtab_gpl [ 429.799574] 0xc0720000 .text [ 429.802902] 0xc07216b0 .data [ 429.806249] 0xc0721790 .bss [ 429.809474] 0xc0721508 __ksymtab_strings [ 429.813977] 0xc0728000 .init.text [ 429.817781] 0xc07214c0 .exit.text [ 429.821606] 0xc0721520 .rodata [ 429.825120] 0xc0721578 .rodata.str1.1 [ 429.829322] 0xc0721638 .note.Linux [ 429.833226] 0xc0721800 .gnu.linkonce.this_module [ 429.838503] 0xc0721650 .MIPS.abiflags [ 429.842702] 0xc0721668 .reginfo [ 429.846326] 0xc0721680 .note.gnu.build-id [ 429.851129] nf_defrag_ipv6: R_MIPS_32 [0x00000008]: 0xc07216b0 -> 0xc07216b8 is broken [ 429.860017] nf_defrag_ipv6: R_MIPS_32 [0x00000008]: 0xc07216b0 -> 0xc07216b8 is broken [ 429.868875] nf_defrag_ipv6: R_MIPS_32 [0x00000138]: 0xc0720000 -> 0xc0720138 is defrag6_net_exit [ 429.878706] nf_defrag_ipv6: R_MIPS_32 [0x000012c8]: 0xc0720000 -> 0xc07212c8 is nf_ct_net_init [ 429.888335] nf_defrag_ipv6: R_MIPS_32 [0x0000142c]: 0xc0720000 -> 0xc072142c is nf_ct_net_pre_exit [ 429.898367] nf_defrag_ipv6: R_MIPS_32 [0x00001440]: 0xc0720000 -> 0xc0721440 is nf_ct_net_exit [ 429.907994] nf_defrag_ipv6: R_MIPS_32 [0x00000057]: 0xc0721578 -> 0xc07215cf is broken [ 429.916872] nf_defrag_ipv6: R_MIPS_32 [0x00000000]: 0x80f297f0 -> 0x80f297f0 is proc_dointvec_jiffies [ 429.927177] nf_defrag_ipv6: R_MIPS_32 [0x00000039]: 0xc0721578 -> 0xc07215b1 is broken [ 429.936044] nf_defrag_ipv6: R_MIPS_32 [0x00000000]: 0x80f29374 -> 0x80f29374 is proc_doulongvec_minmax [ 429.946453] nf_defrag_ipv6: R_MIPS_32 [0x00000072]: 0xc0721578 -> 0xc07215ea is broken [ 429.955320] nf_defrag_ipv6: R_MIPS_32 [0x00000000]: 0x80f29374 -> 0x80f29374 is proc_doulongvec_minmax [ 429.965737] nf_defrag_ipv6: R_MIPS_32 [0x000000a4]: 0xc0720000 -> 0xc07200a4 is ipv6_defrag [ 429.975094] nf_defrag_ipv6: R_MIPS_32 [0x000000a4]: 0xc0720000 -> 0xc07200a4 is ipv6_defrag [ 429.984431] nf_defrag_ipv6: R_MIPS_32 [0x0000106c]: 0xc0720000 -> 0xc072106c is ip6frag_key_hashfn [ 429.994470] nf_defrag_ipv6: R_MIPS_32 [0x00001090]: 0xc0720000 -> 0xc0721090 is ip6frag_obj_hashfn [ 430.004486] nf_defrag_ipv6: R_MIPS_32 [0x000010b8]: 0xc0720000 -> 0xc07210b8 is ip6frag_obj_cmpfn [ 430.014425] nf_defrag_ipv6: R_MIPS_32 [0x00000000]: 0xc0720000 -> 0xc0720000 is nf_defrag_ipv6_enable [ 430.024742] nf_defrag_ipv6: R_MIPS_32 [0x00000001]: 0xc0721508 -> 0xc0721509 is __kstrtab_nf_defrag_ipv6_enable [ 430.036074] nf_defrag_ipv6: R_MIPS_32 [0x00000000]: 0xc0721508 -> 0xc0721508 is __kstrtabns_nf_defrag_ipv6_enable [ 430.047561] nf_defrag_ipv6: R_MIPS_32 [0x00000000]: 0xc0728000 -> 0xc0728000 is init_module [ 430.056930] nf_defrag_ipv6: R_MIPS_32 [0x00000000]: 0xc07214c0 -> 0xc07214c0 is cleanup_module At least five symbols are broken and lead to nowhere: two from .data and three from .rodata. Values in square braces are initial references that can be observed via `nm -n` -- and for broken ones they really don't correspond to any symbols, mismatching the neighbours' addresses by 0x40-0x50. So for now seems like it's really an LLVM problem and there can't be any simple workaround for it in the kernel. > [0] https://elixir.bootlin.com/linux/v5.11-rc2/source/drivers/net/phy/phy_device.c#L2989 > >> Thanks, >> ~Nick Desaulniers > > Thanks, > Al Al ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [BUG mips llvm] MIPS: malformed R_MIPS_{HI16,LO16} with LLVM 2021-01-09 23:29 ` Alexander Lobakin @ 2021-01-10 1:08 ` Alexander Lobakin 0 siblings, 0 replies; 11+ messages in thread From: Alexander Lobakin @ 2021-01-10 1:08 UTC (permalink / raw) To: Nick Desaulniers Cc: Alexander Lobakin, clang-built-linux, linux-mips, Thomas Bogendoerfer, Kees Cook, Nathan Chancellor, Fangrui Song, Sami Tolvanen, Ralf Baechle, LKML, linux-arch From: Alexander Lobakin <alobakin@pm.me> Date: Sat, 09 Jan 2021 23:29:26 +0000 > From: Alexander Lobakin <alobakin@pm.me> > Date: Sat, 09 Jan 2021 19:15:31 +0000 > >> From: Nick Desaulniers <ndesaulniers@google.com> >> Date: Sat, 9 Jan 2021 09:50:44 -0800 >> >>> On Sat, Jan 9, 2021 at 9:11 AM Alexander Lobakin <alobakin@pm.me> wrote: >>>> >>>> Machine: MIPS32 R2 Big Endian (interAptiv (multi)) >>>> >>>> While testing MIPS with LLVM, I found a weird and very rare bug with >>>> MIPS relocs that LLVM emits into kernel modules. It happens on both >>>> 11.0.0 and latest git snapshot and applies, as I can see, only to >>>> references to static symbols. >>>> >>>> When the kernel loads the module, it allocates a space for every >>>> section and then manually apply the relocations relative to the >>>> new address. >>>> >>>> Let's say we have a function phy_probe() in drivers/net/phy/libphy.ko. >>>> It's static and referenced only in phy_register_driver(), where it's >>>> used to fill callback pointer in a structure. >>>> >>>> The real function address after module loading is 0xc06c1444, that >>>> is observed in its ELF st_value field. >>>> There are two relocs related to this usage in phy_register_driver(): >>>> >>>> R_MIPS_HI16 refers to 0x3c010000 >>>> R_MIPS_LO16 refers to 0x24339444 >>>> >>>> The address of .text is 0xc06b8000. So the destination is calculated >>>> as follows: >>>> >>>> 0x00000000 from hi16; >>>> 0xffff9444 from lo16 (sign extend as it's always treated as signed); >>>> 0xc06b8000 from base. >>>> >>>> = 0xc06b1444. The value is lower than the real phy_probe() address >>>> (0xc06c1444) by 0x10000 and is lower than the base address of >>>> module's .text, so it's 100% incorrect. >>>> >>>> This results in: >>>> >>>> [ 2.204022] CPU 3 Unable to handle kernel paging request at virtual >>>> address c06b1444, epc == c06b1444, ra == 803f1090 >>>> >>>> The correct instructions should be: >>>> >>>> R_MIPS_HI16 0x3c010001 >>>> R_MIPS_LO16 0x24339444 >>>> >>>> so there'll be 0x00010000 from hi16. >>>> >>>> I tried to catch those bugs in arch/mips/kernel/module.c (by checking >>>> if the destination is lower than the base address, which should never >>>> happen), and seems like I have only 3 such places in libphy.ko (and >>>> one in nf_tables.ko). >>>> I don't think it should be handled somehow in mentioned source code >>>> as it would look rather ugly and may break kernels build with GNU >>>> stack, which seems to not produce such bad codes. >>>> >>>> If I should report this to any other resources, please let me know. >>>> I chose clang-built-linux and LKML as it may not happen with userland >>>> (didn't tried to catch). >>> >>> Thanks for the report. Sounds like we may indeed be producing an >>> incorrect relocation. This is only seen for big endian triples? >> >> Unfortunately I don't have a LE board to play with, so can confirm >> only Big Endian. >> >> (BTW, if someone can say if it's possible for MIPS (and how if it is) >> to launch a LE kernel from BE-booted preloader and U-Boot, that would >> be super cool) >> >>> Getting a way for us to deterministically reproduce would be a good >>> first step. Which config or configs beyond defconfig, and which >>> relocations specifically are you observing this with? >> >> I use `make 32r2_defconfig` which combines several configs from >> arch/mips/configs: >> - generic_defconfig; >> - generic/32r2.config; >> - generic/eb.config. >> >> Aside from that, I enable a bunch of my WIP drivers and the >> Netfilter. On my setup, this bug is always present in libphy.ko, >> so CONFIG_PHYLIB=m (with all deps) should be enough. >> >> The three failed relocs belongs to this part of code: [0] >> >> llvm-readelf on them: >> >> Relocation section '.rel.text' at offset 0xbf60 contains 2281 entries: >> [...] >> 00005740 00029305 R_MIPS_HI16 00000000 .text >> 00005744 00029306 R_MIPS_LO16 00000000 .text >> 00005720 00029305 R_MIPS_HI16 00000000 .text >> 00005748 00029306 R_MIPS_LO16 00000000 .text >> 0000573c 00029305 R_MIPS_HI16 00000000 .text >> 0000574c 00029306 R_MIPS_LO16 00000000 .text >> >> The first pair is the one from my first mail: >> 0x3c010000 <-- should be 0x3c010001 to work properly >> 0x24339444 >> >> I'm planning to hunt for more now, will let you know. > > Unfortunately, R_MIPS_32 also suffers from that. And unlikely > R_MIPS_{HI,LO}16, they can't be handled runtime as the values > are pure random. > I expanded arch/mips/kernel/module.c a bit, so it tries to find > the actual symbol in .symtab after each applied relocation and > print the detailed info. Here's an example from nf_defrag_ipv6 > loading: > > [ 429.789793] nf_defrag_ipv6: final section addresses: > [ 429.795409] =090xc07214fc __ksymtab_gpl > [ 429.799574] =090xc0720000 .text > [ 429.802902] =090xc07216b0 .data > [ 429.806249] =090xc0721790 .bss > [ 429.809474] =090xc0721508 __ksymtab_strings > [ 429.813977] =090xc0728000 .init.text > [ 429.817781] =090xc07214c0 .exit.text > [ 429.821606] =090xc0721520 .rodata > [ 429.825120] =090xc0721578 .rodata.str1.1 > [ 429.829322] =090xc0721638 .note.Linux > [ 429.833226] =090xc0721800 .gnu.linkonce.this_module > [ 429.838503] =090xc0721650 .MIPS.abiflags > [ 429.842702] =090xc0721668 .reginfo > [ 429.846326] =090xc0721680 .note.gnu.build-id > [ 429.851129] nf_defrag_ipv6: R_MIPS_32 [0x00000008]: 0xc07216b0 -> 0xc07216b8 is broken > [ 429.860017] nf_defrag_ipv6: R_MIPS_32 [0x00000008]: 0xc07216b0 -> 0xc07216b8 is broken > [ 429.868875] nf_defrag_ipv6: R_MIPS_32 [0x00000138]: 0xc0720000 -> 0xc0720138 is defrag6_net_exit > [ 429.878706] nf_defrag_ipv6: R_MIPS_32 [0x000012c8]: 0xc0720000 -> 0xc07212c8 is nf_ct_net_init > [ 429.888335] nf_defrag_ipv6: R_MIPS_32 [0x0000142c]: 0xc0720000 -> 0xc072142c is nf_ct_net_pre_exit > [ 429.898367] nf_defrag_ipv6: R_MIPS_32 [0x00001440]: 0xc0720000 -> 0xc0721440 is nf_ct_net_exit > [ 429.907994] nf_defrag_ipv6: R_MIPS_32 [0x00000057]: 0xc0721578 -> 0xc07215cf is broken > [ 429.916872] nf_defrag_ipv6: R_MIPS_32 [0x00000000]: 0x80f297f0 -> 0x80f297f0 is proc_dointvec_jiffies > [ 429.927177] nf_defrag_ipv6: R_MIPS_32 [0x00000039]: 0xc0721578 -> 0xc07215b1 is broken > [ 429.936044] nf_defrag_ipv6: R_MIPS_32 [0x00000000]: 0x80f29374 -> 0x80f29374 is proc_doulongvec_minmax > [ 429.946453] nf_defrag_ipv6: R_MIPS_32 [0x00000072]: 0xc0721578 -> 0xc07215ea is broken > [ 429.955320] nf_defrag_ipv6: R_MIPS_32 [0x00000000]: 0x80f29374 -> 0x80f29374 is proc_doulongvec_minmax > [ 429.965737] nf_defrag_ipv6: R_MIPS_32 [0x000000a4]: 0xc0720000 -> 0xc07200a4 is ipv6_defrag > [ 429.975094] nf_defrag_ipv6: R_MIPS_32 [0x000000a4]: 0xc0720000 -> 0xc07200a4 is ipv6_defrag > [ 429.984431] nf_defrag_ipv6: R_MIPS_32 [0x0000106c]: 0xc0720000 -> 0xc072106c is ip6frag_key_hashfn > [ 429.994470] nf_defrag_ipv6: R_MIPS_32 [0x00001090]: 0xc0720000 -> 0xc0721090 is ip6frag_obj_hashfn > [ 430.004486] nf_defrag_ipv6: R_MIPS_32 [0x000010b8]: 0xc0720000 -> 0xc07210b8 is ip6frag_obj_cmpfn > [ 430.014425] nf_defrag_ipv6: R_MIPS_32 [0x00000000]: 0xc0720000 -> 0xc0720000 is nf_defrag_ipv6_enable > [ 430.024742] nf_defrag_ipv6: R_MIPS_32 [0x00000001]: 0xc0721508 -> 0xc0721509 is __kstrtab_nf_defrag_ipv6_enable > [ 430.036074] nf_defrag_ipv6: R_MIPS_32 [0x00000000]: 0xc0721508 -> 0xc0721508 is __kstrtabns_nf_defrag_ipv6_enable > [ 430.047561] nf_defrag_ipv6: R_MIPS_32 [0x00000000]: 0xc0728000 -> 0xc0728000 is init_module > [ 430.056930] nf_defrag_ipv6: R_MIPS_32 [0x00000000]: 0xc07214c0 -> 0xc07214c0 is cleanup_module > > At least five symbols are broken and lead to nowhere: two from .data > and three from .rodata. Values in square braces are initial references > that can be observed via `nm -n` -- and for broken ones they really > don't correspond to any symbols, mismatching the neighbours' addresses > by 0x40-0x50. Oops, my bad. I forgot that they can point to the middle of the struct or array or what else. Nevermind, only the problem with R_MIPS_HI16 is actual. With the "add 0x10000 if can't find the symbol" workaround I'm now able to run kernel and modules without any visible defects or issues. Full list of detected and fixed relocs on my setup: libphy: R_MIPS_HI16 [0x3c030000, 0x24639444]: .text -> 0xc06b1444 is broken libphy: R_MIPS_HI16 [0x3c030000, 0x24639444]: .text -> 0xc06c1444 is phy_probe libphy: R_MIPS_HI16 [0x3c020000, 0x2442970c]: .text -> 0xc06b170c is broken libphy: R_MIPS_HI16 [0x3c020000, 0x2442970c]: .text -> 0xc06c170c is phy_remove libphy: R_MIPS_HI16 [0x3c010000, 0x242197ac]: .text -> 0xc06b17ac is broken libphy: R_MIPS_HI16 [0x3c010000, 0x242197ac]: .text -> 0xc06c17ac is phy_shutdown nf_tables: R_MIPS_HI16 [0x3c010001, 0x24218164]: .text -> 0xc07bc164 is broken nf_tables: R_MIPS_HI16 [0x3c010001, 0x24218164]: .text -> 0xc07cc164 is nf_tables_dump_obj_done nf_tables: R_MIPS_HI16 [0x3c010001, 0x243981bc]: .text -> 0xc07bc1bc is broken nf_tables: R_MIPS_HI16 [0x3c010001, 0x243981bc]: .text -> 0xc07cc1bc is nft_flowtable_parse_hook nf_tables: R_MIPS_HI16 [0x3c010001, 0x24398390]: .text -> 0xc07bc390 is broken nf_tables: R_MIPS_HI16 [0x3c010001, 0x24398390]: .text -> 0xc07cc390 is nft_register_flowtable_net_hooks nf_tables: R_MIPS_HI16 [0x3c010001, 0x243981bc]: .text -> 0xc07bc1bc is broken nf_tables: R_MIPS_HI16 [0x3c010001, 0x243981bc]: .text -> 0xc07cc1bc is nft_flowtable_parse_hook nf_tables: R_MIPS_HI16 [0x3c010001, 0x24398390]: .text -> 0xc07bc390 is broken nf_tables: R_MIPS_HI16 [0x3c010001, 0x24398390]: .text -> 0xc07cc390 is nft_register_flowtable_net_hooks nf_tables: R_MIPS_HI16 [0x3c010001, 0x2421866c]: .text -> 0xc07bc66c is broken nf_tables: R_MIPS_HI16 [0x3c010001, 0x2421866c]: .text -> 0xc07cc66c is nf_tables_dump_flowtable nf_tables: R_MIPS_HI16 [0x3c010001, 0x242185b4]: .text -> 0xc07bc5b4 is broken nf_tables: R_MIPS_HI16 [0x3c010001, 0x242185b4]: .text -> 0xc07cc5b4 is nf_tables_dump_flowtable_start nf_tables: R_MIPS_HI16 [0x3c010001, 0x243981bc]: .text -> 0xc07bc1bc is broken nf_tables: R_MIPS_HI16 [0x3c010001, 0x243981bc]: .text -> 0xc07cc1bc is nft_flowtable_parse_hook nf_tables: R_MIPS_HI16 [0x3c020002, 0x24428080]: .text -> 0xc07cc080 is broken nf_tables: R_MIPS_HI16 [0x3c020002, 0x24428080]: .text -> 0xc07dc080 is nft_rbtree_gc nf_conntrack: R_MIPS_HI16 [0x3c010000, 0x24258538]: .text -> 0xc077c538 is broken nf_conntrack: R_MIPS_HI16 [0x3c010000, 0x24258538]: .text -> 0xc078c538 is nf_ct_expectation_timed_out > So for now seems like it's really an LLVM problem and there can't be > any simple workaround for it in the kernel. > >> [0] https://elixir.bootlin.com/linux/v5.11-rc2/source/drivers/net/phy/phy_device.c#L2989 >> >>> Thanks, >>> ~Nick Desaulniers >> >> Thanks, >> Al > > Al Al ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [BUG mips llvm] MIPS: malformed R_MIPS_{HI16,LO16} with LLVM 2021-01-09 19:15 ` Alexander Lobakin 2021-01-09 23:29 ` Alexander Lobakin @ 2021-01-11 20:03 ` Nick Desaulniers 2021-01-11 20:50 ` Alexander Lobakin 1 sibling, 1 reply; 11+ messages in thread From: Nick Desaulniers @ 2021-01-11 20:03 UTC (permalink / raw) To: Alexander Lobakin, Fangrui Song Cc: clang-built-linux, linux-mips, Thomas Bogendoerfer, Kees Cook, Nathan Chancellor, Sami Tolvanen, Ralf Baechle, LKML, linux-arch Hi Alexander, I'm genuinely trying to reproduce/understand this report, questions below: On Sat, Jan 9, 2021 at 11:15 AM Alexander Lobakin <alobakin@pm.me> wrote: > > From: Nick Desaulniers <ndesaulniers@google.com> > Date: Sat, 9 Jan 2021 09:50:44 -0800 > > > On Sat, Jan 9, 2021 at 9:11 AM Alexander Lobakin <alobakin@pm.me> wrote: > >> > >> Machine: MIPS32 R2 Big Endian (interAptiv (multi)) > >> > >> While testing MIPS with LLVM, I found a weird and very rare bug with > >> MIPS relocs that LLVM emits into kernel modules. It happens on both > >> 11.0.0 and latest git snapshot and applies, as I can see, only to > >> references to static symbols. > >> > >> When the kernel loads the module, it allocates a space for every > >> section and then manually apply the relocations relative to the > >> new address. > >> > >> Let's say we have a function phy_probe() in drivers/net/phy/libphy.ko. > >> It's static and referenced only in phy_register_driver(), where it's > >> used to fill callback pointer in a structure. > >> > >> The real function address after module loading is 0xc06c1444, that > >> is observed in its ELF st_value field. > >> There are two relocs related to this usage in phy_register_driver(): > >> > >> R_MIPS_HI16 refers to 0x3c010000 > >> R_MIPS_LO16 refers to 0x24339444 Sorry, how are these calculated? (Explicit shell commands invoked would be appreciated) I'm doing: $ ARCH=mips CROSS_COMPILE=mips-linux-gnu- make CC=clang -j71 32r2_defconfig $ ARCH=mips CROSS_COMPILE=mips-linux-gnu- make CC=clang -j71 modules $ llvm-nm --format=sysv drivers/net/phy/phy_device.o | grep phy_probe $ llvm-objdump -Dr --disassemble-symbols=phy_driver_register drivers/net/phy/phy_device.o $ llvm-readelf -r drivers/net/phy/phy_device.o | grep -e R_MIPS_HI16 -e R_MIPS_LO16 for some of the commands trying to verify. > >> > >> The address of .text is 0xc06b8000. So the destination is calculated > >> as follows: > >> > >> 0x00000000 from hi16; > >> 0xffff9444 from lo16 (sign extend as it's always treated as signed); > >> 0xc06b8000 from base. > >> > >> = 0xc06b1444. The value is lower than the real phy_probe() address > >> (0xc06c1444) by 0x10000 and is lower than the base address of > >> module's .text, so it's 100% incorrect. The disassembly for me produces: 399c: 3c 03 00 00 lui $3, 0 <phy_device_free> 0000399c: R_MIPS_HI16 .text ... 39a8: 24 63 3a 5c addiu $3, $3, 14940 <phy_probe> 000039a8: R_MIPS_LO16 .text I'm not really sure how to manually resolve the relocations; Fangrui do you have any tips? (I'm coincidentally reading through Linkers & Loaders currently, but only just started chpt. 4). > >> > >> This results in: > >> > >> [ 2.204022] CPU 3 Unable to handle kernel paging request at virtual > >> address c06b1444, epc == c06b1444, ra == 803f1090 > >> > >> The correct instructions should be: > >> > >> R_MIPS_HI16 0x3c010001 > >> R_MIPS_LO16 0x24339444 > >> > >> so there'll be 0x00010000 from hi16. > >> > >> I tried to catch those bugs in arch/mips/kernel/module.c (by checking > >> if the destination is lower than the base address, which should never > >> happen), and seems like I have only 3 such places in libphy.ko (and > >> one in nf_tables.ko). > >> I don't think it should be handled somehow in mentioned source code > >> as it would look rather ugly and may break kernels build with GNU > >> stack, which seems to not produce such bad codes. > >> > >> If I should report this to any other resources, please let me know. > >> I chose clang-built-linux and LKML as it may not happen with userland > >> (didn't tried to catch). > > > > Thanks for the report. Sounds like we may indeed be producing an > > incorrect relocation. This is only seen for big endian triples? > > Unfortunately I don't have a LE board to play with, so can confirm > only Big Endian. > > (BTW, if someone can say if it's possible for MIPS (and how if it is) > to launch a LE kernel from BE-booted preloader and U-Boot, that would > be super cool) > > > Getting a way for us to deterministically reproduce would be a good > > first step. Which config or configs beyond defconfig, and which > > relocations specifically are you observing this with? > > I use `make 32r2_defconfig` which combines several configs from > arch/mips/configs: > - generic_defconfig; > - generic/32r2.config; > - generic/eb.config. > > Aside from that, I enable a bunch of my WIP drivers and the > Netfilter. On my setup, this bug is always present in libphy.ko, > so CONFIG_PHYLIB=m (with all deps) should be enough. > > The three failed relocs belongs to this part of code: [0] > > llvm-readelf on them: > > Relocation section '.rel.text' at offset 0xbf60 contains 2281 entries:¬ > [...] > 00005740 00029305 R_MIPS_HI16 00000000 .text > 00005744 00029306 R_MIPS_LO16 00000000 .text > 00005720 00029305 R_MIPS_HI16 00000000 .text > 00005748 00029306 R_MIPS_LO16 00000000 .text > 0000573c 00029305 R_MIPS_HI16 00000000 .text > 0000574c 00029306 R_MIPS_LO16 00000000 .text > > The first pair is the one from my first mail: > 0x3c010000 <-- should be 0x3c010001 to work properly > 0x24339444 > > I'm planning to hunt for more now, will let you know. > > [0] https://elixir.bootlin.com/linux/v5.11-rc2/source/drivers/net/phy/phy_device.c#L2989 > > > Thanks, > > ~Nick Desaulniers > > Thanks, > Al > -- Thanks, ~Nick Desaulniers ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [BUG mips llvm] MIPS: malformed R_MIPS_{HI16,LO16} with LLVM 2021-01-11 20:03 ` Nick Desaulniers @ 2021-01-11 20:50 ` Alexander Lobakin 2021-01-12 22:14 ` Nick Desaulniers 0 siblings, 1 reply; 11+ messages in thread From: Alexander Lobakin @ 2021-01-11 20:50 UTC (permalink / raw) To: Nick Desaulniers Cc: Alexander Lobakin, Fangrui Song, clang-built-linux, linux-mips, Thomas Bogendoerfer, Kees Cook, Nathan Chancellor, Sami Tolvanen, Ralf Baechle, LKML, linux-arch From: Nick Desaulniers <ndesaulniers@google.com> Date: Mon, 11 Jan 2021 12:03:29 -0800 > Hi Alexander, > I'm genuinely trying to reproduce/understand this report, questions below: Hi Nick! > On Sat, Jan 9, 2021 at 11:15 AM Alexander Lobakin <alobakin@pm.me> wrote: >> >> From: Nick Desaulniers <ndesaulniers@google.com> >> Date: Sat, 9 Jan 2021 09:50:44 -0800 >> >>> On Sat, Jan 9, 2021 at 9:11 AM Alexander Lobakin <alobakin@pm.me> wrote: >>>> >>>> Machine: MIPS32 R2 Big Endian (interAptiv (multi)) >>>> >>>> While testing MIPS with LLVM, I found a weird and very rare bug with >>>> MIPS relocs that LLVM emits into kernel modules. It happens on both >>>> 11.0.0 and latest git snapshot and applies, as I can see, only to >>>> references to static symbols. >>>> >>>> When the kernel loads the module, it allocates a space for every >>>> section and then manually apply the relocations relative to the >>>> new address. >>>> >>>> Let's say we have a function phy_probe() in drivers/net/phy/libphy.ko. >>>> It's static and referenced only in phy_register_driver(), where it's >>>> used to fill callback pointer in a structure. >>>> >>>> The real function address after module loading is 0xc06c1444, that >>>> is observed in its ELF st_value field. >>>> There are two relocs related to this usage in phy_register_driver(): >>>> >>>> R_MIPS_HI16 refers to 0x3c010000 >>>> R_MIPS_LO16 refers to 0x24339444 > > Sorry, how are these calculated? (Explicit shell commands invoked > would be appreciated) Just look at the disassembly below and you'll see the similar instructions in your module. > I'm doing: > $ ARCH=mips CROSS_COMPILE=mips-linux-gnu- make CC=clang -j71 32r2_defconfig > $ ARCH=mips CROSS_COMPILE=mips-linux-gnu- make CC=clang -j71 modules > $ llvm-nm --format=sysv drivers/net/phy/phy_device.o | grep phy_probe > $ llvm-objdump -Dr --disassemble-symbols=phy_driver_register drivers/net/phy/phy_device.o > $ llvm-readelf -r drivers/net/phy/phy_device.o | grep -e R_MIPS_HI16 -e R_MIPS_LO16 > > for some of the commands trying to verify. > >>>> >>>> The address of .text is 0xc06b8000. So the destination is calculated >>>> as follows: >>>> >>>> 0x00000000 from hi16; >>>> 0xffff9444 from lo16 (sign extend as it's always treated as signed); >>>> 0xc06b8000 from base. >>>> >>>> = 0xc06b1444. The value is lower than the real phy_probe() address >>>> (0xc06c1444) by 0x10000 and is lower than the base address of >>>> module's .text, so it's 100% incorrect. > > The disassembly for me produces: > 399c: 3c 03 00 00 lui $3, 0 <phy_device_free> > 0000399c: R_MIPS_HI16 .text > ... > 39a8: 24 63 3a 5c addiu $3, $3, 14940 <phy_probe> > 000039a8: R_MIPS_LO16 .text So, in your case the values of the instructions that relocs refer are: 0x3c030000 R_MIPS_HI16 0x24633a5c R_MIPS_LO16 Mine were: 0x3c010000 0x24339444 Your second one doesn't have bit 15 set, so I think this pair won't break the code. Try to hunt for R_MIPS_LO16 that have this bit set, i.e. they have '8', '9', 'a', 'b', 'c', 'd' or 'e' as their [15:12]. > I'm not really sure how to manually resolve the relocations; Fangrui > do you have any tips? (I'm coincidentally reading through Linkers & > Loaders currently, but only just started chpt. 4). > >>>> >>>> This results in: >>>> >>>> [ 2.204022] CPU 3 Unable to handle kernel paging request at virtual >>>> address c06b1444, epc == c06b1444, ra == 803f1090 >>>> >>>> The correct instructions should be: >>>> >>>> R_MIPS_HI16 0x3c010001 >>>> R_MIPS_LO16 0x24339444 >>>> >>>> so there'll be 0x00010000 from hi16. >>>> >>>> I tried to catch those bugs in arch/mips/kernel/module.c (by checking >>>> if the destination is lower than the base address, which should never >>>> happen), and seems like I have only 3 such places in libphy.ko (and >>>> one in nf_tables.ko). >>>> I don't think it should be handled somehow in mentioned source code >>>> as it would look rather ugly and may break kernels build with GNU >>>> stack, which seems to not produce such bad codes. >>>> >>>> If I should report this to any other resources, please let me know. >>>> I chose clang-built-linux and LKML as it may not happen with userland >>>> (didn't tried to catch). >>> >>> Thanks for the report. Sounds like we may indeed be producing an >>> incorrect relocation. This is only seen for big endian triples? >> >> Unfortunately I don't have a LE board to play with, so can confirm >> only Big Endian. >> >> (BTW, if someone can say if it's possible for MIPS (and how if it is) >> to launch a LE kernel from BE-booted preloader and U-Boot, that would >> be super cool) >> >>> Getting a way for us to deterministically reproduce would be a good >>> first step. Which config or configs beyond defconfig, and which >>> relocations specifically are you observing this with? >> >> I use `make 32r2_defconfig` which combines several configs from >> arch/mips/configs: >> - generic_defconfig; >> - generic/32r2.config; >> - generic/eb.config. >> >> Aside from that, I enable a bunch of my WIP drivers and the >> Netfilter. On my setup, this bug is always present in libphy.ko, >> so CONFIG_PHYLIB=m (with all deps) should be enough. >> >> The three failed relocs belongs to this part of code: [0] >> >> llvm-readelf on them: >> >> Relocation section '.rel.text' at offset 0xbf60 contains 2281 entries: >> [...] >> 00005740 00029305 R_MIPS_HI16 00000000 .text >> 00005744 00029306 R_MIPS_LO16 00000000 .text >> 00005720 00029305 R_MIPS_HI16 00000000 .text >> 00005748 00029306 R_MIPS_LO16 00000000 .text >> 0000573c 00029305 R_MIPS_HI16 00000000 .text >> 0000574c 00029306 R_MIPS_LO16 00000000 .text >> >> The first pair is the one from my first mail: >> 0x3c010000 <-- should be 0x3c010001 to work properly >> 0x24339444 >> >> I'm planning to hunt for more now, will let you know. >> >> [0] https://elixir.bootlin.com/linux/v5.11-rc2/source/drivers/net/phy/phy_device.c#L2989 >> >> > Thanks, >> > ~Nick Desaulniers >> >> Thanks, >> Al >> > > Thanks, > ~Nick Desaulniers Thanks, Al ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [BUG mips llvm] MIPS: malformed R_MIPS_{HI16,LO16} with LLVM 2021-01-11 20:50 ` Alexander Lobakin @ 2021-01-12 22:14 ` Nick Desaulniers 2021-01-13 10:16 ` Alexander Lobakin 0 siblings, 1 reply; 11+ messages in thread From: Nick Desaulniers @ 2021-01-12 22:14 UTC (permalink / raw) To: Alexander Lobakin Cc: Fangrui Song, clang-built-linux, linux-mips, Thomas Bogendoerfer, Ralf Baechle, LKML, linux-arch [-- Attachment #1: Type: text/plain, Size: 858 bytes --] On Mon, Jan 11, 2021 at 12:50 PM Alexander Lobakin <alobakin@pm.me> wrote: > > > The disassembly for me produces: > > 399c: 3c 03 00 00 lui $3, 0 <phy_device_free> > > 0000399c: R_MIPS_HI16 .text > > ... > > 39a8: 24 63 3a 5c addiu $3, $3, 14940 <phy_probe> > > 000039a8: R_MIPS_LO16 .text > > So, in your case the values of the instructions that relocs refer are: > > 0x3c030000 R_MIPS_HI16 > 0x24633a5c R_MIPS_LO16 > > Mine were: > > 0x3c010000 > 0x24339444 > > Your second one doesn't have bit 15 set, so I think this pair won't > break the code. > Try to hunt for R_MIPS_LO16 that have this bit set, i.e. they have > '8', '9', 'a', 'b', 'c', 'd' or 'e' as their [15:12]. I don't think any of my R_MIPS_LO16 in that file have that bit set. See attached. -- Thanks, ~Nick Desaulniers [-- Attachment #2: drivers_net_phy_phy_device.objdump.txt --] [-- Type: text/plain, Size: 6145 bytes --] 0000004c: R_MIPS_LO16 kmalloc_caches 00000088: R_MIPS_LO16 .data 00000098: R_MIPS_LO16 .data 0000010c: R_MIPS_LO16 kmalloc_caches 00000130: R_MIPS_LO16 $.str 0000014c: R_MIPS_LO16 .data 0000015c: R_MIPS_LO16 .data 000001c4: R_MIPS_LO16 kmalloc_caches 00000204: R_MIPS_LO16 .data 00000214: R_MIPS_LO16 .data 00000290: R_MIPS_LO16 .data 000002a0: R_MIPS_LO16 .data 000002a4: R_MIPS_LO16 .data 00000384: R_MIPS_LO16 .data 00000394: R_MIPS_LO16 .data 00000398: R_MIPS_LO16 .data 000003ac: R_MIPS_LO16 $.str 00000474: R_MIPS_LO16 .data 00000484: R_MIPS_LO16 .data 00000488: R_MIPS_LO16 .data 0000055c: R_MIPS_LO16 kmalloc_caches 00000588: R_MIPS_LO16 mdio_bus_type 000005a4: R_MIPS_LO16 .text 000005a8: R_MIPS_LO16 .text 000005ac: R_MIPS_LO16 .text 000005b4: R_MIPS_LO16 .rodata 00000638: R_MIPS_LO16 $.str.1 0000065c: R_MIPS_LO16 $.str.2 00000664: R_MIPS_LO16 .sbss 00000684: R_MIPS_LO16 delayed_work_timer_fn 00000688: R_MIPS_LO16 phy_state_machine 00000998: R_MIPS_LO16 $.str.62 000009f8: R_MIPS_LO16 $.str.63 00000a50: R_MIPS_LO16 __stack_chk_guard 00000d84: R_MIPS_LO16 __stack_chk_guard 00000e3c: R_MIPS_LO16 $.str.3 00000e4c: R_MIPS_LO16 $.str.4 00000ea4: R_MIPS_LO16 .data 00000eb4: R_MIPS_LO16 .data 00000eb8: R_MIPS_LO16 .data 00000ed0: R_MIPS_LO16 $.str 00001164: R_MIPS_LO16 .text 000011ac: R_MIPS_LO16 $.str.19 000011c8: R_MIPS_LO16 $.str.18 000011f4: R_MIPS_LO16 genphy_c45_driver 000011fc: R_MIPS_LO16 .data 00001280: R_MIPS_LO16 .data 000013e4: R_MIPS_LO16 $.str.15 00001404: R_MIPS_LO16 $.str.16 00001444: R_MIPS_LO16 $.str.14 0000145c: R_MIPS_LO16 $.str.17 00001478: R_MIPS_LO16 $.str.64 000014c0: R_MIPS_LO16 mdio_bus_type 000014c8: R_MIPS_LO16 device_match_name 00001568: R_MIPS_LO16 $.str.5 000015f4: R_MIPS_LO16 __stack_chk_guard 00001624: R_MIPS_LO16 $.str.18 0000162c: R_MIPS_LO16 $.str.19 00001648: R_MIPS_LO16 .data 00001660: R_MIPS_LO16 .rodata 00001668: R_MIPS_LO16 .rodata 0000170c: R_MIPS_LO16 .data 0000173c: R_MIPS_LO16 genphy_c45_driver 00001794: R_MIPS_LO16 __stack_chk_guard 00001900: R_MIPS_LO16 __stack_chk_guard 00001930: R_MIPS_LO16 $.str.7 0000193c: R_MIPS_LO16 $.str.6 00001944: R_MIPS_LO16 $.str.8 0000195c: R_MIPS_LO16 $.str.9 00001984: R_MIPS_LO16 $.str.10 0000198c: R_MIPS_LO16 $.str.11 00001998: R_MIPS_LO16 $.str.12 000019b0: R_MIPS_LO16 __stack_chk_guard 00001a10: R_MIPS_LO16 __stack_chk_guard 00001a1c: R_MIPS_LO16 $.str.10 00001a24: R_MIPS_LO16 $.str.11 00001a50: R_MIPS_LO16 $.str.7 00001a5c: R_MIPS_LO16 $.str.6 00001a64: R_MIPS_LO16 $.str.8 00001a80: R_MIPS_LO16 $.str.9 00001ab0: R_MIPS_LO16 $.str.12 00001ae0: R_MIPS_LO16 $.str.13 00001b0c: R_MIPS_LO16 __stack_chk_guard 00001b5c: R_MIPS_LO16 __stack_chk_guard 00001b88: R_MIPS_LO16 $.str.7 00001b94: R_MIPS_LO16 $.str.6 00001b9c: R_MIPS_LO16 $.str.8 00001bb4: R_MIPS_LO16 $.str.9 00001bc4: R_MIPS_LO16 __stack_chk_guard 00001d60: R_MIPS_LO16 mdio_bus_type 00001d68: R_MIPS_LO16 device_match_name 00001dec: R_MIPS_LO16 $.str.5 00001e34: R_MIPS_LO16 .data 00001e90: R_MIPS_LO16 genphy_c45_driver 00001f70: R_MIPS_LO16 kmalloc_caches 000020dc: R_MIPS_LO16 .text 00002218: R_MIPS_LO16 __stack_chk_guard 00002224: R_MIPS_LO16 .rodata 00002248: R_MIPS_LO16 .rodata 000022d0: R_MIPS_LO16 __stack_chk_guard 00002658: R_MIPS_LO16 .rodata 00002870: R_MIPS_LO16 $.str.66 00002c88: R_MIPS_LO16 $.str.21 00002ca0: R_MIPS_LO16 $.str.20 00003218: R_MIPS_LO16 $.str.67 00003220: R_MIPS_LO16 $__func__.phy_poll_reset 000033fc: R_MIPS_LO16 __stack_chk_guard 00003474: R_MIPS_LO16 __stack_chk_guard 000034c0: R_MIPS_LO16 __stack_chk_guard 00003510: R_MIPS_LO16 __stack_chk_guard 00003628: R_MIPS_LO16 __stack_chk_guard 00003688: R_MIPS_LO16 __stack_chk_guard 00003778: R_MIPS_LO16 __stack_chk_guard 00003790: R_MIPS_LO16 $.str.22 000037d0: R_MIPS_LO16 $.str.23 000038c0: R_MIPS_LO16 __stack_chk_guard 00003904: R_MIPS_LO16 $.str.24 00003920: R_MIPS_LO16 $.str.25 00003938: R_MIPS_LO16 __stack_chk_guard 000039a4: R_MIPS_LO16 mdio_bus_type 000039a8: R_MIPS_LO16 .text 000039ac: R_MIPS_LO16 .text 000039b0: R_MIPS_LO16 .text 00003a18: R_MIPS_LO16 $.str.28 00003a30: R_MIPS_LO16 $.str.26 00003a50: R_MIPS_LO16 $.str.27 00003a78: R_MIPS_LO16 __stack_chk_guard 00003c84: R_MIPS_LO16 __stack_chk_guard 00003e9c: R_MIPS_LO16 $.str.31 00003ee4: R_MIPS_LO16 .rodata 00003f00: R_MIPS_LO16 $.str.33 00003f08: R_MIPS_LO16 $.str.59 00003f10: R_MIPS_LO16 $.str.34 00003f3c: R_MIPS_LO16 $.str.61 00003f78: R_MIPS_LO16 __stack_chk_guard 00004014: R_MIPS_LO16 .rodata 00004038: R_MIPS_LO16 .rodata 000040ec: R_MIPS_LO16 __stack_chk_guard 000042e8: R_MIPS_LO16 $.str.61 00000010: R_MIPS_LO16 genphy_c45_driver 0000001c: R_MIPS_LO16 .data 00000028: R_MIPS_LO16 .rodata 00000030: R_MIPS_LO16 phy_basic_t1_features 00000044: R_MIPS_LO16 phy_gbit_features 00000048: R_MIPS_LO16 phy_gbit_fibre_features 00000054: R_MIPS_LO16 phy_gbit_all_ports_features 00000060: R_MIPS_LO16 phy_10gbit_features 00000074: R_MIPS_LO16 phy_10gbit_full_features 00000084: R_MIPS_LO16 phy_basic_t1_features 00000090: R_MIPS_LO16 phy_basic_features 00000098: R_MIPS_LO16 phy_basic_features 0000009c: R_MIPS_LO16 phy_basic_t1_features 000000a4: R_MIPS_LO16 phy_gbit_features 000000a8: R_MIPS_LO16 phy_gbit_fibre_features 000000ac: R_MIPS_LO16 phy_gbit_all_ports_features 000000b0: R_MIPS_LO16 phy_10gbit_features 000000b8: R_MIPS_LO16 phy_10gbit_full_features 000000c0: R_MIPS_LO16 phy_10gbit_fec_features 000000cc: R_MIPS_LO16 phy_10gbit_fec_features 000000d4: R_MIPS_LO16 genphy_c45_driver 000000f0: R_MIPS_LO16 .data 00000110: R_MIPS_LO16 genphy_c45_driver ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [BUG mips llvm] MIPS: malformed R_MIPS_{HI16,LO16} with LLVM 2021-01-12 22:14 ` Nick Desaulniers @ 2021-01-13 10:16 ` Alexander Lobakin 0 siblings, 0 replies; 11+ messages in thread From: Alexander Lobakin @ 2021-01-13 10:16 UTC (permalink / raw) To: Nick Desaulniers Cc: Alexander Lobakin, Fangrui Song, clang-built-linux, linux-mips, Thomas Bogendoerfer, Ralf Baechle, LKML, linux-arch From: Nick Desaulniers <ndesaulniers@google.com> Date: Tue, 12 Jan 2021 14:14:58 -0800 > On Mon, Jan 11, 2021 at 12:50 PM Alexander Lobakin <alobakin@pm.me> wrote: >> >>> The disassembly for me produces: >>> 399c: 3c 03 00 00 lui $3, 0 <phy_device_free> >>> 0000399c: R_MIPS_HI16 .text >>> ... >>> 39a8: 24 63 3a 5c addiu $3, $3, 14940 <phy_probe> >>> 000039a8: R_MIPS_LO16 .text >> >> So, in your case the values of the instructions that relocs refer are: >> >> 0x3c030000 R_MIPS_HI16 >> 0x24633a5c R_MIPS_LO16 >> >> Mine were: >> >> 0x3c010000 >> 0x24339444 >> >> Your second one doesn't have bit 15 set, so I think this pair won't >> break the code. >> Try to hunt for R_MIPS_LO16 that have this bit set, i.e. they have >> '8', '9', 'a', 'b', 'c', 'd' or 'e' as their [15:12]. > > I don't think any of my R_MIPS_LO16 in that file have that bit set. > See attached. Well, seems like you got lucky here. You can try to check for other modules if any of them have R_MIPS_LO16 with bit 15 set. Also, you can try to enable: CONFIG_LD_DEAD_CODE_DATA_ELIMINATION=y CONFIG_TRIM_UNUSED_KSYMS=y to alter the build process. These options didn't change anything in terms of relocs for me though. > -- > Thanks, > ~Nick Desaulniers Thanks, Al ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [BUG mips llvm] MIPS: malformed R_MIPS_{HI16,LO16} with LLVM 2021-01-09 17:11 [BUG mips llvm] MIPS: malformed R_MIPS_{HI16,LO16} with LLVM Alexander Lobakin 2021-01-09 17:50 ` Nick Desaulniers @ 2021-05-07 7:47 ` Nathan Chancellor 2021-05-08 16:29 ` Alexander Lobakin 1 sibling, 1 reply; 11+ messages in thread From: Nathan Chancellor @ 2021-05-07 7:47 UTC (permalink / raw) To: Alexander Lobakin Cc: clang-built-linux, linux-mips, Thomas Bogendoerfer, Kees Cook, Nathan Chancellor, Nick Desaulniers, Fangrui Song, Sami Tolvanen, Ralf Baechle, linux-kernel, linux-arch On Sat, Jan 09, 2021 at 05:11:18PM +0000, Alexander Lobakin wrote: > Machine: MIPS32 R2 Big Endian (interAptiv (multi)) > > While testing MIPS with LLVM, I found a weird and very rare bug with > MIPS relocs that LLVM emits into kernel modules. It happens on both > 11.0.0 and latest git snapshot and applies, as I can see, only to > references to static symbols. > > When the kernel loads the module, it allocates a space for every > section and then manually apply the relocations relative to the > new address. > > Let's say we have a function phy_probe() in drivers/net/phy/libphy.ko. > It's static and referenced only in phy_register_driver(), where it's > used to fill callback pointer in a structure. > > The real function address after module loading is 0xc06c1444, that > is observed in its ELF st_value field. > There are two relocs related to this usage in phy_register_driver(): > > R_MIPS_HI16 refers to 0x3c010000 > R_MIPS_LO16 refers to 0x24339444 > > The address of .text is 0xc06b8000. So the destination is calculated > as follows: > > 0x00000000 from hi16; > 0xffff9444 from lo16 (sign extend as it's always treated as signed); > 0xc06b8000 from base. > > = 0xc06b1444. The value is lower than the real phy_probe() address > (0xc06c1444) by 0x10000 and is lower than the base address of > module's .text, so it's 100% incorrect. > > This results in: > > [ 2.204022] CPU 3 Unable to handle kernel paging request at virtual > address c06b1444, epc == c06b1444, ra == 803f1090 > > The correct instructions should be: > > R_MIPS_HI16 0x3c010001 > R_MIPS_LO16 0x24339444 > > so there'll be 0x00010000 from hi16. > > I tried to catch those bugs in arch/mips/kernel/module.c (by checking > if the destination is lower than the base address, which should never > happen), and seems like I have only 3 such places in libphy.ko (and > one in nf_tables.ko). > I don't think it should be handled somehow in mentioned source code > as it would look rather ugly and may break kernels build with GNU > stack, which seems to not produce such bad codes. > > If I should report this to any other resources, please let me know. > I chose clang-built-linux and LKML as it may not happen with userland > (didn't tried to catch). > > Thanks, > Al > Hi Alexander, Doubling back around to this as I was browsing through the LLVM 12.0.1 blockers on LLVM's bug tracker and I noticed a commit that could resolve this? It refers to the same relocations that you reference here. https://bugs.llvm.org/show_bug.cgi?id=49821 http://github.com/llvm/llvm-project/commit/7e83a7f1fdfcc2edde61f0a535f9d7a56f531db9 I think that Debian's apt.llvm.org repository should have a build available with that commit in it. Otherwise, building it from source is not too complicated with my script: https://github.com/ClangBuiltLinux/tc-build $ ./build-llvm.py --build-stage1-only --install-stage1-only --projects "clang;lld" --targets "Mips;X86" would get you a working toolchain relatively quickly. Cheers, Nathan ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [BUG mips llvm] MIPS: malformed R_MIPS_{HI16,LO16} with LLVM 2021-05-07 7:47 ` Nathan Chancellor @ 2021-05-08 16:29 ` Alexander Lobakin 0 siblings, 0 replies; 11+ messages in thread From: Alexander Lobakin @ 2021-05-08 16:29 UTC (permalink / raw) To: Nathan Chancellor Cc: clang-built-linux, linux-mips, Thomas Bogendoerfer, Kees Cook, Nathan Chancellor, Nick Desaulniers, Fangrui Song, Sami Tolvanen, Ralf Baechle, linux-kernel, linux-arch From: Nathan Chancellor <nathan@kernel.org> Date: Fri, 7 May 2021 00:47:14 -0700 > On Sat, Jan 09, 2021 at 05:11:18PM +0000, Alexander Lobakin wrote: > > Machine: MIPS32 R2 Big Endian (interAptiv (multi)) > > > > While testing MIPS with LLVM, I found a weird and very rare bug with > > MIPS relocs that LLVM emits into kernel modules. It happens on both > > 11.0.0 and latest git snapshot and applies, as I can see, only to > > references to static symbols. > > > > When the kernel loads the module, it allocates a space for every > > section and then manually apply the relocations relative to the > > new address. > > > > Let's say we have a function phy_probe() in drivers/net/phy/libphy.ko. > > It's static and referenced only in phy_register_driver(), where it's > > used to fill callback pointer in a structure. > > > > The real function address after module loading is 0xc06c1444, that > > is observed in its ELF st_value field. > > There are two relocs related to this usage in phy_register_driver(): > > > > R_MIPS_HI16 refers to 0x3c010000 > > R_MIPS_LO16 refers to 0x24339444 > > > > The address of .text is 0xc06b8000. So the destination is calculated > > as follows: > > > > 0x00000000 from hi16; > > 0xffff9444 from lo16 (sign extend as it's always treated as signed); > > 0xc06b8000 from base. > > > > =3D 0xc06b1444. The value is lower than the real phy_probe() address > > (0xc06c1444) by 0x10000 and is lower than the base address of > > module's .text, so it's 100% incorrect. > > > > This results in: > > > > [ 2.204022] CPU 3 Unable to handle kernel paging request at virtual > > address c06b1444, epc =3D=3D c06b1444, ra =3D=3D 803f1090 > > > > The correct instructions should be: > > > > R_MIPS_HI16 0x3c010001 > > R_MIPS_LO16 0x24339444 > > > > so there'll be 0x00010000 from hi16. > > > > I tried to catch those bugs in arch/mips/kernel/module.c (by checking > > if the destination is lower than the base address, which should never > > happen), and seems like I have only 3 such places in libphy.ko (and > > one in nf_tables.ko). > > I don't think it should be handled somehow in mentioned source code > > as it would look rather ugly and may break kernels build with GNU > > stack, which seems to not produce such bad codes. > > > > If I should report this to any other resources, please let me know. > > I chose clang-built-linux and LKML as it may not happen with userland > > (didn't tried to catch). > > > > Thanks, > > Al > > > > Hi Alexander, Hi! > Doubling back around to this as I was browsing through the LLVM 12.0.1 > blockers on LLVM's bug tracker and I noticed a commit that could resolve > this? It refers to the same relocations that you reference here. > > https://bugs.llvm.org/show_bug.cgi?id=3D49821 > > http://github.com/llvm/llvm-project/commit/7e83a7f1fdfcc2edde61f0a535f9d7a= > 56f531db9 This really seems very related to the bug I encountered. Currently I don't have a MIPS setup to try since I've moved to another country, but I should "deploy" it again soon. So I'll definitely take a look a bit later, thanks for pointing on this commit! > I think that Debian's apt.llvm.org repository should have a build > available with that commit in it. Otherwise, building it from source is > not too complicated with my script: > > https://github.com/ClangBuiltLinux/tc-build > > $ ./build-llvm.py --build-stage1-only --install-stage1-only --projects "cl= > ang;lld" --targets "Mips;X86" > > would get you a working toolchain relatively quickly. I could just build llvm-git from Arch Linux User Repository :) I did that last time when was checking if the latest snapshot also suffers from the bug, and I think it didn't take much time to build. > Cheers, > Nathan Thanks, Al ^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2021-05-08 16:30 UTC | newest] Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2021-01-09 17:11 [BUG mips llvm] MIPS: malformed R_MIPS_{HI16,LO16} with LLVM Alexander Lobakin 2021-01-09 17:50 ` Nick Desaulniers 2021-01-09 19:15 ` Alexander Lobakin 2021-01-09 23:29 ` Alexander Lobakin 2021-01-10 1:08 ` Alexander Lobakin 2021-01-11 20:03 ` Nick Desaulniers 2021-01-11 20:50 ` Alexander Lobakin 2021-01-12 22:14 ` Nick Desaulniers 2021-01-13 10:16 ` Alexander Lobakin 2021-05-07 7:47 ` Nathan Chancellor 2021-05-08 16:29 ` Alexander Lobakin
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).