All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/2] solve static percpu symbol issue in module and refine code model of module
@ 2020-02-19  7:28 Vincent Chen
  2020-02-19  7:28 ` [PATCH 1/2] riscv: avoid the PIC offset of static percpu data in module beyond 2G limits Vincent Chen
  2020-02-19  7:28 ` [PATCH 2/2] riscv: Change code model of module to medany to improve data accessing Vincent Chen
  0 siblings, 2 replies; 10+ messages in thread
From: Vincent Chen @ 2020-02-19  7:28 UTC (permalink / raw)
  To: paul.walmsley, palmer; +Cc: Vincent Chen, linux-riscv, deanbo422

The compiler uses the PIC-relative method to access static variables
instead of GOT when the code model is PIC. Therefore, the limitation of
the access range from the instruction to the symbol address is +-2GB.
Under this circumstance, the kernel cannot load a kernel module if this
module has static per-CPU symbols declared by DEFINE_PER_CPU(). The reason
is that kernel relocates the .data..percpu section of the kernel module to
the end of kernel's .data..percpu. Hence, the distance between the per-CPU
symbols and the instruction will exceed the 2GB limits. To solve this
problem, the kernel should place the loaded module in the memory area
[&_end-2G, VMALLOC_END].

Becuase the loaded module locates in the region [&_end-2G,VMALLOC_END] at
runtime, the distance from the module start to the end of the kernel
image does not exceed 2GB. Hence, the second patch changes the code model
of the kernel module from PIC to medany to improve the performance of data
access.

Vincent Chen (2):
  riscv: avoid the PIC offset of static percpu data in module beyond 2G
    limits
  riscv: Replace PIC with medany to improve data accessing in module

 arch/riscv/Makefile        |  6 ++++--
 arch/riscv/kernel/module.c | 18 ++++++++++++++++++
 2 files changed, 22 insertions(+), 2 deletions(-)

-- 
2.7.4



^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH 1/2] riscv: avoid the PIC offset of static percpu data in module beyond 2G limits
  2020-02-19  7:28 [PATCH 0/2] solve static percpu symbol issue in module and refine code model of module Vincent Chen
@ 2020-02-19  7:28 ` Vincent Chen
  2020-02-19 17:52   ` Alexandre Ghiti
  2020-02-19  7:28 ` [PATCH 2/2] riscv: Change code model of module to medany to improve data accessing Vincent Chen
  1 sibling, 1 reply; 10+ messages in thread
From: Vincent Chen @ 2020-02-19  7:28 UTC (permalink / raw)
  To: paul.walmsley, palmer; +Cc: Vincent Chen, linux-riscv, deanbo422

The compiler uses the PIC-relative method to access static variables
instead of GOT when the code model is PIC. Therefore, the limitation of
the access range from the instruction to the symbol address is +-2GB.
Under this circumstance, the kernel cannot load a kernel module if this
module has static per-CPU symbols declared by DEFINE_PER_CPU(). The reason
is that kernel relocates the .data..percpu section of the kernel module to
the end of kernel's .data..percpu. Hence, the distance between the per-CPU
symbols and the instruction will exceed the 2GB limits. To solve this
problem, the kernel should place the loaded module in the memory area
[&_end-2G, VMALLOC_END].

Signed-off-by: Vincent Chen <vincent.chen@sifive.com>
Suggested-by: Alex Ghiti <alex@ghiti.fr>
Suggested-by: Anup Patel <anup@brainfault.org>

---
 arch/riscv/kernel/module.c | 18 ++++++++++++++++++
 1 file changed, 18 insertions(+)

diff --git a/arch/riscv/kernel/module.c b/arch/riscv/kernel/module.c
index b7401858d872..c498beb82369 100644
--- a/arch/riscv/kernel/module.c
+++ b/arch/riscv/kernel/module.c
@@ -8,6 +8,10 @@
 #include <linux/err.h>
 #include <linux/errno.h>
 #include <linux/moduleloader.h>
+#include <linux/vmalloc.h>
+#include <linux/sizes.h>
+#include <asm/pgtable.h>
+#include <asm/sections.h>
 
 static int apply_r_riscv_32_rela(struct module *me, u32 *location, Elf_Addr v)
 {
@@ -386,3 +390,17 @@ int apply_relocate_add(Elf_Shdr *sechdrs, const char *strtab,
 
 	return 0;
 }
+#if defined(CONFIG_MMU) && defined(CONFIG_64BIT)
+#ifdef CONFIG_MAXPHYSMEM_2GB
+#define VMALLOC_MODULE_START \
+	max(PFN_ALIGN((unsigned long)&_end - SZ_2G), VMALLOC_START)
+#else
+#define VMALLOC_MODULE_START PFN_ALIGN((unsigned long)&_end - SZ_2G)
+#endif
+void *module_alloc(unsigned long size)
+{
+	return __vmalloc_node_range(size, 1, VMALLOC_MODULE_START,
+	VMALLOC_END, GFP_KERNEL, PAGE_KERNEL_EXEC, 0, NUMA_NO_NODE,
+	__builtin_return_address(0));
+}
+#endif
-- 
2.7.4



^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 2/2] riscv: Change code model of module to medany to improve data accessing
  2020-02-19  7:28 [PATCH 0/2] solve static percpu symbol issue in module and refine code model of module Vincent Chen
  2020-02-19  7:28 ` [PATCH 1/2] riscv: avoid the PIC offset of static percpu data in module beyond 2G limits Vincent Chen
@ 2020-02-19  7:28 ` Vincent Chen
  1 sibling, 0 replies; 10+ messages in thread
From: Vincent Chen @ 2020-02-19  7:28 UTC (permalink / raw)
  To: paul.walmsley, palmer; +Cc: Vincent Chen, linux-riscv, deanbo422

All the loaded module locates in the region [&_end-2G,VMALLOC_END] at
runtime, so the distance from the module start to the end of the kernel
image does not exceed 2GB. Hence, the code model of the kernel module can
be changed to medany to improve the performance data access.

Signed-off-by: Vincent Chen <vincent.chen@sifive.com>
---
 arch/riscv/Makefile | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/arch/riscv/Makefile b/arch/riscv/Makefile
index 33a1d7cbf775..a6abe5847e42 100644
--- a/arch/riscv/Makefile
+++ b/arch/riscv/Makefile
@@ -13,8 +13,10 @@ LDFLAGS_vmlinux :=
 ifeq ($(CONFIG_DYNAMIC_FTRACE),y)
 	LDFLAGS_vmlinux := --no-relax
 endif
-KBUILD_AFLAGS_MODULE += -fPIC
-KBUILD_CFLAGS_MODULE += -fPIC
+
+ifeq ($(CONFIG_64BIT)$(CONFIG_CMODEL_MEDLOW),yy)
+KBUILD_CFLAGS_MODULE += -mcmodel=medany
+endif
 
 export BITS
 ifeq ($(CONFIG_ARCH_RV64I),y)
-- 
2.7.4



^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/2] riscv: avoid the PIC offset of static percpu data in module beyond 2G limits
  2020-02-19  7:28 ` [PATCH 1/2] riscv: avoid the PIC offset of static percpu data in module beyond 2G limits Vincent Chen
@ 2020-02-19 17:52   ` Alexandre Ghiti
  2020-02-19 22:43     ` Carlos Eduardo de Paula
  2020-02-20  2:29     ` Vincent Chen
  0 siblings, 2 replies; 10+ messages in thread
From: Alexandre Ghiti @ 2020-02-19 17:52 UTC (permalink / raw)
  To: Vincent Chen, paul.walmsley, palmer; +Cc: linux-riscv, deanbo422

Hi Vincent,

On 2/19/20 8:28 AM, Vincent Chen wrote:
> The compiler uses the PIC-relative method to access static variables
> instead of GOT when the code model is PIC. Therefore, the limitation of
> the access range from the instruction to the symbol address is +-2GB.
> Under this circumstance, the kernel cannot load a kernel module if this
> module has static per-CPU symbols declared by DEFINE_PER_CPU(). The reason
> is that kernel relocates the .data..percpu section of the kernel module to
> the end of kernel's .data..percpu. Hence, the distance between the per-CPU
> symbols and the instruction will exceed the 2GB limits. To solve this
> problem, the kernel should place the loaded module in the memory area
> [&_end-2G, VMALLOC_END].
> 
> Signed-off-by: Vincent Chen <vincent.chen@sifive.com>
> Suggested-by: Alex Ghiti <alex@ghiti.fr>
> Suggested-by: Anup Patel <anup@brainfault.org>
> 
> ---
>   arch/riscv/kernel/module.c | 18 ++++++++++++++++++
>   1 file changed, 18 insertions(+)
> 
> diff --git a/arch/riscv/kernel/module.c b/arch/riscv/kernel/module.c
> index b7401858d872..c498beb82369 100644
> --- a/arch/riscv/kernel/module.c
> +++ b/arch/riscv/kernel/module.c
> @@ -8,6 +8,10 @@
>   #include <linux/err.h>
>   #include <linux/errno.h>
>   #include <linux/moduleloader.h>
> +#include <linux/vmalloc.h>
> +#include <linux/sizes.h>
> +#include <asm/pgtable.h>
> +#include <asm/sections.h>
>   
>   static int apply_r_riscv_32_rela(struct module *me, u32 *location, Elf_Addr v)
>   {
> @@ -386,3 +390,17 @@ int apply_relocate_add(Elf_Shdr *sechdrs, const char *strtab,
>   
>   	return 0;
>   }
> +#if defined(CONFIG_MMU) && defined(CONFIG_64BIT)
> +#ifdef CONFIG_MAXPHYSMEM_2GB
> +#define VMALLOC_MODULE_START \
> +	max(PFN_ALIGN((unsigned long)&_end - SZ_2G), VMALLOC_START)
> +#else
> +#define VMALLOC_MODULE_START PFN_ALIGN((unsigned long)&_end - SZ_2G)
> +#endif

I would use the same definition for both cases:

#define VMALLOC_MODULE_START \
	max(PFN_ALIGN((unsigned long)&_end - SZ_2G), VMALLOC_START)

as it avoids ifdefs and amounts to the same. And maybe you can avoid the 
definition of VMALLOC_MODULE_START at the same time.

> +void *module_alloc(unsigned long size)
> +{
> +	return __vmalloc_node_range(size, 1, VMALLOC_MODULE_START,
> +	VMALLOC_END, GFP_KERNEL, PAGE_KERNEL_EXEC, 0, NUMA_NO_NODE,
> +	__builtin_return_address(0));
> +}
> +#endif
> 

It's weird checkpatch does not complain about the alignment of those lines.

Otherwise, I have just tested it and it works, so you can add:

Tested-by: Alexandre Ghiti <alex@ghiti.fr>

Thanks,

Alex


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/2] riscv: avoid the PIC offset of static percpu data in module beyond 2G limits
  2020-02-19 17:52   ` Alexandre Ghiti
@ 2020-02-19 22:43     ` Carlos Eduardo de Paula
  2020-02-20  2:31       ` Vincent Chen
  2020-02-20  2:29     ` Vincent Chen
  1 sibling, 1 reply; 10+ messages in thread
From: Carlos Eduardo de Paula @ 2020-02-19 22:43 UTC (permalink / raw)
  To: Alexandre Ghiti
  Cc: Vincent Chen, linux-riscv, deanbo422, Palmer Dabbelt, Paul Walmsley

On Wed, Feb 19, 2020 at 2:53 PM Alexandre Ghiti <alex@ghiti.fr> wrote:
>
> Hi Vincent,
>
> On 2/19/20 8:28 AM, Vincent Chen wrote:
> > The compiler uses the PIC-relative method to access static variables
> > instead of GOT when the code model is PIC. Therefore, the limitation of
> > the access range from the instruction to the symbol address is +-2GB.
> > Under this circumstance, the kernel cannot load a kernel module if this
> > module has static per-CPU symbols declared by DEFINE_PER_CPU(). The reason
> > is that kernel relocates the .data..percpu section of the kernel module to
> > the end of kernel's .data..percpu. Hence, the distance between the per-CPU
> > symbols and the instruction will exceed the 2GB limits. To solve this
> > problem, the kernel should place the loaded module in the memory area
> > [&_end-2G, VMALLOC_END].
> >
> > Signed-off-by: Vincent Chen <vincent.chen@sifive.com>
> > Suggested-by: Alex Ghiti <alex@ghiti.fr>
> > Suggested-by: Anup Patel <anup@brainfault.org>
> >
> > ---
> >   arch/riscv/kernel/module.c | 18 ++++++++++++++++++
> >   1 file changed, 18 insertions(+)
> >
> > diff --git a/arch/riscv/kernel/module.c b/arch/riscv/kernel/module.c
> > index b7401858d872..c498beb82369 100644
> > --- a/arch/riscv/kernel/module.c
> > +++ b/arch/riscv/kernel/module.c
> > @@ -8,6 +8,10 @@
> >   #include <linux/err.h>
> >   #include <linux/errno.h>
> >   #include <linux/moduleloader.h>
> > +#include <linux/vmalloc.h>
> > +#include <linux/sizes.h>
> > +#include <asm/pgtable.h>
> > +#include <asm/sections.h>
> >
> >   static int apply_r_riscv_32_rela(struct module *me, u32 *location, Elf_Addr v)
> >   {
> > @@ -386,3 +390,17 @@ int apply_relocate_add(Elf_Shdr *sechdrs, const char *strtab,
> >
> >       return 0;
> >   }
> > +#if defined(CONFIG_MMU) && defined(CONFIG_64BIT)
> > +#ifdef CONFIG_MAXPHYSMEM_2GB
> > +#define VMALLOC_MODULE_START \
> > +     max(PFN_ALIGN((unsigned long)&_end - SZ_2G), VMALLOC_START)
> > +#else
> > +#define VMALLOC_MODULE_START PFN_ALIGN((unsigned long)&_end - SZ_2G)
> > +#endif
>
> I would use the same definition for both cases:
>
> #define VMALLOC_MODULE_START \
>         max(PFN_ALIGN((unsigned long)&_end - SZ_2G), VMALLOC_START)
>
> as it avoids ifdefs and amounts to the same. And maybe you can avoid the
> definition of VMALLOC_MODULE_START at the same time.
>
> > +void *module_alloc(unsigned long size)
> > +{
> > +     return __vmalloc_node_range(size, 1, VMALLOC_MODULE_START,
> > +     VMALLOC_END, GFP_KERNEL, PAGE_KERNEL_EXEC, 0, NUMA_NO_NODE,
> > +     __builtin_return_address(0));
> > +}
> > +#endif
> >
>
> It's weird checkpatch does not complain about the alignment of those lines.
>
> Otherwise, I have just tested it and it works, so you can add:
>
> Tested-by: Alexandre Ghiti <alex@ghiti.fr>
>
> Thanks,
>
> Alex
>

Thanks for the patch, applied on v5.5.0 and v5.6.0-rc2. Worked fine on
Qemu and Unleashed:

root@debian10-riscv64:~# sudo modprobe openvswitch
[  124.257220] openvswitch: Open vSwitch switching datapath

root@debian10-riscv64:~# modprobe br_netfilter
[  193.168269] Bridge firewalling registered

root@debian10-riscv64:~# lsmod
Module                  Size  Used by
br_netfilter           23054  0
bridge                217063  2 br_netfilter
stp                     2891  1 bridge
llc                     5968  2 bridge,stp
openvswitch           197057  0
nsh                     3501  1 openvswitch
nf_conncount           11362  1 openvswitch
nf_nat                 39088  1 openvswitch
nf_conntrack          143270  3 nf_nat,openvswitch,nf_conncount
nf_defrag_ipv6         10091  2 nf_conntrack,openvswitch
nf_defrag_ipv4          2410  1 nf_conntrack
ip_tables              16409  0

If desired, add:

Tested-by: Carlos de Paula <me@carlosedp.com>


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/2] riscv: avoid the PIC offset of static percpu data in module beyond 2G limits
  2020-02-19 17:52   ` Alexandre Ghiti
  2020-02-19 22:43     ` Carlos Eduardo de Paula
@ 2020-02-20  2:29     ` Vincent Chen
  2020-02-20  5:53       ` Alex Ghiti
  1 sibling, 1 reply; 10+ messages in thread
From: Vincent Chen @ 2020-02-20  2:29 UTC (permalink / raw)
  To: Alexandre Ghiti; +Cc: linux-riscv, deanbo422, Palmer Dabbelt, Paul Walmsley

On Thu, Feb 20, 2020 at 1:52 AM Alexandre Ghiti <alex@ghiti.fr> wrote:
>
> Hi Vincent,
>
> On 2/19/20 8:28 AM, Vincent Chen wrote:
> > The compiler uses the PIC-relative method to access static variables
> > instead of GOT when the code model is PIC. Therefore, the limitation of
> > the access range from the instruction to the symbol address is +-2GB.
> > Under this circumstance, the kernel cannot load a kernel module if this
> > module has static per-CPU symbols declared by DEFINE_PER_CPU(). The reason
> > is that kernel relocates the .data..percpu section of the kernel module to
> > the end of kernel's .data..percpu. Hence, the distance between the per-CPU
> > symbols and the instruction will exceed the 2GB limits. To solve this
> > problem, the kernel should place the loaded module in the memory area
> > [&_end-2G, VMALLOC_END].
> >
> > Signed-off-by: Vincent Chen <vincent.chen@sifive.com>
> > Suggested-by: Alex Ghiti <alex@ghiti.fr>
> > Suggested-by: Anup Patel <anup@brainfault.org>
> >
> > ---
> >   arch/riscv/kernel/module.c | 18 ++++++++++++++++++
> >   1 file changed, 18 insertions(+)
> >
> > diff --git a/arch/riscv/kernel/module.c b/arch/riscv/kernel/module.c
> > index b7401858d872..c498beb82369 100644
> > --- a/arch/riscv/kernel/module.c
> > +++ b/arch/riscv/kernel/module.c
> > @@ -8,6 +8,10 @@
> >   #include <linux/err.h>
> >   #include <linux/errno.h>
> >   #include <linux/moduleloader.h>
> > +#include <linux/vmalloc.h>
> > +#include <linux/sizes.h>
> > +#include <asm/pgtable.h>
> > +#include <asm/sections.h>
> >
> >   static int apply_r_riscv_32_rela(struct module *me, u32 *location, Elf_Addr v)
> >   {
> > @@ -386,3 +390,17 @@ int apply_relocate_add(Elf_Shdr *sechdrs, const char *strtab,
> >
> >       return 0;
> >   }
> > +#if defined(CONFIG_MMU) && defined(CONFIG_64BIT)
> > +#ifdef CONFIG_MAXPHYSMEM_2GB
> > +#define VMALLOC_MODULE_START \
> > +     max(PFN_ALIGN((unsigned long)&_end - SZ_2G), VMALLOC_START)
> > +#else
> > +#define VMALLOC_MODULE_START PFN_ALIGN((unsigned long)&_end - SZ_2G)
> > +#endif
>
> I would use the same definition for both cases:
>
> #define VMALLOC_MODULE_START \
>         max(PFN_ALIGN((unsigned long)&_end - SZ_2G), VMALLOC_START)
>
> as it avoids ifdefs and amounts to the same. And maybe you can avoid the
> definition of VMALLOC_MODULE_START at the same time.
>
Thanks for your comments. I will follow your suggestion to use the
same definition for both cases. For the definition of
VMALLOC_MODULE_START, I may prefer to keep it , because I think it may
be more readable than directly passing the max() function to the
__vmalloc_node_range(). I am afriad that I misunderstood what you
meant. If possible, could you give me an example? Thank you.

> > +void *module_alloc(unsigned long size)
> > +{
> > +     return __vmalloc_node_range(size, 1, VMALLOC_MODULE_START,
> > +     VMALLOC_END, GFP_KERNEL, PAGE_KERNEL_EXEC, 0, NUMA_NO_NODE,
> > +     __builtin_return_address(0));
> > +}
> > +#endif
> >
>
> It's weird checkpatch does not complain about the alignment of those lines.
>
I will modify it.
> Otherwise, I have just tested it and it works, so you can add:
>
> Tested-by: Alexandre Ghiti <alex@ghiti.fr>
>
> Thanks,
>
> Alex

Thank you for testing this patch, I will add it.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/2] riscv: avoid the PIC offset of static percpu data in module beyond 2G limits
  2020-02-19 22:43     ` Carlos Eduardo de Paula
@ 2020-02-20  2:31       ` Vincent Chen
  0 siblings, 0 replies; 10+ messages in thread
From: Vincent Chen @ 2020-02-20  2:31 UTC (permalink / raw)
  To: Carlos Eduardo de Paula
  Cc: linux-riscv, deanbo422, Palmer Dabbelt, Alexandre Ghiti, Paul Walmsley

On Thu, Feb 20, 2020 at 6:43 AM Carlos Eduardo de Paula
<me@carlosedp.com> wrote:
>
> On Wed, Feb 19, 2020 at 2:53 PM Alexandre Ghiti <alex@ghiti.fr> wrote:
> >
> > Hi Vincent,
> >
> > On 2/19/20 8:28 AM, Vincent Chen wrote:
> > > The compiler uses the PIC-relative method to access static variables
> > > instead of GOT when the code model is PIC. Therefore, the limitation of
> > > the access range from the instruction to the symbol address is +-2GB.
> > > Under this circumstance, the kernel cannot load a kernel module if this
> > > module has static per-CPU symbols declared by DEFINE_PER_CPU(). The reason
> > > is that kernel relocates the .data..percpu section of the kernel module to
> > > the end of kernel's .data..percpu. Hence, the distance between the per-CPU
> > > symbols and the instruction will exceed the 2GB limits. To solve this
> > > problem, the kernel should place the loaded module in the memory area
> > > [&_end-2G, VMALLOC_END].
> > >
> > > Signed-off-by: Vincent Chen <vincent.chen@sifive.com>
> > > Suggested-by: Alex Ghiti <alex@ghiti.fr>
> > > Suggested-by: Anup Patel <anup@brainfault.org>
> > >
> > > ---
> > >   arch/riscv/kernel/module.c | 18 ++++++++++++++++++
> > >   1 file changed, 18 insertions(+)
> > >
> > > diff --git a/arch/riscv/kernel/module.c b/arch/riscv/kernel/module.c
> > > index b7401858d872..c498beb82369 100644
> > > --- a/arch/riscv/kernel/module.c
> > > +++ b/arch/riscv/kernel/module.c
> > > @@ -8,6 +8,10 @@
> > >   #include <linux/err.h>
> > >   #include <linux/errno.h>
> > >   #include <linux/moduleloader.h>
> > > +#include <linux/vmalloc.h>
> > > +#include <linux/sizes.h>
> > > +#include <asm/pgtable.h>
> > > +#include <asm/sections.h>
> > >
> > >   static int apply_r_riscv_32_rela(struct module *me, u32 *location, Elf_Addr v)
> > >   {
> > > @@ -386,3 +390,17 @@ int apply_relocate_add(Elf_Shdr *sechdrs, const char *strtab,
> > >
> > >       return 0;
> > >   }
> > > +#if defined(CONFIG_MMU) && defined(CONFIG_64BIT)
> > > +#ifdef CONFIG_MAXPHYSMEM_2GB
> > > +#define VMALLOC_MODULE_START \
> > > +     max(PFN_ALIGN((unsigned long)&_end - SZ_2G), VMALLOC_START)
> > > +#else
> > > +#define VMALLOC_MODULE_START PFN_ALIGN((unsigned long)&_end - SZ_2G)
> > > +#endif
> >
> > I would use the same definition for both cases:
> >
> > #define VMALLOC_MODULE_START \
> >         max(PFN_ALIGN((unsigned long)&_end - SZ_2G), VMALLOC_START)
> >
> > as it avoids ifdefs and amounts to the same. And maybe you can avoid the
> > definition of VMALLOC_MODULE_START at the same time.
> >
> > > +void *module_alloc(unsigned long size)
> > > +{
> > > +     return __vmalloc_node_range(size, 1, VMALLOC_MODULE_START,
> > > +     VMALLOC_END, GFP_KERNEL, PAGE_KERNEL_EXEC, 0, NUMA_NO_NODE,
> > > +     __builtin_return_address(0));
> > > +}
> > > +#endif
> > >
> >
> > It's weird checkpatch does not complain about the alignment of those lines.
> >
> > Otherwise, I have just tested it and it works, so you can add:
> >
> > Tested-by: Alexandre Ghiti <alex@ghiti.fr>
> >
> > Thanks,
> >
> > Alex
> >
>
> Thanks for the patch, applied on v5.5.0 and v5.6.0-rc2. Worked fine on
> Qemu and Unleashed:
>
> root@debian10-riscv64:~# sudo modprobe openvswitch
> [  124.257220] openvswitch: Open vSwitch switching datapath
>
> root@debian10-riscv64:~# modprobe br_netfilter
> [  193.168269] Bridge firewalling registered
>
> root@debian10-riscv64:~# lsmod
> Module                  Size  Used by
> br_netfilter           23054  0
> bridge                217063  2 br_netfilter
> stp                     2891  1 bridge
> llc                     5968  2 bridge,stp
> openvswitch           197057  0
> nsh                     3501  1 openvswitch
> nf_conncount           11362  1 openvswitch
> nf_nat                 39088  1 openvswitch
> nf_conntrack          143270  3 nf_nat,openvswitch,nf_conncount
> nf_defrag_ipv6         10091  2 nf_conntrack,openvswitch
> nf_defrag_ipv4          2410  1 nf_conntrack
> ip_tables              16409  0
>
> If desired, add:
>
> Tested-by: Carlos de Paula <me@carlosedp.com>

Thank you for testing this patch, I will add it.

Vincent Chen


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/2] riscv: avoid the PIC offset of static percpu data in module beyond 2G limits
  2020-02-20  2:29     ` Vincent Chen
@ 2020-02-20  5:53       ` Alex Ghiti
  0 siblings, 0 replies; 10+ messages in thread
From: Alex Ghiti @ 2020-02-20  5:53 UTC (permalink / raw)
  To: Vincent Chen; +Cc: linux-riscv, deanbo422, Palmer Dabbelt, Paul Walmsley

Hi Vincent,

On 2/19/20 9:29 PM, Vincent Chen wrote:
> On Thu, Feb 20, 2020 at 1:52 AM Alexandre Ghiti <alex@ghiti.fr> wrote:
>>
>> Hi Vincent,
>>
>> On 2/19/20 8:28 AM, Vincent Chen wrote:
>>> The compiler uses the PIC-relative method to access static variables
>>> instead of GOT when the code model is PIC. Therefore, the limitation of
>>> the access range from the instruction to the symbol address is +-2GB.
>>> Under this circumstance, the kernel cannot load a kernel module if this
>>> module has static per-CPU symbols declared by DEFINE_PER_CPU(). The reason
>>> is that kernel relocates the .data..percpu section of the kernel module to
>>> the end of kernel's .data..percpu. Hence, the distance between the per-CPU
>>> symbols and the instruction will exceed the 2GB limits. To solve this
>>> problem, the kernel should place the loaded module in the memory area
>>> [&_end-2G, VMALLOC_END].
>>>
>>> Signed-off-by: Vincent Chen <vincent.chen@sifive.com>
>>> Suggested-by: Alex Ghiti <alex@ghiti.fr>
>>> Suggested-by: Anup Patel <anup@brainfault.org>
>>>
>>> ---
>>>    arch/riscv/kernel/module.c | 18 ++++++++++++++++++
>>>    1 file changed, 18 insertions(+)
>>>
>>> diff --git a/arch/riscv/kernel/module.c b/arch/riscv/kernel/module.c
>>> index b7401858d872..c498beb82369 100644
>>> --- a/arch/riscv/kernel/module.c
>>> +++ b/arch/riscv/kernel/module.c
>>> @@ -8,6 +8,10 @@
>>>    #include <linux/err.h>
>>>    #include <linux/errno.h>
>>>    #include <linux/moduleloader.h>
>>> +#include <linux/vmalloc.h>
>>> +#include <linux/sizes.h>
>>> +#include <asm/pgtable.h>
>>> +#include <asm/sections.h>
>>>
>>>    static int apply_r_riscv_32_rela(struct module *me, u32 *location, Elf_Addr v)
>>>    {
>>> @@ -386,3 +390,17 @@ int apply_relocate_add(Elf_Shdr *sechdrs, const char *strtab,
>>>
>>>        return 0;
>>>    }
>>> +#if defined(CONFIG_MMU) && defined(CONFIG_64BIT)
>>> +#ifdef CONFIG_MAXPHYSMEM_2GB
>>> +#define VMALLOC_MODULE_START \
>>> +     max(PFN_ALIGN((unsigned long)&_end - SZ_2G), VMALLOC_START)
>>> +#else
>>> +#define VMALLOC_MODULE_START PFN_ALIGN((unsigned long)&_end - SZ_2G)
>>> +#endif
>>
>> I would use the same definition for both cases:
>>
>> #define VMALLOC_MODULE_START \
>>          max(PFN_ALIGN((unsigned long)&_end - SZ_2G), VMALLOC_START)
>>
>> as it avoids ifdefs and amounts to the same. And maybe you can avoid the
>> definition of VMALLOC_MODULE_START at the same time.
>>
> Thanks for your comments. I will follow your suggestion to use the
> same definition for both cases. For the definition of
> VMALLOC_MODULE_START, I may prefer to keep it , because I think it may
> be more readable than directly passing the max() function to the
> __vmalloc_node_range(). I am afriad that I misunderstood what you
> meant. If possible, could you give me an example? Thank you.
> 

I meant you could get rid of VMALLOC_MODULE_START definition if there 
was only one, but I don't mind, you can keep it if you prefer.

>>> +void *module_alloc(unsigned long size)
>>> +{
>>> +     return __vmalloc_node_range(size, 1, VMALLOC_MODULE_START,
>>> +     VMALLOC_END, GFP_KERNEL, PAGE_KERNEL_EXEC, 0, NUMA_NO_NODE,
>>> +     __builtin_return_address(0));
>>> +}
>>> +#endif
>>>
>>
>> It's weird checkpatch does not complain about the alignment of those lines.
>>
> I will modify it.
>> Otherwise, I have just tested it and it works, so you can add:
>>
>> Tested-by: Alexandre Ghiti <alex@ghiti.fr>
>>
>> Thanks,
>>
>> Alex
> 
> Thank you for testing this patch, I will add it.
> 

Thanks again,

Alex


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 0/2] solve static percpu symbol issue in module and refine code model of module
  2020-02-21  2:47 [PATCH 0/2] solve static percpu symbol issue in module and refine code model of module Vincent Chen
@ 2020-02-27 22:30 ` Palmer Dabbelt
  0 siblings, 0 replies; 10+ messages in thread
From: Palmer Dabbelt @ 2020-02-27 22:30 UTC (permalink / raw)
  To: vincent.chen; +Cc: vincent.chen, linux-riscv, deanbo422, Paul Walmsley

On Thu, 20 Feb 2020 18:47:53 PST (-0800), vincent.chen@sifive.com wrote:
> The compiler uses the PIC-relative method to access static variables
> instead of GOT when the code model is PIC. Therefore, the limitation of
> the access range from the instruction to the symbol address is +-2GB.
> Under this circumstance, the kernel cannot load a kernel module if this
> module has static per-CPU symbols declared by DEFINE_PER_CPU(). The reason
> is that kernel relocates the .data..percpu section of the kernel module to
> the end of kernel's .data..percpu. Hence, the distance between the per-CPU
> symbols and the instruction will exceed the 2GB limits. To solve this
> problem, the kernel should place the loaded module in the memory area
> [&_end-2G, VMALLOC_END].
>
> Because the loaded module locates in the region [&_end-2G,VMALLOC_END]
> at runtime, the distance from the module start to the end of the kernel
> image does not exceed 2GB. Hence, the second patch changes the code model
> of the kernel module from PIC to medany to improve the performance of data
> access.
>
> Changes from v1->v2
> 1. Unify the definition of VMALLOC_MODULE_START
> 2. Modify the indent
>
> Vincent Chen (2):
>   riscv: avoid the PIC offset of static percpu data in module beyond 2G
>     limits
>   riscv: Replace PIC with medany to improve data accessing in module
>
>  arch/riscv/Makefile        |  6 ++++--
>  arch/riscv/kernel/module.c | 18 ++++++++++++++++++
>  2 files changed, 22 insertions(+), 2 deletions(-)

Looking at this again, I think this is actually a good candidate for fixes.
Unless there's any opposition I'll target it for rc5.  It's on fixes.

Thanks!


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH 0/2] solve static percpu symbol issue in module and refine code model of module
@ 2020-02-21  2:47 Vincent Chen
  2020-02-27 22:30 ` Palmer Dabbelt
  0 siblings, 1 reply; 10+ messages in thread
From: Vincent Chen @ 2020-02-21  2:47 UTC (permalink / raw)
  To: paul.walmsley, palmer; +Cc: Vincent Chen, linux-riscv, deanbo422

The compiler uses the PIC-relative method to access static variables
instead of GOT when the code model is PIC. Therefore, the limitation of
the access range from the instruction to the symbol address is +-2GB.
Under this circumstance, the kernel cannot load a kernel module if this
module has static per-CPU symbols declared by DEFINE_PER_CPU(). The reason
is that kernel relocates the .data..percpu section of the kernel module to
the end of kernel's .data..percpu. Hence, the distance between the per-CPU
symbols and the instruction will exceed the 2GB limits. To solve this
problem, the kernel should place the loaded module in the memory area
[&_end-2G, VMALLOC_END].

Because the loaded module locates in the region [&_end-2G,VMALLOC_END]
at runtime, the distance from the module start to the end of the kernel
image does not exceed 2GB. Hence, the second patch changes the code model
of the kernel module from PIC to medany to improve the performance of data
access.

Changes from v1->v2
1. Unify the definition of VMALLOC_MODULE_START
2. Modify the indent  

Vincent Chen (2):
  riscv: avoid the PIC offset of static percpu data in module beyond 2G
    limits
  riscv: Replace PIC with medany to improve data accessing in module

 arch/riscv/Makefile        |  6 ++++--
 arch/riscv/kernel/module.c | 18 ++++++++++++++++++
 2 files changed, 22 insertions(+), 2 deletions(-)

-- 
2.7.4



^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2020-02-27 22:31 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-02-19  7:28 [PATCH 0/2] solve static percpu symbol issue in module and refine code model of module Vincent Chen
2020-02-19  7:28 ` [PATCH 1/2] riscv: avoid the PIC offset of static percpu data in module beyond 2G limits Vincent Chen
2020-02-19 17:52   ` Alexandre Ghiti
2020-02-19 22:43     ` Carlos Eduardo de Paula
2020-02-20  2:31       ` Vincent Chen
2020-02-20  2:29     ` Vincent Chen
2020-02-20  5:53       ` Alex Ghiti
2020-02-19  7:28 ` [PATCH 2/2] riscv: Change code model of module to medany to improve data accessing Vincent Chen
2020-02-21  2:47 [PATCH 0/2] solve static percpu symbol issue in module and refine code model of module Vincent Chen
2020-02-27 22:30 ` Palmer Dabbelt

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.