All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v1 0/4] Kill the time spent in patch_instruction()
@ 2022-03-22 15:40 ` Christophe Leroy
  0 siblings, 0 replies; 47+ messages in thread
From: Christophe Leroy @ 2022-03-22 15:40 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman
  Cc: Christophe Leroy, linux-kernel, linuxppc-dev

This series reduces by 70% the time required to activate
ftrace on an 8xx with CONFIG_STRICT_KERNEL_RWX.

Measure is performed in function ftrace_replace_code() using mftb()
around the loop.

With the series,
- Without CONFIG_STRICT_KERNEL_RWX, 416000 TB ticks are measured.
- With CONFIG_STRICT_KERNEL_RWX, 546000 TB ticks are measured.

Before this series,
- Without CONFIG_STRICT_KERNEL_RWX, 427000 TB ticks are measured.
- With CONFIG_STRICT_KERNEL_RWX, 1744000 TB ticks are measured.

Before the series, CONFIG_STRICT_KERNEL_RWX multiplies the time
required for ftrace activation by more than 4.

With the series, CONFIG_STRICT_KERNEL_RWX increases the time
required for ftrace activation by only 30%

Christophe Leroy (4):
  powerpc/code-patching: Don't call is_vmalloc_or_module_addr() without
    CONFIG_MODULES
  powerpc/code-patching: Speed up page mapping/unmapping
  powerpc/code-patching: Use jump_label for testing freed initmem
  powerpc/code-patching: Use jump_label to check if poking_init() is
    done

 arch/powerpc/include/asm/code-patching.h |  2 ++
 arch/powerpc/lib/code-patching.c         | 37 +++++++++++++++---------
 arch/powerpc/mm/mem.c                    |  2 ++
 3 files changed, 28 insertions(+), 13 deletions(-)

-- 
2.35.1


^ permalink raw reply	[flat|nested] 47+ messages in thread

* [PATCH v1 0/4] Kill the time spent in patch_instruction()
@ 2022-03-22 15:40 ` Christophe Leroy
  0 siblings, 0 replies; 47+ messages in thread
From: Christophe Leroy @ 2022-03-22 15:40 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman
  Cc: linuxppc-dev, linux-kernel

This series reduces by 70% the time required to activate
ftrace on an 8xx with CONFIG_STRICT_KERNEL_RWX.

Measure is performed in function ftrace_replace_code() using mftb()
around the loop.

With the series,
- Without CONFIG_STRICT_KERNEL_RWX, 416000 TB ticks are measured.
- With CONFIG_STRICT_KERNEL_RWX, 546000 TB ticks are measured.

Before this series,
- Without CONFIG_STRICT_KERNEL_RWX, 427000 TB ticks are measured.
- With CONFIG_STRICT_KERNEL_RWX, 1744000 TB ticks are measured.

Before the series, CONFIG_STRICT_KERNEL_RWX multiplies the time
required for ftrace activation by more than 4.

With the series, CONFIG_STRICT_KERNEL_RWX increases the time
required for ftrace activation by only 30%

Christophe Leroy (4):
  powerpc/code-patching: Don't call is_vmalloc_or_module_addr() without
    CONFIG_MODULES
  powerpc/code-patching: Speed up page mapping/unmapping
  powerpc/code-patching: Use jump_label for testing freed initmem
  powerpc/code-patching: Use jump_label to check if poking_init() is
    done

 arch/powerpc/include/asm/code-patching.h |  2 ++
 arch/powerpc/lib/code-patching.c         | 37 +++++++++++++++---------
 arch/powerpc/mm/mem.c                    |  2 ++
 3 files changed, 28 insertions(+), 13 deletions(-)

-- 
2.35.1


^ permalink raw reply	[flat|nested] 47+ messages in thread

* [PATCH v1 1/4] powerpc/code-patching: Don't call is_vmalloc_or_module_addr() without CONFIG_MODULES
  2022-03-22 15:40 ` Christophe Leroy
  (?)
  (?)
@ 2022-03-22 15:40   ` Christophe Leroy
  -1 siblings, 0 replies; 47+ messages in thread
From: Christophe Leroy @ 2022-03-22 15:40 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman
  Cc: Christophe Leroy, linux-kernel, linuxppc-dev

If CONFIG_MODULES is not set, there is no point in checking
whether text is in module area.

This reduced the time needed to activate/deactivate ftrace
by more than 10% on an 8xx.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
---
 arch/powerpc/lib/code-patching.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/lib/code-patching.c b/arch/powerpc/lib/code-patching.c
index 00c68e7fb11e..f970f189875b 100644
--- a/arch/powerpc/lib/code-patching.c
+++ b/arch/powerpc/lib/code-patching.c
@@ -97,7 +97,7 @@ static int map_patch_area(void *addr, unsigned long text_poke_addr)
 {
 	unsigned long pfn;
 
-	if (is_vmalloc_or_module_addr(addr))
+	if (IS_ENABLED(CONFIG_MODULES) && is_vmalloc_or_module_addr(addr))
 		pfn = vmalloc_to_pfn(addr);
 	else
 		pfn = __pa_symbol(addr) >> PAGE_SHIFT;
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v1 1/4] powerpc/code-patching: Don't call is_vmalloc_or_module_addr() without CONFIG_MODULES
@ 2022-03-22 15:40   ` Christophe Leroy
  0 siblings, 0 replies; 47+ messages in thread
From: Christophe Leroy @ 2022-03-22 15:40 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman
  Cc: linuxppc-dev, linux-kernel

If CONFIG_MODULES is not set, there is no point in checking
whether text is in module area.

This reduced the time needed to activate/deactivate ftrace
by more than 10% on an 8xx.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
---
 arch/powerpc/lib/code-patching.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/lib/code-patching.c b/arch/powerpc/lib/code-patching.c
index 00c68e7fb11e..f970f189875b 100644
--- a/arch/powerpc/lib/code-patching.c
+++ b/arch/powerpc/lib/code-patching.c
@@ -97,7 +97,7 @@ static int map_patch_area(void *addr, unsigned long text_poke_addr)
 {
 	unsigned long pfn;
 
-	if (is_vmalloc_or_module_addr(addr))
+	if (IS_ENABLED(CONFIG_MODULES) && is_vmalloc_or_module_addr(addr))
 		pfn = vmalloc_to_pfn(addr);
 	else
 		pfn = __pa_symbol(addr) >> PAGE_SHIFT;
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v1 2/4] powerpc/code-patching: Speed up page mapping/unmapping
  2022-03-22 15:40 ` Christophe Leroy
  (?)
  (?)
@ 2022-03-22 15:40   ` Christophe Leroy
  -1 siblings, 0 replies; 47+ messages in thread
From: Christophe Leroy @ 2022-03-22 15:40 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman
  Cc: Christophe Leroy, linux-kernel, linuxppc-dev

Since commit 591b4b268435 ("powerpc/code-patching: Pre-map patch area")
the patch area is premapped so intermediate page tables are already
allocated.

Use __set_pte_at() directly instead of the heavy map_kernel_page(),
at for unmapping just do a pte_clear() followed by a flush.

__set_pte_at() can be used directly without the filters in
set_pte_at() because we are mapping a normal page non executable.

Make sure gcc knows text_poke_area is page aligned in order to
optimise the flush.

This change reduces by 66% the time needed to activate ftrace on
an 8xx (588000 tb ticks instead of 1744000).

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
---
 arch/powerpc/lib/code-patching.c | 27 ++++++++++++++++-----------
 1 file changed, 16 insertions(+), 11 deletions(-)

diff --git a/arch/powerpc/lib/code-patching.c b/arch/powerpc/lib/code-patching.c
index f970f189875b..62692c6031bc 100644
--- a/arch/powerpc/lib/code-patching.c
+++ b/arch/powerpc/lib/code-patching.c
@@ -90,17 +90,20 @@ void __init poking_init(void)
 		text_area_cpu_down));
 }
 
+static unsigned long get_patch_pfn(void *addr)
+{
+	if (IS_ENABLED(CONFIG_MODULES) && is_vmalloc_or_module_addr(addr))
+		return vmalloc_to_pfn(addr);
+	else
+		return __pa_symbol(addr) >> PAGE_SHIFT;
+}
+
 /*
  * This can be called for kernel text or a module.
  */
 static int map_patch_area(void *addr, unsigned long text_poke_addr)
 {
-	unsigned long pfn;
-
-	if (IS_ENABLED(CONFIG_MODULES) && is_vmalloc_or_module_addr(addr))
-		pfn = vmalloc_to_pfn(addr);
-	else
-		pfn = __pa_symbol(addr) >> PAGE_SHIFT;
+	unsigned long pfn = get_patch_pfn(addr);
 
 	return map_kernel_page(text_poke_addr, (pfn << PAGE_SHIFT), PAGE_KERNEL);
 }
@@ -145,17 +148,19 @@ static int __do_patch_instruction(u32 *addr, ppc_inst_t instr)
 	int err;
 	u32 *patch_addr;
 	unsigned long text_poke_addr;
+	pte_t *pte;
+	unsigned long pfn = get_patch_pfn(addr);
 
-	text_poke_addr = (unsigned long)__this_cpu_read(text_poke_area)->addr;
+	text_poke_addr = (unsigned long)__this_cpu_read(text_poke_area)->addr & PAGE_MASK;
 	patch_addr = (u32 *)(text_poke_addr + offset_in_page(addr));
 
-	err = map_patch_area(addr, text_poke_addr);
-	if (err)
-		return err;
+	pte = virt_to_kpte(text_poke_addr);
+	__set_pte_at(&init_mm, text_poke_addr, pte, pfn_pte(pfn, PAGE_KERNEL), 0);
 
 	err = __patch_instruction(addr, instr, patch_addr);
 
-	unmap_patch_area(text_poke_addr);
+	pte_clear(&init_mm, text_poke_addr, pte);
+	flush_tlb_kernel_range(text_poke_addr, text_poke_addr + PAGE_SIZE);
 
 	return err;
 }
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v1 2/4] powerpc/code-patching: Speed up page mapping/unmapping
@ 2022-03-22 15:40   ` Christophe Leroy
  0 siblings, 0 replies; 47+ messages in thread
From: Christophe Leroy @ 2022-03-22 15:40 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman
  Cc: linuxppc-dev, linux-kernel

Since commit 591b4b268435 ("powerpc/code-patching: Pre-map patch area")
the patch area is premapped so intermediate page tables are already
allocated.

Use __set_pte_at() directly instead of the heavy map_kernel_page(),
at for unmapping just do a pte_clear() followed by a flush.

__set_pte_at() can be used directly without the filters in
set_pte_at() because we are mapping a normal page non executable.

Make sure gcc knows text_poke_area is page aligned in order to
optimise the flush.

This change reduces by 66% the time needed to activate ftrace on
an 8xx (588000 tb ticks instead of 1744000).

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
---
 arch/powerpc/lib/code-patching.c | 27 ++++++++++++++++-----------
 1 file changed, 16 insertions(+), 11 deletions(-)

diff --git a/arch/powerpc/lib/code-patching.c b/arch/powerpc/lib/code-patching.c
index f970f189875b..62692c6031bc 100644
--- a/arch/powerpc/lib/code-patching.c
+++ b/arch/powerpc/lib/code-patching.c
@@ -90,17 +90,20 @@ void __init poking_init(void)
 		text_area_cpu_down));
 }
 
+static unsigned long get_patch_pfn(void *addr)
+{
+	if (IS_ENABLED(CONFIG_MODULES) && is_vmalloc_or_module_addr(addr))
+		return vmalloc_to_pfn(addr);
+	else
+		return __pa_symbol(addr) >> PAGE_SHIFT;
+}
+
 /*
  * This can be called for kernel text or a module.
  */
 static int map_patch_area(void *addr, unsigned long text_poke_addr)
 {
-	unsigned long pfn;
-
-	if (IS_ENABLED(CONFIG_MODULES) && is_vmalloc_or_module_addr(addr))
-		pfn = vmalloc_to_pfn(addr);
-	else
-		pfn = __pa_symbol(addr) >> PAGE_SHIFT;
+	unsigned long pfn = get_patch_pfn(addr);
 
 	return map_kernel_page(text_poke_addr, (pfn << PAGE_SHIFT), PAGE_KERNEL);
 }
@@ -145,17 +148,19 @@ static int __do_patch_instruction(u32 *addr, ppc_inst_t instr)
 	int err;
 	u32 *patch_addr;
 	unsigned long text_poke_addr;
+	pte_t *pte;
+	unsigned long pfn = get_patch_pfn(addr);
 
-	text_poke_addr = (unsigned long)__this_cpu_read(text_poke_area)->addr;
+	text_poke_addr = (unsigned long)__this_cpu_read(text_poke_area)->addr & PAGE_MASK;
 	patch_addr = (u32 *)(text_poke_addr + offset_in_page(addr));
 
-	err = map_patch_area(addr, text_poke_addr);
-	if (err)
-		return err;
+	pte = virt_to_kpte(text_poke_addr);
+	__set_pte_at(&init_mm, text_poke_addr, pte, pfn_pte(pfn, PAGE_KERNEL), 0);
 
 	err = __patch_instruction(addr, instr, patch_addr);
 
-	unmap_patch_area(text_poke_addr);
+	pte_clear(&init_mm, text_poke_addr, pte);
+	flush_tlb_kernel_range(text_poke_addr, text_poke_addr + PAGE_SIZE);
 
 	return err;
 }
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v1 3/4] powerpc/code-patching: Use jump_label for testing freed initmem
  2022-03-22 15:40 ` Christophe Leroy
  (?)
  (?)
@ 2022-03-22 15:40   ` Christophe Leroy
  -1 siblings, 0 replies; 47+ messages in thread
From: Christophe Leroy @ 2022-03-22 15:40 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman
  Cc: Christophe Leroy, linux-kernel, linuxppc-dev

Once init is done, initmem is freed forever so no need to
test system_state at every call to patch_instruction().

Use jump_label.

This reduces by 2% the time needed to activate ftrace on an 8xx.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
---
 arch/powerpc/include/asm/code-patching.h | 2 ++
 arch/powerpc/lib/code-patching.c         | 5 ++++-
 arch/powerpc/mm/mem.c                    | 2 ++
 3 files changed, 8 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/include/asm/code-patching.h b/arch/powerpc/include/asm/code-patching.h
index 409483b2d0ce..bccc3a538b9f 100644
--- a/arch/powerpc/include/asm/code-patching.h
+++ b/arch/powerpc/include/asm/code-patching.h
@@ -22,6 +22,8 @@
 #define BRANCH_SET_LINK	0x1
 #define BRANCH_ABSOLUTE	0x2
 
+DECLARE_STATIC_KEY_FALSE(init_mem_is_free);
+
 bool is_offset_in_branch_range(long offset);
 bool is_offset_in_cond_branch_range(long offset);
 int create_branch(ppc_inst_t *instr, const u32 *addr,
diff --git a/arch/powerpc/lib/code-patching.c b/arch/powerpc/lib/code-patching.c
index 62692c6031bc..ab434c3853c9 100644
--- a/arch/powerpc/lib/code-patching.c
+++ b/arch/powerpc/lib/code-patching.c
@@ -8,6 +8,7 @@
 #include <linux/init.h>
 #include <linux/cpuhotplug.h>
 #include <linux/uaccess.h>
+#include <linux/jump_label.h>
 
 #include <asm/tlbflush.h>
 #include <asm/page.h>
@@ -193,10 +194,12 @@ static int do_patch_instruction(u32 *addr, ppc_inst_t instr)
 
 #endif /* CONFIG_STRICT_KERNEL_RWX */
 
+__ro_after_init DEFINE_STATIC_KEY_FALSE(init_mem_is_free);
+
 int patch_instruction(u32 *addr, ppc_inst_t instr)
 {
 	/* Make sure we aren't patching a freed init section */
-	if (system_state >= SYSTEM_FREEING_INITMEM && init_section_contains(addr, 4))
+	if (static_branch_likely(&init_mem_is_free) && init_section_contains(addr, 4))
 		return 0;
 
 	return do_patch_instruction(addr, instr);
diff --git a/arch/powerpc/mm/mem.c b/arch/powerpc/mm/mem.c
index 8e301cd8925b..9710d4e0bf08 100644
--- a/arch/powerpc/mm/mem.c
+++ b/arch/powerpc/mm/mem.c
@@ -22,6 +22,7 @@
 #include <asm/kasan.h>
 #include <asm/svm.h>
 #include <asm/mmzone.h>
+#include <asm/code-patching.h>
 
 #include <mm/mmu_decl.h>
 
@@ -311,6 +312,7 @@ void free_initmem(void)
 {
 	ppc_md.progress = ppc_printk_progress;
 	mark_initmem_nx();
+	static_branch_enable(&init_mem_is_free);
 	free_initmem_default(POISON_FREE_INITMEM);
 }
 
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v1 3/4] powerpc/code-patching: Use jump_label for testing freed initmem
@ 2022-03-22 15:40   ` Christophe Leroy
  0 siblings, 0 replies; 47+ messages in thread
From: Christophe Leroy @ 2022-03-22 15:40 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman
  Cc: linuxppc-dev, linux-kernel

Once init is done, initmem is freed forever so no need to
test system_state at every call to patch_instruction().

Use jump_label.

This reduces by 2% the time needed to activate ftrace on an 8xx.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
---
 arch/powerpc/include/asm/code-patching.h | 2 ++
 arch/powerpc/lib/code-patching.c         | 5 ++++-
 arch/powerpc/mm/mem.c                    | 2 ++
 3 files changed, 8 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/include/asm/code-patching.h b/arch/powerpc/include/asm/code-patching.h
index 409483b2d0ce..bccc3a538b9f 100644
--- a/arch/powerpc/include/asm/code-patching.h
+++ b/arch/powerpc/include/asm/code-patching.h
@@ -22,6 +22,8 @@
 #define BRANCH_SET_LINK	0x1
 #define BRANCH_ABSOLUTE	0x2
 
+DECLARE_STATIC_KEY_FALSE(init_mem_is_free);
+
 bool is_offset_in_branch_range(long offset);
 bool is_offset_in_cond_branch_range(long offset);
 int create_branch(ppc_inst_t *instr, const u32 *addr,
diff --git a/arch/powerpc/lib/code-patching.c b/arch/powerpc/lib/code-patching.c
index 62692c6031bc..ab434c3853c9 100644
--- a/arch/powerpc/lib/code-patching.c
+++ b/arch/powerpc/lib/code-patching.c
@@ -8,6 +8,7 @@
 #include <linux/init.h>
 #include <linux/cpuhotplug.h>
 #include <linux/uaccess.h>
+#include <linux/jump_label.h>
 
 #include <asm/tlbflush.h>
 #include <asm/page.h>
@@ -193,10 +194,12 @@ static int do_patch_instruction(u32 *addr, ppc_inst_t instr)
 
 #endif /* CONFIG_STRICT_KERNEL_RWX */
 
+__ro_after_init DEFINE_STATIC_KEY_FALSE(init_mem_is_free);
+
 int patch_instruction(u32 *addr, ppc_inst_t instr)
 {
 	/* Make sure we aren't patching a freed init section */
-	if (system_state >= SYSTEM_FREEING_INITMEM && init_section_contains(addr, 4))
+	if (static_branch_likely(&init_mem_is_free) && init_section_contains(addr, 4))
 		return 0;
 
 	return do_patch_instruction(addr, instr);
diff --git a/arch/powerpc/mm/mem.c b/arch/powerpc/mm/mem.c
index 8e301cd8925b..9710d4e0bf08 100644
--- a/arch/powerpc/mm/mem.c
+++ b/arch/powerpc/mm/mem.c
@@ -22,6 +22,7 @@
 #include <asm/kasan.h>
 #include <asm/svm.h>
 #include <asm/mmzone.h>
+#include <asm/code-patching.h>
 
 #include <mm/mmu_decl.h>
 
@@ -311,6 +312,7 @@ void free_initmem(void)
 {
 	ppc_md.progress = ppc_printk_progress;
 	mark_initmem_nx();
+	static_branch_enable(&init_mem_is_free);
 	free_initmem_default(POISON_FREE_INITMEM);
 }
 
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v1 4/4] powerpc/code-patching: Use jump_label to check if poking_init() is done
  2022-03-22 15:40 ` Christophe Leroy
  (?)
  (?)
@ 2022-03-22 15:40   ` Christophe Leroy
  -1 siblings, 0 replies; 47+ messages in thread
From: Christophe Leroy @ 2022-03-22 15:40 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman
  Cc: Christophe Leroy, linux-kernel, linuxppc-dev

It's only during early startup that poking_init() is not done yet,
for instance when calling ftrace_init().

Once poking_init() has been called there must be a poking area, no
need to check it everytime patch_instruction() is called.

ftrace activation time is reduced by 7% with the change on an 8xx.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
---
 arch/powerpc/lib/code-patching.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/lib/code-patching.c b/arch/powerpc/lib/code-patching.c
index ab434c3853c9..8bd74bbe8b8d 100644
--- a/arch/powerpc/lib/code-patching.c
+++ b/arch/powerpc/lib/code-patching.c
@@ -79,6 +79,8 @@ static int text_area_cpu_down(unsigned int cpu)
 	return 0;
 }
 
+static __ro_after_init DEFINE_STATIC_KEY_FALSE(poking_init_done);
+
 /*
  * Although BUG_ON() is rude, in this case it should only happen if ENOMEM, and
  * we judge it as being preferable to a kernel that will crash later when
@@ -89,6 +91,7 @@ void __init poking_init(void)
 	BUG_ON(!cpuhp_setup_state(CPUHP_AP_ONLINE_DYN,
 		"powerpc/text_poke:online", text_area_cpu_up,
 		text_area_cpu_down));
+	static_branch_enable(&poking_init_done);
 }
 
 static unsigned long get_patch_pfn(void *addr)
@@ -176,7 +179,7 @@ static int do_patch_instruction(u32 *addr, ppc_inst_t instr)
 	 * when text_poke_area is not ready, but we still need
 	 * to allow patching. We just do the plain old patching
 	 */
-	if (!this_cpu_read(text_poke_area))
+	if (!static_branch_likely(&poking_init_done))
 		return raw_patch_instruction(addr, instr);
 
 	local_irq_save(flags);
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v1 4/4] powerpc/code-patching: Use jump_label to check if poking_init() is done
@ 2022-03-22 15:40   ` Christophe Leroy
  0 siblings, 0 replies; 47+ messages in thread
From: Christophe Leroy @ 2022-03-22 15:40 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman
  Cc: linuxppc-dev, linux-kernel

It's only during early startup that poking_init() is not done yet,
for instance when calling ftrace_init().

Once poking_init() has been called there must be a poking area, no
need to check it everytime patch_instruction() is called.

ftrace activation time is reduced by 7% with the change on an 8xx.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
---
 arch/powerpc/lib/code-patching.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/lib/code-patching.c b/arch/powerpc/lib/code-patching.c
index ab434c3853c9..8bd74bbe8b8d 100644
--- a/arch/powerpc/lib/code-patching.c
+++ b/arch/powerpc/lib/code-patching.c
@@ -79,6 +79,8 @@ static int text_area_cpu_down(unsigned int cpu)
 	return 0;
 }
 
+static __ro_after_init DEFINE_STATIC_KEY_FALSE(poking_init_done);
+
 /*
  * Although BUG_ON() is rude, in this case it should only happen if ENOMEM, and
  * we judge it as being preferable to a kernel that will crash later when
@@ -89,6 +91,7 @@ void __init poking_init(void)
 	BUG_ON(!cpuhp_setup_state(CPUHP_AP_ONLINE_DYN,
 		"powerpc/text_poke:online", text_area_cpu_up,
 		text_area_cpu_down));
+	static_branch_enable(&poking_init_done);
 }
 
 static unsigned long get_patch_pfn(void *addr)
@@ -176,7 +179,7 @@ static int do_patch_instruction(u32 *addr, ppc_inst_t instr)
 	 * when text_poke_area is not ready, but we still need
 	 * to allow patching. We just do the plain old patching
 	 */
-	if (!this_cpu_read(text_poke_area))
+	if (!static_branch_likely(&poking_init_done))
 		return raw_patch_instruction(addr, instr);
 
 	local_irq_save(flags);
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* Re: [PATCH v1 0/4] Kill the time spent in patch_instruction()
  2022-03-22 15:40 ` Christophe Leroy
                   ` (6 preceding siblings ...)
  (?)
@ 2022-05-15 10:28 ` Michael Ellerman
  2022-05-17  6:44   ` Christophe Leroy
  -1 siblings, 1 reply; 47+ messages in thread
From: Michael Ellerman @ 2022-05-15 10:28 UTC (permalink / raw)
  To: Christophe Leroy, Michael Ellerman, Paul Mackerras,
	Benjamin Herrenschmidt
  Cc: linuxppc-dev, linux-kernel

On Tue, 22 Mar 2022 16:40:17 +0100, Christophe Leroy wrote:
> This series reduces by 70% the time required to activate
> ftrace on an 8xx with CONFIG_STRICT_KERNEL_RWX.
> 
> Measure is performed in function ftrace_replace_code() using mftb()
> around the loop.
> 
> With the series,
> - Without CONFIG_STRICT_KERNEL_RWX, 416000 TB ticks are measured.
> - With CONFIG_STRICT_KERNEL_RWX, 546000 TB ticks are measured.
> 
> [...]

Patches 1, 3 and 4 applied to powerpc/next.

[1/4] powerpc/code-patching: Don't call is_vmalloc_or_module_addr() without CONFIG_MODULES
      https://git.kernel.org/powerpc/c/cb3ac45214c03852430979a43180371a44b74596
[3/4] powerpc/code-patching: Use jump_label for testing freed initmem
      https://git.kernel.org/powerpc/c/b033767848c4115e486b1a51946de3bee2ac0fa6
[4/4] powerpc/code-patching: Use jump_label to check if poking_init() is done
      https://git.kernel.org/powerpc/c/1751289268ef959db68b0b6f798d904d6403309a

cheers

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v1 0/4] Kill the time spent in patch_instruction()
  2022-05-15 10:28 ` [PATCH v1 0/4] Kill the time spent in patch_instruction() Michael Ellerman
@ 2022-05-17  6:44   ` Christophe Leroy
  2022-05-17 12:37     ` Michael Ellerman
  0 siblings, 1 reply; 47+ messages in thread
From: Christophe Leroy @ 2022-05-17  6:44 UTC (permalink / raw)
  To: Michael Ellerman, Michael Ellerman, Paul Mackerras,
	Benjamin Herrenschmidt
  Cc: linuxppc-dev, linux-kernel



Le 15/05/2022 à 12:28, Michael Ellerman a écrit :
> On Tue, 22 Mar 2022 16:40:17 +0100, Christophe Leroy wrote:
>> This series reduces by 70% the time required to activate
>> ftrace on an 8xx with CONFIG_STRICT_KERNEL_RWX.
>>
>> Measure is performed in function ftrace_replace_code() using mftb()
>> around the loop.
>>
>> With the series,
>> - Without CONFIG_STRICT_KERNEL_RWX, 416000 TB ticks are measured.
>> - With CONFIG_STRICT_KERNEL_RWX, 546000 TB ticks are measured.
>>
>> [...]
> 
> Patches 1, 3 and 4 applied to powerpc/next.
> 
> [1/4] powerpc/code-patching: Don't call is_vmalloc_or_module_addr() without CONFIG_MODULES
>        https://git.kernel.org/powerpc/c/cb3ac45214c03852430979a43180371a44b74596
> [3/4] powerpc/code-patching: Use jump_label for testing freed initmem
>        https://git.kernel.org/powerpc/c/b033767848c4115e486b1a51946de3bee2ac0fa6
> [4/4] powerpc/code-patching: Use jump_label to check if poking_init() is done
>        https://git.kernel.org/powerpc/c/1751289268ef959db68b0b6f798d904d6403309a
> 

Patch 2 was the keystone of this series. What happened to it ?

Christophe

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v1 0/4] Kill the time spent in patch_instruction()
  2022-05-17  6:44   ` Christophe Leroy
@ 2022-05-17 12:37     ` Michael Ellerman
  2022-05-31  6:24       ` Christophe Leroy
  0 siblings, 1 reply; 47+ messages in thread
From: Michael Ellerman @ 2022-05-17 12:37 UTC (permalink / raw)
  To: Christophe Leroy, Michael Ellerman, Paul Mackerras,
	Benjamin Herrenschmidt
  Cc: linuxppc-dev, linux-kernel

Christophe Leroy <christophe.leroy@csgroup.eu> writes:
> Le 15/05/2022 à 12:28, Michael Ellerman a écrit :
>> On Tue, 22 Mar 2022 16:40:17 +0100, Christophe Leroy wrote:
>>> This series reduces by 70% the time required to activate
>>> ftrace on an 8xx with CONFIG_STRICT_KERNEL_RWX.
>>>
>>> Measure is performed in function ftrace_replace_code() using mftb()
>>> around the loop.
>>>
>>> With the series,
>>> - Without CONFIG_STRICT_KERNEL_RWX, 416000 TB ticks are measured.
>>> - With CONFIG_STRICT_KERNEL_RWX, 546000 TB ticks are measured.
>>>
>>> [...]
>> 
>> Patches 1, 3 and 4 applied to powerpc/next.
>> 
>> [1/4] powerpc/code-patching: Don't call is_vmalloc_or_module_addr() without CONFIG_MODULES
>>        https://git.kernel.org/powerpc/c/cb3ac45214c03852430979a43180371a44b74596
>> [3/4] powerpc/code-patching: Use jump_label for testing freed initmem
>>        https://git.kernel.org/powerpc/c/b033767848c4115e486b1a51946de3bee2ac0fa6
>> [4/4] powerpc/code-patching: Use jump_label to check if poking_init() is done
>>        https://git.kernel.org/powerpc/c/1751289268ef959db68b0b6f798d904d6403309a
>> 
>
> Patch 2 was the keystone of this series. What happened to it ?

It broke on 64-bit. I think I know why but I haven't had time to test
it. Will try and get it fixed in the next day or two.

cheers

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v1 3/4] powerpc/code-patching: Use jump_label for testing freed initmem
  2022-03-22 15:40   ` Christophe Leroy
@ 2022-05-19  2:17     ` Guenter Roeck
  -1 siblings, 0 replies; 47+ messages in thread
From: Guenter Roeck @ 2022-05-19  2:17 UTC (permalink / raw)
  To: Christophe Leroy
  Cc: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman,
	linux-kernel, linuxppc-dev

On Tue, Mar 22, 2022 at 04:40:20PM +0100, Christophe Leroy wrote:
> Once init is done, initmem is freed forever so no need to
> test system_state at every call to patch_instruction().
> 
> Use jump_label.
> 
> This reduces by 2% the time needed to activate ftrace on an 8xx.
> 

It also causes the qemu mpc8544ds emulation to crash.

BUG: Unable to handle kernel data access on write at 0xc122eb34
Faulting instruction address: 0xc001b580
Oops: Kernel access of bad area, sig: 11 [#1]
BE PAGE_SIZE=4K MPC8544 DS
Modules linked in:
CPU: 0 PID: 1 Comm: swapper Not tainted 5.18.0-rc7-next-20220518 #1
NIP:  c001b580 LR: c001b560 CTR: 00000003
REGS: c5107dd0 TRAP: 0300   Not tainted  (5.18.0-rc7-next-20220518)
MSR:  00009000 <EE,ME>  CR: 24000882  XER: 00000000
DEAR: c122eb34 ESR: 00800000
GPR00: c001b560 c5107ec0 c5120020 10000000 00000000 00000078 0c000000 cfffffff
GPR08: c001e9ec 00000001 00000007 00000000 44000882 00000000 c0005178 00000000
GPR16: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
GPR24: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 c1230000
NIP [c001b580] free_initmem+0x48/0xa8
LR [c001b560] free_initmem+0x28/0xa8
Call Trace:
[c5107ec0] [c001b560] free_initmem+0x28/0xa8 (unreliable)
[c5107ee0] [c00051b0] kernel_init+0x38/0x150
[c5107f00] [c001626c] ret_from_kernel_thread+0x5c/0x64
Instruction dump:
3fe0c123 912a00dc 90010024 48000665 3d20c218 8929fa65 2c090000 41820058
813feb34 2c090000 4082003c 39200001 <913feb34> 80010024 3cc0c114 83e1001c

Reverting this patch fixes the problem.

Guenter

---
# bad: [736ee37e2e8eed7fe48d0a37ee5a709514d478b3] Add linux-next specific files for 20220518
# good: [42226c989789d8da4af1de0c31070c96726d990c] Linux 5.18-rc7
git bisect start 'HEAD' 'v5.18-rc7'
# bad: [555b5fa93f08980ccb6bc8e196226046fe047901] Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/cryptodev-2.6.git
git bisect bad 555b5fa93f08980ccb6bc8e196226046fe047901
# bad: [8f5ef5e622d3f217d6542779723566099f370c31] Merge branch 'for-next' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux.git
git bisect bad 8f5ef5e622d3f217d6542779723566099f370c31
# good: [2b7d17d4b7c1ff40f58b0d32be40fc0bb6c582fb] soc: document merges
git bisect good 2b7d17d4b7c1ff40f58b0d32be40fc0bb6c582fb
# good: [4964f9250fbf76cb0b9c1124d5b9ab65de9bfd0e] Merge branch 'clk-next' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux.git
git bisect good 4964f9250fbf76cb0b9c1124d5b9ab65de9bfd0e
# bad: [18fae10a22071ccd0a2c44df2749ff482132774e] Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux.git
git bisect bad 18fae10a22071ccd0a2c44df2749ff482132774e
# bad: [b4a5aaaa51e4ab7f03eec509d3710d50e52e87a6] Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux.git
git bisect bad b4a5aaaa51e4ab7f03eec509d3710d50e52e87a6
# bad: [b6b1c3ce06ca438eb24e0f45bf0e63ecad0369f5] powerpc/rtas: Keep MSR[RI] set when calling RTAS
git bisect bad b6b1c3ce06ca438eb24e0f45bf0e63ecad0369f5
# good: [87ccc6684d3b57e3073f77cf28396b3037154193] powerpc/book3e: Fix sparse report in mm/nohash/fsl_book3e.c
git bisect good 87ccc6684d3b57e3073f77cf28396b3037154193
# good: [f31c618373f2051a32e30002d8eacad7bbbd3885] powerpc: Sort and de-dup primary opcodes in ppc-opcode.h
git bisect good f31c618373f2051a32e30002d8eacad7bbbd3885
# good: [9290c379d19774d8de6e2b895d756004dbad9ce5] powerpc/8xx: Simplify flush_tlb_kernel_range()
git bisect good 9290c379d19774d8de6e2b895d756004dbad9ce5
# bad: [d8d2af70b98109418bb16ff6638d7c1c4336f7fe] cxl/ocxl: Prepare cleanup of powerpc's asm/prom.h
git bisect bad d8d2af70b98109418bb16ff6638d7c1c4336f7fe
# bad: [b033767848c4115e486b1a51946de3bee2ac0fa6] powerpc/code-patching: Use jump_label for testing freed initmem
git bisect bad b033767848c4115e486b1a51946de3bee2ac0fa6
# good: [cb3ac45214c03852430979a43180371a44b74596] powerpc/code-patching: Don't call is_vmalloc_or_module_addr() without CONFIG_MODULES
git bisect good cb3ac45214c03852430979a43180371a44b74596
# first bad commit: [b033767848c4115e486b1a51946de3bee2ac0fa6] powerpc/code-patching: Use jump_label for testing freed initmem

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v1 3/4] powerpc/code-patching: Use jump_label for testing freed initmem
@ 2022-05-19  2:17     ` Guenter Roeck
  0 siblings, 0 replies; 47+ messages in thread
From: Guenter Roeck @ 2022-05-19  2:17 UTC (permalink / raw)
  To: Christophe Leroy; +Cc: linuxppc-dev, Paul Mackerras, linux-kernel

On Tue, Mar 22, 2022 at 04:40:20PM +0100, Christophe Leroy wrote:
> Once init is done, initmem is freed forever so no need to
> test system_state at every call to patch_instruction().
> 
> Use jump_label.
> 
> This reduces by 2% the time needed to activate ftrace on an 8xx.
> 

It also causes the qemu mpc8544ds emulation to crash.

BUG: Unable to handle kernel data access on write at 0xc122eb34
Faulting instruction address: 0xc001b580
Oops: Kernel access of bad area, sig: 11 [#1]
BE PAGE_SIZE=4K MPC8544 DS
Modules linked in:
CPU: 0 PID: 1 Comm: swapper Not tainted 5.18.0-rc7-next-20220518 #1
NIP:  c001b580 LR: c001b560 CTR: 00000003
REGS: c5107dd0 TRAP: 0300   Not tainted  (5.18.0-rc7-next-20220518)
MSR:  00009000 <EE,ME>  CR: 24000882  XER: 00000000
DEAR: c122eb34 ESR: 00800000
GPR00: c001b560 c5107ec0 c5120020 10000000 00000000 00000078 0c000000 cfffffff
GPR08: c001e9ec 00000001 00000007 00000000 44000882 00000000 c0005178 00000000
GPR16: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
GPR24: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 c1230000
NIP [c001b580] free_initmem+0x48/0xa8
LR [c001b560] free_initmem+0x28/0xa8
Call Trace:
[c5107ec0] [c001b560] free_initmem+0x28/0xa8 (unreliable)
[c5107ee0] [c00051b0] kernel_init+0x38/0x150
[c5107f00] [c001626c] ret_from_kernel_thread+0x5c/0x64
Instruction dump:
3fe0c123 912a00dc 90010024 48000665 3d20c218 8929fa65 2c090000 41820058
813feb34 2c090000 4082003c 39200001 <913feb34> 80010024 3cc0c114 83e1001c

Reverting this patch fixes the problem.

Guenter

---
# bad: [736ee37e2e8eed7fe48d0a37ee5a709514d478b3] Add linux-next specific files for 20220518
# good: [42226c989789d8da4af1de0c31070c96726d990c] Linux 5.18-rc7
git bisect start 'HEAD' 'v5.18-rc7'
# bad: [555b5fa93f08980ccb6bc8e196226046fe047901] Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/cryptodev-2.6.git
git bisect bad 555b5fa93f08980ccb6bc8e196226046fe047901
# bad: [8f5ef5e622d3f217d6542779723566099f370c31] Merge branch 'for-next' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux.git
git bisect bad 8f5ef5e622d3f217d6542779723566099f370c31
# good: [2b7d17d4b7c1ff40f58b0d32be40fc0bb6c582fb] soc: document merges
git bisect good 2b7d17d4b7c1ff40f58b0d32be40fc0bb6c582fb
# good: [4964f9250fbf76cb0b9c1124d5b9ab65de9bfd0e] Merge branch 'clk-next' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux.git
git bisect good 4964f9250fbf76cb0b9c1124d5b9ab65de9bfd0e
# bad: [18fae10a22071ccd0a2c44df2749ff482132774e] Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux.git
git bisect bad 18fae10a22071ccd0a2c44df2749ff482132774e
# bad: [b4a5aaaa51e4ab7f03eec509d3710d50e52e87a6] Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux.git
git bisect bad b4a5aaaa51e4ab7f03eec509d3710d50e52e87a6
# bad: [b6b1c3ce06ca438eb24e0f45bf0e63ecad0369f5] powerpc/rtas: Keep MSR[RI] set when calling RTAS
git bisect bad b6b1c3ce06ca438eb24e0f45bf0e63ecad0369f5
# good: [87ccc6684d3b57e3073f77cf28396b3037154193] powerpc/book3e: Fix sparse report in mm/nohash/fsl_book3e.c
git bisect good 87ccc6684d3b57e3073f77cf28396b3037154193
# good: [f31c618373f2051a32e30002d8eacad7bbbd3885] powerpc: Sort and de-dup primary opcodes in ppc-opcode.h
git bisect good f31c618373f2051a32e30002d8eacad7bbbd3885
# good: [9290c379d19774d8de6e2b895d756004dbad9ce5] powerpc/8xx: Simplify flush_tlb_kernel_range()
git bisect good 9290c379d19774d8de6e2b895d756004dbad9ce5
# bad: [d8d2af70b98109418bb16ff6638d7c1c4336f7fe] cxl/ocxl: Prepare cleanup of powerpc's asm/prom.h
git bisect bad d8d2af70b98109418bb16ff6638d7c1c4336f7fe
# bad: [b033767848c4115e486b1a51946de3bee2ac0fa6] powerpc/code-patching: Use jump_label for testing freed initmem
git bisect bad b033767848c4115e486b1a51946de3bee2ac0fa6
# good: [cb3ac45214c03852430979a43180371a44b74596] powerpc/code-patching: Don't call is_vmalloc_or_module_addr() without CONFIG_MODULES
git bisect good cb3ac45214c03852430979a43180371a44b74596
# first bad commit: [b033767848c4115e486b1a51946de3bee2ac0fa6] powerpc/code-patching: Use jump_label for testing freed initmem

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v1 3/4] powerpc/code-patching: Use jump_label for testing freed initmem
  2022-05-19  2:17     ` Guenter Roeck
@ 2022-05-19  6:27       ` Christophe Leroy
  -1 siblings, 0 replies; 47+ messages in thread
From: Christophe Leroy @ 2022-05-19  6:27 UTC (permalink / raw)
  To: Guenter Roeck
  Cc: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman,
	linux-kernel, linuxppc-dev



Le 19/05/2022 à 04:17, Guenter Roeck a écrit :
> On Tue, Mar 22, 2022 at 04:40:20PM +0100, Christophe Leroy wrote:
>> Once init is done, initmem is freed forever so no need to
>> test system_state at every call to patch_instruction().
>>
>> Use jump_label.
>>
>> This reduces by 2% the time needed to activate ftrace on an 8xx.
>>
> 
> It also causes the qemu mpc8544ds emulation to crash.
> 
> BUG: Unable to handle kernel data access on write at 0xc122eb34
> Faulting instruction address: 0xc001b580
> Oops: Kernel access of bad area, sig: 11 [#1]
> BE PAGE_SIZE=4K MPC8544 DS
> Modules linked in:
> CPU: 0 PID: 1 Comm: swapper Not tainted 5.18.0-rc7-next-20220518 #1
> NIP:  c001b580 LR: c001b560 CTR: 00000003
> REGS: c5107dd0 TRAP: 0300   Not tainted  (5.18.0-rc7-next-20220518)
> MSR:  00009000 <EE,ME>  CR: 24000882  XER: 00000000
> DEAR: c122eb34 ESR: 00800000
> GPR00: c001b560 c5107ec0 c5120020 10000000 00000000 00000078 0c000000 cfffffff
> GPR08: c001e9ec 00000001 00000007 00000000 44000882 00000000 c0005178 00000000
> GPR16: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
> GPR24: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 c1230000
> NIP [c001b580] free_initmem+0x48/0xa8
> LR [c001b560] free_initmem+0x28/0xa8
> Call Trace:
> [c5107ec0] [c001b560] free_initmem+0x28/0xa8 (unreliable)
> [c5107ee0] [c00051b0] kernel_init+0x38/0x150
> [c5107f00] [c001626c] ret_from_kernel_thread+0x5c/0x64
> Instruction dump:
> 3fe0c123 912a00dc 90010024 48000665 3d20c218 8929fa65 2c090000 41820058
> 813feb34 2c090000 4082003c 39200001 <913feb34> 80010024 3cc0c114 83e1001c
> 
> Reverting this patch fixes the problem.
> 

That's strange.

I was able to reproduce the problem.

Removing the __ro_after_init in front of 
DEFINE_STATIC_KEY_FALSE(init_mem_is_free) fixes the problem.

I can't understand why, mark_readonly() is called after free_initmem().

Christophe

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v1 3/4] powerpc/code-patching: Use jump_label for testing freed initmem
@ 2022-05-19  6:27       ` Christophe Leroy
  0 siblings, 0 replies; 47+ messages in thread
From: Christophe Leroy @ 2022-05-19  6:27 UTC (permalink / raw)
  To: Guenter Roeck; +Cc: linuxppc-dev, Paul Mackerras, linux-kernel



Le 19/05/2022 à 04:17, Guenter Roeck a écrit :
> On Tue, Mar 22, 2022 at 04:40:20PM +0100, Christophe Leroy wrote:
>> Once init is done, initmem is freed forever so no need to
>> test system_state at every call to patch_instruction().
>>
>> Use jump_label.
>>
>> This reduces by 2% the time needed to activate ftrace on an 8xx.
>>
> 
> It also causes the qemu mpc8544ds emulation to crash.
> 
> BUG: Unable to handle kernel data access on write at 0xc122eb34
> Faulting instruction address: 0xc001b580
> Oops: Kernel access of bad area, sig: 11 [#1]
> BE PAGE_SIZE=4K MPC8544 DS
> Modules linked in:
> CPU: 0 PID: 1 Comm: swapper Not tainted 5.18.0-rc7-next-20220518 #1
> NIP:  c001b580 LR: c001b560 CTR: 00000003
> REGS: c5107dd0 TRAP: 0300   Not tainted  (5.18.0-rc7-next-20220518)
> MSR:  00009000 <EE,ME>  CR: 24000882  XER: 00000000
> DEAR: c122eb34 ESR: 00800000
> GPR00: c001b560 c5107ec0 c5120020 10000000 00000000 00000078 0c000000 cfffffff
> GPR08: c001e9ec 00000001 00000007 00000000 44000882 00000000 c0005178 00000000
> GPR16: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
> GPR24: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 c1230000
> NIP [c001b580] free_initmem+0x48/0xa8
> LR [c001b560] free_initmem+0x28/0xa8
> Call Trace:
> [c5107ec0] [c001b560] free_initmem+0x28/0xa8 (unreliable)
> [c5107ee0] [c00051b0] kernel_init+0x38/0x150
> [c5107f00] [c001626c] ret_from_kernel_thread+0x5c/0x64
> Instruction dump:
> 3fe0c123 912a00dc 90010024 48000665 3d20c218 8929fa65 2c090000 41820058
> 813feb34 2c090000 4082003c 39200001 <913feb34> 80010024 3cc0c114 83e1001c
> 
> Reverting this patch fixes the problem.
> 

That's strange.

I was able to reproduce the problem.

Removing the __ro_after_init in front of 
DEFINE_STATIC_KEY_FALSE(init_mem_is_free) fixes the problem.

I can't understand why, mark_readonly() is called after free_initmem().

Christophe

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v1 3/4] powerpc/code-patching: Use jump_label for testing freed initmem
  2022-05-19  6:27       ` Christophe Leroy
@ 2022-05-19  6:53         ` Christophe Leroy
  -1 siblings, 0 replies; 47+ messages in thread
From: Christophe Leroy @ 2022-05-19  6:53 UTC (permalink / raw)
  To: Guenter Roeck; +Cc: linuxppc-dev, Paul Mackerras, linux-kernel



Le 19/05/2022 à 08:27, Christophe Leroy a écrit :
> 
> 
> Le 19/05/2022 à 04:17, Guenter Roeck a écrit :
>> On Tue, Mar 22, 2022 at 04:40:20PM +0100, Christophe Leroy wrote:
>>> Once init is done, initmem is freed forever so no need to
>>> test system_state at every call to patch_instruction().
>>>
>>> Use jump_label.
>>>
>>> This reduces by 2% the time needed to activate ftrace on an 8xx.
>>>
>>
>> It also causes the qemu mpc8544ds emulation to crash.
>>
>> BUG: Unable to handle kernel data access on write at 0xc122eb34
>> Faulting instruction address: 0xc001b580
>> Oops: Kernel access of bad area, sig: 11 [#1]
>> BE PAGE_SIZE=4K MPC8544 DS
>> Modules linked in:
>> CPU: 0 PID: 1 Comm: swapper Not tainted 5.18.0-rc7-next-20220518 #1
>> NIP:  c001b580 LR: c001b560 CTR: 00000003
>> REGS: c5107dd0 TRAP: 0300   Not tainted  (5.18.0-rc7-next-20220518)
>> MSR:  00009000 <EE,ME>  CR: 24000882  XER: 00000000
>> DEAR: c122eb34 ESR: 00800000
>> GPR00: c001b560 c5107ec0 c5120020 10000000 00000000 00000078 0c000000 cfffffff
>> GPR08: c001e9ec 00000001 00000007 00000000 44000882 00000000 c0005178 00000000
>> GPR16: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
>> GPR24: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 c1230000
>> NIP [c001b580] free_initmem+0x48/0xa8
>> LR [c001b560] free_initmem+0x28/0xa8
>> Call Trace:
>> [c5107ec0] [c001b560] free_initmem+0x28/0xa8 (unreliable)
>> [c5107ee0] [c00051b0] kernel_init+0x38/0x150
>> [c5107f00] [c001626c] ret_from_kernel_thread+0x5c/0x64
>> Instruction dump:
>> 3fe0c123 912a00dc 90010024 48000665 3d20c218 8929fa65 2c090000 41820058
>> 813feb34 2c090000 4082003c 39200001 <913feb34> 80010024 3cc0c114 83e1001c
>>
>> Reverting this patch fixes the problem.
>>
> 
> That's strange.
> 
> I was able to reproduce the problem.
> 
> Removing the __ro_after_init in front of
> DEFINE_STATIC_KEY_FALSE(init_mem_is_free) fixes the problem.
> 
> I can't understand why, mark_readonly() is called after free_initmem().
> 

Moving static_branch_enable(&init_mem_is_free) before mark_initmem_nx() 
also solves the problem.

There must be something wrong with mark_initmem_nx().

Christophe

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v1 3/4] powerpc/code-patching: Use jump_label for testing freed initmem
@ 2022-05-19  6:53         ` Christophe Leroy
  0 siblings, 0 replies; 47+ messages in thread
From: Christophe Leroy @ 2022-05-19  6:53 UTC (permalink / raw)
  To: Guenter Roeck; +Cc: Paul Mackerras, linuxppc-dev, linux-kernel



Le 19/05/2022 à 08:27, Christophe Leroy a écrit :
> 
> 
> Le 19/05/2022 à 04:17, Guenter Roeck a écrit :
>> On Tue, Mar 22, 2022 at 04:40:20PM +0100, Christophe Leroy wrote:
>>> Once init is done, initmem is freed forever so no need to
>>> test system_state at every call to patch_instruction().
>>>
>>> Use jump_label.
>>>
>>> This reduces by 2% the time needed to activate ftrace on an 8xx.
>>>
>>
>> It also causes the qemu mpc8544ds emulation to crash.
>>
>> BUG: Unable to handle kernel data access on write at 0xc122eb34
>> Faulting instruction address: 0xc001b580
>> Oops: Kernel access of bad area, sig: 11 [#1]
>> BE PAGE_SIZE=4K MPC8544 DS
>> Modules linked in:
>> CPU: 0 PID: 1 Comm: swapper Not tainted 5.18.0-rc7-next-20220518 #1
>> NIP:  c001b580 LR: c001b560 CTR: 00000003
>> REGS: c5107dd0 TRAP: 0300   Not tainted  (5.18.0-rc7-next-20220518)
>> MSR:  00009000 <EE,ME>  CR: 24000882  XER: 00000000
>> DEAR: c122eb34 ESR: 00800000
>> GPR00: c001b560 c5107ec0 c5120020 10000000 00000000 00000078 0c000000 cfffffff
>> GPR08: c001e9ec 00000001 00000007 00000000 44000882 00000000 c0005178 00000000
>> GPR16: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
>> GPR24: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 c1230000
>> NIP [c001b580] free_initmem+0x48/0xa8
>> LR [c001b560] free_initmem+0x28/0xa8
>> Call Trace:
>> [c5107ec0] [c001b560] free_initmem+0x28/0xa8 (unreliable)
>> [c5107ee0] [c00051b0] kernel_init+0x38/0x150
>> [c5107f00] [c001626c] ret_from_kernel_thread+0x5c/0x64
>> Instruction dump:
>> 3fe0c123 912a00dc 90010024 48000665 3d20c218 8929fa65 2c090000 41820058
>> 813feb34 2c090000 4082003c 39200001 <913feb34> 80010024 3cc0c114 83e1001c
>>
>> Reverting this patch fixes the problem.
>>
> 
> That's strange.
> 
> I was able to reproduce the problem.
> 
> Removing the __ro_after_init in front of
> DEFINE_STATIC_KEY_FALSE(init_mem_is_free) fixes the problem.
> 
> I can't understand why, mark_readonly() is called after free_initmem().
> 

Moving static_branch_enable(&init_mem_is_free) before mark_initmem_nx() 
also solves the problem.

There must be something wrong with mark_initmem_nx().

Christophe

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v1 3/4] powerpc/code-patching: Use jump_label for testing freed initmem
  2022-05-19  6:53         ` Christophe Leroy
@ 2022-05-19 17:27           ` Christophe Leroy
  -1 siblings, 0 replies; 47+ messages in thread
From: Christophe Leroy @ 2022-05-19 17:27 UTC (permalink / raw)
  To: Guenter Roeck; +Cc: linuxppc-dev, Paul Mackerras, linux-kernel



Le 19/05/2022 à 08:53, Christophe Leroy a écrit :
> 
> 
> Le 19/05/2022 à 08:27, Christophe Leroy a écrit :
>>
>>
>> Le 19/05/2022 à 04:17, Guenter Roeck a écrit :
>>> On Tue, Mar 22, 2022 at 04:40:20PM +0100, Christophe Leroy wrote:
>>>> Once init is done, initmem is freed forever so no need to
>>>> test system_state at every call to patch_instruction().
>>>>
>>>> Use jump_label.
>>>>
>>>> This reduces by 2% the time needed to activate ftrace on an 8xx.
>>>>
>>>
>>> It also causes the qemu mpc8544ds emulation to crash.
>>>
>>> BUG: Unable to handle kernel data access on write at 0xc122eb34
>>> Faulting instruction address: 0xc001b580
>>> Oops: Kernel access of bad area, sig: 11 [#1]
>>> BE PAGE_SIZE=4K MPC8544 DS
>>> Modules linked in:
>>> CPU: 0 PID: 1 Comm: swapper Not tainted 5.18.0-rc7-next-20220518 #1
>>> NIP:  c001b580 LR: c001b560 CTR: 00000003
>>> REGS: c5107dd0 TRAP: 0300   Not tainted  (5.18.0-rc7-next-20220518)
>>> MSR:  00009000 <EE,ME>  CR: 24000882  XER: 00000000
>>> DEAR: c122eb34 ESR: 00800000
>>> GPR00: c001b560 c5107ec0 c5120020 10000000 00000000 00000078 0c000000 
>>> cfffffff
>>> GPR08: c001e9ec 00000001 00000007 00000000 44000882 00000000 c0005178 
>>> 00000000
>>> GPR16: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 
>>> 00000000
>>> GPR24: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 
>>> c1230000
>>> NIP [c001b580] free_initmem+0x48/0xa8
>>> LR [c001b560] free_initmem+0x28/0xa8
>>> Call Trace:
>>> [c5107ec0] [c001b560] free_initmem+0x28/0xa8 (unreliable)
>>> [c5107ee0] [c00051b0] kernel_init+0x38/0x150
>>> [c5107f00] [c001626c] ret_from_kernel_thread+0x5c/0x64
>>> Instruction dump:
>>> 3fe0c123 912a00dc 90010024 48000665 3d20c218 8929fa65 2c090000 41820058
>>> 813feb34 2c090000 4082003c 39200001 <913feb34> 80010024 3cc0c114 
>>> 83e1001c
>>>
>>> Reverting this patch fixes the problem.
>>>
>>
>> That's strange.
>>
>> I was able to reproduce the problem.
>>
>> Removing the __ro_after_init in front of
>> DEFINE_STATIC_KEY_FALSE(init_mem_is_free) fixes the problem.
>>
>> I can't understand why, mark_readonly() is called after free_initmem().
>>
> 
> Moving static_branch_enable(&init_mem_is_free) before mark_initmem_nx() 
> also solves the problem.
> 
> There must be something wrong with mark_initmem_nx().
> 


Fixing patch sent, see 
https://patchwork.ozlabs.org/project/linuxppc-dev/patch/2e35f0fd649c83c5add17a99514ac040767be93a.1652981047.git.christophe.leroy@csgroup.eu/

Christophe

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v1 3/4] powerpc/code-patching: Use jump_label for testing freed initmem
@ 2022-05-19 17:27           ` Christophe Leroy
  0 siblings, 0 replies; 47+ messages in thread
From: Christophe Leroy @ 2022-05-19 17:27 UTC (permalink / raw)
  To: Guenter Roeck; +Cc: Paul Mackerras, linuxppc-dev, linux-kernel



Le 19/05/2022 à 08:53, Christophe Leroy a écrit :
> 
> 
> Le 19/05/2022 à 08:27, Christophe Leroy a écrit :
>>
>>
>> Le 19/05/2022 à 04:17, Guenter Roeck a écrit :
>>> On Tue, Mar 22, 2022 at 04:40:20PM +0100, Christophe Leroy wrote:
>>>> Once init is done, initmem is freed forever so no need to
>>>> test system_state at every call to patch_instruction().
>>>>
>>>> Use jump_label.
>>>>
>>>> This reduces by 2% the time needed to activate ftrace on an 8xx.
>>>>
>>>
>>> It also causes the qemu mpc8544ds emulation to crash.
>>>
>>> BUG: Unable to handle kernel data access on write at 0xc122eb34
>>> Faulting instruction address: 0xc001b580
>>> Oops: Kernel access of bad area, sig: 11 [#1]
>>> BE PAGE_SIZE=4K MPC8544 DS
>>> Modules linked in:
>>> CPU: 0 PID: 1 Comm: swapper Not tainted 5.18.0-rc7-next-20220518 #1
>>> NIP:  c001b580 LR: c001b560 CTR: 00000003
>>> REGS: c5107dd0 TRAP: 0300   Not tainted  (5.18.0-rc7-next-20220518)
>>> MSR:  00009000 <EE,ME>  CR: 24000882  XER: 00000000
>>> DEAR: c122eb34 ESR: 00800000
>>> GPR00: c001b560 c5107ec0 c5120020 10000000 00000000 00000078 0c000000 
>>> cfffffff
>>> GPR08: c001e9ec 00000001 00000007 00000000 44000882 00000000 c0005178 
>>> 00000000
>>> GPR16: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 
>>> 00000000
>>> GPR24: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 
>>> c1230000
>>> NIP [c001b580] free_initmem+0x48/0xa8
>>> LR [c001b560] free_initmem+0x28/0xa8
>>> Call Trace:
>>> [c5107ec0] [c001b560] free_initmem+0x28/0xa8 (unreliable)
>>> [c5107ee0] [c00051b0] kernel_init+0x38/0x150
>>> [c5107f00] [c001626c] ret_from_kernel_thread+0x5c/0x64
>>> Instruction dump:
>>> 3fe0c123 912a00dc 90010024 48000665 3d20c218 8929fa65 2c090000 41820058
>>> 813feb34 2c090000 4082003c 39200001 <913feb34> 80010024 3cc0c114 
>>> 83e1001c
>>>
>>> Reverting this patch fixes the problem.
>>>
>>
>> That's strange.
>>
>> I was able to reproduce the problem.
>>
>> Removing the __ro_after_init in front of
>> DEFINE_STATIC_KEY_FALSE(init_mem_is_free) fixes the problem.
>>
>> I can't understand why, mark_readonly() is called after free_initmem().
>>
> 
> Moving static_branch_enable(&init_mem_is_free) before mark_initmem_nx() 
> also solves the problem.
> 
> There must be something wrong with mark_initmem_nx().
> 


Fixing patch sent, see 
https://patchwork.ozlabs.org/project/linuxppc-dev/patch/2e35f0fd649c83c5add17a99514ac040767be93a.1652981047.git.christophe.leroy@csgroup.eu/

Christophe

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v1 0/4] Kill the time spent in patch_instruction()
  2022-05-17 12:37     ` Michael Ellerman
@ 2022-05-31  6:24       ` Christophe Leroy
  2022-06-24  7:06         ` Christophe Leroy
  0 siblings, 1 reply; 47+ messages in thread
From: Christophe Leroy @ 2022-05-31  6:24 UTC (permalink / raw)
  To: Michael Ellerman, Michael Ellerman, Paul Mackerras,
	Benjamin Herrenschmidt
  Cc: linuxppc-dev, linux-kernel



Le 17/05/2022 à 14:37, Michael Ellerman a écrit :
> Christophe Leroy <christophe.leroy@csgroup.eu> writes:
>> Le 15/05/2022 à 12:28, Michael Ellerman a écrit :
>>> On Tue, 22 Mar 2022 16:40:17 +0100, Christophe Leroy wrote:
>>>> This series reduces by 70% the time required to activate
>>>> ftrace on an 8xx with CONFIG_STRICT_KERNEL_RWX.
>>>>
>>>> Measure is performed in function ftrace_replace_code() using mftb()
>>>> around the loop.
>>>>
>>>> With the series,
>>>> - Without CONFIG_STRICT_KERNEL_RWX, 416000 TB ticks are measured.
>>>> - With CONFIG_STRICT_KERNEL_RWX, 546000 TB ticks are measured.
>>>>
>>>> [...]
>>>
>>> Patches 1, 3 and 4 applied to powerpc/next.
>>>
>>> [1/4] powerpc/code-patching: Don't call is_vmalloc_or_module_addr() without CONFIG_MODULES
>>>         https://git.kernel.org/powerpc/c/cb3ac45214c03852430979a43180371a44b74596
>>> [3/4] powerpc/code-patching: Use jump_label for testing freed initmem
>>>         https://git.kernel.org/powerpc/c/b033767848c4115e486b1a51946de3bee2ac0fa6
>>> [4/4] powerpc/code-patching: Use jump_label to check if poking_init() is done
>>>         https://git.kernel.org/powerpc/c/1751289268ef959db68b0b6f798d904d6403309a
>>>
>>
>> Patch 2 was the keystone of this series. What happened to it ?
> 
> It broke on 64-bit. I think I know why but I haven't had time to test
> it. Will try and get it fixed in the next day or two.
> 

You didn't find any solution at the end, or didn't have time ?

What was the problem exactly ? I made a quick try on QEMU and it was 
working as expected.

Christophe

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v1 0/4] Kill the time spent in patch_instruction()
  2022-05-31  6:24       ` Christophe Leroy
@ 2022-06-24  7:06         ` Christophe Leroy
  0 siblings, 0 replies; 47+ messages in thread
From: Christophe Leroy @ 2022-06-24  7:06 UTC (permalink / raw)
  To: Michael Ellerman, Michael Ellerman, Paul Mackerras,
	Benjamin Herrenschmidt
  Cc: linuxppc-dev, linux-kernel

Michael ?

Le 31/05/2022 à 08:24, Christophe Leroy a écrit :
> 
> 
> Le 17/05/2022 à 14:37, Michael Ellerman a écrit :
>> Christophe Leroy <christophe.leroy@csgroup.eu> writes:
>>> Le 15/05/2022 à 12:28, Michael Ellerman a écrit :
>>>> On Tue, 22 Mar 2022 16:40:17 +0100, Christophe Leroy wrote:
>>>>> This series reduces by 70% the time required to activate
>>>>> ftrace on an 8xx with CONFIG_STRICT_KERNEL_RWX.
>>>>>
>>>>> Measure is performed in function ftrace_replace_code() using mftb()
>>>>> around the loop.
>>>>>
>>>>> With the series,
>>>>> - Without CONFIG_STRICT_KERNEL_RWX, 416000 TB ticks are measured.
>>>>> - With CONFIG_STRICT_KERNEL_RWX, 546000 TB ticks are measured.
>>>>>
>>>>> [...]
>>>>
>>>> Patches 1, 3 and 4 applied to powerpc/next.
>>>>
>>>> [1/4] powerpc/code-patching: Don't call is_vmalloc_or_module_addr() 
>>>> without CONFIG_MODULES
>>>>         
>>>> https://git.kernel.org/powerpc/c/cb3ac45214c03852430979a43180371a44b74596 
>>>>
>>>> [3/4] powerpc/code-patching: Use jump_label for testing freed initmem
>>>>         
>>>> https://git.kernel.org/powerpc/c/b033767848c4115e486b1a51946de3bee2ac0fa6 
>>>>
>>>> [4/4] powerpc/code-patching: Use jump_label to check if 
>>>> poking_init() is done
>>>>         
>>>> https://git.kernel.org/powerpc/c/1751289268ef959db68b0b6f798d904d6403309a 
>>>>
>>>>
>>>
>>> Patch 2 was the keystone of this series. What happened to it ?
>>
>> It broke on 64-bit. I think I know why but I haven't had time to test
>> it. Will try and get it fixed in the next day or two.
>>
> 
> You didn't find any solution at the end, or didn't have time ?
> 
> What was the problem exactly ? I made a quick try on QEMU and it was 
> working as expected.
> 

Should I make it a ppc32-only change ?

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [PATCH v1 0/4] Kill the time spent in patch_instruction()
@ 2022-03-22 15:40 ` Christophe Leroy
  0 siblings, 0 replies; 47+ messages in thread
From: Christophe Leroy @ 2022-09-27 14:33 UTC (permalink / raw)
  To: Benjamin Gray, Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman
  Cc: linuxppc-dev, linux-kernel

This series reduces by 70% the time required to activate
ftrace on an 8xx with CONFIG_STRICT_KERNEL_RWX.

Measure is performed in function ftrace_replace_code() using mftb()
around the loop.

With the series,
- Without CONFIG_STRICT_KERNEL_RWX, 416000 TB ticks are measured.
- With CONFIG_STRICT_KERNEL_RWX, 546000 TB ticks are measured.

Before this series,
- Without CONFIG_STRICT_KERNEL_RWX, 427000 TB ticks are measured.
- With CONFIG_STRICT_KERNEL_RWX, 1744000 TB ticks are measured.

Before the series, CONFIG_STRICT_KERNEL_RWX multiplies the time
required for ftrace activation by more than 4.

With the series, CONFIG_STRICT_KERNEL_RWX increases the time
required for ftrace activation by only 30%

Christophe Leroy (4):
  powerpc/code-patching: Don't call is_vmalloc_or_module_addr() without
    CONFIG_MODULES
  powerpc/code-patching: Speed up page mapping/unmapping
  powerpc/code-patching: Use jump_label for testing freed initmem
  powerpc/code-patching: Use jump_label to check if poking_init() is
    done

 arch/powerpc/include/asm/code-patching.h |  2 ++
 arch/powerpc/lib/code-patching.c         | 37 +++++++++++++++---------
 arch/powerpc/mm/mem.c                    |  2 ++
 3 files changed, 28 insertions(+), 13 deletions(-)

-- 
2.35.1


^ permalink raw reply	[flat|nested] 47+ messages in thread

* [PATCH v1 0/4] Kill the time spent in patch_instruction()
@ 2022-03-22 15:40 ` Christophe Leroy
  0 siblings, 0 replies; 47+ messages in thread
From: Christophe Leroy @ 2022-09-27 14:33 UTC (permalink / raw)
  To: Benjamin Gray, Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman
  Cc: Christophe Leroy, linux-kernel, linuxppc-dev

This series reduces by 70% the time required to activate
ftrace on an 8xx with CONFIG_STRICT_KERNEL_RWX.

Measure is performed in function ftrace_replace_code() using mftb()
around the loop.

With the series,
- Without CONFIG_STRICT_KERNEL_RWX, 416000 TB ticks are measured.
- With CONFIG_STRICT_KERNEL_RWX, 546000 TB ticks are measured.

Before this series,
- Without CONFIG_STRICT_KERNEL_RWX, 427000 TB ticks are measured.
- With CONFIG_STRICT_KERNEL_RWX, 1744000 TB ticks are measured.

Before the series, CONFIG_STRICT_KERNEL_RWX multiplies the time
required for ftrace activation by more than 4.

With the series, CONFIG_STRICT_KERNEL_RWX increases the time
required for ftrace activation by only 30%

Christophe Leroy (4):
  powerpc/code-patching: Don't call is_vmalloc_or_module_addr() without
    CONFIG_MODULES
  powerpc/code-patching: Speed up page mapping/unmapping
  powerpc/code-patching: Use jump_label for testing freed initmem
  powerpc/code-patching: Use jump_label to check if poking_init() is
    done

 arch/powerpc/include/asm/code-patching.h |  2 ++
 arch/powerpc/lib/code-patching.c         | 37 +++++++++++++++---------
 arch/powerpc/mm/mem.c                    |  2 ++
 3 files changed, 28 insertions(+), 13 deletions(-)

-- 
2.35.1


^ permalink raw reply	[flat|nested] 47+ messages in thread

* [PATCH v1 1/4] powerpc/code-patching: Don't call is_vmalloc_or_module_addr() without CONFIG_MODULES
@ 2022-03-22 15:40   ` Christophe Leroy
  0 siblings, 0 replies; 47+ messages in thread
From: Christophe Leroy @ 2022-09-27 14:33 UTC (permalink / raw)
  To: Benjamin Gray, Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman
  Cc: linuxppc-dev, linux-kernel

If CONFIG_MODULES is not set, there is no point in checking
whether text is in module area.

This reduced the time needed to activate/deactivate ftrace
by more than 10% on an 8xx.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
---
 arch/powerpc/lib/code-patching.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/lib/code-patching.c b/arch/powerpc/lib/code-patching.c
index 00c68e7fb11e..f970f189875b 100644
--- a/arch/powerpc/lib/code-patching.c
+++ b/arch/powerpc/lib/code-patching.c
@@ -97,7 +97,7 @@ static int map_patch_area(void *addr, unsigned long text_poke_addr)
 {
 	unsigned long pfn;
 
-	if (is_vmalloc_or_module_addr(addr))
+	if (IS_ENABLED(CONFIG_MODULES) && is_vmalloc_or_module_addr(addr))
 		pfn = vmalloc_to_pfn(addr);
 	else
 		pfn = __pa_symbol(addr) >> PAGE_SHIFT;
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v1 1/4] powerpc/code-patching: Don't call is_vmalloc_or_module_addr() without CONFIG_MODULES
@ 2022-03-22 15:40   ` Christophe Leroy
  0 siblings, 0 replies; 47+ messages in thread
From: Christophe Leroy @ 2022-09-27 14:33 UTC (permalink / raw)
  To: Benjamin Gray, Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman
  Cc: Christophe Leroy, linux-kernel, linuxppc-dev

If CONFIG_MODULES is not set, there is no point in checking
whether text is in module area.

This reduced the time needed to activate/deactivate ftrace
by more than 10% on an 8xx.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
---
 arch/powerpc/lib/code-patching.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/lib/code-patching.c b/arch/powerpc/lib/code-patching.c
index 00c68e7fb11e..f970f189875b 100644
--- a/arch/powerpc/lib/code-patching.c
+++ b/arch/powerpc/lib/code-patching.c
@@ -97,7 +97,7 @@ static int map_patch_area(void *addr, unsigned long text_poke_addr)
 {
 	unsigned long pfn;
 
-	if (is_vmalloc_or_module_addr(addr))
+	if (IS_ENABLED(CONFIG_MODULES) && is_vmalloc_or_module_addr(addr))
 		pfn = vmalloc_to_pfn(addr);
 	else
 		pfn = __pa_symbol(addr) >> PAGE_SHIFT;
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v1 1/6] powerpc/code-patching: Use pte_offset_kernel() instead of virt_to_kpte()
  2022-03-22 15:40 ` Christophe Leroy
@ 2022-09-27 14:33   ` Christophe Leroy
  -1 siblings, 0 replies; 47+ messages in thread
From: Christophe Leroy @ 2022-09-27 14:33 UTC (permalink / raw)
  To: Benjamin Gray, Michael Ellerman, Nicholas Piggin
  Cc: linuxppc-dev, linux-kernel

virt_to_kpte() checks pmd_none() and returns NULL in that case.

__do_patch_instruction() doesn't expect the pmd to be none and
doesn't handle the case anyway.

So avoid the pmd_none() check by using pte_offset_kernel()
directly.

It improves ftrace activation by approx 1% on an 8xx.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
---
 arch/powerpc/lib/code-patching.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/lib/code-patching.c b/arch/powerpc/lib/code-patching.c
index ad0cf3108dd0..0f3acb0534b6 100644
--- a/arch/powerpc/lib/code-patching.c
+++ b/arch/powerpc/lib/code-patching.c
@@ -158,7 +158,7 @@ static int __do_patch_instruction(u32 *addr, ppc_inst_t instr)
 	text_poke_addr = (unsigned long)__this_cpu_read(text_poke_area)->addr & PAGE_MASK;
 	patch_addr = (u32 *)(text_poke_addr + offset_in_page(addr));
 
-	pte = virt_to_kpte(text_poke_addr);
+	pte = pte_offset_kernel(pmd_off_k(text_poke_addr), text_poke_addr);
 	__set_pte_at(&init_mm, text_poke_addr, pte, pfn_pte(pfn, PAGE_KERNEL), 0);
 	/* See ptesync comment in radix__set_pte_at() */
 	if (radix_enabled())
-- 
2.37.1


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v1 1/6] powerpc/code-patching: Use pte_offset_kernel() instead of virt_to_kpte()
@ 2022-09-27 14:33   ` Christophe Leroy
  0 siblings, 0 replies; 47+ messages in thread
From: Christophe Leroy @ 2022-09-27 14:33 UTC (permalink / raw)
  To: Benjamin Gray, Michael Ellerman, Nicholas Piggin
  Cc: Christophe Leroy, linux-kernel, linuxppc-dev

virt_to_kpte() checks pmd_none() and returns NULL in that case.

__do_patch_instruction() doesn't expect the pmd to be none and
doesn't handle the case anyway.

So avoid the pmd_none() check by using pte_offset_kernel()
directly.

It improves ftrace activation by approx 1% on an 8xx.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
---
 arch/powerpc/lib/code-patching.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/lib/code-patching.c b/arch/powerpc/lib/code-patching.c
index ad0cf3108dd0..0f3acb0534b6 100644
--- a/arch/powerpc/lib/code-patching.c
+++ b/arch/powerpc/lib/code-patching.c
@@ -158,7 +158,7 @@ static int __do_patch_instruction(u32 *addr, ppc_inst_t instr)
 	text_poke_addr = (unsigned long)__this_cpu_read(text_poke_area)->addr & PAGE_MASK;
 	patch_addr = (u32 *)(text_poke_addr + offset_in_page(addr));
 
-	pte = virt_to_kpte(text_poke_addr);
+	pte = pte_offset_kernel(pmd_off_k(text_poke_addr), text_poke_addr);
 	__set_pte_at(&init_mm, text_poke_addr, pte, pfn_pte(pfn, PAGE_KERNEL), 0);
 	/* See ptesync comment in radix__set_pte_at() */
 	if (radix_enabled())
-- 
2.37.1


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v1 2/6] powerpc/code-patching: Remove #ifdef CONFIG_STRICT_KERNEL_RWX
  2022-09-27 14:33   ` Christophe Leroy
@ 2022-09-27 14:33     ` Christophe Leroy
  -1 siblings, 0 replies; 47+ messages in thread
From: Christophe Leroy @ 2022-09-27 14:33 UTC (permalink / raw)
  To: Benjamin Gray, Michael Ellerman, Nicholas Piggin
  Cc: linuxppc-dev, linux-kernel

No need to have one implementation of patch_instruction() for
CONFIG_STRICT_KERNEL_RWX and one for !CONFIG_STRICT_KERNEL_RWX.

In patch_instruction(), call raw_patch_instruction() when
!CONFIG_STRICT_KERNEL_RWX.

In poking_init(), bail out immediately, it will be equivalent
to the weak default implementation.

Everything else is declared static and will be discarded by
GCC when !CONFIG_STRICT_KERNEL_RWX.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
---
 arch/powerpc/lib/code-patching.c | 15 +++++----------
 1 file changed, 5 insertions(+), 10 deletions(-)

diff --git a/arch/powerpc/lib/code-patching.c b/arch/powerpc/lib/code-patching.c
index 0f3acb0534b6..647a0bb35848 100644
--- a/arch/powerpc/lib/code-patching.c
+++ b/arch/powerpc/lib/code-patching.c
@@ -41,7 +41,6 @@ int raw_patch_instruction(u32 *addr, ppc_inst_t instr)
 	return __patch_instruction(addr, instr, addr);
 }
 
-#ifdef CONFIG_STRICT_KERNEL_RWX
 static DEFINE_PER_CPU(struct vm_struct *, text_poke_area);
 
 static int map_patch_area(void *addr, unsigned long text_poke_addr);
@@ -88,6 +87,9 @@ static __ro_after_init DEFINE_STATIC_KEY_FALSE(poking_init_done);
  */
 void __init poking_init(void)
 {
+	if (!IS_ENABLED(CONFIG_STRICT_KERNEL_RWX))
+		return;
+
 	BUG_ON(!cpuhp_setup_state(CPUHP_AP_ONLINE_DYN,
 		"powerpc/text_poke:online", text_area_cpu_up,
 		text_area_cpu_down));
@@ -182,7 +184,8 @@ static int do_patch_instruction(u32 *addr, ppc_inst_t instr)
 	 * when text_poke_area is not ready, but we still need
 	 * to allow patching. We just do the plain old patching
 	 */
-	if (!static_branch_likely(&poking_init_done))
+	if (!IS_ENABLED(CONFIG_STRICT_KERNEL_RWX) ||
+	    !static_branch_likely(&poking_init_done))
 		return raw_patch_instruction(addr, instr);
 
 	local_irq_save(flags);
@@ -191,14 +194,6 @@ static int do_patch_instruction(u32 *addr, ppc_inst_t instr)
 
 	return err;
 }
-#else /* !CONFIG_STRICT_KERNEL_RWX */
-
-static int do_patch_instruction(u32 *addr, ppc_inst_t instr)
-{
-	return raw_patch_instruction(addr, instr);
-}
-
-#endif /* CONFIG_STRICT_KERNEL_RWX */
 
 __ro_after_init DEFINE_STATIC_KEY_FALSE(init_mem_is_free);
 
-- 
2.37.1


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v1 2/6] powerpc/code-patching: Remove #ifdef CONFIG_STRICT_KERNEL_RWX
@ 2022-09-27 14:33     ` Christophe Leroy
  0 siblings, 0 replies; 47+ messages in thread
From: Christophe Leroy @ 2022-09-27 14:33 UTC (permalink / raw)
  To: Benjamin Gray, Michael Ellerman, Nicholas Piggin
  Cc: Christophe Leroy, linux-kernel, linuxppc-dev

No need to have one implementation of patch_instruction() for
CONFIG_STRICT_KERNEL_RWX and one for !CONFIG_STRICT_KERNEL_RWX.

In patch_instruction(), call raw_patch_instruction() when
!CONFIG_STRICT_KERNEL_RWX.

In poking_init(), bail out immediately, it will be equivalent
to the weak default implementation.

Everything else is declared static and will be discarded by
GCC when !CONFIG_STRICT_KERNEL_RWX.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
---
 arch/powerpc/lib/code-patching.c | 15 +++++----------
 1 file changed, 5 insertions(+), 10 deletions(-)

diff --git a/arch/powerpc/lib/code-patching.c b/arch/powerpc/lib/code-patching.c
index 0f3acb0534b6..647a0bb35848 100644
--- a/arch/powerpc/lib/code-patching.c
+++ b/arch/powerpc/lib/code-patching.c
@@ -41,7 +41,6 @@ int raw_patch_instruction(u32 *addr, ppc_inst_t instr)
 	return __patch_instruction(addr, instr, addr);
 }
 
-#ifdef CONFIG_STRICT_KERNEL_RWX
 static DEFINE_PER_CPU(struct vm_struct *, text_poke_area);
 
 static int map_patch_area(void *addr, unsigned long text_poke_addr);
@@ -88,6 +87,9 @@ static __ro_after_init DEFINE_STATIC_KEY_FALSE(poking_init_done);
  */
 void __init poking_init(void)
 {
+	if (!IS_ENABLED(CONFIG_STRICT_KERNEL_RWX))
+		return;
+
 	BUG_ON(!cpuhp_setup_state(CPUHP_AP_ONLINE_DYN,
 		"powerpc/text_poke:online", text_area_cpu_up,
 		text_area_cpu_down));
@@ -182,7 +184,8 @@ static int do_patch_instruction(u32 *addr, ppc_inst_t instr)
 	 * when text_poke_area is not ready, but we still need
 	 * to allow patching. We just do the plain old patching
 	 */
-	if (!static_branch_likely(&poking_init_done))
+	if (!IS_ENABLED(CONFIG_STRICT_KERNEL_RWX) ||
+	    !static_branch_likely(&poking_init_done))
 		return raw_patch_instruction(addr, instr);
 
 	local_irq_save(flags);
@@ -191,14 +194,6 @@ static int do_patch_instruction(u32 *addr, ppc_inst_t instr)
 
 	return err;
 }
-#else /* !CONFIG_STRICT_KERNEL_RWX */
-
-static int do_patch_instruction(u32 *addr, ppc_inst_t instr)
-{
-	return raw_patch_instruction(addr, instr);
-}
-
-#endif /* CONFIG_STRICT_KERNEL_RWX */
 
 __ro_after_init DEFINE_STATIC_KEY_FALSE(init_mem_is_free);
 
-- 
2.37.1


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v1 2/4] powerpc/code-patching: Speed up page mapping/unmapping
@ 2022-03-22 15:40   ` Christophe Leroy
  0 siblings, 0 replies; 47+ messages in thread
From: Christophe Leroy @ 2022-09-27 14:33 UTC (permalink / raw)
  To: Benjamin Gray, Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman
  Cc: linuxppc-dev, linux-kernel

Since commit 591b4b268435 ("powerpc/code-patching: Pre-map patch area")
the patch area is premapped so intermediate page tables are already
allocated.

Use __set_pte_at() directly instead of the heavy map_kernel_page(),
at for unmapping just do a pte_clear() followed by a flush.

__set_pte_at() can be used directly without the filters in
set_pte_at() because we are mapping a normal page non executable.

Make sure gcc knows text_poke_area is page aligned in order to
optimise the flush.

This change reduces by 66% the time needed to activate ftrace on
an 8xx (588000 tb ticks instead of 1744000).

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
---
 arch/powerpc/lib/code-patching.c | 27 ++++++++++++++++-----------
 1 file changed, 16 insertions(+), 11 deletions(-)

diff --git a/arch/powerpc/lib/code-patching.c b/arch/powerpc/lib/code-patching.c
index f970f189875b..62692c6031bc 100644
--- a/arch/powerpc/lib/code-patching.c
+++ b/arch/powerpc/lib/code-patching.c
@@ -90,17 +90,20 @@ void __init poking_init(void)
 		text_area_cpu_down));
 }
 
+static unsigned long get_patch_pfn(void *addr)
+{
+	if (IS_ENABLED(CONFIG_MODULES) && is_vmalloc_or_module_addr(addr))
+		return vmalloc_to_pfn(addr);
+	else
+		return __pa_symbol(addr) >> PAGE_SHIFT;
+}
+
 /*
  * This can be called for kernel text or a module.
  */
 static int map_patch_area(void *addr, unsigned long text_poke_addr)
 {
-	unsigned long pfn;
-
-	if (IS_ENABLED(CONFIG_MODULES) && is_vmalloc_or_module_addr(addr))
-		pfn = vmalloc_to_pfn(addr);
-	else
-		pfn = __pa_symbol(addr) >> PAGE_SHIFT;
+	unsigned long pfn = get_patch_pfn(addr);
 
 	return map_kernel_page(text_poke_addr, (pfn << PAGE_SHIFT), PAGE_KERNEL);
 }
@@ -145,17 +148,19 @@ static int __do_patch_instruction(u32 *addr, ppc_inst_t instr)
 	int err;
 	u32 *patch_addr;
 	unsigned long text_poke_addr;
+	pte_t *pte;
+	unsigned long pfn = get_patch_pfn(addr);
 
-	text_poke_addr = (unsigned long)__this_cpu_read(text_poke_area)->addr;
+	text_poke_addr = (unsigned long)__this_cpu_read(text_poke_area)->addr & PAGE_MASK;
 	patch_addr = (u32 *)(text_poke_addr + offset_in_page(addr));
 
-	err = map_patch_area(addr, text_poke_addr);
-	if (err)
-		return err;
+	pte = virt_to_kpte(text_poke_addr);
+	__set_pte_at(&init_mm, text_poke_addr, pte, pfn_pte(pfn, PAGE_KERNEL), 0);
 
 	err = __patch_instruction(addr, instr, patch_addr);
 
-	unmap_patch_area(text_poke_addr);
+	pte_clear(&init_mm, text_poke_addr, pte);
+	flush_tlb_kernel_range(text_poke_addr, text_poke_addr + PAGE_SIZE);
 
 	return err;
 }
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v1 2/4] powerpc/code-patching: Speed up page mapping/unmapping
@ 2022-03-22 15:40   ` Christophe Leroy
  0 siblings, 0 replies; 47+ messages in thread
From: Christophe Leroy @ 2022-09-27 14:33 UTC (permalink / raw)
  To: Benjamin Gray, Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman
  Cc: Christophe Leroy, linux-kernel, linuxppc-dev

Since commit 591b4b268435 ("powerpc/code-patching: Pre-map patch area")
the patch area is premapped so intermediate page tables are already
allocated.

Use __set_pte_at() directly instead of the heavy map_kernel_page(),
at for unmapping just do a pte_clear() followed by a flush.

__set_pte_at() can be used directly without the filters in
set_pte_at() because we are mapping a normal page non executable.

Make sure gcc knows text_poke_area is page aligned in order to
optimise the flush.

This change reduces by 66% the time needed to activate ftrace on
an 8xx (588000 tb ticks instead of 1744000).

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
---
 arch/powerpc/lib/code-patching.c | 27 ++++++++++++++++-----------
 1 file changed, 16 insertions(+), 11 deletions(-)

diff --git a/arch/powerpc/lib/code-patching.c b/arch/powerpc/lib/code-patching.c
index f970f189875b..62692c6031bc 100644
--- a/arch/powerpc/lib/code-patching.c
+++ b/arch/powerpc/lib/code-patching.c
@@ -90,17 +90,20 @@ void __init poking_init(void)
 		text_area_cpu_down));
 }
 
+static unsigned long get_patch_pfn(void *addr)
+{
+	if (IS_ENABLED(CONFIG_MODULES) && is_vmalloc_or_module_addr(addr))
+		return vmalloc_to_pfn(addr);
+	else
+		return __pa_symbol(addr) >> PAGE_SHIFT;
+}
+
 /*
  * This can be called for kernel text or a module.
  */
 static int map_patch_area(void *addr, unsigned long text_poke_addr)
 {
-	unsigned long pfn;
-
-	if (IS_ENABLED(CONFIG_MODULES) && is_vmalloc_or_module_addr(addr))
-		pfn = vmalloc_to_pfn(addr);
-	else
-		pfn = __pa_symbol(addr) >> PAGE_SHIFT;
+	unsigned long pfn = get_patch_pfn(addr);
 
 	return map_kernel_page(text_poke_addr, (pfn << PAGE_SHIFT), PAGE_KERNEL);
 }
@@ -145,17 +148,19 @@ static int __do_patch_instruction(u32 *addr, ppc_inst_t instr)
 	int err;
 	u32 *patch_addr;
 	unsigned long text_poke_addr;
+	pte_t *pte;
+	unsigned long pfn = get_patch_pfn(addr);
 
-	text_poke_addr = (unsigned long)__this_cpu_read(text_poke_area)->addr;
+	text_poke_addr = (unsigned long)__this_cpu_read(text_poke_area)->addr & PAGE_MASK;
 	patch_addr = (u32 *)(text_poke_addr + offset_in_page(addr));
 
-	err = map_patch_area(addr, text_poke_addr);
-	if (err)
-		return err;
+	pte = virt_to_kpte(text_poke_addr);
+	__set_pte_at(&init_mm, text_poke_addr, pte, pfn_pte(pfn, PAGE_KERNEL), 0);
 
 	err = __patch_instruction(addr, instr, patch_addr);
 
-	unmap_patch_area(text_poke_addr);
+	pte_clear(&init_mm, text_poke_addr, pte);
+	flush_tlb_kernel_range(text_poke_addr, text_poke_addr + PAGE_SIZE);
 
 	return err;
 }
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v1 3/4] powerpc/code-patching: Use jump_label for testing freed initmem
@ 2022-03-22 15:40   ` Christophe Leroy
  0 siblings, 0 replies; 47+ messages in thread
From: Christophe Leroy @ 2022-09-27 14:33 UTC (permalink / raw)
  To: Benjamin Gray, Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman
  Cc: linuxppc-dev, linux-kernel

Once init is done, initmem is freed forever so no need to
test system_state at every call to patch_instruction().

Use jump_label.

This reduces by 2% the time needed to activate ftrace on an 8xx.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
---
 arch/powerpc/include/asm/code-patching.h | 2 ++
 arch/powerpc/lib/code-patching.c         | 5 ++++-
 arch/powerpc/mm/mem.c                    | 2 ++
 3 files changed, 8 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/include/asm/code-patching.h b/arch/powerpc/include/asm/code-patching.h
index 409483b2d0ce..bccc3a538b9f 100644
--- a/arch/powerpc/include/asm/code-patching.h
+++ b/arch/powerpc/include/asm/code-patching.h
@@ -22,6 +22,8 @@
 #define BRANCH_SET_LINK	0x1
 #define BRANCH_ABSOLUTE	0x2
 
+DECLARE_STATIC_KEY_FALSE(init_mem_is_free);
+
 bool is_offset_in_branch_range(long offset);
 bool is_offset_in_cond_branch_range(long offset);
 int create_branch(ppc_inst_t *instr, const u32 *addr,
diff --git a/arch/powerpc/lib/code-patching.c b/arch/powerpc/lib/code-patching.c
index 62692c6031bc..ab434c3853c9 100644
--- a/arch/powerpc/lib/code-patching.c
+++ b/arch/powerpc/lib/code-patching.c
@@ -8,6 +8,7 @@
 #include <linux/init.h>
 #include <linux/cpuhotplug.h>
 #include <linux/uaccess.h>
+#include <linux/jump_label.h>
 
 #include <asm/tlbflush.h>
 #include <asm/page.h>
@@ -193,10 +194,12 @@ static int do_patch_instruction(u32 *addr, ppc_inst_t instr)
 
 #endif /* CONFIG_STRICT_KERNEL_RWX */
 
+__ro_after_init DEFINE_STATIC_KEY_FALSE(init_mem_is_free);
+
 int patch_instruction(u32 *addr, ppc_inst_t instr)
 {
 	/* Make sure we aren't patching a freed init section */
-	if (system_state >= SYSTEM_FREEING_INITMEM && init_section_contains(addr, 4))
+	if (static_branch_likely(&init_mem_is_free) && init_section_contains(addr, 4))
 		return 0;
 
 	return do_patch_instruction(addr, instr);
diff --git a/arch/powerpc/mm/mem.c b/arch/powerpc/mm/mem.c
index 8e301cd8925b..9710d4e0bf08 100644
--- a/arch/powerpc/mm/mem.c
+++ b/arch/powerpc/mm/mem.c
@@ -22,6 +22,7 @@
 #include <asm/kasan.h>
 #include <asm/svm.h>
 #include <asm/mmzone.h>
+#include <asm/code-patching.h>
 
 #include <mm/mmu_decl.h>
 
@@ -311,6 +312,7 @@ void free_initmem(void)
 {
 	ppc_md.progress = ppc_printk_progress;
 	mark_initmem_nx();
+	static_branch_enable(&init_mem_is_free);
 	free_initmem_default(POISON_FREE_INITMEM);
 }
 
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v1 3/4] powerpc/code-patching: Use jump_label for testing freed initmem
@ 2022-03-22 15:40   ` Christophe Leroy
  0 siblings, 0 replies; 47+ messages in thread
From: Christophe Leroy @ 2022-09-27 14:33 UTC (permalink / raw)
  To: Benjamin Gray, Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman
  Cc: Christophe Leroy, linux-kernel, linuxppc-dev

Once init is done, initmem is freed forever so no need to
test system_state at every call to patch_instruction().

Use jump_label.

This reduces by 2% the time needed to activate ftrace on an 8xx.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
---
 arch/powerpc/include/asm/code-patching.h | 2 ++
 arch/powerpc/lib/code-patching.c         | 5 ++++-
 arch/powerpc/mm/mem.c                    | 2 ++
 3 files changed, 8 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/include/asm/code-patching.h b/arch/powerpc/include/asm/code-patching.h
index 409483b2d0ce..bccc3a538b9f 100644
--- a/arch/powerpc/include/asm/code-patching.h
+++ b/arch/powerpc/include/asm/code-patching.h
@@ -22,6 +22,8 @@
 #define BRANCH_SET_LINK	0x1
 #define BRANCH_ABSOLUTE	0x2
 
+DECLARE_STATIC_KEY_FALSE(init_mem_is_free);
+
 bool is_offset_in_branch_range(long offset);
 bool is_offset_in_cond_branch_range(long offset);
 int create_branch(ppc_inst_t *instr, const u32 *addr,
diff --git a/arch/powerpc/lib/code-patching.c b/arch/powerpc/lib/code-patching.c
index 62692c6031bc..ab434c3853c9 100644
--- a/arch/powerpc/lib/code-patching.c
+++ b/arch/powerpc/lib/code-patching.c
@@ -8,6 +8,7 @@
 #include <linux/init.h>
 #include <linux/cpuhotplug.h>
 #include <linux/uaccess.h>
+#include <linux/jump_label.h>
 
 #include <asm/tlbflush.h>
 #include <asm/page.h>
@@ -193,10 +194,12 @@ static int do_patch_instruction(u32 *addr, ppc_inst_t instr)
 
 #endif /* CONFIG_STRICT_KERNEL_RWX */
 
+__ro_after_init DEFINE_STATIC_KEY_FALSE(init_mem_is_free);
+
 int patch_instruction(u32 *addr, ppc_inst_t instr)
 {
 	/* Make sure we aren't patching a freed init section */
-	if (system_state >= SYSTEM_FREEING_INITMEM && init_section_contains(addr, 4))
+	if (static_branch_likely(&init_mem_is_free) && init_section_contains(addr, 4))
 		return 0;
 
 	return do_patch_instruction(addr, instr);
diff --git a/arch/powerpc/mm/mem.c b/arch/powerpc/mm/mem.c
index 8e301cd8925b..9710d4e0bf08 100644
--- a/arch/powerpc/mm/mem.c
+++ b/arch/powerpc/mm/mem.c
@@ -22,6 +22,7 @@
 #include <asm/kasan.h>
 #include <asm/svm.h>
 #include <asm/mmzone.h>
+#include <asm/code-patching.h>
 
 #include <mm/mmu_decl.h>
 
@@ -311,6 +312,7 @@ void free_initmem(void)
 {
 	ppc_md.progress = ppc_printk_progress;
 	mark_initmem_nx();
+	static_branch_enable(&init_mem_is_free);
 	free_initmem_default(POISON_FREE_INITMEM);
 }
 
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v1 3/6] powerpc/feature-fixups: Refactor entry fixups patching
  2022-09-27 14:33   ` Christophe Leroy
@ 2022-09-27 14:33     ` Christophe Leroy
  -1 siblings, 0 replies; 47+ messages in thread
From: Christophe Leroy @ 2022-09-27 14:33 UTC (permalink / raw)
  To: Benjamin Gray, Michael Ellerman, Nicholas Piggin
  Cc: linuxppc-dev, linux-kernel

Several fonctions have the same loop for patching instructions.

Introduce function do_patch_entry_fixups() to refactor those loops.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
---
 arch/powerpc/lib/feature-fixups.c | 84 ++++++++++++-------------------
 1 file changed, 32 insertions(+), 52 deletions(-)

diff --git a/arch/powerpc/lib/feature-fixups.c b/arch/powerpc/lib/feature-fixups.c
index 993d3f31832a..6767a6c3106f 100644
--- a/arch/powerpc/lib/feature-fixups.c
+++ b/arch/powerpc/lib/feature-fixups.c
@@ -118,9 +118,33 @@ void do_feature_fixups(unsigned long value, void *fixup_start, void *fixup_end)
 }
 
 #ifdef CONFIG_PPC_BOOK3S_64
+static int do_patch_entry_fixups(long *start, long *end, unsigned int *instrs,
+				 bool do_fallback, void *fallback)
+{
+	int i;
+
+	for (i = 0; start < end; start++, i++) {
+		unsigned int *dest = (void *)start + *start;
+
+		pr_devel("patching dest %lx\n", (unsigned long)dest);
+
+		// See comment in do_entry_flush_fixups() RE order of patching
+		if (do_fallback) {
+			patch_instruction(dest, ppc_inst(instrs[0]));
+			patch_instruction(dest + 2, ppc_inst(instrs[2]));
+			patch_branch(dest + 1, (unsigned long)fallback, BRANCH_SET_LINK);
+		} else {
+			patch_instruction(dest + 1, ppc_inst(instrs[1]));
+			patch_instruction(dest + 2, ppc_inst(instrs[2]));
+			patch_instruction(dest, ppc_inst(instrs[0]));
+		}
+	}
+	return i;
+}
+
 static void do_stf_entry_barrier_fixups(enum stf_barrier_type types)
 {
-	unsigned int instrs[3], *dest;
+	unsigned int instrs[3];
 	long *start, *end;
 	int i;
 
@@ -144,23 +168,8 @@ static void do_stf_entry_barrier_fixups(enum stf_barrier_type types)
 		instrs[i++] = PPC_RAW_ORI(_R31, _R31, 0); /* speculation barrier */
 	}
 
-	for (i = 0; start < end; start++, i++) {
-		dest = (void *)start + *start;
-
-		pr_devel("patching dest %lx\n", (unsigned long)dest);
-
-		// See comment in do_entry_flush_fixups() RE order of patching
-		if (types & STF_BARRIER_FALLBACK) {
-			patch_instruction(dest, ppc_inst(instrs[0]));
-			patch_instruction(dest + 2, ppc_inst(instrs[2]));
-			patch_branch(dest + 1,
-				     (unsigned long)&stf_barrier_fallback, BRANCH_SET_LINK);
-		} else {
-			patch_instruction(dest + 1, ppc_inst(instrs[1]));
-			patch_instruction(dest + 2, ppc_inst(instrs[2]));
-			patch_instruction(dest, ppc_inst(instrs[0]));
-		}
-	}
+	i = do_patch_entry_fixups(start, end, instrs, types & STF_BARRIER_FALLBACK,
+				  &stf_barrier_fallback);
 
 	printk(KERN_DEBUG "stf-barrier: patched %d entry locations (%s barrier)\n", i,
 		(types == STF_BARRIER_NONE)                  ? "no" :
@@ -325,7 +334,7 @@ void do_uaccess_flush_fixups(enum l1d_flush_type types)
 static int __do_entry_flush_fixups(void *data)
 {
 	enum l1d_flush_type types = *(enum l1d_flush_type *)data;
-	unsigned int instrs[3], *dest;
+	unsigned int instrs[3];
 	long *start, *end;
 	int i;
 
@@ -375,42 +384,13 @@ static int __do_entry_flush_fixups(void *data)
 
 	start = PTRRELOC(&__start___entry_flush_fixup);
 	end = PTRRELOC(&__stop___entry_flush_fixup);
-	for (i = 0; start < end; start++, i++) {
-		dest = (void *)start + *start;
-
-		pr_devel("patching dest %lx\n", (unsigned long)dest);
-
-		if (types == L1D_FLUSH_FALLBACK) {
-			patch_instruction(dest, ppc_inst(instrs[0]));
-			patch_instruction(dest + 2, ppc_inst(instrs[2]));
-			patch_branch(dest + 1,
-				     (unsigned long)&entry_flush_fallback, BRANCH_SET_LINK);
-		} else {
-			patch_instruction(dest + 1, ppc_inst(instrs[1]));
-			patch_instruction(dest + 2, ppc_inst(instrs[2]));
-			patch_instruction(dest, ppc_inst(instrs[0]));
-		}
-	}
+	i = do_patch_entry_fixups(start, end, instrs, types == L1D_FLUSH_FALLBACK,
+				  &entry_flush_fallback);
 
 	start = PTRRELOC(&__start___scv_entry_flush_fixup);
 	end = PTRRELOC(&__stop___scv_entry_flush_fixup);
-	for (; start < end; start++, i++) {
-		dest = (void *)start + *start;
-
-		pr_devel("patching dest %lx\n", (unsigned long)dest);
-
-		if (types == L1D_FLUSH_FALLBACK) {
-			patch_instruction(dest, ppc_inst(instrs[0]));
-			patch_instruction(dest + 2, ppc_inst(instrs[2]));
-			patch_branch(dest + 1,
-				     (unsigned long)&scv_entry_flush_fallback, BRANCH_SET_LINK);
-		} else {
-			patch_instruction(dest + 1, ppc_inst(instrs[1]));
-			patch_instruction(dest + 2, ppc_inst(instrs[2]));
-			patch_instruction(dest, ppc_inst(instrs[0]));
-		}
-	}
-
+	i += do_patch_entry_fixups(start, end, instrs, types == L1D_FLUSH_FALLBACK,
+				   &scv_entry_flush_fallback);
 
 	printk(KERN_DEBUG "entry-flush: patched %d locations (%s flush)\n", i,
 		(types == L1D_FLUSH_NONE)       ? "no" :
-- 
2.37.1


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v1 3/6] powerpc/feature-fixups: Refactor entry fixups patching
@ 2022-09-27 14:33     ` Christophe Leroy
  0 siblings, 0 replies; 47+ messages in thread
From: Christophe Leroy @ 2022-09-27 14:33 UTC (permalink / raw)
  To: Benjamin Gray, Michael Ellerman, Nicholas Piggin
  Cc: Christophe Leroy, linux-kernel, linuxppc-dev

Several fonctions have the same loop for patching instructions.

Introduce function do_patch_entry_fixups() to refactor those loops.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
---
 arch/powerpc/lib/feature-fixups.c | 84 ++++++++++++-------------------
 1 file changed, 32 insertions(+), 52 deletions(-)

diff --git a/arch/powerpc/lib/feature-fixups.c b/arch/powerpc/lib/feature-fixups.c
index 993d3f31832a..6767a6c3106f 100644
--- a/arch/powerpc/lib/feature-fixups.c
+++ b/arch/powerpc/lib/feature-fixups.c
@@ -118,9 +118,33 @@ void do_feature_fixups(unsigned long value, void *fixup_start, void *fixup_end)
 }
 
 #ifdef CONFIG_PPC_BOOK3S_64
+static int do_patch_entry_fixups(long *start, long *end, unsigned int *instrs,
+				 bool do_fallback, void *fallback)
+{
+	int i;
+
+	for (i = 0; start < end; start++, i++) {
+		unsigned int *dest = (void *)start + *start;
+
+		pr_devel("patching dest %lx\n", (unsigned long)dest);
+
+		// See comment in do_entry_flush_fixups() RE order of patching
+		if (do_fallback) {
+			patch_instruction(dest, ppc_inst(instrs[0]));
+			patch_instruction(dest + 2, ppc_inst(instrs[2]));
+			patch_branch(dest + 1, (unsigned long)fallback, BRANCH_SET_LINK);
+		} else {
+			patch_instruction(dest + 1, ppc_inst(instrs[1]));
+			patch_instruction(dest + 2, ppc_inst(instrs[2]));
+			patch_instruction(dest, ppc_inst(instrs[0]));
+		}
+	}
+	return i;
+}
+
 static void do_stf_entry_barrier_fixups(enum stf_barrier_type types)
 {
-	unsigned int instrs[3], *dest;
+	unsigned int instrs[3];
 	long *start, *end;
 	int i;
 
@@ -144,23 +168,8 @@ static void do_stf_entry_barrier_fixups(enum stf_barrier_type types)
 		instrs[i++] = PPC_RAW_ORI(_R31, _R31, 0); /* speculation barrier */
 	}
 
-	for (i = 0; start < end; start++, i++) {
-		dest = (void *)start + *start;
-
-		pr_devel("patching dest %lx\n", (unsigned long)dest);
-
-		// See comment in do_entry_flush_fixups() RE order of patching
-		if (types & STF_BARRIER_FALLBACK) {
-			patch_instruction(dest, ppc_inst(instrs[0]));
-			patch_instruction(dest + 2, ppc_inst(instrs[2]));
-			patch_branch(dest + 1,
-				     (unsigned long)&stf_barrier_fallback, BRANCH_SET_LINK);
-		} else {
-			patch_instruction(dest + 1, ppc_inst(instrs[1]));
-			patch_instruction(dest + 2, ppc_inst(instrs[2]));
-			patch_instruction(dest, ppc_inst(instrs[0]));
-		}
-	}
+	i = do_patch_entry_fixups(start, end, instrs, types & STF_BARRIER_FALLBACK,
+				  &stf_barrier_fallback);
 
 	printk(KERN_DEBUG "stf-barrier: patched %d entry locations (%s barrier)\n", i,
 		(types == STF_BARRIER_NONE)                  ? "no" :
@@ -325,7 +334,7 @@ void do_uaccess_flush_fixups(enum l1d_flush_type types)
 static int __do_entry_flush_fixups(void *data)
 {
 	enum l1d_flush_type types = *(enum l1d_flush_type *)data;
-	unsigned int instrs[3], *dest;
+	unsigned int instrs[3];
 	long *start, *end;
 	int i;
 
@@ -375,42 +384,13 @@ static int __do_entry_flush_fixups(void *data)
 
 	start = PTRRELOC(&__start___entry_flush_fixup);
 	end = PTRRELOC(&__stop___entry_flush_fixup);
-	for (i = 0; start < end; start++, i++) {
-		dest = (void *)start + *start;
-
-		pr_devel("patching dest %lx\n", (unsigned long)dest);
-
-		if (types == L1D_FLUSH_FALLBACK) {
-			patch_instruction(dest, ppc_inst(instrs[0]));
-			patch_instruction(dest + 2, ppc_inst(instrs[2]));
-			patch_branch(dest + 1,
-				     (unsigned long)&entry_flush_fallback, BRANCH_SET_LINK);
-		} else {
-			patch_instruction(dest + 1, ppc_inst(instrs[1]));
-			patch_instruction(dest + 2, ppc_inst(instrs[2]));
-			patch_instruction(dest, ppc_inst(instrs[0]));
-		}
-	}
+	i = do_patch_entry_fixups(start, end, instrs, types == L1D_FLUSH_FALLBACK,
+				  &entry_flush_fallback);
 
 	start = PTRRELOC(&__start___scv_entry_flush_fixup);
 	end = PTRRELOC(&__stop___scv_entry_flush_fixup);
-	for (; start < end; start++, i++) {
-		dest = (void *)start + *start;
-
-		pr_devel("patching dest %lx\n", (unsigned long)dest);
-
-		if (types == L1D_FLUSH_FALLBACK) {
-			patch_instruction(dest, ppc_inst(instrs[0]));
-			patch_instruction(dest + 2, ppc_inst(instrs[2]));
-			patch_branch(dest + 1,
-				     (unsigned long)&scv_entry_flush_fallback, BRANCH_SET_LINK);
-		} else {
-			patch_instruction(dest + 1, ppc_inst(instrs[1]));
-			patch_instruction(dest + 2, ppc_inst(instrs[2]));
-			patch_instruction(dest, ppc_inst(instrs[0]));
-		}
-	}
-
+	i += do_patch_entry_fixups(start, end, instrs, types == L1D_FLUSH_FALLBACK,
+				   &scv_entry_flush_fallback);
 
 	printk(KERN_DEBUG "entry-flush: patched %d locations (%s flush)\n", i,
 		(types == L1D_FLUSH_NONE)       ? "no" :
-- 
2.37.1


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v1 4/4] powerpc/code-patching: Use jump_label to check if poking_init() is done
@ 2022-03-22 15:40   ` Christophe Leroy
  0 siblings, 0 replies; 47+ messages in thread
From: Christophe Leroy @ 2022-09-27 14:33 UTC (permalink / raw)
  To: Benjamin Gray, Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman
  Cc: linuxppc-dev, linux-kernel

It's only during early startup that poking_init() is not done yet,
for instance when calling ftrace_init().

Once poking_init() has been called there must be a poking area, no
need to check it everytime patch_instruction() is called.

ftrace activation time is reduced by 7% with the change on an 8xx.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
---
 arch/powerpc/lib/code-patching.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/lib/code-patching.c b/arch/powerpc/lib/code-patching.c
index ab434c3853c9..8bd74bbe8b8d 100644
--- a/arch/powerpc/lib/code-patching.c
+++ b/arch/powerpc/lib/code-patching.c
@@ -79,6 +79,8 @@ static int text_area_cpu_down(unsigned int cpu)
 	return 0;
 }
 
+static __ro_after_init DEFINE_STATIC_KEY_FALSE(poking_init_done);
+
 /*
  * Although BUG_ON() is rude, in this case it should only happen if ENOMEM, and
  * we judge it as being preferable to a kernel that will crash later when
@@ -89,6 +91,7 @@ void __init poking_init(void)
 	BUG_ON(!cpuhp_setup_state(CPUHP_AP_ONLINE_DYN,
 		"powerpc/text_poke:online", text_area_cpu_up,
 		text_area_cpu_down));
+	static_branch_enable(&poking_init_done);
 }
 
 static unsigned long get_patch_pfn(void *addr)
@@ -176,7 +179,7 @@ static int do_patch_instruction(u32 *addr, ppc_inst_t instr)
 	 * when text_poke_area is not ready, but we still need
 	 * to allow patching. We just do the plain old patching
 	 */
-	if (!this_cpu_read(text_poke_area))
+	if (!static_branch_likely(&poking_init_done))
 		return raw_patch_instruction(addr, instr);
 
 	local_irq_save(flags);
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v1 4/4] powerpc/code-patching: Use jump_label to check if poking_init() is done
@ 2022-03-22 15:40   ` Christophe Leroy
  0 siblings, 0 replies; 47+ messages in thread
From: Christophe Leroy @ 2022-09-27 14:33 UTC (permalink / raw)
  To: Benjamin Gray, Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman
  Cc: Christophe Leroy, linux-kernel, linuxppc-dev

It's only during early startup that poking_init() is not done yet,
for instance when calling ftrace_init().

Once poking_init() has been called there must be a poking area, no
need to check it everytime patch_instruction() is called.

ftrace activation time is reduced by 7% with the change on an 8xx.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
---
 arch/powerpc/lib/code-patching.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/lib/code-patching.c b/arch/powerpc/lib/code-patching.c
index ab434c3853c9..8bd74bbe8b8d 100644
--- a/arch/powerpc/lib/code-patching.c
+++ b/arch/powerpc/lib/code-patching.c
@@ -79,6 +79,8 @@ static int text_area_cpu_down(unsigned int cpu)
 	return 0;
 }
 
+static __ro_after_init DEFINE_STATIC_KEY_FALSE(poking_init_done);
+
 /*
  * Although BUG_ON() is rude, in this case it should only happen if ENOMEM, and
  * we judge it as being preferable to a kernel that will crash later when
@@ -89,6 +91,7 @@ void __init poking_init(void)
 	BUG_ON(!cpuhp_setup_state(CPUHP_AP_ONLINE_DYN,
 		"powerpc/text_poke:online", text_area_cpu_up,
 		text_area_cpu_down));
+	static_branch_enable(&poking_init_done);
 }
 
 static unsigned long get_patch_pfn(void *addr)
@@ -176,7 +179,7 @@ static int do_patch_instruction(u32 *addr, ppc_inst_t instr)
 	 * when text_poke_area is not ready, but we still need
 	 * to allow patching. We just do the plain old patching
 	 */
-	if (!this_cpu_read(text_poke_area))
+	if (!static_branch_likely(&poking_init_done))
 		return raw_patch_instruction(addr, instr);
 
 	local_irq_save(flags);
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v1 4/6] powerpc/feature-fixups: Refactor other fixups patching
  2022-09-27 14:33   ` Christophe Leroy
@ 2022-09-27 14:33     ` Christophe Leroy
  -1 siblings, 0 replies; 47+ messages in thread
From: Christophe Leroy @ 2022-09-27 14:33 UTC (permalink / raw)
  To: Benjamin Gray, Michael Ellerman, Nicholas Piggin
  Cc: linuxppc-dev, linux-kernel

Several fonctions have the same loop for patching instructions.

Introduce function do_patch_fixups() to refactor those loops.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
---
 arch/powerpc/lib/feature-fixups.c | 77 +++++++++++--------------------
 1 file changed, 28 insertions(+), 49 deletions(-)

diff --git a/arch/powerpc/lib/feature-fixups.c b/arch/powerpc/lib/feature-fixups.c
index 6767a6c3106f..a03ed9931224 100644
--- a/arch/powerpc/lib/feature-fixups.c
+++ b/arch/powerpc/lib/feature-fixups.c
@@ -117,6 +117,24 @@ void do_feature_fixups(unsigned long value, void *fixup_start, void *fixup_end)
 	}
 }
 
+#ifdef CONFIG_PPC_BARRIER_NOSPEC
+static int do_patch_fixups(long *start, long *end, unsigned int *instrs, int num)
+{
+	int i;
+
+	for (i = 0; start < end; start++, i++) {
+		int j;
+		unsigned int *dest = (void *)start + *start;
+
+		pr_devel("patching dest %lx\n", (unsigned long)dest);
+
+		for (j = 0; j < num; j++)
+			patch_instruction(dest + j, ppc_inst(instrs[j]));
+	}
+	return i;
+}
+#endif
+
 #ifdef CONFIG_PPC_BOOK3S_64
 static int do_patch_entry_fixups(long *start, long *end, unsigned int *instrs,
 				 bool do_fallback, void *fallback)
@@ -181,7 +199,7 @@ static void do_stf_entry_barrier_fixups(enum stf_barrier_type types)
 
 static void do_stf_exit_barrier_fixups(enum stf_barrier_type types)
 {
-	unsigned int instrs[6], *dest;
+	unsigned int instrs[6];
 	long *start, *end;
 	int i;
 
@@ -215,18 +233,8 @@ static void do_stf_exit_barrier_fixups(enum stf_barrier_type types)
 		instrs[i++] = PPC_RAW_EIEIO() | 0x02000000; /* eieio + bit 6 hint */
 	}
 
-	for (i = 0; start < end; start++, i++) {
-		dest = (void *)start + *start;
+	i = do_patch_fixups(start, end, instrs, ARRAY_SIZE(instrs));
 
-		pr_devel("patching dest %lx\n", (unsigned long)dest);
-
-		patch_instruction(dest, ppc_inst(instrs[0]));
-		patch_instruction(dest + 1, ppc_inst(instrs[1]));
-		patch_instruction(dest + 2, ppc_inst(instrs[2]));
-		patch_instruction(dest + 3, ppc_inst(instrs[3]));
-		patch_instruction(dest + 4, ppc_inst(instrs[4]));
-		patch_instruction(dest + 5, ppc_inst(instrs[5]));
-	}
 	printk(KERN_DEBUG "stf-barrier: patched %d exit locations (%s barrier)\n", i,
 		(types == STF_BARRIER_NONE)                  ? "no" :
 		(types == STF_BARRIER_FALLBACK)              ? "fallback" :
@@ -283,7 +291,7 @@ void do_stf_barrier_fixups(enum stf_barrier_type types)
 
 void do_uaccess_flush_fixups(enum l1d_flush_type types)
 {
-	unsigned int instrs[4], *dest;
+	unsigned int instrs[4];
 	long *start, *end;
 	int i;
 
@@ -309,17 +317,7 @@ void do_uaccess_flush_fixups(enum l1d_flush_type types)
 	if (types & L1D_FLUSH_MTTRIG)
 		instrs[i++] = PPC_RAW_MTSPR(SPRN_TRIG2, _R0);
 
-	for (i = 0; start < end; start++, i++) {
-		dest = (void *)start + *start;
-
-		pr_devel("patching dest %lx\n", (unsigned long)dest);
-
-		patch_instruction(dest, ppc_inst(instrs[0]));
-
-		patch_instruction(dest + 1, ppc_inst(instrs[1]));
-		patch_instruction(dest + 2, ppc_inst(instrs[2]));
-		patch_instruction(dest + 3, ppc_inst(instrs[3]));
-	}
+	i = do_patch_fixups(start, end, instrs, ARRAY_SIZE(instrs));
 
 	printk(KERN_DEBUG "uaccess-flush: patched %d locations (%s flush)\n", i,
 		(types == L1D_FLUSH_NONE)       ? "no" :
@@ -418,7 +416,7 @@ void do_entry_flush_fixups(enum l1d_flush_type types)
 static int __do_rfi_flush_fixups(void *data)
 {
 	enum l1d_flush_type types = *(enum l1d_flush_type *)data;
-	unsigned int instrs[3], *dest;
+	unsigned int instrs[3];
 	long *start, *end;
 	int i;
 
@@ -442,15 +440,7 @@ static int __do_rfi_flush_fixups(void *data)
 	if (types & L1D_FLUSH_MTTRIG)
 		instrs[i++] = PPC_RAW_MTSPR(SPRN_TRIG2, _R0);
 
-	for (i = 0; start < end; start++, i++) {
-		dest = (void *)start + *start;
-
-		pr_devel("patching dest %lx\n", (unsigned long)dest);
-
-		patch_instruction(dest, ppc_inst(instrs[0]));
-		patch_instruction(dest + 1, ppc_inst(instrs[1]));
-		patch_instruction(dest + 2, ppc_inst(instrs[2]));
-	}
+	i = do_patch_fixups(start, end, instrs, ARRAY_SIZE(instrs));
 
 	printk(KERN_DEBUG "rfi-flush: patched %d locations (%s flush)\n", i,
 		(types == L1D_FLUSH_NONE)       ? "no" :
@@ -492,7 +482,7 @@ void do_rfi_flush_fixups(enum l1d_flush_type types)
 
 void do_barrier_nospec_fixups_range(bool enable, void *fixup_start, void *fixup_end)
 {
-	unsigned int instr, *dest;
+	unsigned int instr;
 	long *start, *end;
 	int i;
 
@@ -506,12 +496,7 @@ void do_barrier_nospec_fixups_range(bool enable, void *fixup_start, void *fixup_
 		instr = PPC_RAW_ORI(_R31, _R31, 0); /* speculation barrier */
 	}
 
-	for (i = 0; start < end; start++, i++) {
-		dest = (void *)start + *start;
-
-		pr_devel("patching dest %lx\n", (unsigned long)dest);
-		patch_instruction(dest, ppc_inst(instr));
-	}
+	i = do_patch_fixups(start, end, &instr, 1);
 
 	printk(KERN_DEBUG "barrier-nospec: patched %d locations\n", i);
 }
@@ -533,7 +518,7 @@ void do_barrier_nospec_fixups(bool enable)
 #ifdef CONFIG_PPC_FSL_BOOK3E
 void do_barrier_nospec_fixups_range(bool enable, void *fixup_start, void *fixup_end)
 {
-	unsigned int instr[2], *dest;
+	unsigned int instr[2];
 	long *start, *end;
 	int i;
 
@@ -549,13 +534,7 @@ void do_barrier_nospec_fixups_range(bool enable, void *fixup_start, void *fixup_
 		instr[1] = PPC_RAW_SYNC();
 	}
 
-	for (i = 0; start < end; start++, i++) {
-		dest = (void *)start + *start;
-
-		pr_devel("patching dest %lx\n", (unsigned long)dest);
-		patch_instruction(dest, ppc_inst(instr[0]));
-		patch_instruction(dest + 1, ppc_inst(instr[1]));
-	}
+	i = do_patch_fixups(start, end, instr, ARRAY_SIZE(instr));
 
 	printk(KERN_DEBUG "barrier-nospec: patched %d locations\n", i);
 }
-- 
2.37.1


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v1 4/6] powerpc/feature-fixups: Refactor other fixups patching
@ 2022-09-27 14:33     ` Christophe Leroy
  0 siblings, 0 replies; 47+ messages in thread
From: Christophe Leroy @ 2022-09-27 14:33 UTC (permalink / raw)
  To: Benjamin Gray, Michael Ellerman, Nicholas Piggin
  Cc: Christophe Leroy, linux-kernel, linuxppc-dev

Several fonctions have the same loop for patching instructions.

Introduce function do_patch_fixups() to refactor those loops.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
---
 arch/powerpc/lib/feature-fixups.c | 77 +++++++++++--------------------
 1 file changed, 28 insertions(+), 49 deletions(-)

diff --git a/arch/powerpc/lib/feature-fixups.c b/arch/powerpc/lib/feature-fixups.c
index 6767a6c3106f..a03ed9931224 100644
--- a/arch/powerpc/lib/feature-fixups.c
+++ b/arch/powerpc/lib/feature-fixups.c
@@ -117,6 +117,24 @@ void do_feature_fixups(unsigned long value, void *fixup_start, void *fixup_end)
 	}
 }
 
+#ifdef CONFIG_PPC_BARRIER_NOSPEC
+static int do_patch_fixups(long *start, long *end, unsigned int *instrs, int num)
+{
+	int i;
+
+	for (i = 0; start < end; start++, i++) {
+		int j;
+		unsigned int *dest = (void *)start + *start;
+
+		pr_devel("patching dest %lx\n", (unsigned long)dest);
+
+		for (j = 0; j < num; j++)
+			patch_instruction(dest + j, ppc_inst(instrs[j]));
+	}
+	return i;
+}
+#endif
+
 #ifdef CONFIG_PPC_BOOK3S_64
 static int do_patch_entry_fixups(long *start, long *end, unsigned int *instrs,
 				 bool do_fallback, void *fallback)
@@ -181,7 +199,7 @@ static void do_stf_entry_barrier_fixups(enum stf_barrier_type types)
 
 static void do_stf_exit_barrier_fixups(enum stf_barrier_type types)
 {
-	unsigned int instrs[6], *dest;
+	unsigned int instrs[6];
 	long *start, *end;
 	int i;
 
@@ -215,18 +233,8 @@ static void do_stf_exit_barrier_fixups(enum stf_barrier_type types)
 		instrs[i++] = PPC_RAW_EIEIO() | 0x02000000; /* eieio + bit 6 hint */
 	}
 
-	for (i = 0; start < end; start++, i++) {
-		dest = (void *)start + *start;
+	i = do_patch_fixups(start, end, instrs, ARRAY_SIZE(instrs));
 
-		pr_devel("patching dest %lx\n", (unsigned long)dest);
-
-		patch_instruction(dest, ppc_inst(instrs[0]));
-		patch_instruction(dest + 1, ppc_inst(instrs[1]));
-		patch_instruction(dest + 2, ppc_inst(instrs[2]));
-		patch_instruction(dest + 3, ppc_inst(instrs[3]));
-		patch_instruction(dest + 4, ppc_inst(instrs[4]));
-		patch_instruction(dest + 5, ppc_inst(instrs[5]));
-	}
 	printk(KERN_DEBUG "stf-barrier: patched %d exit locations (%s barrier)\n", i,
 		(types == STF_BARRIER_NONE)                  ? "no" :
 		(types == STF_BARRIER_FALLBACK)              ? "fallback" :
@@ -283,7 +291,7 @@ void do_stf_barrier_fixups(enum stf_barrier_type types)
 
 void do_uaccess_flush_fixups(enum l1d_flush_type types)
 {
-	unsigned int instrs[4], *dest;
+	unsigned int instrs[4];
 	long *start, *end;
 	int i;
 
@@ -309,17 +317,7 @@ void do_uaccess_flush_fixups(enum l1d_flush_type types)
 	if (types & L1D_FLUSH_MTTRIG)
 		instrs[i++] = PPC_RAW_MTSPR(SPRN_TRIG2, _R0);
 
-	for (i = 0; start < end; start++, i++) {
-		dest = (void *)start + *start;
-
-		pr_devel("patching dest %lx\n", (unsigned long)dest);
-
-		patch_instruction(dest, ppc_inst(instrs[0]));
-
-		patch_instruction(dest + 1, ppc_inst(instrs[1]));
-		patch_instruction(dest + 2, ppc_inst(instrs[2]));
-		patch_instruction(dest + 3, ppc_inst(instrs[3]));
-	}
+	i = do_patch_fixups(start, end, instrs, ARRAY_SIZE(instrs));
 
 	printk(KERN_DEBUG "uaccess-flush: patched %d locations (%s flush)\n", i,
 		(types == L1D_FLUSH_NONE)       ? "no" :
@@ -418,7 +416,7 @@ void do_entry_flush_fixups(enum l1d_flush_type types)
 static int __do_rfi_flush_fixups(void *data)
 {
 	enum l1d_flush_type types = *(enum l1d_flush_type *)data;
-	unsigned int instrs[3], *dest;
+	unsigned int instrs[3];
 	long *start, *end;
 	int i;
 
@@ -442,15 +440,7 @@ static int __do_rfi_flush_fixups(void *data)
 	if (types & L1D_FLUSH_MTTRIG)
 		instrs[i++] = PPC_RAW_MTSPR(SPRN_TRIG2, _R0);
 
-	for (i = 0; start < end; start++, i++) {
-		dest = (void *)start + *start;
-
-		pr_devel("patching dest %lx\n", (unsigned long)dest);
-
-		patch_instruction(dest, ppc_inst(instrs[0]));
-		patch_instruction(dest + 1, ppc_inst(instrs[1]));
-		patch_instruction(dest + 2, ppc_inst(instrs[2]));
-	}
+	i = do_patch_fixups(start, end, instrs, ARRAY_SIZE(instrs));
 
 	printk(KERN_DEBUG "rfi-flush: patched %d locations (%s flush)\n", i,
 		(types == L1D_FLUSH_NONE)       ? "no" :
@@ -492,7 +482,7 @@ void do_rfi_flush_fixups(enum l1d_flush_type types)
 
 void do_barrier_nospec_fixups_range(bool enable, void *fixup_start, void *fixup_end)
 {
-	unsigned int instr, *dest;
+	unsigned int instr;
 	long *start, *end;
 	int i;
 
@@ -506,12 +496,7 @@ void do_barrier_nospec_fixups_range(bool enable, void *fixup_start, void *fixup_
 		instr = PPC_RAW_ORI(_R31, _R31, 0); /* speculation barrier */
 	}
 
-	for (i = 0; start < end; start++, i++) {
-		dest = (void *)start + *start;
-
-		pr_devel("patching dest %lx\n", (unsigned long)dest);
-		patch_instruction(dest, ppc_inst(instr));
-	}
+	i = do_patch_fixups(start, end, &instr, 1);
 
 	printk(KERN_DEBUG "barrier-nospec: patched %d locations\n", i);
 }
@@ -533,7 +518,7 @@ void do_barrier_nospec_fixups(bool enable)
 #ifdef CONFIG_PPC_FSL_BOOK3E
 void do_barrier_nospec_fixups_range(bool enable, void *fixup_start, void *fixup_end)
 {
-	unsigned int instr[2], *dest;
+	unsigned int instr[2];
 	long *start, *end;
 	int i;
 
@@ -549,13 +534,7 @@ void do_barrier_nospec_fixups_range(bool enable, void *fixup_start, void *fixup_
 		instr[1] = PPC_RAW_SYNC();
 	}
 
-	for (i = 0; start < end; start++, i++) {
-		dest = (void *)start + *start;
-
-		pr_devel("patching dest %lx\n", (unsigned long)dest);
-		patch_instruction(dest, ppc_inst(instr[0]));
-		patch_instruction(dest + 1, ppc_inst(instr[1]));
-	}
+	i = do_patch_fixups(start, end, instr, ARRAY_SIZE(instr));
 
 	printk(KERN_DEBUG "barrier-nospec: patched %d locations\n", i);
 }
-- 
2.37.1


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v1 5/6] powerpc/feature-fixups: Do not patch init section after init
  2022-09-27 14:33   ` Christophe Leroy
@ 2022-09-27 14:33     ` Christophe Leroy
  -1 siblings, 0 replies; 47+ messages in thread
From: Christophe Leroy @ 2022-09-27 14:33 UTC (permalink / raw)
  To: Benjamin Gray, Michael Ellerman, Nicholas Piggin
  Cc: linuxppc-dev, linux-kernel

Once init section is freed, attempting to patch init code
ends up in the weed.

Commit 51c3c62b58b3 ("powerpc: Avoid code patching freed init sections")
protected patch_instruction() against that, but it is the responsibility
of the caller to ensure that the patched memory is valid.

In the same spirit as jump_label with its jump_label_can_update()
function, add is_fixup_addr_valid() function to skip patching on
freed init section.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
---
 arch/powerpc/lib/feature-fixups.c | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/arch/powerpc/lib/feature-fixups.c b/arch/powerpc/lib/feature-fixups.c
index a03ed9931224..1d4342dc4b8d 100644
--- a/arch/powerpc/lib/feature-fixups.c
+++ b/arch/powerpc/lib/feature-fixups.c
@@ -118,6 +118,12 @@ void do_feature_fixups(unsigned long value, void *fixup_start, void *fixup_end)
 }
 
 #ifdef CONFIG_PPC_BARRIER_NOSPEC
+static bool is_fixup_addr_valid(void *dest, size_t size)
+{
+	return system_state < SYSTEM_FREEING_INITMEM ||
+	       !init_section_contains(dest, size);
+}
+
 static int do_patch_fixups(long *start, long *end, unsigned int *instrs, int num)
 {
 	int i;
@@ -126,6 +132,9 @@ static int do_patch_fixups(long *start, long *end, unsigned int *instrs, int num
 		int j;
 		unsigned int *dest = (void *)start + *start;
 
+		if (!is_fixup_addr_valid(dest, sizeof(*instrs) * num))
+			continue;
+
 		pr_devel("patching dest %lx\n", (unsigned long)dest);
 
 		for (j = 0; j < num; j++)
@@ -144,6 +153,9 @@ static int do_patch_entry_fixups(long *start, long *end, unsigned int *instrs,
 	for (i = 0; start < end; start++, i++) {
 		unsigned int *dest = (void *)start + *start;
 
+		if (!is_fixup_addr_valid(dest, sizeof(*instrs) * 3))
+			continue;
+
 		pr_devel("patching dest %lx\n", (unsigned long)dest);
 
 		// See comment in do_entry_flush_fixups() RE order of patching
-- 
2.37.1


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v1 5/6] powerpc/feature-fixups: Do not patch init section after init
@ 2022-09-27 14:33     ` Christophe Leroy
  0 siblings, 0 replies; 47+ messages in thread
From: Christophe Leroy @ 2022-09-27 14:33 UTC (permalink / raw)
  To: Benjamin Gray, Michael Ellerman, Nicholas Piggin
  Cc: Christophe Leroy, linux-kernel, linuxppc-dev

Once init section is freed, attempting to patch init code
ends up in the weed.

Commit 51c3c62b58b3 ("powerpc: Avoid code patching freed init sections")
protected patch_instruction() against that, but it is the responsibility
of the caller to ensure that the patched memory is valid.

In the same spirit as jump_label with its jump_label_can_update()
function, add is_fixup_addr_valid() function to skip patching on
freed init section.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
---
 arch/powerpc/lib/feature-fixups.c | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/arch/powerpc/lib/feature-fixups.c b/arch/powerpc/lib/feature-fixups.c
index a03ed9931224..1d4342dc4b8d 100644
--- a/arch/powerpc/lib/feature-fixups.c
+++ b/arch/powerpc/lib/feature-fixups.c
@@ -118,6 +118,12 @@ void do_feature_fixups(unsigned long value, void *fixup_start, void *fixup_end)
 }
 
 #ifdef CONFIG_PPC_BARRIER_NOSPEC
+static bool is_fixup_addr_valid(void *dest, size_t size)
+{
+	return system_state < SYSTEM_FREEING_INITMEM ||
+	       !init_section_contains(dest, size);
+}
+
 static int do_patch_fixups(long *start, long *end, unsigned int *instrs, int num)
 {
 	int i;
@@ -126,6 +132,9 @@ static int do_patch_fixups(long *start, long *end, unsigned int *instrs, int num
 		int j;
 		unsigned int *dest = (void *)start + *start;
 
+		if (!is_fixup_addr_valid(dest, sizeof(*instrs) * num))
+			continue;
+
 		pr_devel("patching dest %lx\n", (unsigned long)dest);
 
 		for (j = 0; j < num; j++)
@@ -144,6 +153,9 @@ static int do_patch_entry_fixups(long *start, long *end, unsigned int *instrs,
 	for (i = 0; start < end; start++, i++) {
 		unsigned int *dest = (void *)start + *start;
 
+		if (!is_fixup_addr_valid(dest, sizeof(*instrs) * 3))
+			continue;
+
 		pr_devel("patching dest %lx\n", (unsigned long)dest);
 
 		// See comment in do_entry_flush_fixups() RE order of patching
-- 
2.37.1


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v1 6/6] powerpc/code-patching: Remove protection against patching init addresses after init
  2022-09-27 14:33   ` Christophe Leroy
@ 2022-09-27 14:33     ` Christophe Leroy
  -1 siblings, 0 replies; 47+ messages in thread
From: Christophe Leroy @ 2022-09-27 14:33 UTC (permalink / raw)
  To: Benjamin Gray, Michael Ellerman, Nicholas Piggin
  Cc: linuxppc-dev, linux-kernel

Once init section is freed, attempting to patch init code
ends up in the weed.

Commit 51c3c62b58b3 ("powerpc: Avoid code patching freed init sections")
protected patch_instruction() against that, but it is the responsibility
of the caller to ensure that the patched memory is valid.

All callers have now been verified and fixed so the check
can be removed.

This improves ftrace activation by about 2% on 8xx.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
---
 arch/powerpc/include/asm/code-patching.h |  2 --
 arch/powerpc/lib/code-patching.c         | 13 +------------
 arch/powerpc/mm/mem.c                    |  1 -
 3 files changed, 1 insertion(+), 15 deletions(-)

diff --git a/arch/powerpc/include/asm/code-patching.h b/arch/powerpc/include/asm/code-patching.h
index 1c6316ec4b74..3f881548fb61 100644
--- a/arch/powerpc/include/asm/code-patching.h
+++ b/arch/powerpc/include/asm/code-patching.h
@@ -22,8 +22,6 @@
 #define BRANCH_SET_LINK	0x1
 #define BRANCH_ABSOLUTE	0x2
 
-DECLARE_STATIC_KEY_FALSE(init_mem_is_free);
-
 /*
  * Powerpc branch instruction is :
  *
diff --git a/arch/powerpc/lib/code-patching.c b/arch/powerpc/lib/code-patching.c
index 647a0bb35848..125c55e3e148 100644
--- a/arch/powerpc/lib/code-patching.c
+++ b/arch/powerpc/lib/code-patching.c
@@ -174,7 +174,7 @@ static int __do_patch_instruction(u32 *addr, ppc_inst_t instr)
 	return err;
 }
 
-static int do_patch_instruction(u32 *addr, ppc_inst_t instr)
+int patch_instruction(u32 *addr, ppc_inst_t instr)
 {
 	int err;
 	unsigned long flags;
@@ -194,17 +194,6 @@ static int do_patch_instruction(u32 *addr, ppc_inst_t instr)
 
 	return err;
 }
-
-__ro_after_init DEFINE_STATIC_KEY_FALSE(init_mem_is_free);
-
-int patch_instruction(u32 *addr, ppc_inst_t instr)
-{
-	/* Make sure we aren't patching a freed init section */
-	if (static_branch_likely(&init_mem_is_free) && init_section_contains(addr, 4))
-		return 0;
-
-	return do_patch_instruction(addr, instr);
-}
 NOKPROBE_SYMBOL(patch_instruction);
 
 int patch_branch(u32 *addr, unsigned long target, int flags)
diff --git a/arch/powerpc/mm/mem.c b/arch/powerpc/mm/mem.c
index 01772e79fd93..fb89883e97bd 100644
--- a/arch/powerpc/mm/mem.c
+++ b/arch/powerpc/mm/mem.c
@@ -344,7 +344,6 @@ void free_initmem(void)
 {
 	ppc_md.progress = ppc_printk_progress;
 	mark_initmem_nx();
-	static_branch_enable(&init_mem_is_free);
 	free_initmem_default(POISON_FREE_INITMEM);
 	ftrace_free_init_tramp();
 }
-- 
2.37.1


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v1 6/6] powerpc/code-patching: Remove protection against patching init addresses after init
@ 2022-09-27 14:33     ` Christophe Leroy
  0 siblings, 0 replies; 47+ messages in thread
From: Christophe Leroy @ 2022-09-27 14:33 UTC (permalink / raw)
  To: Benjamin Gray, Michael Ellerman, Nicholas Piggin
  Cc: Christophe Leroy, linux-kernel, linuxppc-dev

Once init section is freed, attempting to patch init code
ends up in the weed.

Commit 51c3c62b58b3 ("powerpc: Avoid code patching freed init sections")
protected patch_instruction() against that, but it is the responsibility
of the caller to ensure that the patched memory is valid.

All callers have now been verified and fixed so the check
can be removed.

This improves ftrace activation by about 2% on 8xx.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
---
 arch/powerpc/include/asm/code-patching.h |  2 --
 arch/powerpc/lib/code-patching.c         | 13 +------------
 arch/powerpc/mm/mem.c                    |  1 -
 3 files changed, 1 insertion(+), 15 deletions(-)

diff --git a/arch/powerpc/include/asm/code-patching.h b/arch/powerpc/include/asm/code-patching.h
index 1c6316ec4b74..3f881548fb61 100644
--- a/arch/powerpc/include/asm/code-patching.h
+++ b/arch/powerpc/include/asm/code-patching.h
@@ -22,8 +22,6 @@
 #define BRANCH_SET_LINK	0x1
 #define BRANCH_ABSOLUTE	0x2
 
-DECLARE_STATIC_KEY_FALSE(init_mem_is_free);
-
 /*
  * Powerpc branch instruction is :
  *
diff --git a/arch/powerpc/lib/code-patching.c b/arch/powerpc/lib/code-patching.c
index 647a0bb35848..125c55e3e148 100644
--- a/arch/powerpc/lib/code-patching.c
+++ b/arch/powerpc/lib/code-patching.c
@@ -174,7 +174,7 @@ static int __do_patch_instruction(u32 *addr, ppc_inst_t instr)
 	return err;
 }
 
-static int do_patch_instruction(u32 *addr, ppc_inst_t instr)
+int patch_instruction(u32 *addr, ppc_inst_t instr)
 {
 	int err;
 	unsigned long flags;
@@ -194,17 +194,6 @@ static int do_patch_instruction(u32 *addr, ppc_inst_t instr)
 
 	return err;
 }
-
-__ro_after_init DEFINE_STATIC_KEY_FALSE(init_mem_is_free);
-
-int patch_instruction(u32 *addr, ppc_inst_t instr)
-{
-	/* Make sure we aren't patching a freed init section */
-	if (static_branch_likely(&init_mem_is_free) && init_section_contains(addr, 4))
-		return 0;
-
-	return do_patch_instruction(addr, instr);
-}
 NOKPROBE_SYMBOL(patch_instruction);
 
 int patch_branch(u32 *addr, unsigned long target, int flags)
diff --git a/arch/powerpc/mm/mem.c b/arch/powerpc/mm/mem.c
index 01772e79fd93..fb89883e97bd 100644
--- a/arch/powerpc/mm/mem.c
+++ b/arch/powerpc/mm/mem.c
@@ -344,7 +344,6 @@ void free_initmem(void)
 {
 	ppc_md.progress = ppc_printk_progress;
 	mark_initmem_nx();
-	static_branch_enable(&init_mem_is_free);
 	free_initmem_default(POISON_FREE_INITMEM);
 	ftrace_free_init_tramp();
 }
-- 
2.37.1


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* Re: [PATCH v1 0/4] Kill the time spent in patch_instruction()
  2022-03-22 15:40 ` Christophe Leroy
@ 2022-09-27 14:38   ` Christophe Leroy
  -1 siblings, 0 replies; 47+ messages in thread
From: Christophe Leroy @ 2022-09-27 14:38 UTC (permalink / raw)
  To: Benjamin Gray, Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman
  Cc: linuxppc-dev, linux-kernel

Argh !! Looks like I sent an old already applied series nested in the 
new one.

Ignore those x/4 patches, only look at the x/6 ones.


Le 27/09/2022 à 16:33, Christophe Leroy a écrit :
> This series reduces by 70% the time required to activate
> ftrace on an 8xx with CONFIG_STRICT_KERNEL_RWX.
> 
> Measure is performed in function ftrace_replace_code() using mftb()
> around the loop.
> 
> With the series,
> - Without CONFIG_STRICT_KERNEL_RWX, 416000 TB ticks are measured.
> - With CONFIG_STRICT_KERNEL_RWX, 546000 TB ticks are measured.
> 
> Before this series,
> - Without CONFIG_STRICT_KERNEL_RWX, 427000 TB ticks are measured.
> - With CONFIG_STRICT_KERNEL_RWX, 1744000 TB ticks are measured.
> 
> Before the series, CONFIG_STRICT_KERNEL_RWX multiplies the time
> required for ftrace activation by more than 4.
> 
> With the series, CONFIG_STRICT_KERNEL_RWX increases the time
> required for ftrace activation by only 30%
> 
> Christophe Leroy (4):
>    powerpc/code-patching: Don't call is_vmalloc_or_module_addr() without
>      CONFIG_MODULES
>    powerpc/code-patching: Speed up page mapping/unmapping
>    powerpc/code-patching: Use jump_label for testing freed initmem
>    powerpc/code-patching: Use jump_label to check if poking_init() is
>      done
> 
>   arch/powerpc/include/asm/code-patching.h |  2 ++
>   arch/powerpc/lib/code-patching.c         | 37 +++++++++++++++---------
>   arch/powerpc/mm/mem.c                    |  2 ++
>   3 files changed, 28 insertions(+), 13 deletions(-)
> 

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v1 0/4] Kill the time spent in patch_instruction()
@ 2022-09-27 14:38   ` Christophe Leroy
  0 siblings, 0 replies; 47+ messages in thread
From: Christophe Leroy @ 2022-09-27 14:38 UTC (permalink / raw)
  To: Benjamin Gray, Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman
  Cc: linux-kernel, linuxppc-dev

Argh !! Looks like I sent an old already applied series nested in the 
new one.

Ignore those x/4 patches, only look at the x/6 ones.


Le 27/09/2022 à 16:33, Christophe Leroy a écrit :
> This series reduces by 70% the time required to activate
> ftrace on an 8xx with CONFIG_STRICT_KERNEL_RWX.
> 
> Measure is performed in function ftrace_replace_code() using mftb()
> around the loop.
> 
> With the series,
> - Without CONFIG_STRICT_KERNEL_RWX, 416000 TB ticks are measured.
> - With CONFIG_STRICT_KERNEL_RWX, 546000 TB ticks are measured.
> 
> Before this series,
> - Without CONFIG_STRICT_KERNEL_RWX, 427000 TB ticks are measured.
> - With CONFIG_STRICT_KERNEL_RWX, 1744000 TB ticks are measured.
> 
> Before the series, CONFIG_STRICT_KERNEL_RWX multiplies the time
> required for ftrace activation by more than 4.
> 
> With the series, CONFIG_STRICT_KERNEL_RWX increases the time
> required for ftrace activation by only 30%
> 
> Christophe Leroy (4):
>    powerpc/code-patching: Don't call is_vmalloc_or_module_addr() without
>      CONFIG_MODULES
>    powerpc/code-patching: Speed up page mapping/unmapping
>    powerpc/code-patching: Use jump_label for testing freed initmem
>    powerpc/code-patching: Use jump_label to check if poking_init() is
>      done
> 
>   arch/powerpc/include/asm/code-patching.h |  2 ++
>   arch/powerpc/lib/code-patching.c         | 37 +++++++++++++++---------
>   arch/powerpc/mm/mem.c                    |  2 ++
>   3 files changed, 28 insertions(+), 13 deletions(-)
> 

^ permalink raw reply	[flat|nested] 47+ messages in thread

end of thread, other threads:[~2022-09-27 14:44 UTC | newest]

Thread overview: 47+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-03-22 15:40 [PATCH v1 0/4] Kill the time spent in patch_instruction() Christophe Leroy
2022-09-27 14:33 ` Christophe Leroy
2022-09-27 14:33 ` Christophe Leroy
2022-03-22 15:40 ` Christophe Leroy
2022-03-22 15:40 ` [PATCH v1 1/4] powerpc/code-patching: Don't call is_vmalloc_or_module_addr() without CONFIG_MODULES Christophe Leroy
2022-09-27 14:33   ` Christophe Leroy
2022-09-27 14:33   ` Christophe Leroy
2022-03-22 15:40   ` Christophe Leroy
2022-03-22 15:40 ` [PATCH v1 2/4] powerpc/code-patching: Speed up page mapping/unmapping Christophe Leroy
2022-09-27 14:33   ` Christophe Leroy
2022-09-27 14:33   ` Christophe Leroy
2022-03-22 15:40   ` Christophe Leroy
2022-03-22 15:40 ` [PATCH v1 3/4] powerpc/code-patching: Use jump_label for testing freed initmem Christophe Leroy
2022-09-27 14:33   ` Christophe Leroy
2022-09-27 14:33   ` Christophe Leroy
2022-03-22 15:40   ` Christophe Leroy
2022-05-19  2:17   ` Guenter Roeck
2022-05-19  2:17     ` Guenter Roeck
2022-05-19  6:27     ` Christophe Leroy
2022-05-19  6:27       ` Christophe Leroy
2022-05-19  6:53       ` Christophe Leroy
2022-05-19  6:53         ` Christophe Leroy
2022-05-19 17:27         ` Christophe Leroy
2022-05-19 17:27           ` Christophe Leroy
2022-03-22 15:40 ` [PATCH v1 4/4] powerpc/code-patching: Use jump_label to check if poking_init() is done Christophe Leroy
2022-09-27 14:33   ` Christophe Leroy
2022-09-27 14:33   ` Christophe Leroy
2022-03-22 15:40   ` Christophe Leroy
2022-05-15 10:28 ` [PATCH v1 0/4] Kill the time spent in patch_instruction() Michael Ellerman
2022-05-17  6:44   ` Christophe Leroy
2022-05-17 12:37     ` Michael Ellerman
2022-05-31  6:24       ` Christophe Leroy
2022-06-24  7:06         ` Christophe Leroy
2022-09-27 14:33 ` [PATCH v1 1/6] powerpc/code-patching: Use pte_offset_kernel() instead of virt_to_kpte() Christophe Leroy
2022-09-27 14:33   ` Christophe Leroy
2022-09-27 14:33   ` [PATCH v1 2/6] powerpc/code-patching: Remove #ifdef CONFIG_STRICT_KERNEL_RWX Christophe Leroy
2022-09-27 14:33     ` Christophe Leroy
2022-09-27 14:33   ` [PATCH v1 3/6] powerpc/feature-fixups: Refactor entry fixups patching Christophe Leroy
2022-09-27 14:33     ` Christophe Leroy
2022-09-27 14:33   ` [PATCH v1 4/6] powerpc/feature-fixups: Refactor other " Christophe Leroy
2022-09-27 14:33     ` Christophe Leroy
2022-09-27 14:33   ` [PATCH v1 5/6] powerpc/feature-fixups: Do not patch init section after init Christophe Leroy
2022-09-27 14:33     ` Christophe Leroy
2022-09-27 14:33   ` [PATCH v1 6/6] powerpc/code-patching: Remove protection against patching init addresses " Christophe Leroy
2022-09-27 14:33     ` Christophe Leroy
2022-09-27 14:38 ` [PATCH v1 0/4] Kill the time spent in patch_instruction() Christophe Leroy
2022-09-27 14:38   ` Christophe Leroy

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.