linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH 0/2] Alloc kfence_pool after system startup
@ 2022-03-03  3:15 Tianchen Ding
  2022-03-03  3:15 ` [RFC PATCH 1/2] kfence: Allow re-enabling KFENCE " Tianchen Ding
                   ` (2 more replies)
  0 siblings, 3 replies; 11+ messages in thread
From: Tianchen Ding @ 2022-03-03  3:15 UTC (permalink / raw)
  To: Alexander Potapenko, Marco Elver, Dmitry Vyukov, Andrew Morton
  Cc: kasan-dev, linux-mm, linux-kernel

KFENCE aims at production environments, but it does not allow enabling
after system startup because kfence_pool only alloc pages from memblock.
Consider the following production scene:
At first, for performance considerations, production machines do not
enable KFENCE.
However, after running for a while, the kernel is suspected to have
memory errors. (e.g., a sibling machine crashed.)
So other production machines need to enable KFENCE, but it's hard for
them to reboot.

The 1st patch allows re-enabling KFENCE if the pool is already
allocated from memblock.

The 2nd patch applies the main part.

Tianchen Ding (2):
  kfence: Allow re-enabling KFENCE after system startup
  kfence: Alloc kfence_pool after system startup

 mm/kfence/core.c | 106 ++++++++++++++++++++++++++++++++++++++---------
 1 file changed, 87 insertions(+), 19 deletions(-)

-- 
2.27.0


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [RFC PATCH 1/2] kfence: Allow re-enabling KFENCE after system startup
  2022-03-03  3:15 [RFC PATCH 0/2] Alloc kfence_pool after system startup Tianchen Ding
@ 2022-03-03  3:15 ` Tianchen Ding
  2022-03-04 18:13   ` Marco Elver
  2022-03-03  3:15 ` [RFC PATCH 2/2] kfence: Alloc kfence_pool " Tianchen Ding
       [not found] ` <CAG_fn=Wd5GMFojbvdZkysBQ5Auy5YYRdmZfjSVMq8gpDMRZ_3w@mail.gmail.com>
  2 siblings, 1 reply; 11+ messages in thread
From: Tianchen Ding @ 2022-03-03  3:15 UTC (permalink / raw)
  To: Alexander Potapenko, Marco Elver, Dmitry Vyukov, Andrew Morton
  Cc: kasan-dev, linux-mm, linux-kernel

If once KFENCE is disabled by:
echo 0 > /sys/module/kfence/parameters/sample_interval
KFENCE could never be re-enabled until next rebooting.

Allow re-enabling it by writing a positive num to sample_interval.

Signed-off-by: Tianchen Ding <dtcccc@linux.alibaba.com>
---
 mm/kfence/core.c | 16 ++++++++++++++--
 1 file changed, 14 insertions(+), 2 deletions(-)

diff --git a/mm/kfence/core.c b/mm/kfence/core.c
index 13128fa13062..19eb123c0bba 100644
--- a/mm/kfence/core.c
+++ b/mm/kfence/core.c
@@ -55,6 +55,7 @@ EXPORT_SYMBOL_GPL(kfence_sample_interval); /* Export for test modules. */
 #endif
 #define MODULE_PARAM_PREFIX "kfence."
 
+static int kfence_enable_late(void);
 static int param_set_sample_interval(const char *val, const struct kernel_param *kp)
 {
 	unsigned long num;
@@ -65,10 +66,11 @@ static int param_set_sample_interval(const char *val, const struct kernel_param
 
 	if (!num) /* Using 0 to indicate KFENCE is disabled. */
 		WRITE_ONCE(kfence_enabled, false);
-	else if (!READ_ONCE(kfence_enabled) && system_state != SYSTEM_BOOTING)
-		return -EINVAL; /* Cannot (re-)enable KFENCE on-the-fly. */
 
 	*((unsigned long *)kp->arg) = num;
+
+	if (num && !READ_ONCE(kfence_enabled) && system_state != SYSTEM_BOOTING)
+		return kfence_enable_late();
 	return 0;
 }
 
@@ -787,6 +789,16 @@ void __init kfence_init(void)
 		(void *)(__kfence_pool + KFENCE_POOL_SIZE));
 }
 
+static int kfence_enable_late(void)
+{
+	if (!__kfence_pool)
+		return -EINVAL;
+
+	WRITE_ONCE(kfence_enabled, true);
+	queue_delayed_work(system_unbound_wq, &kfence_timer, 0);
+	return 0;
+}
+
 void kfence_shutdown_cache(struct kmem_cache *s)
 {
 	unsigned long flags;
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [RFC PATCH 2/2] kfence: Alloc kfence_pool after system startup
  2022-03-03  3:15 [RFC PATCH 0/2] Alloc kfence_pool after system startup Tianchen Ding
  2022-03-03  3:15 ` [RFC PATCH 1/2] kfence: Allow re-enabling KFENCE " Tianchen Ding
@ 2022-03-03  3:15 ` Tianchen Ding
  2022-03-04 18:14   ` Marco Elver
       [not found] ` <CAG_fn=Wd5GMFojbvdZkysBQ5Auy5YYRdmZfjSVMq8gpDMRZ_3w@mail.gmail.com>
  2 siblings, 1 reply; 11+ messages in thread
From: Tianchen Ding @ 2022-03-03  3:15 UTC (permalink / raw)
  To: Alexander Potapenko, Marco Elver, Dmitry Vyukov, Andrew Morton
  Cc: kasan-dev, linux-mm, linux-kernel

KFENCE aims at production environments, but it does not allow enabling
after system startup because kfence_pool only alloc pages from memblock.
Consider the following production scene:
At first, for performance considerations, production machines do not
enable KFENCE.
However, after running for a while, the kernel is suspected to have
memory errors. (e.g., a sibling machine crashed.)
So other production machines need to enable KFENCE, but it's hard for
them to reboot.

Allow enabling KFENCE by alloc pages after system startup, even if
KFENCE is not enabled during booting.

Signed-off-by: Tianchen Ding <dtcccc@linux.alibaba.com>
---
This patch is similar to what the KFENCE(early version) do on ARM64.
Instead of alloc_pages(), we'd prefer alloc_contig_pages() to get exact
number of pages.
I'm not sure about the impact of breaking __ro_after_init. I've tested
with hackbench, and it seems no performance regression.
Or any problem about security?
---
 mm/kfence/core.c | 96 ++++++++++++++++++++++++++++++++++++++----------
 1 file changed, 76 insertions(+), 20 deletions(-)

diff --git a/mm/kfence/core.c b/mm/kfence/core.c
index 19eb123c0bba..ae69b2a113a4 100644
--- a/mm/kfence/core.c
+++ b/mm/kfence/core.c
@@ -93,7 +93,7 @@ static unsigned long kfence_skip_covered_thresh __read_mostly = 75;
 module_param_named(skip_covered_thresh, kfence_skip_covered_thresh, ulong, 0644);
 
 /* The pool of pages used for guard pages and objects. */
-char *__kfence_pool __ro_after_init;
+char *__kfence_pool __read_mostly;
 EXPORT_SYMBOL(__kfence_pool); /* Export for test modules. */
 
 /*
@@ -534,17 +534,18 @@ static void rcu_guarded_free(struct rcu_head *h)
 	kfence_guarded_free((void *)meta->addr, meta, false);
 }
 
-static bool __init kfence_init_pool(void)
+/*
+ * The main part of init kfence pool.
+ * Return 0 if succeed. Otherwise return the address where error occurs.
+ */
+static unsigned long __kfence_init_pool(void)
 {
 	unsigned long addr = (unsigned long)__kfence_pool;
 	struct page *pages;
 	int i;
 
-	if (!__kfence_pool)
-		return false;
-
 	if (!arch_kfence_init_pool())
-		goto err;
+		return addr;
 
 	pages = virt_to_page(addr);
 
@@ -562,7 +563,7 @@ static bool __init kfence_init_pool(void)
 
 		/* Verify we do not have a compound head page. */
 		if (WARN_ON(compound_head(&pages[i]) != &pages[i]))
-			goto err;
+			return addr;
 
 		__SetPageSlab(&pages[i]);
 	}
@@ -575,7 +576,7 @@ static bool __init kfence_init_pool(void)
 	 */
 	for (i = 0; i < 2; i++) {
 		if (unlikely(!kfence_protect(addr)))
-			goto err;
+			return addr;
 
 		addr += PAGE_SIZE;
 	}
@@ -592,7 +593,7 @@ static bool __init kfence_init_pool(void)
 
 		/* Protect the right redzone. */
 		if (unlikely(!kfence_protect(addr + PAGE_SIZE)))
-			goto err;
+			return addr;
 
 		addr += 2 * PAGE_SIZE;
 	}
@@ -605,9 +606,21 @@ static bool __init kfence_init_pool(void)
 	 */
 	kmemleak_free(__kfence_pool);
 
-	return true;
+	return 0;
+}
+
+static bool __init kfence_init_pool(void)
+{
+	unsigned long addr;
+
+	if (!__kfence_pool)
+		return false;
+
+	addr = __kfence_init_pool();
+
+	if (!addr)
+		return true;
 
-err:
 	/*
 	 * Only release unprotected pages, and do not try to go back and change
 	 * page attributes due to risk of failing to do so as well. If changing
@@ -620,6 +633,22 @@ static bool __init kfence_init_pool(void)
 	return false;
 }
 
+static bool kfence_init_pool_late(void)
+{
+	unsigned long addr, free_pages;
+
+	addr = __kfence_init_pool();
+
+	if (!addr)
+		return true;
+
+	/* Same as above. */
+	free_pages = (KFENCE_POOL_SIZE - (addr - (unsigned long)__kfence_pool)) / PAGE_SIZE;
+	free_contig_range(page_to_pfn(virt_to_page(addr)), free_pages);
+	__kfence_pool = NULL;
+	return false;
+}
+
 /* === DebugFS Interface ==================================================== */
 
 static int stats_show(struct seq_file *seq, void *v)
@@ -768,31 +797,58 @@ void __init kfence_alloc_pool(void)
 		pr_err("failed to allocate pool\n");
 }
 
+static inline void __kfence_init(void)
+{
+	if (!IS_ENABLED(CONFIG_KFENCE_STATIC_KEYS))
+		static_branch_enable(&kfence_allocation_key);
+	WRITE_ONCE(kfence_enabled, true);
+	queue_delayed_work(system_unbound_wq, &kfence_timer, 0);
+	pr_info("initialized - using %lu bytes for %d objects at 0x%p-0x%p\n", KFENCE_POOL_SIZE,
+		CONFIG_KFENCE_NUM_OBJECTS, (void *)__kfence_pool,
+		(void *)(__kfence_pool + KFENCE_POOL_SIZE));
+}
+
 void __init kfence_init(void)
 {
+	stack_hash_seed = (u32)random_get_entropy();
+
 	/* Setting kfence_sample_interval to 0 on boot disables KFENCE. */
 	if (!kfence_sample_interval)
 		return;
 
-	stack_hash_seed = (u32)random_get_entropy();
 	if (!kfence_init_pool()) {
 		pr_err("%s failed\n", __func__);
 		return;
 	}
 
-	if (!IS_ENABLED(CONFIG_KFENCE_STATIC_KEYS))
-		static_branch_enable(&kfence_allocation_key);
-	WRITE_ONCE(kfence_enabled, true);
-	queue_delayed_work(system_unbound_wq, &kfence_timer, 0);
-	pr_info("initialized - using %lu bytes for %d objects at 0x%p-0x%p\n", KFENCE_POOL_SIZE,
-		CONFIG_KFENCE_NUM_OBJECTS, (void *)__kfence_pool,
-		(void *)(__kfence_pool + KFENCE_POOL_SIZE));
+	__kfence_init();
+}
+
+static int kfence_init_late(void)
+{
+	struct page *pages;
+	const unsigned long nr_pages = KFENCE_POOL_SIZE / PAGE_SIZE;
+
+	pages = alloc_contig_pages(nr_pages, GFP_KERNEL, first_online_node, NULL);
+
+	if (!pages)
+		return -ENOMEM;
+
+	__kfence_pool = page_to_virt(pages);
+
+	if (!kfence_init_pool_late()) {
+		pr_err("%s failed\n", __func__);
+		return -EBUSY;
+	}
+
+	__kfence_init();
+	return 0;
 }
 
 static int kfence_enable_late(void)
 {
 	if (!__kfence_pool)
-		return -EINVAL;
+		return kfence_init_late();
 
 	WRITE_ONCE(kfence_enabled, true);
 	queue_delayed_work(system_unbound_wq, &kfence_timer, 0);
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [RFC PATCH 0/2] Alloc kfence_pool after system startup
       [not found] ` <CAG_fn=Wd5GMFojbvdZkysBQ5Auy5YYRdmZfjSVMq8gpDMRZ_3w@mail.gmail.com>
@ 2022-03-03  9:30   ` Marco Elver
  2022-03-04  2:24     ` Tianchen Ding
  0 siblings, 1 reply; 11+ messages in thread
From: Marco Elver @ 2022-03-03  9:30 UTC (permalink / raw)
  To: Alexander Potapenko
  Cc: Tianchen Ding, Dmitry Vyukov, Andrew Morton, kasan-dev,
	Linux Memory Management List, LKML

On Thu, 3 Mar 2022 at 10:05, Alexander Potapenko <glider@google.com> wrote:

I share Alex's concerns.

> On Thu, Mar 3, 2022 at 4:15 AM Tianchen Ding <dtcccc@linux.alibaba.com> wrote:
>>
>> KFENCE aims at production environments, but it does not allow enabling
>> after system startup because kfence_pool only alloc pages from memblock.
>> Consider the following production scene:
>> At first, for performance considerations, production machines do not
>> enable KFENCE.
>
> What are the performance considerations you have in mind? Are you running KFENCE with a very aggressive sampling rate?

Indeed, what is wrong with simply starting up KFENCE with a sample
interval of 10000? However, I very much doubt that you'll notice any
performance issues above 500ms.

Do let us know what performance issues you have seen. It may be
related to an earlier version of KFENCE but has since been fixed (see
log).

>> However, after running for a while, the kernel is suspected to have
>> memory errors. (e.g., a sibling machine crashed.)
>
> I have doubts regarding this setup. It might be faster (although one can tune KFENCE to have nearly zero performance impact), but is harder to maintain.
> It will also catch fewer errors than if you just had KFENCE on from the very beginning:
>  - sibling machines may behave differently, and a certain bug may only occur once - in that case the secondary instances won't notice it, even with KFENCE;
>  - KFENCE also catches non-lethal corruptions (e.g. OOB reads), which may stay under radar for a very long time.
>
>>
>> So other production machines need to enable KFENCE, but it's hard for
>> them to reboot.
>>
>> The 1st patch allows re-enabling KFENCE if the pool is already
>> allocated from memblock.

Patch 1/2 might be ok by itself, but I still don't see the point
because you should just leave KFENCE enabled. There should be no
reason to have to turn it off. If anything, you can increase the
sample interval to something very large if needed.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC PATCH 0/2] Alloc kfence_pool after system startup
  2022-03-03  9:30   ` [RFC PATCH 0/2] " Marco Elver
@ 2022-03-04  2:24     ` Tianchen Ding
  2022-03-04 18:14       ` Marco Elver
  0 siblings, 1 reply; 11+ messages in thread
From: Tianchen Ding @ 2022-03-04  2:24 UTC (permalink / raw)
  To: Marco Elver, Alexander Potapenko
  Cc: Dmitry Vyukov, Andrew Morton, kasan-dev,
	Linux Memory Management List, LKML

On 2022/3/3 17:30, Marco Elver wrote:

Thanks for your replies.
I do see setting a large sample_interval means almost disabling KFENCE.
In fact, my point is to provide a more “flexible” way. Since some Ops 
may be glad to use something like on/off switch than 10000ms interval. :-)

> On Thu, 3 Mar 2022 at 10:05, Alexander Potapenko <glider@google.com> wrote:
> 
> I share Alex's concerns.
> 
>> On Thu, Mar 3, 2022 at 4:15 AM Tianchen Ding <dtcccc@linux.alibaba.com> wrote:
>>>
>>> KFENCE aims at production environments, but it does not allow enabling
>>> after system startup because kfence_pool only alloc pages from memblock.
>>> Consider the following production scene:
>>> At first, for performance considerations, production machines do not
>>> enable KFENCE.
>>
>> What are the performance considerations you have in mind? Are you running KFENCE with a very aggressive sampling rate?
> 
> Indeed, what is wrong with simply starting up KFENCE with a sample
> interval of 10000? However, I very much doubt that you'll notice any
> performance issues above 500ms.
> 
> Do let us know what performance issues you have seen. It may be
> related to an earlier version of KFENCE but has since been fixed (see
> log).
> 
>>> However, after running for a while, the kernel is suspected to have
>>> memory errors. (e.g., a sibling machine crashed.)
>>
>> I have doubts regarding this setup. It might be faster (although one can tune KFENCE to have nearly zero performance impact), but is harder to maintain.
>> It will also catch fewer errors than if you just had KFENCE on from the very beginning:
>>   - sibling machines may behave differently, and a certain bug may only occur once - in that case the secondary instances won't notice it, even with KFENCE;
>>   - KFENCE also catches non-lethal corruptions (e.g. OOB reads), which may stay under radar for a very long time.
>>
>>>
>>> So other production machines need to enable KFENCE, but it's hard for
>>> them to reboot.
>>>
>>> The 1st patch allows re-enabling KFENCE if the pool is already
>>> allocated from memblock.
> 
> Patch 1/2 might be ok by itself, but I still don't see the point
> because you should just leave KFENCE enabled. There should be no
> reason to have to turn it off. If anything, you can increase the
> sample interval to something very large if needed.


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC PATCH 1/2] kfence: Allow re-enabling KFENCE after system startup
  2022-03-03  3:15 ` [RFC PATCH 1/2] kfence: Allow re-enabling KFENCE " Tianchen Ding
@ 2022-03-04 18:13   ` Marco Elver
  2022-03-05  5:26     ` Tianchen Ding
  0 siblings, 1 reply; 11+ messages in thread
From: Marco Elver @ 2022-03-04 18:13 UTC (permalink / raw)
  To: Tianchen Ding
  Cc: Alexander Potapenko, Dmitry Vyukov, Andrew Morton, kasan-dev,
	linux-mm, linux-kernel

On Thu, 3 Mar 2022 at 04:15, Tianchen Ding <dtcccc@linux.alibaba.com> wrote:
>
> If once KFENCE is disabled by:
> echo 0 > /sys/module/kfence/parameters/sample_interval
> KFENCE could never be re-enabled until next rebooting.
>
> Allow re-enabling it by writing a positive num to sample_interval.
>
> Signed-off-by: Tianchen Ding <dtcccc@linux.alibaba.com>

The only problem I see with this is if KFENCE was disabled because of
a KFENCE_WARN_ON(). See below.

> ---
>  mm/kfence/core.c | 16 ++++++++++++++--
>  1 file changed, 14 insertions(+), 2 deletions(-)
>
> diff --git a/mm/kfence/core.c b/mm/kfence/core.c
> index 13128fa13062..19eb123c0bba 100644
> --- a/mm/kfence/core.c
> +++ b/mm/kfence/core.c
> @@ -55,6 +55,7 @@ EXPORT_SYMBOL_GPL(kfence_sample_interval); /* Export for test modules. */
>  #endif
>  #define MODULE_PARAM_PREFIX "kfence."
>
> +static int kfence_enable_late(void);
>  static int param_set_sample_interval(const char *val, const struct kernel_param *kp)
>  {
>         unsigned long num;
> @@ -65,10 +66,11 @@ static int param_set_sample_interval(const char *val, const struct kernel_param
>
>         if (!num) /* Using 0 to indicate KFENCE is disabled. */
>                 WRITE_ONCE(kfence_enabled, false);
> -       else if (!READ_ONCE(kfence_enabled) && system_state != SYSTEM_BOOTING)
> -               return -EINVAL; /* Cannot (re-)enable KFENCE on-the-fly. */
>
>         *((unsigned long *)kp->arg) = num;
> +
> +       if (num && !READ_ONCE(kfence_enabled) && system_state != SYSTEM_BOOTING)

Should probably have an 'old_sample_interval = *((unsigned long
*)kp->arg)' somewhere before, and add a '&& !old_sample_interval',
because if old_sample_interval!=0 then KFENCE was disabled due to a
KFENCE_WARN_ON(). Also in this case, it should return -EINVAL. So you
want a flow like this:

old_sample_interval = ...;
...
if (num && !READ_ONCE(kfence_enabled) && system_state != SYSTEM_BOOTING)
  return old_sample_interval ? -EINVAL : kfence_enable_late();
...

Thanks,
-- Marco

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC PATCH 2/2] kfence: Alloc kfence_pool after system startup
  2022-03-03  3:15 ` [RFC PATCH 2/2] kfence: Alloc kfence_pool " Tianchen Ding
@ 2022-03-04 18:14   ` Marco Elver
  0 siblings, 0 replies; 11+ messages in thread
From: Marco Elver @ 2022-03-04 18:14 UTC (permalink / raw)
  To: Tianchen Ding
  Cc: Alexander Potapenko, Dmitry Vyukov, Andrew Morton, kasan-dev,
	linux-mm, linux-kernel

On Thu, 3 Mar 2022 at 04:15, Tianchen Ding <dtcccc@linux.alibaba.com> wrote:
>
> KFENCE aims at production environments, but it does not allow enabling
> after system startup because kfence_pool only alloc pages from memblock.
> Consider the following production scene:
> At first, for performance considerations, production machines do not
> enable KFENCE.
> However, after running for a while, the kernel is suspected to have
> memory errors. (e.g., a sibling machine crashed.)
> So other production machines need to enable KFENCE, but it's hard for
> them to reboot.

I think having this flexibility isn't bad, but your usecase just
doesn't make sense (to us at least, based on our experience).

So I would simply remove the above as it will give folks the wrong
impression. The below paragraph can be improved a little, but should
be enough.

> Allow enabling KFENCE by alloc pages after system startup, even if
> KFENCE is not enabled during booting.

The above doesn't parse very well -- my suggestion:
  "Allow enabling KFENCE after system startup by allocating its pool
via the page allocator. This provides the flexibility to enable KFENCE
even if it wasn't enabled at boot time."

> Signed-off-by: Tianchen Ding <dtcccc@linux.alibaba.com>
> ---
> This patch is similar to what the KFENCE(early version) do on ARM64.
> Instead of alloc_pages(), we'd prefer alloc_contig_pages() to get exact
> number of pages.
> I'm not sure about the impact of breaking __ro_after_init. I've tested
> with hackbench, and it seems no performance regression.
> Or any problem about security?

Performance would be the main consideration. However, I think
__read_mostly should be as good as __ro_after_init in terms of
performance.

> ---
>  mm/kfence/core.c | 96 ++++++++++++++++++++++++++++++++++++++----------
>  1 file changed, 76 insertions(+), 20 deletions(-)
>
> diff --git a/mm/kfence/core.c b/mm/kfence/core.c
> index 19eb123c0bba..ae69b2a113a4 100644
> --- a/mm/kfence/core.c
> +++ b/mm/kfence/core.c
> @@ -93,7 +93,7 @@ static unsigned long kfence_skip_covered_thresh __read_mostly = 75;
>  module_param_named(skip_covered_thresh, kfence_skip_covered_thresh, ulong, 0644);
>
>  /* The pool of pages used for guard pages and objects. */
> -char *__kfence_pool __ro_after_init;
> +char *__kfence_pool __read_mostly;
>  EXPORT_SYMBOL(__kfence_pool); /* Export for test modules. */
>
>  /*
> @@ -534,17 +534,18 @@ static void rcu_guarded_free(struct rcu_head *h)
>         kfence_guarded_free((void *)meta->addr, meta, false);
>  }
>
> -static bool __init kfence_init_pool(void)
> +/*
> + * The main part of init kfence pool.

"Initialization of the KFENCE pool after its allocation."

> + * Return 0 if succeed. Otherwise return the address where error occurs.

"Return 0 on success; otherwise returns the address up to which
partial initialization succeeded."

> + */
> +static unsigned long __kfence_init_pool(void)

Keep this function simply named 'kfence_init_pool()' - it's a static
function, and we can be more descriptive with the other function
names.

>  {
>         unsigned long addr = (unsigned long)__kfence_pool;
>         struct page *pages;
>         int i;
>
> -       if (!__kfence_pool)
> -               return false;
> -
>         if (!arch_kfence_init_pool())
> -               goto err;
> +               return addr;
>
>         pages = virt_to_page(addr);
>
> @@ -562,7 +563,7 @@ static bool __init kfence_init_pool(void)
>
>                 /* Verify we do not have a compound head page. */
>                 if (WARN_ON(compound_head(&pages[i]) != &pages[i]))
> -                       goto err;
> +                       return addr;
>
>                 __SetPageSlab(&pages[i]);
>         }
> @@ -575,7 +576,7 @@ static bool __init kfence_init_pool(void)
>          */
>         for (i = 0; i < 2; i++) {
>                 if (unlikely(!kfence_protect(addr)))
> -                       goto err;
> +                       return addr;
>
>                 addr += PAGE_SIZE;
>         }
> @@ -592,7 +593,7 @@ static bool __init kfence_init_pool(void)
>
>                 /* Protect the right redzone. */
>                 if (unlikely(!kfence_protect(addr + PAGE_SIZE)))
> -                       goto err;
> +                       return addr;
>
>                 addr += 2 * PAGE_SIZE;
>         }
> @@ -605,9 +606,21 @@ static bool __init kfence_init_pool(void)
>          */
>         kmemleak_free(__kfence_pool);
>
> -       return true;
> +       return 0;
> +}
> +
> +static bool __init kfence_init_pool(void)

Just call this kfence_init_pool_early().

> +{
> +       unsigned long addr;
> +
> +       if (!__kfence_pool)
> +               return false;
> +
> +       addr = __kfence_init_pool();
> +
> +       if (!addr)
> +               return true;
>
> -err:
>         /*
>          * Only release unprotected pages, and do not try to go back and change
>          * page attributes due to risk of failing to do so as well. If changing
> @@ -620,6 +633,22 @@ static bool __init kfence_init_pool(void)
>         return false;
>  }
>
> +static bool kfence_init_pool_late(void)
> +{
> +       unsigned long addr, free_pages;
> +
> +       addr = __kfence_init_pool();
> +
> +       if (!addr)
> +               return true;
> +
> +       /* Same as above. */
> +       free_pages = (KFENCE_POOL_SIZE - (addr - (unsigned long)__kfence_pool)) / PAGE_SIZE;
> +       free_contig_range(page_to_pfn(virt_to_page(addr)), free_pages);
> +       __kfence_pool = NULL;
> +       return false;
> +}
> +
>  /* === DebugFS Interface ==================================================== */
>
>  static int stats_show(struct seq_file *seq, void *v)
> @@ -768,31 +797,58 @@ void __init kfence_alloc_pool(void)
>                 pr_err("failed to allocate pool\n");
>  }
>
> +static inline void __kfence_init(void)

Don't make this 'inline', I see no reason for it. If the compiler
thinks it's really worth inlining, it'll do it anyway.

Also, just call it 'kfence_init_enable()' (sprinkling '__' everywhere
really doesn't improve readability if we can avoid it).

> +{
> +       if (!IS_ENABLED(CONFIG_KFENCE_STATIC_KEYS))
> +               static_branch_enable(&kfence_allocation_key);
> +       WRITE_ONCE(kfence_enabled, true);
> +       queue_delayed_work(system_unbound_wq, &kfence_timer, 0);
> +       pr_info("initialized - using %lu bytes for %d objects at 0x%p-0x%p\n", KFENCE_POOL_SIZE,
> +               CONFIG_KFENCE_NUM_OBJECTS, (void *)__kfence_pool,
> +               (void *)(__kfence_pool + KFENCE_POOL_SIZE));
> +}
> +
>  void __init kfence_init(void)
>  {
> +       stack_hash_seed = (u32)random_get_entropy();
> +
>         /* Setting kfence_sample_interval to 0 on boot disables KFENCE. */
>         if (!kfence_sample_interval)
>                 return;
>
> -       stack_hash_seed = (u32)random_get_entropy();
>         if (!kfence_init_pool()) {
>                 pr_err("%s failed\n", __func__);
>                 return;
>         }
>
> -       if (!IS_ENABLED(CONFIG_KFENCE_STATIC_KEYS))
> -               static_branch_enable(&kfence_allocation_key);
> -       WRITE_ONCE(kfence_enabled, true);
> -       queue_delayed_work(system_unbound_wq, &kfence_timer, 0);
> -       pr_info("initialized - using %lu bytes for %d objects at 0x%p-0x%p\n", KFENCE_POOL_SIZE,
> -               CONFIG_KFENCE_NUM_OBJECTS, (void *)__kfence_pool,
> -               (void *)(__kfence_pool + KFENCE_POOL_SIZE));
> +       __kfence_init();
> +}
> +
> +static int kfence_init_late(void)
> +{
> +       struct page *pages;
> +       const unsigned long nr_pages = KFENCE_POOL_SIZE / PAGE_SIZE;

Order 'nr_pages' above 'pages' (reverse xmas-tree).


> +       pages = alloc_contig_pages(nr_pages, GFP_KERNEL, first_online_node, NULL);
> +
> +       if (!pages)
> +               return -ENOMEM;
> +
> +       __kfence_pool = page_to_virt(pages);
> +
> +       if (!kfence_init_pool_late()) {
> +               pr_err("%s failed\n", __func__);
> +               return -EBUSY;
> +       }
> +
> +       __kfence_init();
> +       return 0;
>  }
>
>  static int kfence_enable_late(void)
>  {
>         if (!__kfence_pool)
> -               return -EINVAL;
> +               return kfence_init_late();
>
>         WRITE_ONCE(kfence_enabled, true);
>         queue_delayed_work(system_unbound_wq, &kfence_timer, 0);
> --
> 2.27.0
>
> --
> You received this message because you are subscribed to the Google Groups "kasan-dev" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to kasan-dev+unsubscribe@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/kasan-dev/20220303031505.28495-3-dtcccc%40linux.alibaba.com.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC PATCH 0/2] Alloc kfence_pool after system startup
  2022-03-04  2:24     ` Tianchen Ding
@ 2022-03-04 18:14       ` Marco Elver
  0 siblings, 0 replies; 11+ messages in thread
From: Marco Elver @ 2022-03-04 18:14 UTC (permalink / raw)
  To: Tianchen Ding
  Cc: Alexander Potapenko, Dmitry Vyukov, Andrew Morton, kasan-dev,
	Linux Memory Management List, LKML

On Fri, 4 Mar 2022 at 03:25, Tianchen Ding <dtcccc@linux.alibaba.com> wrote:
>
> On 2022/3/3 17:30, Marco Elver wrote:
>
> Thanks for your replies.
> I do see setting a large sample_interval means almost disabling KFENCE.
> In fact, my point is to provide a more “flexible” way. Since some Ops
> may be glad to use something like on/off switch than 10000ms interval. :-)

Have you already successfully caught bugs by turning KFENCE on _in
reaction_ to some suspected issues? We really do not think that
switching on KFENCE _after_ having observed a bug, especially on a
completely different machine, is at all reliable.

While your patches are appreciated, I think your usecase doesn't make
sense to us (based on our experience). I think this flexibility is
nice-to-have, so I think the justification just needs changing, to
avoid misleading other folks. Please see comments on the other
patches.

Thanks,
-- Marco

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC PATCH 1/2] kfence: Allow re-enabling KFENCE after system startup
  2022-03-04 18:13   ` Marco Elver
@ 2022-03-05  5:26     ` Tianchen Ding
  2022-03-05  6:06       ` Tianchen Ding
  0 siblings, 1 reply; 11+ messages in thread
From: Tianchen Ding @ 2022-03-05  5:26 UTC (permalink / raw)
  To: Marco Elver
  Cc: Alexander Potapenko, Dmitry Vyukov, Andrew Morton, kasan-dev,
	linux-mm, linux-kernel

On 2022/3/5 02:13, Marco Elver wrote:
> On Thu, 3 Mar 2022 at 04:15, Tianchen Ding <dtcccc@linux.alibaba.com> wrote:
>>
>> If once KFENCE is disabled by:
>> echo 0 > /sys/module/kfence/parameters/sample_interval
>> KFENCE could never be re-enabled until next rebooting.
>>
>> Allow re-enabling it by writing a positive num to sample_interval.
>>
>> Signed-off-by: Tianchen Ding <dtcccc@linux.alibaba.com>
> 
> The only problem I see with this is if KFENCE was disabled because of
> a KFENCE_WARN_ON(). See below.
> 
>> ---
>>   mm/kfence/core.c | 16 ++++++++++++++--
>>   1 file changed, 14 insertions(+), 2 deletions(-)
>>
>> diff --git a/mm/kfence/core.c b/mm/kfence/core.c
>> index 13128fa13062..19eb123c0bba 100644
>> --- a/mm/kfence/core.c
>> +++ b/mm/kfence/core.c
>> @@ -55,6 +55,7 @@ EXPORT_SYMBOL_GPL(kfence_sample_interval); /* Export for test modules. */
>>   #endif
>>   #define MODULE_PARAM_PREFIX "kfence."
>>
>> +static int kfence_enable_late(void);
>>   static int param_set_sample_interval(const char *val, const struct kernel_param *kp)
>>   {
>>          unsigned long num;
>> @@ -65,10 +66,11 @@ static int param_set_sample_interval(const char *val, const struct kernel_param
>>
>>          if (!num) /* Using 0 to indicate KFENCE is disabled. */
>>                  WRITE_ONCE(kfence_enabled, false);
>> -       else if (!READ_ONCE(kfence_enabled) && system_state != SYSTEM_BOOTING)
>> -               return -EINVAL; /* Cannot (re-)enable KFENCE on-the-fly. */
>>
>>          *((unsigned long *)kp->arg) = num;
>> +
>> +       if (num && !READ_ONCE(kfence_enabled) && system_state != SYSTEM_BOOTING)
> 
> Should probably have an 'old_sample_interval = *((unsigned long
> *)kp->arg)' somewhere before, and add a '&& !old_sample_interval',
> because if old_sample_interval!=0 then KFENCE was disabled due to a
> KFENCE_WARN_ON(). Also in this case, it should return -EINVAL. So you
> want a flow like this:
> 
> old_sample_interval = ...;
> ...
> if (num && !READ_ONCE(kfence_enabled) && system_state != SYSTEM_BOOTING)
>    return old_sample_interval ? -EINVAL : kfence_enable_late();
> ...
> 

Because sample_interval will used by delayed_work, we must put setting 
sample_interval before enabling KFENCE.
So the order would be:

old_sample_interval = sample_interval;
sample_interval = num;
if (...) kfence_enable_late();

This may be bypassed after KFENCE_WARN_ON() happens, if we first write 
0, and then write 100 to it.

How about this one:

	if (ret < 0)
		return ret;

+	/* Cannot set sample_interval after KFENCE_WARN_ON(). */
+	if (unlikely(*((unsigned long *)kp->arg) && !READ_ONCE(kfence_enabled)))
+		return -EINVAL;
+
	if (!num) /* Using 0 to indicate KFENCE is disabled. */
		WRITE_ONCE(kfence_enabled, false);

> Thanks,
> -- Marco


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC PATCH 1/2] kfence: Allow re-enabling KFENCE after system startup
  2022-03-05  5:26     ` Tianchen Ding
@ 2022-03-05  6:06       ` Tianchen Ding
  2022-03-05  9:36         ` Marco Elver
  0 siblings, 1 reply; 11+ messages in thread
From: Tianchen Ding @ 2022-03-05  6:06 UTC (permalink / raw)
  To: Marco Elver
  Cc: Alexander Potapenko, Dmitry Vyukov, Andrew Morton, kasan-dev,
	linux-mm, linux-kernel

On 2022/3/5 13:26, Tianchen Ding wrote:
> On 2022/3/5 02:13, Marco Elver wrote:
>> On Thu, 3 Mar 2022 at 04:15, Tianchen Ding <dtcccc@linux.alibaba.com> 
>> wrote:
>>>
>>> If once KFENCE is disabled by:
>>> echo 0 > /sys/module/kfence/parameters/sample_interval
>>> KFENCE could never be re-enabled until next rebooting.
>>>
>>> Allow re-enabling it by writing a positive num to sample_interval.
>>>
>>> Signed-off-by: Tianchen Ding <dtcccc@linux.alibaba.com>
>>
>> The only problem I see with this is if KFENCE was disabled because of
>> a KFENCE_WARN_ON(). See below.
>>
>>> ---
>>>   mm/kfence/core.c | 16 ++++++++++++++--
>>>   1 file changed, 14 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/mm/kfence/core.c b/mm/kfence/core.c
>>> index 13128fa13062..19eb123c0bba 100644
>>> --- a/mm/kfence/core.c
>>> +++ b/mm/kfence/core.c
>>> @@ -55,6 +55,7 @@ EXPORT_SYMBOL_GPL(kfence_sample_interval); /* 
>>> Export for test modules. */
>>>   #endif
>>>   #define MODULE_PARAM_PREFIX "kfence."
>>>
>>> +static int kfence_enable_late(void);
>>>   static int param_set_sample_interval(const char *val, const struct 
>>> kernel_param *kp)
>>>   {
>>>          unsigned long num;
>>> @@ -65,10 +66,11 @@ static int param_set_sample_interval(const char 
>>> *val, const struct kernel_param
>>>
>>>          if (!num) /* Using 0 to indicate KFENCE is disabled. */
>>>                  WRITE_ONCE(kfence_enabled, false);
>>> -       else if (!READ_ONCE(kfence_enabled) && system_state != 
>>> SYSTEM_BOOTING)
>>> -               return -EINVAL; /* Cannot (re-)enable KFENCE 
>>> on-the-fly. */
>>>
>>>          *((unsigned long *)kp->arg) = num;
>>> +
>>> +       if (num && !READ_ONCE(kfence_enabled) && system_state != 
>>> SYSTEM_BOOTING)
>>
>> Should probably have an 'old_sample_interval = *((unsigned long
>> *)kp->arg)' somewhere before, and add a '&& !old_sample_interval',
>> because if old_sample_interval!=0 then KFENCE was disabled due to a
>> KFENCE_WARN_ON(). Also in this case, it should return -EINVAL. So you
>> want a flow like this:
>>
>> old_sample_interval = ...;
>> ...
>> if (num && !READ_ONCE(kfence_enabled) && system_state != SYSTEM_BOOTING)
>>    return old_sample_interval ? -EINVAL : kfence_enable_late();
>> ...
>>
> 
> Because sample_interval will used by delayed_work, we must put setting 
> sample_interval before enabling KFENCE.
> So the order would be:
> 
> old_sample_interval = sample_interval;
> sample_interval = num;
> if (...) kfence_enable_late();
> 
> This may be bypassed after KFENCE_WARN_ON() happens, if we first write 
> 0, and then write 100 to it.
> 
> How about this one:
> 
>      if (ret < 0)
>          return ret;
> 
> +    /* Cannot set sample_interval after KFENCE_WARN_ON(). */
> +    if (unlikely(*((unsigned long *)kp->arg) && 
> !READ_ONCE(kfence_enabled)))
> +        return -EINVAL;
> +
>      if (!num) /* Using 0 to indicate KFENCE is disabled. */
>          WRITE_ONCE(kfence_enabled, false);
> 

Hmm...
I found KFENCE_WARN_ON() may be called when sample_interval==0. (e.g., 
kfence_guarded_free())
So it's better to add a bool.

diff --git a/mm/kfence/core.c b/mm/kfence/core.c
index ae69b2a113a4..c729be0207e8 100644
--- a/mm/kfence/core.c
+++ b/mm/kfence/core.c
@@ -38,14 +38,17 @@
  #define KFENCE_WARN_ON(cond) 
          \
  	({                                                                     \
  		const bool __cond = WARN_ON(cond);                             \
-		if (unlikely(__cond))                                          \
+		if (unlikely(__cond)) {                                        \
  			WRITE_ONCE(kfence_enabled, false);                     \
+			disabled_by_warn = true;                               \
+		}                                                              \
  		__cond;                                                        \
  	})

  /* === Data 
================================================================= */

  static bool kfence_enabled __read_mostly;
+static bool disabled_by_warn __read_mostly;

  unsigned long kfence_sample_interval __read_mostly = 
CONFIG_KFENCE_SAMPLE_INTERVAL;
  EXPORT_SYMBOL_GPL(kfence_sample_interval); /* Export for test modules. */
@@ -70,7 +73,7 @@ static int param_set_sample_interval(const char *val, 
const struct kernel_param
  	*((unsigned long *)kp->arg) = num;

  	if (num && !READ_ONCE(kfence_enabled) && system_state != SYSTEM_BOOTING)
-		return kfence_enable_late();
+		return disabled_by_warn ? -EINVAL : kfence_enable_late();
  	return 0;
  }

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [RFC PATCH 1/2] kfence: Allow re-enabling KFENCE after system startup
  2022-03-05  6:06       ` Tianchen Ding
@ 2022-03-05  9:36         ` Marco Elver
  0 siblings, 0 replies; 11+ messages in thread
From: Marco Elver @ 2022-03-05  9:36 UTC (permalink / raw)
  To: Tianchen Ding
  Cc: Alexander Potapenko, Dmitry Vyukov, Andrew Morton, kasan-dev,
	linux-mm, linux-kernel

On Sat, 5 Mar 2022 at 07:06, Tianchen Ding <dtcccc@linux.alibaba.com> wrote:
[...]
> Hmm...
> I found KFENCE_WARN_ON() may be called when sample_interval==0. (e.g.,
> kfence_guarded_free())
> So it's better to add a bool.

Yes, that's probably safer and easier.

Thanks,
-- Marco

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2022-03-05  9:37 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-03-03  3:15 [RFC PATCH 0/2] Alloc kfence_pool after system startup Tianchen Ding
2022-03-03  3:15 ` [RFC PATCH 1/2] kfence: Allow re-enabling KFENCE " Tianchen Ding
2022-03-04 18:13   ` Marco Elver
2022-03-05  5:26     ` Tianchen Ding
2022-03-05  6:06       ` Tianchen Ding
2022-03-05  9:36         ` Marco Elver
2022-03-03  3:15 ` [RFC PATCH 2/2] kfence: Alloc kfence_pool " Tianchen Ding
2022-03-04 18:14   ` Marco Elver
     [not found] ` <CAG_fn=Wd5GMFojbvdZkysBQ5Auy5YYRdmZfjSVMq8gpDMRZ_3w@mail.gmail.com>
2022-03-03  9:30   ` [RFC PATCH 0/2] " Marco Elver
2022-03-04  2:24     ` Tianchen Ding
2022-03-04 18:14       ` Marco Elver

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).