Linux-EDAC Archive on lore.kernel.org
 help / color / Atom feed
* [v2,1/2] ras: fix an off-by-one error in __find_elem()
@ 2019-04-16 21:33 Cong Wang
  2019-04-16 21:33 ` [PATCH v2 1/2] " Cong Wang
                   ` (2 more replies)
  0 siblings, 3 replies; 19+ messages in thread
From: Cong Wang @ 2019-04-16 21:33 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-edac, Cong Wang, Tony Luck, Borislav Petkov, Thomas Gleixner

ce_arr.array[] is always within the range [0, ce_arr.n-1].
However, the binary search code in __find_elem() uses ce_arr.n
as the maximum index, which could lead to an off-by-one
out-of-bound access right after the while loop. In this case,
we should not even read it, just return -ENOKEY instead.

Note, this could cause a kernel crash if ce_arr.n is exactly
MAX_ELEMS.

Fixes: 011d82611172 ("RAS: Add a Corrected Errors Collector")
Cc: Tony Luck <tony.luck@intel.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
---
 drivers/ras/cec.c | 9 +++++----
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/drivers/ras/cec.c b/drivers/ras/cec.c
index 2d9ec378a8bc..a4ff54e50673 100644
--- a/drivers/ras/cec.c
+++ b/drivers/ras/cec.c
@@ -204,10 +204,11 @@ static int __find_elem(struct ce_array *ca, u64 pfn, unsigned int *to)
 	if (to)
 		*to = min;
 
-	this_pfn = PFN(ca->array[min]);
-
-	if (this_pfn == pfn)
-		return min;
+	if (min < ca->n) {
+		this_pfn = PFN(ca->array[min]);
+		if (this_pfn == pfn)
+			return min;
+	}
 
 	return -ENOKEY;
 }

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH v2 1/2] ras: fix an off-by-one error in __find_elem()
  2019-04-16 21:33 [v2,1/2] ras: fix an off-by-one error in __find_elem() Cong Wang
@ 2019-04-16 21:33 ` " Cong Wang
  2019-04-16 21:33 ` [v2,2/2] ras: close the race condition with timer Cong Wang
  2019-04-16 21:46 ` [v2,1/2] ras: fix an off-by-one error in __find_elem() Borislav Petkov
  2 siblings, 0 replies; 19+ messages in thread
From: Cong Wang @ 2019-04-16 21:33 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-edac, Cong Wang, Tony Luck, Borislav Petkov, Thomas Gleixner

ce_arr.array[] is always within the range [0, ce_arr.n-1].
However, the binary search code in __find_elem() uses ce_arr.n
as the maximum index, which could lead to an off-by-one
out-of-bound access right after the while loop. In this case,
we should not even read it, just return -ENOKEY instead.

Note, this could cause a kernel crash if ce_arr.n is exactly
MAX_ELEMS.

Fixes: 011d82611172 ("RAS: Add a Corrected Errors Collector")
Cc: Tony Luck <tony.luck@intel.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
---
 drivers/ras/cec.c | 9 +++++----
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/drivers/ras/cec.c b/drivers/ras/cec.c
index 2d9ec378a8bc..a4ff54e50673 100644
--- a/drivers/ras/cec.c
+++ b/drivers/ras/cec.c
@@ -204,10 +204,11 @@ static int __find_elem(struct ce_array *ca, u64 pfn, unsigned int *to)
 	if (to)
 		*to = min;
 
-	this_pfn = PFN(ca->array[min]);
-
-	if (this_pfn == pfn)
-		return min;
+	if (min < ca->n) {
+		this_pfn = PFN(ca->array[min]);
+		if (this_pfn == pfn)
+			return min;
+	}
 
 	return -ENOKEY;
 }
-- 
2.20.1


^ permalink raw reply	[flat|nested] 19+ messages in thread

* [v2,2/2] ras: close the race condition with timer
@ 2019-04-16 21:33 ` Cong Wang
  2019-04-16 21:33   ` [PATCH v2 2/2] " Cong Wang
  2019-06-08 15:28   ` [tip:ras/urgent] RAS/CEC: Convert the timer callback to a workqueue tip-bot for Cong Wang
  0 siblings, 2 replies; 19+ messages in thread
From: Cong Wang @ 2019-04-16 21:33 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-edac, Cong Wang, Tony Luck, Borislav Petkov, Thomas Gleixner

cec_timer_fn() is a timer callback which reads ce_arr.array[]
and updates its decay values. Elements could be added to or
removed from this global array in parallel, although the array
itself will not grow or shrink. del_lru_elem_unlocked() uses
FULL_COUNT() as a key to find a right element to remove,
which could be affected by the parallel timer.

Fix this by converting the timer to a delayed work as suggested
by Borislav, to avoid using spinlock.

Fixes: 011d82611172 ("RAS: Add a Corrected Errors Collector")
Cc: Tony Luck <tony.luck@intel.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
---
 drivers/ras/cec.c | 22 ++++++++++++----------
 1 file changed, 12 insertions(+), 10 deletions(-)

diff --git a/drivers/ras/cec.c b/drivers/ras/cec.c
index a4ff54e50673..5c2040a7389d 100644
--- a/drivers/ras/cec.c
+++ b/drivers/ras/cec.c
@@ -2,6 +2,7 @@
 #include <linux/mm.h>
 #include <linux/gfp.h>
 #include <linux/kernel.h>
+#include <linux/workqueue.h>
 
 #include <asm/mce.h>
 
@@ -131,7 +132,7 @@ static unsigned int count_threshold = COUNT_MASK;
 #define CEC_TIMER_DEFAULT_INTERVAL	24 * 60 * 60	/* 24 hrs */
 #define CEC_TIMER_MIN_INTERVAL		 1 * 60 * 60	/* 1h */
 #define CEC_TIMER_MAX_INTERVAL	   30 *	24 * 60 * 60	/* one month */
-static struct timer_list cec_timer;
+static struct delayed_work cec_work;
 static u64 timer_interval = CEC_TIMER_DEFAULT_INTERVAL;
 
 /*
@@ -160,20 +161,21 @@ static void do_spring_cleaning(struct ce_array *ca)
 /*
  * @interval in seconds
  */
-static void cec_mod_timer(struct timer_list *t, unsigned long interval)
+static void cec_mod_work(unsigned long interval)
 {
 	unsigned long iv;
 
-	iv = interval * HZ + jiffies;
-
-	mod_timer(t, round_jiffies(iv));
+	iv = interval * HZ;
+	mod_delayed_work(system_wq, &cec_work, round_jiffies(iv));
 }
 
-static void cec_timer_fn(struct timer_list *unused)
+static void cec_work_fn(struct work_struct *work)
 {
+	mutex_lock(&ce_mutex);
 	do_spring_cleaning(&ce_arr);
+	mutex_unlock(&ce_mutex);
 
-	cec_mod_timer(&cec_timer, timer_interval);
+	cec_mod_work(timer_interval);
 }
 
 /*
@@ -383,7 +385,7 @@ static int decay_interval_set(void *data, u64 val)
 
 	timer_interval = val;
 
-	cec_mod_timer(&cec_timer, timer_interval);
+	cec_mod_work(timer_interval);
 	return 0;
 }
 DEFINE_DEBUGFS_ATTRIBUTE(decay_interval_ops, u64_get, decay_interval_set, "%lld\n");
@@ -509,8 +511,8 @@ void __init cec_init(void)
 	if (create_debugfs_nodes())
 		return;
 
-	timer_setup(&cec_timer, cec_timer_fn, 0);
-	cec_mod_timer(&cec_timer, CEC_TIMER_DEFAULT_INTERVAL);
+	INIT_DELAYED_WORK(&cec_work, cec_work_fn);
+	schedule_delayed_work(&cec_work, CEC_TIMER_DEFAULT_INTERVAL);
 
 	pr_info("Correctable Errors collector initialized.\n");
 }

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH v2 2/2] ras: close the race condition with timer
  2019-04-16 21:33 ` [v2,2/2] ras: close the race condition with timer Cong Wang
@ 2019-04-16 21:33   ` " Cong Wang
  2019-06-08 15:28   ` [tip:ras/urgent] RAS/CEC: Convert the timer callback to a workqueue tip-bot for Cong Wang
  1 sibling, 0 replies; 19+ messages in thread
From: Cong Wang @ 2019-04-16 21:33 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-edac, Cong Wang, Tony Luck, Borislav Petkov, Thomas Gleixner

cec_timer_fn() is a timer callback which reads ce_arr.array[]
and updates its decay values. Elements could be added to or
removed from this global array in parallel, although the array
itself will not grow or shrink. del_lru_elem_unlocked() uses
FULL_COUNT() as a key to find a right element to remove,
which could be affected by the parallel timer.

Fix this by converting the timer to a delayed work as suggested
by Borislav, to avoid using spinlock.

Fixes: 011d82611172 ("RAS: Add a Corrected Errors Collector")
Cc: Tony Luck <tony.luck@intel.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
---
 drivers/ras/cec.c | 22 ++++++++++++----------
 1 file changed, 12 insertions(+), 10 deletions(-)

diff --git a/drivers/ras/cec.c b/drivers/ras/cec.c
index a4ff54e50673..5c2040a7389d 100644
--- a/drivers/ras/cec.c
+++ b/drivers/ras/cec.c
@@ -2,6 +2,7 @@
 #include <linux/mm.h>
 #include <linux/gfp.h>
 #include <linux/kernel.h>
+#include <linux/workqueue.h>
 
 #include <asm/mce.h>
 
@@ -131,7 +132,7 @@ static unsigned int count_threshold = COUNT_MASK;
 #define CEC_TIMER_DEFAULT_INTERVAL	24 * 60 * 60	/* 24 hrs */
 #define CEC_TIMER_MIN_INTERVAL		 1 * 60 * 60	/* 1h */
 #define CEC_TIMER_MAX_INTERVAL	   30 *	24 * 60 * 60	/* one month */
-static struct timer_list cec_timer;
+static struct delayed_work cec_work;
 static u64 timer_interval = CEC_TIMER_DEFAULT_INTERVAL;
 
 /*
@@ -160,20 +161,21 @@ static void do_spring_cleaning(struct ce_array *ca)
 /*
  * @interval in seconds
  */
-static void cec_mod_timer(struct timer_list *t, unsigned long interval)
+static void cec_mod_work(unsigned long interval)
 {
 	unsigned long iv;
 
-	iv = interval * HZ + jiffies;
-
-	mod_timer(t, round_jiffies(iv));
+	iv = interval * HZ;
+	mod_delayed_work(system_wq, &cec_work, round_jiffies(iv));
 }
 
-static void cec_timer_fn(struct timer_list *unused)
+static void cec_work_fn(struct work_struct *work)
 {
+	mutex_lock(&ce_mutex);
 	do_spring_cleaning(&ce_arr);
+	mutex_unlock(&ce_mutex);
 
-	cec_mod_timer(&cec_timer, timer_interval);
+	cec_mod_work(timer_interval);
 }
 
 /*
@@ -383,7 +385,7 @@ static int decay_interval_set(void *data, u64 val)
 
 	timer_interval = val;
 
-	cec_mod_timer(&cec_timer, timer_interval);
+	cec_mod_work(timer_interval);
 	return 0;
 }
 DEFINE_DEBUGFS_ATTRIBUTE(decay_interval_ops, u64_get, decay_interval_set, "%lld\n");
@@ -509,8 +511,8 @@ void __init cec_init(void)
 	if (create_debugfs_nodes())
 		return;
 
-	timer_setup(&cec_timer, cec_timer_fn, 0);
-	cec_mod_timer(&cec_timer, CEC_TIMER_DEFAULT_INTERVAL);
+	INIT_DELAYED_WORK(&cec_work, cec_work_fn);
+	schedule_delayed_work(&cec_work, CEC_TIMER_DEFAULT_INTERVAL);
 
 	pr_info("Correctable Errors collector initialized.\n");
 }
-- 
2.20.1


^ permalink raw reply	[flat|nested] 19+ messages in thread

* [v2,1/2] ras: fix an off-by-one error in __find_elem()
@ 2019-04-16 21:46 ` Borislav Petkov
  2019-04-16 21:46   ` [PATCH v2 1/2] " Borislav Petkov
  2019-04-16 23:16   ` [v2,1/2] " Cong Wang
  0 siblings, 2 replies; 19+ messages in thread
From: Borislav Petkov @ 2019-04-16 21:46 UTC (permalink / raw)
  To: Cong Wang; +Cc: linux-kernel, linux-edac, Tony Luck, Thomas Gleixner

On Tue, Apr 16, 2019 at 02:33:50PM -0700, Cong Wang wrote:
> ce_arr.array[] is always within the range [0, ce_arr.n-1].
> However, the binary search code in __find_elem() uses ce_arr.n
> as the maximum index, which could lead to an off-by-one
> out-of-bound access right after the while loop. In this case,
> we should not even read it, just return -ENOKEY instead.
> 
> Note, this could cause a kernel crash if ce_arr.n is exactly
> MAX_ELEMS.

"Could cause"?

I'm still waiting for a demonstration. You can build a case through
writing values in the debugfs nodes I pointed you at or even with a
patch ontop preparing the exact conditions for it to crash. And then
give me that "recipe" to trigger it here in a VM.

Thx.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 1/2] ras: fix an off-by-one error in __find_elem()
  2019-04-16 21:46 ` [v2,1/2] ras: fix an off-by-one error in __find_elem() Borislav Petkov
@ 2019-04-16 21:46   ` " Borislav Petkov
  2019-04-16 23:16   ` [v2,1/2] " Cong Wang
  1 sibling, 0 replies; 19+ messages in thread
From: Borislav Petkov @ 2019-04-16 21:46 UTC (permalink / raw)
  To: Cong Wang; +Cc: linux-kernel, linux-edac, Tony Luck, Thomas Gleixner

On Tue, Apr 16, 2019 at 02:33:50PM -0700, Cong Wang wrote:
> ce_arr.array[] is always within the range [0, ce_arr.n-1].
> However, the binary search code in __find_elem() uses ce_arr.n
> as the maximum index, which could lead to an off-by-one
> out-of-bound access right after the while loop. In this case,
> we should not even read it, just return -ENOKEY instead.
> 
> Note, this could cause a kernel crash if ce_arr.n is exactly
> MAX_ELEMS.

"Could cause"?

I'm still waiting for a demonstration. You can build a case through
writing values in the debugfs nodes I pointed you at or even with a
patch ontop preparing the exact conditions for it to crash. And then
give me that "recipe" to trigger it here in a VM.

Thx.

-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [v2,1/2] ras: fix an off-by-one error in __find_elem()
@ 2019-04-16 23:16   ` " Cong Wang
  2019-04-16 23:16     ` [PATCH v2 1/2] " Cong Wang
  2019-04-20 11:34     ` [v2,1/2] " Borislav Petkov
  0 siblings, 2 replies; 19+ messages in thread
From: Cong Wang @ 2019-04-16 23:16 UTC (permalink / raw)
  To: Borislav Petkov; +Cc: LKML, linux-edac, Tony Luck, Thomas Gleixner

On Tue, Apr 16, 2019 at 2:46 PM Borislav Petkov <bp@alien8.de> wrote:
>
> On Tue, Apr 16, 2019 at 02:33:50PM -0700, Cong Wang wrote:
> > ce_arr.array[] is always within the range [0, ce_arr.n-1].
> > However, the binary search code in __find_elem() uses ce_arr.n
> > as the maximum index, which could lead to an off-by-one
> > out-of-bound access right after the while loop. In this case,
> > we should not even read it, just return -ENOKEY instead.
> >
> > Note, this could cause a kernel crash if ce_arr.n is exactly
> > MAX_ELEMS.
>
> "Could cause"?
>
> I'm still waiting for a demonstration. You can build a case through
> writing values in the debugfs nodes I pointed you at or even with a
> patch ontop preparing the exact conditions for it to crash. And then
> give me that "recipe" to trigger it here in a VM.

It is actually fairly easy:

1) Fill the whole page with PFN's:
for i in `seq 0 511`; do echo $i >> /sys/kernel/debug/ras/cec/pfn; done

2) Set thresh to 1 in order to trigger the deletion:
echo 1 > /sys/kernel/debug/ras/cec/count_threshold

3) Repeatedly add and remove the last element:
echo 512 >> /sys/kernel/debug/ras/cec/pfn
(until you get a crash.)

In case you still don't get it, here it is:

[   57.732593] BUG: unable to handle kernel paging request at ffff9c667bca0000
[   57.734994] #PF error: [PROT] [WRITE]
[   57.735891] PGD 75601067 P4D 75601067 PUD 75605067 PMD 7bca1063 PTE
800000007bca0061
[   57.737702] Oops: 0003 [#1] SMP PTI
[   57.738533] CPU: 0 PID: 649 Comm: bash Not tainted 5.1.0-rc5+ #561
[   57.739965] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
BIOS ?-20180724_192412-buildhw-07.phx2.fedoraproject.org-1.fc29
04/01/2014
[   57.742892] RIP: 0010:__memmove+0x57/0x1a0
[   57.743853] Code: 00 72 05 40 38 fe 74 3b 48 83 ea 20 48 83 ea 20
4c 8b 1e 4c 8b 56 08 4c 8b 4e 10 4c 8b 46 18 48 8d 76 20 4c 89 1f 4c
89 57 08 <4c> 89 4f 10 4c 89 47 18 48 8d 7f 20 73 d4 48 83 c2 20 e9 a2
00 00
[   57.748150] RSP: 0018:ffffbe2ec0c8bdf8 EFLAGS: 00010206
[   57.749371] RAX: ffff9c667a5c1ff0 RBX: 0000000000000001 RCX: 0000000000000ff8
[   57.751018] RDX: 00000007fe921fb8 RSI: ffff9c667bca0018 RDI: ffff9c667bc9fff0
[   57.752674] RBP: 0000000000000200 R08: 0000000000000000 R09: 0000015c00000000
[   57.754325] R10: 0000000000040001 R11: 5a5a5a5a5a5a5a5a R12: 0000000000000004
[   57.755976] R13: ffff9c6671787778 R14: ffff9c6671787728 R15: ffff9c6671787750
[   57.757631] FS:  00007f33ca294740(0000) GS:ffff9c667d800000(0000)
knlGS:0000000000000000
[   57.759689] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   57.761023] CR2: ffff9c667bca0000 CR3: 000000007061e000 CR4: 00000000000406f0
[   57.762681] Call Trace:
[   57.763275]  del_elem.constprop.1+0x39/0x40
[   57.764260]  cec_add_elem+0x1e4/0x211
[   57.765129]  simple_attr_write+0xa2/0xc3
[   57.766057]  debugfs_attr_write+0x45/0x5c
[   57.767005]  full_proxy_write+0x4b/0x65
[   57.767911]  ? full_proxy_poll+0x50/0x50
[   57.768844]  vfs_write+0xb8/0xf5
[   57.769613]  ksys_write+0x6b/0xb8
[   57.770407]  do_syscall_64+0x57/0x65
[   57.771249]  entry_SYSCALL_64_after_hwframe+0x49/0xbe

I will leave it as a homework for explaining why the crash is inside
memmove(). ;)

Thanks.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 1/2] ras: fix an off-by-one error in __find_elem()
  2019-04-16 23:16   ` [v2,1/2] " Cong Wang
@ 2019-04-16 23:16     ` " Cong Wang
  2019-04-20 11:34     ` [v2,1/2] " Borislav Petkov
  1 sibling, 0 replies; 19+ messages in thread
From: Cong Wang @ 2019-04-16 23:16 UTC (permalink / raw)
  To: Borislav Petkov; +Cc: LKML, linux-edac, Tony Luck, Thomas Gleixner

On Tue, Apr 16, 2019 at 2:46 PM Borislav Petkov <bp@alien8.de> wrote:
>
> On Tue, Apr 16, 2019 at 02:33:50PM -0700, Cong Wang wrote:
> > ce_arr.array[] is always within the range [0, ce_arr.n-1].
> > However, the binary search code in __find_elem() uses ce_arr.n
> > as the maximum index, which could lead to an off-by-one
> > out-of-bound access right after the while loop. In this case,
> > we should not even read it, just return -ENOKEY instead.
> >
> > Note, this could cause a kernel crash if ce_arr.n is exactly
> > MAX_ELEMS.
>
> "Could cause"?
>
> I'm still waiting for a demonstration. You can build a case through
> writing values in the debugfs nodes I pointed you at or even with a
> patch ontop preparing the exact conditions for it to crash. And then
> give me that "recipe" to trigger it here in a VM.

It is actually fairly easy:

1) Fill the whole page with PFN's:
for i in `seq 0 511`; do echo $i >> /sys/kernel/debug/ras/cec/pfn; done

2) Set thresh to 1 in order to trigger the deletion:
echo 1 > /sys/kernel/debug/ras/cec/count_threshold

3) Repeatedly add and remove the last element:
echo 512 >> /sys/kernel/debug/ras/cec/pfn
(until you get a crash.)

In case you still don't get it, here it is:

[   57.732593] BUG: unable to handle kernel paging request at ffff9c667bca0000
[   57.734994] #PF error: [PROT] [WRITE]
[   57.735891] PGD 75601067 P4D 75601067 PUD 75605067 PMD 7bca1063 PTE
800000007bca0061
[   57.737702] Oops: 0003 [#1] SMP PTI
[   57.738533] CPU: 0 PID: 649 Comm: bash Not tainted 5.1.0-rc5+ #561
[   57.739965] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
BIOS ?-20180724_192412-buildhw-07.phx2.fedoraproject.org-1.fc29
04/01/2014
[   57.742892] RIP: 0010:__memmove+0x57/0x1a0
[   57.743853] Code: 00 72 05 40 38 fe 74 3b 48 83 ea 20 48 83 ea 20
4c 8b 1e 4c 8b 56 08 4c 8b 4e 10 4c 8b 46 18 48 8d 76 20 4c 89 1f 4c
89 57 08 <4c> 89 4f 10 4c 89 47 18 48 8d 7f 20 73 d4 48 83 c2 20 e9 a2
00 00
[   57.748150] RSP: 0018:ffffbe2ec0c8bdf8 EFLAGS: 00010206
[   57.749371] RAX: ffff9c667a5c1ff0 RBX: 0000000000000001 RCX: 0000000000000ff8
[   57.751018] RDX: 00000007fe921fb8 RSI: ffff9c667bca0018 RDI: ffff9c667bc9fff0
[   57.752674] RBP: 0000000000000200 R08: 0000000000000000 R09: 0000015c00000000
[   57.754325] R10: 0000000000040001 R11: 5a5a5a5a5a5a5a5a R12: 0000000000000004
[   57.755976] R13: ffff9c6671787778 R14: ffff9c6671787728 R15: ffff9c6671787750
[   57.757631] FS:  00007f33ca294740(0000) GS:ffff9c667d800000(0000)
knlGS:0000000000000000
[   57.759689] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   57.761023] CR2: ffff9c667bca0000 CR3: 000000007061e000 CR4: 00000000000406f0
[   57.762681] Call Trace:
[   57.763275]  del_elem.constprop.1+0x39/0x40
[   57.764260]  cec_add_elem+0x1e4/0x211
[   57.765129]  simple_attr_write+0xa2/0xc3
[   57.766057]  debugfs_attr_write+0x45/0x5c
[   57.767005]  full_proxy_write+0x4b/0x65
[   57.767911]  ? full_proxy_poll+0x50/0x50
[   57.768844]  vfs_write+0xb8/0xf5
[   57.769613]  ksys_write+0x6b/0xb8
[   57.770407]  do_syscall_64+0x57/0x65
[   57.771249]  entry_SYSCALL_64_after_hwframe+0x49/0xbe

I will leave it as a homework for explaining why the crash is inside
memmove(). ;)

Thanks.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [v2,1/2] ras: fix an off-by-one error in __find_elem()
@ 2019-04-20 11:34     ` " Borislav Petkov
  2019-04-20 11:34       ` [PATCH v2 1/2] " Borislav Petkov
  2019-04-20 18:25       ` [v2,1/2] " Cong Wang
  0 siblings, 2 replies; 19+ messages in thread
From: Borislav Petkov @ 2019-04-20 11:34 UTC (permalink / raw)
  To: Cong Wang; +Cc: LKML, linux-edac, Tony Luck, Thomas Gleixner

On Tue, Apr 16, 2019 at 04:16:08PM -0700, Cong Wang wrote:
> It is actually fairly easy:
> 
> 1) Fill the whole page with PFN's:
> for i in `seq 0 511`; do echo $i >> /sys/kernel/debug/ras/cec/pfn; done
> 
> 2) Set thresh to 1 in order to trigger the deletion:
> echo 1 > /sys/kernel/debug/ras/cec/count_threshold
> 
> 3) Repeatedly add and remove the last element:
> echo 512 >> /sys/kernel/debug/ras/cec/pfn
> (until you get a crash.)

Thanks, I was able to reproduce with that. Below is a conglomerate patch
which converts __find_elem() to using Donald Knuth's binary search
version. I don't know what I was thinking then and why I didn't use
it from the get-go. The textbook even says that one can easily get it
wrong...

Anyway, see below. It has some debug output for easier debugging, that
will be removed in the final version, of course. With it, the injection
pattern above works as expected and I'll continue hammering on it to see
if there are more funky issues.

For easier experimenting, the whole branch is also here:

https://git.kernel.org/pub/scm/linux/kernel/git/bp/bp.git/log/?h=tip-ras-core-cec

Thx.
---
From: Borislav Petkov <bp@suse.de>
Date: Sat, 20 Apr 2019 13:27:51 +0200
Subject: [PATCH] WIP

Signed-off-by: Borislav Petkov <bp@suse.de>
---
 drivers/ras/cec.c | 83 +++++++++++++++++++++++++++++++++++------------
 1 file changed, 62 insertions(+), 21 deletions(-)

diff --git a/drivers/ras/cec.c b/drivers/ras/cec.c
index 3e9f62b84378..946e52751ae2 100644
--- a/drivers/ras/cec.c
+++ b/drivers/ras/cec.c
@@ -185,31 +185,42 @@ static void cec_work_fn(struct work_struct *work)
  */
 static int __find_elem(struct ce_array *ca, u64 pfn, unsigned int *to)
 {
+	int min = 0, max = ca->n - 1;
 	u64 this_pfn;
-	int min = 0, max = ca->n;
 
-	while (min < max) {
-		int tmp = (max + min) >> 1;
+	pr_info("%s: entry %llu in [%d:%d]?\n", __func__, pfn, min, max);
 
-		this_pfn = PFN(ca->array[tmp]);
+	while (min <= max) {
+		int i = (min + max) >> 1;
+
+		this_pfn = PFN(ca->array[i]);
+
+		pr_info("%s: ... this_pfn: %3llu, [%3d:%3d:%3d]\n",
+			__func__, this_pfn, min, i, max);
 
 		if (this_pfn < pfn)
-			min = tmp + 1;
+			min = i + 1;
 		else if (this_pfn > pfn)
-			max = tmp;
-		else {
-			min = tmp;
-			break;
+			max = i - 1;
+		else if (this_pfn == pfn) {
+			pr_info("%s: .. FOUND at %d\n", __func__, i);
+
+			if (to)
+				*to = i;
+
+			return i;
 		}
 	}
 
-	if (to)
-		*to = min;
-
-	this_pfn = PFN(ca->array[min]);
+	/*
+	 * When the loop terminates without finding @pfn, min has the index of
+	 * the element slot where the new @pfn should be inserted.
+	 */
 
-	if (this_pfn == pfn)
-		return min;
+	if (to) {
+		*to = min;
+		pr_info("%s: [%d:%d]\n", __func__, min, max);
+	}
 
 	return -ENOKEY;
 }
@@ -234,6 +245,8 @@ static void del_elem(struct ce_array *ca, int idx)
 			(ca->n - (idx + 1)) * sizeof(u64));
 
 	ca->n--;
+
+	pr_info("%s: idx: %d, ca->n: %d\n", __func__, idx, ca->n);
 }
 
 static u64 del_lru_elem_unlocked(struct ce_array *ca)
@@ -274,13 +287,43 @@ static u64 __maybe_unused del_lru_elem(void)
 	return pfn;
 }
 
+static bool sanity_check(struct ce_array *ca)
+{
+	bool ret = false;
+	u64 prev = 0;
+	int i;
+
+	for (i = 0; i < ca->n; i++) {
+		u64 this = PFN(ca->array[i]);
+
+		if (WARN(prev > this, "prev: 0x%016llx <-> this: 0x%016llx\n", prev, this))
+			ret = true;
+
+		prev = this;
+	}
+
+	if (!ret)
+		return ret;
+
+	pr_info("Sanity check:\n{ n: %d\n", ca->n);
+	for (i = 0; i < ca->n; i++) {
+		u64 this = PFN(ca->array[i]);
+		pr_info(" %03d: [%016llx|%03llx]\n", i, this, FULL_COUNT(ca->array[i]));
+	}
+	pr_info("}\n");
+
+	return ret;
+}
 
 int cec_add_elem(u64 pfn)
 {
 	struct ce_array *ca = &ce_arr;
-	unsigned int to;
+	unsigned int to = 0;
 	int count, ret = 0;
 
+	pr_info("===============================================\n");
+	pr_info("%s: entry, pfn: %lld, ca->n: %d\n", __func__, pfn, ca->n);
+
 	/*
 	 * We can be called very early on the identify_cpu() path where we are
 	 * not initialized yet. We ignore the error for simplicity.
@@ -296,6 +339,7 @@ int cec_add_elem(u64 pfn)
 		WARN_ON(!del_lru_elem_unlocked(ca));
 
 	ret = find_elem(ca, pfn, &to);
+	pr_info("%s: find_elem: ret: %d: to: %d\n", __func__, ret, to);
 	if (ret < 0) {
 		/*
 		 * Shift range [to-end] to make room for one more element.
@@ -350,6 +394,8 @@ int cec_add_elem(u64 pfn)
 	if (ca->decay_count >= CLEAN_ELEMS)
 		do_spring_cleaning(ca);
 
+	WARN_ON_ONCE(sanity_check(ca));
+
 unlock:
 	mutex_unlock(&ce_mutex);
 
@@ -407,7 +453,6 @@ DEFINE_DEBUGFS_ATTRIBUTE(count_threshold_ops, pfn_get, count_threshold_set, "%ll
 static int array_dump(struct seq_file *m, void *v)
 {
 	struct ce_array *ca = &ce_arr;
-	u64 prev = 0;
 	int i;
 
 	mutex_lock(&ce_mutex);
@@ -417,10 +462,6 @@ static int array_dump(struct seq_file *m, void *v)
 		u64 this = PFN(ca->array[i]);
 
 		seq_printf(m, " %03d: [%016llx|%03llx]\n", i, this, FULL_COUNT(ca->array[i]));
-
-		WARN_ON(prev > this);
-
-		prev = this;
 	}
 
 	seq_printf(m, "}\n");

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 1/2] ras: fix an off-by-one error in __find_elem()
  2019-04-20 11:34     ` [v2,1/2] " Borislav Petkov
@ 2019-04-20 11:34       ` " Borislav Petkov
  2019-04-20 18:25       ` [v2,1/2] " Cong Wang
  1 sibling, 0 replies; 19+ messages in thread
From: Borislav Petkov @ 2019-04-20 11:34 UTC (permalink / raw)
  To: Cong Wang; +Cc: LKML, linux-edac, Tony Luck, Thomas Gleixner

On Tue, Apr 16, 2019 at 04:16:08PM -0700, Cong Wang wrote:
> It is actually fairly easy:
> 
> 1) Fill the whole page with PFN's:
> for i in `seq 0 511`; do echo $i >> /sys/kernel/debug/ras/cec/pfn; done
> 
> 2) Set thresh to 1 in order to trigger the deletion:
> echo 1 > /sys/kernel/debug/ras/cec/count_threshold
> 
> 3) Repeatedly add and remove the last element:
> echo 512 >> /sys/kernel/debug/ras/cec/pfn
> (until you get a crash.)

Thanks, I was able to reproduce with that. Below is a conglomerate patch
which converts __find_elem() to using Donald Knuth's binary search
version. I don't know what I was thinking then and why I didn't use
it from the get-go. The textbook even says that one can easily get it
wrong...

Anyway, see below. It has some debug output for easier debugging, that
will be removed in the final version, of course. With it, the injection
pattern above works as expected and I'll continue hammering on it to see
if there are more funky issues.

For easier experimenting, the whole branch is also here:

https://git.kernel.org/pub/scm/linux/kernel/git/bp/bp.git/log/?h=tip-ras-core-cec

Thx.

---
From: Borislav Petkov <bp@suse.de>
Date: Sat, 20 Apr 2019 13:27:51 +0200
Subject: [PATCH] WIP

Signed-off-by: Borislav Petkov <bp@suse.de>
---
 drivers/ras/cec.c | 83 +++++++++++++++++++++++++++++++++++------------
 1 file changed, 62 insertions(+), 21 deletions(-)

diff --git a/drivers/ras/cec.c b/drivers/ras/cec.c
index 3e9f62b84378..946e52751ae2 100644
--- a/drivers/ras/cec.c
+++ b/drivers/ras/cec.c
@@ -185,31 +185,42 @@ static void cec_work_fn(struct work_struct *work)
  */
 static int __find_elem(struct ce_array *ca, u64 pfn, unsigned int *to)
 {
+	int min = 0, max = ca->n - 1;
 	u64 this_pfn;
-	int min = 0, max = ca->n;
 
-	while (min < max) {
-		int tmp = (max + min) >> 1;
+	pr_info("%s: entry %llu in [%d:%d]?\n", __func__, pfn, min, max);
 
-		this_pfn = PFN(ca->array[tmp]);
+	while (min <= max) {
+		int i = (min + max) >> 1;
+
+		this_pfn = PFN(ca->array[i]);
+
+		pr_info("%s: ... this_pfn: %3llu, [%3d:%3d:%3d]\n",
+			__func__, this_pfn, min, i, max);
 
 		if (this_pfn < pfn)
-			min = tmp + 1;
+			min = i + 1;
 		else if (this_pfn > pfn)
-			max = tmp;
-		else {
-			min = tmp;
-			break;
+			max = i - 1;
+		else if (this_pfn == pfn) {
+			pr_info("%s: .. FOUND at %d\n", __func__, i);
+
+			if (to)
+				*to = i;
+
+			return i;
 		}
 	}
 
-	if (to)
-		*to = min;
-
-	this_pfn = PFN(ca->array[min]);
+	/*
+	 * When the loop terminates without finding @pfn, min has the index of
+	 * the element slot where the new @pfn should be inserted.
+	 */
 
-	if (this_pfn == pfn)
-		return min;
+	if (to) {
+		*to = min;
+		pr_info("%s: [%d:%d]\n", __func__, min, max);
+	}
 
 	return -ENOKEY;
 }
@@ -234,6 +245,8 @@ static void del_elem(struct ce_array *ca, int idx)
 			(ca->n - (idx + 1)) * sizeof(u64));
 
 	ca->n--;
+
+	pr_info("%s: idx: %d, ca->n: %d\n", __func__, idx, ca->n);
 }
 
 static u64 del_lru_elem_unlocked(struct ce_array *ca)
@@ -274,13 +287,43 @@ static u64 __maybe_unused del_lru_elem(void)
 	return pfn;
 }
 
+static bool sanity_check(struct ce_array *ca)
+{
+	bool ret = false;
+	u64 prev = 0;
+	int i;
+
+	for (i = 0; i < ca->n; i++) {
+		u64 this = PFN(ca->array[i]);
+
+		if (WARN(prev > this, "prev: 0x%016llx <-> this: 0x%016llx\n", prev, this))
+			ret = true;
+
+		prev = this;
+	}
+
+	if (!ret)
+		return ret;
+
+	pr_info("Sanity check:\n{ n: %d\n", ca->n);
+	for (i = 0; i < ca->n; i++) {
+		u64 this = PFN(ca->array[i]);
+		pr_info(" %03d: [%016llx|%03llx]\n", i, this, FULL_COUNT(ca->array[i]));
+	}
+	pr_info("}\n");
+
+	return ret;
+}
 
 int cec_add_elem(u64 pfn)
 {
 	struct ce_array *ca = &ce_arr;
-	unsigned int to;
+	unsigned int to = 0;
 	int count, ret = 0;
 
+	pr_info("===============================================\n");
+	pr_info("%s: entry, pfn: %lld, ca->n: %d\n", __func__, pfn, ca->n);
+
 	/*
 	 * We can be called very early on the identify_cpu() path where we are
 	 * not initialized yet. We ignore the error for simplicity.
@@ -296,6 +339,7 @@ int cec_add_elem(u64 pfn)
 		WARN_ON(!del_lru_elem_unlocked(ca));
 
 	ret = find_elem(ca, pfn, &to);
+	pr_info("%s: find_elem: ret: %d: to: %d\n", __func__, ret, to);
 	if (ret < 0) {
 		/*
 		 * Shift range [to-end] to make room for one more element.
@@ -350,6 +394,8 @@ int cec_add_elem(u64 pfn)
 	if (ca->decay_count >= CLEAN_ELEMS)
 		do_spring_cleaning(ca);
 
+	WARN_ON_ONCE(sanity_check(ca));
+
 unlock:
 	mutex_unlock(&ce_mutex);
 
@@ -407,7 +453,6 @@ DEFINE_DEBUGFS_ATTRIBUTE(count_threshold_ops, pfn_get, count_threshold_set, "%ll
 static int array_dump(struct seq_file *m, void *v)
 {
 	struct ce_array *ca = &ce_arr;
-	u64 prev = 0;
 	int i;
 
 	mutex_lock(&ce_mutex);
@@ -417,10 +462,6 @@ static int array_dump(struct seq_file *m, void *v)
 		u64 this = PFN(ca->array[i]);
 
 		seq_printf(m, " %03d: [%016llx|%03llx]\n", i, this, FULL_COUNT(ca->array[i]));
-
-		WARN_ON(prev > this);
-
-		prev = this;
 	}
 
 	seq_printf(m, "}\n");
-- 
2.21.0

-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [v2,1/2] ras: fix an off-by-one error in __find_elem()
@ 2019-04-20 18:25       ` " Cong Wang
  2019-04-20 18:25         ` [PATCH v2 1/2] " Cong Wang
  2019-04-20 19:04         ` [v2,1/2] " Borislav Petkov
  0 siblings, 2 replies; 19+ messages in thread
From: Cong Wang @ 2019-04-20 18:25 UTC (permalink / raw)
  To: Borislav Petkov; +Cc: LKML, linux-edac, Tony Luck, Thomas Gleixner

On Sat, Apr 20, 2019 at 4:34 AM Borislav Petkov <bp@alien8.de> wrote:
>
> On Tue, Apr 16, 2019 at 04:16:08PM -0700, Cong Wang wrote:
> > It is actually fairly easy:
> >
> > 1) Fill the whole page with PFN's:
> > for i in `seq 0 511`; do echo $i >> /sys/kernel/debug/ras/cec/pfn; done
> >
> > 2) Set thresh to 1 in order to trigger the deletion:
> > echo 1 > /sys/kernel/debug/ras/cec/count_threshold
> >
> > 3) Repeatedly add and remove the last element:
> > echo 512 >> /sys/kernel/debug/ras/cec/pfn
> > (until you get a crash.)
>
> Thanks, I was able to reproduce with that. Below is a conglomerate patch
> which converts __find_elem() to using Donald Knuth's binary search
> version. I don't know what I was thinking then and why I didn't use
> it from the get-go. The textbook even says that one can easily get it
> wrong...

If you want to go that far, you can choose to use lib/bsearch.c too in
case you want to reinvent the wheel.

What's your point here? You know my fix is targeted for -stable, I doubt
your 83-line change could fit for -stable.

Feel free to drop my patch to favor yours. I am really tired.

Good luck with that!

Thanks.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 1/2] ras: fix an off-by-one error in __find_elem()
  2019-04-20 18:25       ` [v2,1/2] " Cong Wang
@ 2019-04-20 18:25         ` " Cong Wang
  2019-04-20 19:04         ` [v2,1/2] " Borislav Petkov
  1 sibling, 0 replies; 19+ messages in thread
From: Cong Wang @ 2019-04-20 18:25 UTC (permalink / raw)
  To: Borislav Petkov; +Cc: LKML, linux-edac, Tony Luck, Thomas Gleixner

On Sat, Apr 20, 2019 at 4:34 AM Borislav Petkov <bp@alien8.de> wrote:
>
> On Tue, Apr 16, 2019 at 04:16:08PM -0700, Cong Wang wrote:
> > It is actually fairly easy:
> >
> > 1) Fill the whole page with PFN's:
> > for i in `seq 0 511`; do echo $i >> /sys/kernel/debug/ras/cec/pfn; done
> >
> > 2) Set thresh to 1 in order to trigger the deletion:
> > echo 1 > /sys/kernel/debug/ras/cec/count_threshold
> >
> > 3) Repeatedly add and remove the last element:
> > echo 512 >> /sys/kernel/debug/ras/cec/pfn
> > (until you get a crash.)
>
> Thanks, I was able to reproduce with that. Below is a conglomerate patch
> which converts __find_elem() to using Donald Knuth's binary search
> version. I don't know what I was thinking then and why I didn't use
> it from the get-go. The textbook even says that one can easily get it
> wrong...

If you want to go that far, you can choose to use lib/bsearch.c too in
case you want to reinvent the wheel.

What's your point here? You know my fix is targeted for -stable, I doubt
your 83-line change could fit for -stable.

Feel free to drop my patch to favor yours. I am really tired.

Good luck with that!

Thanks.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [v2,1/2] ras: fix an off-by-one error in __find_elem()
@ 2019-04-20 19:04         ` " Borislav Petkov
  2019-04-20 19:04           ` [PATCH v2 1/2] " Borislav Petkov
  2019-04-20 19:15           ` [v2,1/2] " Cong Wang
  0 siblings, 2 replies; 19+ messages in thread
From: Borislav Petkov @ 2019-04-20 19:04 UTC (permalink / raw)
  To: Cong Wang; +Cc: LKML, linux-edac, Tony Luck, Thomas Gleixner

On Sat, Apr 20, 2019 at 11:25:43AM -0700, Cong Wang wrote:
> If you want to go that far, you can choose to use lib/bsearch.c too in
> case you want to reinvent the wheel.

Well, that doesn't give me the @to functionality which points to the
slot where the new element should be inserted, when the search was
unsuccessful.

> What's your point here?

My point is to fix it properly. Obviously.

> You know my fix is targeted for -stable,

Well, first you sent me this:

https://lkml.kernel.org/r/20190416012001.5338-1-xiyou.wangcong@gmail.com

then this:

https://lkml.kernel.org/r/20190416213351.28999-1-xiyou.wangcong@gmail.com

Tony liked this second version more and if you look at the final result of mine:

	int min = 0, max = ca->n - 1;

	...

                if (this_pfn < pfn)
                        min = i + 1;
                else if (this_pfn > pfn)
                        max = i - 1;
                else if (this_pfn == pfn) {
                        if (to)
                                *to = i;
                        return i;
                }

it has basically *both*: the correct [min:max] range *and* the return of
ithe ndex when found. But the algorithm this time is the correct one.

> I doubt your 83-line change could fit for -stable.

My 83-line change has debug output only for experimentation. It will,
*of* *course* be removed before committing it upstream. That's why I
called it "a conglomerate patch" and I said "It has some debug output
for easier debugging, that will be removed in the final version, of
course." I guess you didn't read that either.

And the sanity_check() piece will be a separate patch, of course.

In the end the diffstat will be 30-40 lines max.

> Feel free to drop my patch to favor yours. I am really tired.

Suit yourself. Thanks for the reporting.

> Good luck with that!

Ditto.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 1/2] ras: fix an off-by-one error in __find_elem()
  2019-04-20 19:04         ` [v2,1/2] " Borislav Petkov
@ 2019-04-20 19:04           ` " Borislav Petkov
  2019-04-20 19:15           ` [v2,1/2] " Cong Wang
  1 sibling, 0 replies; 19+ messages in thread
From: Borislav Petkov @ 2019-04-20 19:04 UTC (permalink / raw)
  To: Cong Wang; +Cc: LKML, linux-edac, Tony Luck, Thomas Gleixner

On Sat, Apr 20, 2019 at 11:25:43AM -0700, Cong Wang wrote:
> If you want to go that far, you can choose to use lib/bsearch.c too in
> case you want to reinvent the wheel.

Well, that doesn't give me the @to functionality which points to the
slot where the new element should be inserted, when the search was
unsuccessful.

> What's your point here?

My point is to fix it properly. Obviously.

> You know my fix is targeted for -stable,

Well, first you sent me this:

https://lkml.kernel.org/r/20190416012001.5338-1-xiyou.wangcong@gmail.com

then this:

https://lkml.kernel.org/r/20190416213351.28999-1-xiyou.wangcong@gmail.com

Tony liked this second version more and if you look at the final result of mine:

	int min = 0, max = ca->n - 1;

	...

                if (this_pfn < pfn)
                        min = i + 1;
                else if (this_pfn > pfn)
                        max = i - 1;
                else if (this_pfn == pfn) {
                        if (to)
                                *to = i;
                        return i;
                }

it has basically *both*: the correct [min:max] range *and* the return of
ithe ndex when found. But the algorithm this time is the correct one.

> I doubt your 83-line change could fit for -stable.

My 83-line change has debug output only for experimentation. It will,
*of* *course* be removed before committing it upstream. That's why I
called it "a conglomerate patch" and I said "It has some debug output
for easier debugging, that will be removed in the final version, of
course." I guess you didn't read that either.

And the sanity_check() piece will be a separate patch, of course.

In the end the diffstat will be 30-40 lines max.

> Feel free to drop my patch to favor yours. I am really tired.

Suit yourself. Thanks for the reporting.

> Good luck with that!

Ditto.

-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [v2,1/2] ras: fix an off-by-one error in __find_elem()
@ 2019-04-20 19:15           ` " Cong Wang
  2019-04-20 19:15             ` [PATCH v2 1/2] " Cong Wang
  2019-04-21  8:27             ` [v2,1/2] " Borislav Petkov
  0 siblings, 2 replies; 19+ messages in thread
From: Cong Wang @ 2019-04-20 19:15 UTC (permalink / raw)
  To: Borislav Petkov; +Cc: LKML, linux-edac, Tony Luck, Thomas Gleixner

On Sat, Apr 20, 2019 at 12:04 PM Borislav Petkov <bp@alien8.de> wrote:
>
> On Sat, Apr 20, 2019 at 11:25:43AM -0700, Cong Wang wrote:
> > If you want to go that far, you can choose to use lib/bsearch.c too in
> > case you want to reinvent the wheel.
>
> Well, that doesn't give me the @to functionality which points to the
> slot where the new element should be inserted, when the search was
> unsuccessful.

No one stops you from adding it, as you are going far you can always
go further. :)


>
> > What's your point here?
>
> My point is to fix it properly. Obviously.

Of course, no one can stop you from rewriting the whole ras code by
doing it properly.


>
> > You know my fix is targeted for -stable,
>
> Well, first you sent me this:
>
> https://lkml.kernel.org/r/20190416012001.5338-1-xiyou.wangcong@gmail.com
>
> then this:
>
> https://lkml.kernel.org/r/20190416213351.28999-1-xiyou.wangcong@gmail.com


Yes, one is V1 and the other is V2. Is it hard to understand V2 is to replace
V1?


>
> Tony liked this second version more and if you look at the final result of mine:


Sorry, I have no reason to look into a 83-line change.


> it has basically *both*: the correct [min:max] range *and* the return of
> ithe ndex when found. But the algorithm this time is the correct one.

I don't review it, so I don't know. Feel free to believe you are correct here,
until someone finds a bug later.


>
> > I doubt your 83-line change could fit for -stable.
>
> My 83-line change has debug output only for experimentation. It will,
> *of* *course* be removed before committing it upstream. That's why I
> called it "a conglomerate patch" and I said "It has some debug output
> for easier debugging, that will be removed in the final version, of
> course." I guess you didn't read that either.

Why should I read a debugging patch?


>
> And the sanity_check() piece will be a separate patch, of course.
>
> In the end the diffstat will be 30-40 lines max.
>
> > Feel free to drop my patch to favor yours. I am really tired.
>
> Suit yourself. Thanks for the reporting.

There is no bug here, your code is perfect. :) Sorry for wasting
your time to believe this it is bug, it is not. :-)

Thanks.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 1/2] ras: fix an off-by-one error in __find_elem()
  2019-04-20 19:15           ` [v2,1/2] " Cong Wang
@ 2019-04-20 19:15             ` " Cong Wang
  2019-04-21  8:27             ` [v2,1/2] " Borislav Petkov
  1 sibling, 0 replies; 19+ messages in thread
From: Cong Wang @ 2019-04-20 19:15 UTC (permalink / raw)
  To: Borislav Petkov; +Cc: LKML, linux-edac, Tony Luck, Thomas Gleixner

On Sat, Apr 20, 2019 at 12:04 PM Borislav Petkov <bp@alien8.de> wrote:
>
> On Sat, Apr 20, 2019 at 11:25:43AM -0700, Cong Wang wrote:
> > If you want to go that far, you can choose to use lib/bsearch.c too in
> > case you want to reinvent the wheel.
>
> Well, that doesn't give me the @to functionality which points to the
> slot where the new element should be inserted, when the search was
> unsuccessful.

No one stops you from adding it, as you are going far you can always
go further. :)


>
> > What's your point here?
>
> My point is to fix it properly. Obviously.

Of course, no one can stop you from rewriting the whole ras code by
doing it properly.


>
> > You know my fix is targeted for -stable,
>
> Well, first you sent me this:
>
> https://lkml.kernel.org/r/20190416012001.5338-1-xiyou.wangcong@gmail.com
>
> then this:
>
> https://lkml.kernel.org/r/20190416213351.28999-1-xiyou.wangcong@gmail.com


Yes, one is V1 and the other is V2. Is it hard to understand V2 is to replace
V1?


>
> Tony liked this second version more and if you look at the final result of mine:


Sorry, I have no reason to look into a 83-line change.


> it has basically *both*: the correct [min:max] range *and* the return of
> ithe ndex when found. But the algorithm this time is the correct one.

I don't review it, so I don't know. Feel free to believe you are correct here,
until someone finds a bug later.


>
> > I doubt your 83-line change could fit for -stable.
>
> My 83-line change has debug output only for experimentation. It will,
> *of* *course* be removed before committing it upstream. That's why I
> called it "a conglomerate patch" and I said "It has some debug output
> for easier debugging, that will be removed in the final version, of
> course." I guess you didn't read that either.

Why should I read a debugging patch?


>
> And the sanity_check() piece will be a separate patch, of course.
>
> In the end the diffstat will be 30-40 lines max.
>
> > Feel free to drop my patch to favor yours. I am really tired.
>
> Suit yourself. Thanks for the reporting.

There is no bug here, your code is perfect. :) Sorry for wasting
your time to believe this it is bug, it is not. :-)

Thanks.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [v2,1/2] ras: fix an off-by-one error in __find_elem()
@ 2019-04-21  8:27             ` " Borislav Petkov
  2019-04-21  8:27               ` [PATCH v2 1/2] " Borislav Petkov
  0 siblings, 1 reply; 19+ messages in thread
From: Borislav Petkov @ 2019-04-21  8:27 UTC (permalink / raw)
  To: Cong Wang; +Cc: LKML, linux-edac, Tony Luck, Thomas Gleixner

On Sat, Apr 20, 2019 at 12:15:26PM -0700, Cong Wang wrote:
> Yes, one is V1 and the other is V2. Is it hard to understand V2 is to
> replace V1?

Well, looking at these two very different fixes, it made me think that
you don't really know what you're doing. So I went and did the Knuth's
version just so that I can analyze and understand the issue myself.

The final result ended up needing *both* the index fix *and* removed the
trailing noodling code after the loop which looked fishy at best and I
wanted it gone anyway.

So in the end:

1. your first fix was correct but incomplete

2. your second was replaced by a better version of the whole thing

So the final result is a lot cleaner and straight-forward. And it is
only 29 lines and I don't see a problem with it going to stable.

And I as author and maintainer of this code have very much the
prerogative to decide which way to go, TYVM. No matter how much you
passive-aggressively bitch. Thanks to your last mail, I won't have to
make this choice anymore.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 1/2] ras: fix an off-by-one error in __find_elem()
  2019-04-21  8:27             ` [v2,1/2] " Borislav Petkov
@ 2019-04-21  8:27               ` " Borislav Petkov
  0 siblings, 0 replies; 19+ messages in thread
From: Borislav Petkov @ 2019-04-21  8:27 UTC (permalink / raw)
  To: Cong Wang; +Cc: LKML, linux-edac, Tony Luck, Thomas Gleixner

On Sat, Apr 20, 2019 at 12:15:26PM -0700, Cong Wang wrote:
> Yes, one is V1 and the other is V2. Is it hard to understand V2 is to
> replace V1?

Well, looking at these two very different fixes, it made me think that
you don't really know what you're doing. So I went and did the Knuth's
version just so that I can analyze and understand the issue myself.

The final result ended up needing *both* the index fix *and* removed the
trailing noodling code after the loop which looked fishy at best and I
wanted it gone anyway.

So in the end:

1. your first fix was correct but incomplete

2. your second was replaced by a better version of the whole thing

So the final result is a lot cleaner and straight-forward. And it is
only 29 lines and I don't see a problem with it going to stable.

And I as author and maintainer of this code have very much the
prerogative to decide which way to go, TYVM. No matter how much you
passive-aggressively bitch. Thanks to your last mail, I won't have to
make this choice anymore.

-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [tip:ras/urgent] RAS/CEC: Convert the timer callback to a workqueue
  2019-04-16 21:33 ` [v2,2/2] ras: close the race condition with timer Cong Wang
  2019-04-16 21:33   ` [PATCH v2 2/2] " Cong Wang
@ 2019-06-08 15:28   ` tip-bot for Cong Wang
  1 sibling, 0 replies; 19+ messages in thread
From: tip-bot for Cong Wang @ 2019-06-08 15:28 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: xiyou.wangcong, linux-edac, bp, linux-kernel, mingo, tony.luck,
	hpa, stable, tglx

Commit-ID:  0ade0b6240c4853cf9725924c46c10f4251639d7
Gitweb:     https://git.kernel.org/tip/0ade0b6240c4853cf9725924c46c10f4251639d7
Author:     Cong Wang <xiyou.wangcong@gmail.com>
AuthorDate: Tue, 16 Apr 2019 14:33:51 -0700
Committer:  Borislav Petkov <bp@suse.de>
CommitDate: Fri, 7 Jun 2019 23:21:39 +0200

RAS/CEC: Convert the timer callback to a workqueue

cec_timer_fn() is a timer callback which reads ce_arr.array[] and
updates its decay values. However, it runs in interrupt context and the
mutex protection the CEC uses for that array, is inadequate. Convert the
used timer to a workqueue to keep the tasks the CEC performs preemptible
and thus low-prio.

 [ bp: Rewrite commit message.
   s/timer/decay/gi to make it agnostic as to what facility is used. ]

Fixes: 011d82611172 ("RAS: Add a Corrected Errors Collector")
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Cc: <stable@vger.kernel.org>
Link: https://lkml.kernel.org/r/20190416213351.28999-2-xiyou.wangcong@gmail.com
---
 drivers/ras/cec.c | 46 ++++++++++++++++++++++------------------------
 1 file changed, 22 insertions(+), 24 deletions(-)

diff --git a/drivers/ras/cec.c b/drivers/ras/cec.c
index dbfe3e61d2c2..673f8a128397 100644
--- a/drivers/ras/cec.c
+++ b/drivers/ras/cec.c
@@ -2,6 +2,7 @@
 #include <linux/mm.h>
 #include <linux/gfp.h>
 #include <linux/kernel.h>
+#include <linux/workqueue.h>
 
 #include <asm/mce.h>
 
@@ -123,16 +124,12 @@ static u64 dfs_pfn;
 /* Amount of errors after which we offline */
 static unsigned int count_threshold = COUNT_MASK;
 
-/*
- * The timer "decays" element count each timer_interval which is 24hrs by
- * default.
- */
-
-#define CEC_TIMER_DEFAULT_INTERVAL	24 * 60 * 60	/* 24 hrs */
-#define CEC_TIMER_MIN_INTERVAL		 1 * 60 * 60	/* 1h */
-#define CEC_TIMER_MAX_INTERVAL	   30 *	24 * 60 * 60	/* one month */
-static struct timer_list cec_timer;
-static u64 timer_interval = CEC_TIMER_DEFAULT_INTERVAL;
+/* Each element "decays" each decay_interval which is 24hrs by default. */
+#define CEC_DECAY_DEFAULT_INTERVAL	24 * 60 * 60	/* 24 hrs */
+#define CEC_DECAY_MIN_INTERVAL		 1 * 60 * 60	/* 1h */
+#define CEC_DECAY_MAX_INTERVAL	   30 *	24 * 60 * 60	/* one month */
+static struct delayed_work cec_work;
+static u64 decay_interval = CEC_DECAY_DEFAULT_INTERVAL;
 
 /*
  * Decrement decay value. We're using DECAY_BITS bits to denote decay of an
@@ -160,20 +157,21 @@ static void do_spring_cleaning(struct ce_array *ca)
 /*
  * @interval in seconds
  */
-static void cec_mod_timer(struct timer_list *t, unsigned long interval)
+static void cec_mod_work(unsigned long interval)
 {
 	unsigned long iv;
 
-	iv = interval * HZ + jiffies;
-
-	mod_timer(t, round_jiffies(iv));
+	iv = interval * HZ;
+	mod_delayed_work(system_wq, &cec_work, round_jiffies(iv));
 }
 
-static void cec_timer_fn(struct timer_list *unused)
+static void cec_work_fn(struct work_struct *work)
 {
+	mutex_lock(&ce_mutex);
 	do_spring_cleaning(&ce_arr);
+	mutex_unlock(&ce_mutex);
 
-	cec_mod_timer(&cec_timer, timer_interval);
+	cec_mod_work(decay_interval);
 }
 
 /*
@@ -380,15 +378,15 @@ static int decay_interval_set(void *data, u64 val)
 {
 	*(u64 *)data = val;
 
-	if (val < CEC_TIMER_MIN_INTERVAL)
+	if (val < CEC_DECAY_MIN_INTERVAL)
 		return -EINVAL;
 
-	if (val > CEC_TIMER_MAX_INTERVAL)
+	if (val > CEC_DECAY_MAX_INTERVAL)
 		return -EINVAL;
 
-	timer_interval = val;
+	decay_interval = val;
 
-	cec_mod_timer(&cec_timer, timer_interval);
+	cec_mod_work(decay_interval);
 	return 0;
 }
 DEFINE_DEBUGFS_ATTRIBUTE(decay_interval_ops, u64_get, decay_interval_set, "%lld\n");
@@ -432,7 +430,7 @@ static int array_dump(struct seq_file *m, void *v)
 
 	seq_printf(m, "Flags: 0x%x\n", ca->flags);
 
-	seq_printf(m, "Timer interval: %lld seconds\n", timer_interval);
+	seq_printf(m, "Decay interval: %lld seconds\n", decay_interval);
 	seq_printf(m, "Decays: %lld\n", ca->decays_done);
 
 	seq_printf(m, "Action threshold: %d\n", count_threshold);
@@ -478,7 +476,7 @@ static int __init create_debugfs_nodes(void)
 	}
 
 	decay = debugfs_create_file("decay_interval", S_IRUSR | S_IWUSR, d,
-				    &timer_interval, &decay_interval_ops);
+				    &decay_interval, &decay_interval_ops);
 	if (!decay) {
 		pr_warn("Error creating decay_interval debugfs node!\n");
 		goto err;
@@ -514,8 +512,8 @@ void __init cec_init(void)
 	if (create_debugfs_nodes())
 		return;
 
-	timer_setup(&cec_timer, cec_timer_fn, 0);
-	cec_mod_timer(&cec_timer, CEC_TIMER_DEFAULT_INTERVAL);
+	INIT_DELAYED_WORK(&cec_work, cec_work_fn);
+	schedule_delayed_work(&cec_work, CEC_DECAY_DEFAULT_INTERVAL);
 
 	pr_info("Correctable Errors collector initialized.\n");
 }

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, back to index

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-04-16 21:33 [v2,1/2] ras: fix an off-by-one error in __find_elem() Cong Wang
2019-04-16 21:33 ` [PATCH v2 1/2] " Cong Wang
2019-04-16 21:33 ` [v2,2/2] ras: close the race condition with timer Cong Wang
2019-04-16 21:33   ` [PATCH v2 2/2] " Cong Wang
2019-06-08 15:28   ` [tip:ras/urgent] RAS/CEC: Convert the timer callback to a workqueue tip-bot for Cong Wang
2019-04-16 21:46 ` [v2,1/2] ras: fix an off-by-one error in __find_elem() Borislav Petkov
2019-04-16 21:46   ` [PATCH v2 1/2] " Borislav Petkov
2019-04-16 23:16   ` [v2,1/2] " Cong Wang
2019-04-16 23:16     ` [PATCH v2 1/2] " Cong Wang
2019-04-20 11:34     ` [v2,1/2] " Borislav Petkov
2019-04-20 11:34       ` [PATCH v2 1/2] " Borislav Petkov
2019-04-20 18:25       ` [v2,1/2] " Cong Wang
2019-04-20 18:25         ` [PATCH v2 1/2] " Cong Wang
2019-04-20 19:04         ` [v2,1/2] " Borislav Petkov
2019-04-20 19:04           ` [PATCH v2 1/2] " Borislav Petkov
2019-04-20 19:15           ` [v2,1/2] " Cong Wang
2019-04-20 19:15             ` [PATCH v2 1/2] " Cong Wang
2019-04-21  8:27             ` [v2,1/2] " Borislav Petkov
2019-04-21  8:27               ` [PATCH v2 1/2] " Borislav Petkov

Linux-EDAC Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-edac/0 linux-edac/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-edac linux-edac/ https://lore.kernel.org/linux-edac \
		linux-edac@vger.kernel.org linux-edac@archiver.kernel.org
	public-inbox-index linux-edac


Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-edac


AGPL code for this site: git clone https://public-inbox.org/ public-inbox