All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] ubi: check kthread_should_stop() after the setting of task state
@ 2020-06-01  9:12 ` Zhihao Cheng
  0 siblings, 0 replies; 24+ messages in thread
From: Zhihao Cheng @ 2020-06-01  9:12 UTC (permalink / raw)
  To: linux-mtd, linux-kernel; +Cc: richard, yi.zhang

A detach hung is possible when a race occurs between the detach process
and the ubi background thread. The following sequences outline the race:

  ubi thread: if (list_empty(&ubi->works)...

  ubi detach: set_bit(KTHREAD_SHOULD_STOP, &kthread->flags)
              => by kthread_stop()
              wake_up_process()
              => ubi thread is still running, so 0 is returned

  ubi thread: set_current_state(TASK_INTERRUPTIBLE)
              schedule()
              => ubi thread will never be scheduled again

  ubi detach: wait_for_completion()
              => hung task!

To fix that, we need to check kthread_should_stop() after we set the
task state, so the ubi thread will either see the stop bit and exit or
the task state is reset to runnable such that it isn't scheduled out
indefinitely.

Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com>
Cc: <Stable@vger.kernel.org>
Fixes: 801c135ce73d5df1ca ("UBI: Unsorted Block Images")
Reported-by: syzbot+853639d0cb16c31c7a14@syzkaller.appspotmail.com
---
 drivers/mtd/ubi/wl.c | 13 +++++++++++++
 1 file changed, 13 insertions(+)

diff --git a/drivers/mtd/ubi/wl.c b/drivers/mtd/ubi/wl.c
index 5146cce5fe32..a4d4343053d7 100644
--- a/drivers/mtd/ubi/wl.c
+++ b/drivers/mtd/ubi/wl.c
@@ -1628,6 +1628,19 @@ int ubi_thread(void *u)
 		    !ubi->thread_enabled || ubi_dbg_is_bgt_disabled(ubi)) {
 			set_current_state(TASK_INTERRUPTIBLE);
 			spin_unlock(&ubi->wl_lock);
+
+			/*
+			 * Check kthread_should_stop() after we set the task
+			 * state to guarantee that we either see the stop bit
+			 * and exit or the task state is reset to runnable such
+			 * that it's not scheduled out indefinitely and detects
+			 * the stop bit at kthread_should_stop().
+			 */
+			if (kthread_should_stop()) {
+				set_current_state(TASK_RUNNING);
+				break;
+			}
+
 			schedule();
 			continue;
 		}
-- 
2.25.4


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH] ubi: check kthread_should_stop() after the setting of task state
@ 2020-06-01  9:12 ` Zhihao Cheng
  0 siblings, 0 replies; 24+ messages in thread
From: Zhihao Cheng @ 2020-06-01  9:12 UTC (permalink / raw)
  To: linux-mtd, linux-kernel; +Cc: richard, yi.zhang

A detach hung is possible when a race occurs between the detach process
and the ubi background thread. The following sequences outline the race:

  ubi thread: if (list_empty(&ubi->works)...

  ubi detach: set_bit(KTHREAD_SHOULD_STOP, &kthread->flags)
              => by kthread_stop()
              wake_up_process()
              => ubi thread is still running, so 0 is returned

  ubi thread: set_current_state(TASK_INTERRUPTIBLE)
              schedule()
              => ubi thread will never be scheduled again

  ubi detach: wait_for_completion()
              => hung task!

To fix that, we need to check kthread_should_stop() after we set the
task state, so the ubi thread will either see the stop bit and exit or
the task state is reset to runnable such that it isn't scheduled out
indefinitely.

Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com>
Cc: <Stable@vger.kernel.org>
Fixes: 801c135ce73d5df1ca ("UBI: Unsorted Block Images")
Reported-by: syzbot+853639d0cb16c31c7a14@syzkaller.appspotmail.com
---
 drivers/mtd/ubi/wl.c | 13 +++++++++++++
 1 file changed, 13 insertions(+)

diff --git a/drivers/mtd/ubi/wl.c b/drivers/mtd/ubi/wl.c
index 5146cce5fe32..a4d4343053d7 100644
--- a/drivers/mtd/ubi/wl.c
+++ b/drivers/mtd/ubi/wl.c
@@ -1628,6 +1628,19 @@ int ubi_thread(void *u)
 		    !ubi->thread_enabled || ubi_dbg_is_bgt_disabled(ubi)) {
 			set_current_state(TASK_INTERRUPTIBLE);
 			spin_unlock(&ubi->wl_lock);
+
+			/*
+			 * Check kthread_should_stop() after we set the task
+			 * state to guarantee that we either see the stop bit
+			 * and exit or the task state is reset to runnable such
+			 * that it's not scheduled out indefinitely and detects
+			 * the stop bit at kthread_should_stop().
+			 */
+			if (kthread_should_stop()) {
+				set_current_state(TASK_RUNNING);
+				break;
+			}
+
 			schedule();
 			continue;
 		}
-- 
2.25.4


______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH] ubi: check kthread_should_stop() after the setting of task state
  2020-06-01  9:12 ` Zhihao Cheng
@ 2020-08-02 21:25   ` Richard Weinberger
  -1 siblings, 0 replies; 24+ messages in thread
From: Richard Weinberger @ 2020-08-02 21:25 UTC (permalink / raw)
  To: Zhihao Cheng; +Cc: linux-mtd, LKML, Richard Weinberger, zhangyi (F)

On Mon, Jun 1, 2020 at 11:13 AM Zhihao Cheng <chengzhihao1@huawei.com> wrote:
>
> A detach hung is possible when a race occurs between the detach process
> and the ubi background thread. The following sequences outline the race:
>
>   ubi thread: if (list_empty(&ubi->works)...
>
>   ubi detach: set_bit(KTHREAD_SHOULD_STOP, &kthread->flags)
>               => by kthread_stop()
>               wake_up_process()
>               => ubi thread is still running, so 0 is returned
>
>   ubi thread: set_current_state(TASK_INTERRUPTIBLE)
>               schedule()
>               => ubi thread will never be scheduled again
>
>   ubi detach: wait_for_completion()
>               => hung task!
>
> To fix that, we need to check kthread_should_stop() after we set the
> task state, so the ubi thread will either see the stop bit and exit or
> the task state is reset to runnable such that it isn't scheduled out
> indefinitely.
>
> Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com>
> Cc: <Stable@vger.kernel.org>
> Fixes: 801c135ce73d5df1ca ("UBI: Unsorted Block Images")
> Reported-by: syzbot+853639d0cb16c31c7a14@syzkaller.appspotmail.com
> ---
>  drivers/mtd/ubi/wl.c | 13 +++++++++++++
>  1 file changed, 13 insertions(+)
>
> diff --git a/drivers/mtd/ubi/wl.c b/drivers/mtd/ubi/wl.c
> index 5146cce5fe32..a4d4343053d7 100644
> --- a/drivers/mtd/ubi/wl.c
> +++ b/drivers/mtd/ubi/wl.c
> @@ -1628,6 +1628,19 @@ int ubi_thread(void *u)
>                     !ubi->thread_enabled || ubi_dbg_is_bgt_disabled(ubi)) {
>                         set_current_state(TASK_INTERRUPTIBLE);
>                         spin_unlock(&ubi->wl_lock);
> +
> +                       /*
> +                        * Check kthread_should_stop() after we set the task
> +                        * state to guarantee that we either see the stop bit
> +                        * and exit or the task state is reset to runnable such
> +                        * that it's not scheduled out indefinitely and detects
> +                        * the stop bit at kthread_should_stop().
> +                        */
> +                       if (kthread_should_stop()) {
> +                               set_current_state(TASK_RUNNING);
> +                               break;
> +                       }
> +

Hmm, I see the problem but I fear this patch does not cure the race completely.
It just lowers the chance to hit it.
What if KTHREAD_SHOULD_STOP is set right after you checked for it?

>                         schedule();
>                         continue;
>                 }


-- 
Thanks,
//richard

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] ubi: check kthread_should_stop() after the setting of task state
@ 2020-08-02 21:25   ` Richard Weinberger
  0 siblings, 0 replies; 24+ messages in thread
From: Richard Weinberger @ 2020-08-02 21:25 UTC (permalink / raw)
  To: Zhihao Cheng; +Cc: Richard Weinberger, linux-mtd, LKML, zhangyi (F)

On Mon, Jun 1, 2020 at 11:13 AM Zhihao Cheng <chengzhihao1@huawei.com> wrote:
>
> A detach hung is possible when a race occurs between the detach process
> and the ubi background thread. The following sequences outline the race:
>
>   ubi thread: if (list_empty(&ubi->works)...
>
>   ubi detach: set_bit(KTHREAD_SHOULD_STOP, &kthread->flags)
>               => by kthread_stop()
>               wake_up_process()
>               => ubi thread is still running, so 0 is returned
>
>   ubi thread: set_current_state(TASK_INTERRUPTIBLE)
>               schedule()
>               => ubi thread will never be scheduled again
>
>   ubi detach: wait_for_completion()
>               => hung task!
>
> To fix that, we need to check kthread_should_stop() after we set the
> task state, so the ubi thread will either see the stop bit and exit or
> the task state is reset to runnable such that it isn't scheduled out
> indefinitely.
>
> Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com>
> Cc: <Stable@vger.kernel.org>
> Fixes: 801c135ce73d5df1ca ("UBI: Unsorted Block Images")
> Reported-by: syzbot+853639d0cb16c31c7a14@syzkaller.appspotmail.com
> ---
>  drivers/mtd/ubi/wl.c | 13 +++++++++++++
>  1 file changed, 13 insertions(+)
>
> diff --git a/drivers/mtd/ubi/wl.c b/drivers/mtd/ubi/wl.c
> index 5146cce5fe32..a4d4343053d7 100644
> --- a/drivers/mtd/ubi/wl.c
> +++ b/drivers/mtd/ubi/wl.c
> @@ -1628,6 +1628,19 @@ int ubi_thread(void *u)
>                     !ubi->thread_enabled || ubi_dbg_is_bgt_disabled(ubi)) {
>                         set_current_state(TASK_INTERRUPTIBLE);
>                         spin_unlock(&ubi->wl_lock);
> +
> +                       /*
> +                        * Check kthread_should_stop() after we set the task
> +                        * state to guarantee that we either see the stop bit
> +                        * and exit or the task state is reset to runnable such
> +                        * that it's not scheduled out indefinitely and detects
> +                        * the stop bit at kthread_should_stop().
> +                        */
> +                       if (kthread_should_stop()) {
> +                               set_current_state(TASK_RUNNING);
> +                               break;
> +                       }
> +

Hmm, I see the problem but I fear this patch does not cure the race completely.
It just lowers the chance to hit it.
What if KTHREAD_SHOULD_STOP is set right after you checked for it?

>                         schedule();
>                         continue;
>                 }


-- 
Thanks,
//richard

______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] ubi: check kthread_should_stop() after the setting of task state
  2020-08-02 21:25   ` Richard Weinberger
@ 2020-08-03  2:01     ` Zhihao Cheng
  -1 siblings, 0 replies; 24+ messages in thread
From: Zhihao Cheng @ 2020-08-03  2:01 UTC (permalink / raw)
  To: Richard Weinberger; +Cc: linux-mtd, LKML, Richard Weinberger, zhangyi (F)

在 2020/8/3 5:25, Richard Weinberger 写道:
> On Mon, Jun 1, 2020 at 11:13 AM Zhihao Cheng <chengzhihao1@huawei.com> wrote:
>> A detach hung is possible when a race occurs between the detach process
>> and the ubi background thread. The following sequences outline the race:
>>
>>    ubi thread: if (list_empty(&ubi->works)...
>>
>>    ubi detach: set_bit(KTHREAD_SHOULD_STOP, &kthread->flags)
>>                => by kthread_stop()
>>                wake_up_process()
>>                => ubi thread is still running, so 0 is returned
>>
>>    ubi thread: set_current_state(TASK_INTERRUPTIBLE)
>>                schedule()
>>                => ubi thread will never be scheduled again
>>
>>    ubi detach: wait_for_completion()
>>                => hung task!
>>
>> To fix that, we need to check kthread_should_stop() after we set the
>> task state, so the ubi thread will either see the stop bit and exit or
>> the task state is reset to runnable such that it isn't scheduled out
>> indefinitely.
>>
>> Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com>
>> Cc: <Stable@vger.kernel.org>
>> Fixes: 801c135ce73d5df1ca ("UBI: Unsorted Block Images")
>> Reported-by: syzbot+853639d0cb16c31c7a14@syzkaller.appspotmail.com
>> ---
>>   drivers/mtd/ubi/wl.c | 13 +++++++++++++
>>   1 file changed, 13 insertions(+)
>>
>> diff --git a/drivers/mtd/ubi/wl.c b/drivers/mtd/ubi/wl.c
>> index 5146cce5fe32..a4d4343053d7 100644
>> --- a/drivers/mtd/ubi/wl.c
>> +++ b/drivers/mtd/ubi/wl.c
>> @@ -1628,6 +1628,19 @@ int ubi_thread(void *u)
>>                      !ubi->thread_enabled || ubi_dbg_is_bgt_disabled(ubi)) {
>>                          set_current_state(TASK_INTERRUPTIBLE);
>>                          spin_unlock(&ubi->wl_lock);
>> +
>> +                       /*
>> +                        * Check kthread_should_stop() after we set the task
>> +                        * state to guarantee that we either see the stop bit
>> +                        * and exit or the task state is reset to runnable such
>> +                        * that it's not scheduled out indefinitely and detects
>> +                        * the stop bit at kthread_should_stop().
>> +                        */
>> +                       if (kthread_should_stop()) {
>> +                               set_current_state(TASK_RUNNING);
>> +                               break;
>> +                       }
>> +
> Hmm, I see the problem but I fear this patch does not cure the race completely.
> It just lowers the chance to hit it.
> What if KTHREAD_SHOULD_STOP is set right after you checked for it?

The patch can handle this case. ubi_thread will exit at 
kthread_should_stop() in next iteration.


You can apply following patch to verify it. (You may set 
'kernel.hung_task_timeout_secs = 10' by sysctl)


diff --git a/drivers/mtd/ubi/wl.c b/drivers/mtd/ubi/wl.c
index 27636063ed1b..44047861c2a1 100644
--- a/drivers/mtd/ubi/wl.c
+++ b/drivers/mtd/ubi/wl.c
@@ -91,6 +91,7 @@
  #include <linux/kthread.h>
  #include "ubi.h"
  #include "wl.h"
+#include <linux/delay.h>

  /* Number of physical eraseblocks reserved for wear-leveling purposes */
  #define WL_RESERVED_PEBS 1
@@ -1627,8 +1628,10 @@ int ubi_thread(void *u)
         for (;;) {
                 int err;

-               if (kthread_should_stop())
+               if (kthread_should_stop()) {
+                       pr_err("Exit at stop A\n");
                         break;
+               }

                 if (try_to_freeze())
                         continue;
@@ -1638,6 +1641,15 @@ int ubi_thread(void *u)
                     !ubi->thread_enabled || ubi_dbg_is_bgt_disabled(ubi)) {
                         set_current_state(TASK_INTERRUPTIBLE);
                         spin_unlock(&ubi->wl_lock);
+                       if (kthread_should_stop()) {
+                               set_current_state(TASK_RUNNING);
+                               break;
+                       }
+
+                       pr_err("Check should stop B\n");
+                       mdelay(5000);
+                       pr_err("delay 5000ms \n");
+
                         schedule();
                         continue;
                 }
diff --git a/kernel/kthread.c b/kernel/kthread.c
index 132f84a5fde3..18627acb9573 100644
--- a/kernel/kthread.c
+++ b/kernel/kthread.c
@@ -590,6 +590,10 @@ int kthread_stop(struct task_struct *k)
         get_task_struct(k);
         kthread = to_kthread(k);
         set_bit(KTHREAD_SHOULD_STOP, &kthread->flags);
+
+       if (k->comm && !strncmp(k->comm, "ubi_bgt0d", strlen("ubi_bgt0d")))
+               pr_err("Stop flag has been set for task %s\n", k->comm);
+
         kthread_unpark(k);
         wake_up_process(k);
         wait_for_completion(&kthread->exited);

kernel msg:
[  210.602005] Check should stop B             # Please execute 
ubi_detach manually when you see this
[  211.347638] Stop flag has been set for task ubi_bgt0d
[  215.603026] delay 5000ms
[  215.605728] Exit at stop A
>>                          schedule();
>>                          continue;
>>                  }
>



^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH] ubi: check kthread_should_stop() after the setting of task state
@ 2020-08-03  2:01     ` Zhihao Cheng
  0 siblings, 0 replies; 24+ messages in thread
From: Zhihao Cheng @ 2020-08-03  2:01 UTC (permalink / raw)
  To: Richard Weinberger; +Cc: Richard Weinberger, linux-mtd, LKML, zhangyi (F)

在 2020/8/3 5:25, Richard Weinberger 写道:
> On Mon, Jun 1, 2020 at 11:13 AM Zhihao Cheng <chengzhihao1@huawei.com> wrote:
>> A detach hung is possible when a race occurs between the detach process
>> and the ubi background thread. The following sequences outline the race:
>>
>>    ubi thread: if (list_empty(&ubi->works)...
>>
>>    ubi detach: set_bit(KTHREAD_SHOULD_STOP, &kthread->flags)
>>                => by kthread_stop()
>>                wake_up_process()
>>                => ubi thread is still running, so 0 is returned
>>
>>    ubi thread: set_current_state(TASK_INTERRUPTIBLE)
>>                schedule()
>>                => ubi thread will never be scheduled again
>>
>>    ubi detach: wait_for_completion()
>>                => hung task!
>>
>> To fix that, we need to check kthread_should_stop() after we set the
>> task state, so the ubi thread will either see the stop bit and exit or
>> the task state is reset to runnable such that it isn't scheduled out
>> indefinitely.
>>
>> Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com>
>> Cc: <Stable@vger.kernel.org>
>> Fixes: 801c135ce73d5df1ca ("UBI: Unsorted Block Images")
>> Reported-by: syzbot+853639d0cb16c31c7a14@syzkaller.appspotmail.com
>> ---
>>   drivers/mtd/ubi/wl.c | 13 +++++++++++++
>>   1 file changed, 13 insertions(+)
>>
>> diff --git a/drivers/mtd/ubi/wl.c b/drivers/mtd/ubi/wl.c
>> index 5146cce5fe32..a4d4343053d7 100644
>> --- a/drivers/mtd/ubi/wl.c
>> +++ b/drivers/mtd/ubi/wl.c
>> @@ -1628,6 +1628,19 @@ int ubi_thread(void *u)
>>                      !ubi->thread_enabled || ubi_dbg_is_bgt_disabled(ubi)) {
>>                          set_current_state(TASK_INTERRUPTIBLE);
>>                          spin_unlock(&ubi->wl_lock);
>> +
>> +                       /*
>> +                        * Check kthread_should_stop() after we set the task
>> +                        * state to guarantee that we either see the stop bit
>> +                        * and exit or the task state is reset to runnable such
>> +                        * that it's not scheduled out indefinitely and detects
>> +                        * the stop bit at kthread_should_stop().
>> +                        */
>> +                       if (kthread_should_stop()) {
>> +                               set_current_state(TASK_RUNNING);
>> +                               break;
>> +                       }
>> +
> Hmm, I see the problem but I fear this patch does not cure the race completely.
> It just lowers the chance to hit it.
> What if KTHREAD_SHOULD_STOP is set right after you checked for it?

The patch can handle this case. ubi_thread will exit at 
kthread_should_stop() in next iteration.


You can apply following patch to verify it. (You may set 
'kernel.hung_task_timeout_secs = 10' by sysctl)


diff --git a/drivers/mtd/ubi/wl.c b/drivers/mtd/ubi/wl.c
index 27636063ed1b..44047861c2a1 100644
--- a/drivers/mtd/ubi/wl.c
+++ b/drivers/mtd/ubi/wl.c
@@ -91,6 +91,7 @@
  #include <linux/kthread.h>
  #include "ubi.h"
  #include "wl.h"
+#include <linux/delay.h>

  /* Number of physical eraseblocks reserved for wear-leveling purposes */
  #define WL_RESERVED_PEBS 1
@@ -1627,8 +1628,10 @@ int ubi_thread(void *u)
         for (;;) {
                 int err;

-               if (kthread_should_stop())
+               if (kthread_should_stop()) {
+                       pr_err("Exit at stop A\n");
                         break;
+               }

                 if (try_to_freeze())
                         continue;
@@ -1638,6 +1641,15 @@ int ubi_thread(void *u)
                     !ubi->thread_enabled || ubi_dbg_is_bgt_disabled(ubi)) {
                         set_current_state(TASK_INTERRUPTIBLE);
                         spin_unlock(&ubi->wl_lock);
+                       if (kthread_should_stop()) {
+                               set_current_state(TASK_RUNNING);
+                               break;
+                       }
+
+                       pr_err("Check should stop B\n");
+                       mdelay(5000);
+                       pr_err("delay 5000ms \n");
+
                         schedule();
                         continue;
                 }
diff --git a/kernel/kthread.c b/kernel/kthread.c
index 132f84a5fde3..18627acb9573 100644
--- a/kernel/kthread.c
+++ b/kernel/kthread.c
@@ -590,6 +590,10 @@ int kthread_stop(struct task_struct *k)
         get_task_struct(k);
         kthread = to_kthread(k);
         set_bit(KTHREAD_SHOULD_STOP, &kthread->flags);
+
+       if (k->comm && !strncmp(k->comm, "ubi_bgt0d", strlen("ubi_bgt0d")))
+               pr_err("Stop flag has been set for task %s\n", k->comm);
+
         kthread_unpark(k);
         wake_up_process(k);
         wait_for_completion(&kthread->exited);

kernel msg:
[  210.602005] Check should stop B             # Please execute 
ubi_detach manually when you see this
[  211.347638] Stop flag has been set for task ubi_bgt0d
[  215.603026] delay 5000ms
[  215.605728] Exit at stop A
>>                          schedule();
>>                          continue;
>>                  }
>



______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH] ubi: check kthread_should_stop() after the setting of task state
  2020-08-03  2:01     ` Zhihao Cheng
@ 2020-08-03 22:11       ` Richard Weinberger
  -1 siblings, 0 replies; 24+ messages in thread
From: Richard Weinberger @ 2020-08-03 22:11 UTC (permalink / raw)
  To: Zhihao Cheng; +Cc: linux-mtd, LKML, Richard Weinberger, zhangyi (F)

On Mon, Aug 3, 2020 at 4:01 AM Zhihao Cheng <chengzhihao1@huawei.com> wrote:
> > Hmm, I see the problem but I fear this patch does not cure the race completely.
> > It just lowers the chance to hit it.
> > What if KTHREAD_SHOULD_STOP is set right after you checked for it?
>
> The patch can handle this case. ubi_thread will exit at
> kthread_should_stop() in next iteration.

How can it reach the next iteration?
Maybe I didn't fully get your explanation.

As far as I understand the problem correctly, the following happens:
1. ubi_thread is running and the program counter is somewhere between
"if (kthread_should_stop())"
and schedule()
2. While detaching kthread_stop() is called
3. Since the program counter in the thread is right before schedule(),
it does not check KTHREAD_SHOULD_STOP
and blindly calls into schedule()
4. The thread goes to sleep and nothing wakes it anymore -> endless wait.

Is this correct so far?

Your solution is putting another check for KTHREAD_SHOULD_STOP before
schedule().
I argue that this will just reduce the chance to hit the race window
because it can still happen
that kthread_stop() is being called right after the second check and
again before schedule().
Then we end up with the same situation.

-- 
Thanks,
//richard

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] ubi: check kthread_should_stop() after the setting of task state
@ 2020-08-03 22:11       ` Richard Weinberger
  0 siblings, 0 replies; 24+ messages in thread
From: Richard Weinberger @ 2020-08-03 22:11 UTC (permalink / raw)
  To: Zhihao Cheng; +Cc: Richard Weinberger, linux-mtd, LKML, zhangyi (F)

On Mon, Aug 3, 2020 at 4:01 AM Zhihao Cheng <chengzhihao1@huawei.com> wrote:
> > Hmm, I see the problem but I fear this patch does not cure the race completely.
> > It just lowers the chance to hit it.
> > What if KTHREAD_SHOULD_STOP is set right after you checked for it?
>
> The patch can handle this case. ubi_thread will exit at
> kthread_should_stop() in next iteration.

How can it reach the next iteration?
Maybe I didn't fully get your explanation.

As far as I understand the problem correctly, the following happens:
1. ubi_thread is running and the program counter is somewhere between
"if (kthread_should_stop())"
and schedule()
2. While detaching kthread_stop() is called
3. Since the program counter in the thread is right before schedule(),
it does not check KTHREAD_SHOULD_STOP
and blindly calls into schedule()
4. The thread goes to sleep and nothing wakes it anymore -> endless wait.

Is this correct so far?

Your solution is putting another check for KTHREAD_SHOULD_STOP before
schedule().
I argue that this will just reduce the chance to hit the race window
because it can still happen
that kthread_stop() is being called right after the second check and
again before schedule().
Then we end up with the same situation.

-- 
Thanks,
//richard

______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] ubi: check kthread_should_stop() after the setting of task state
  2020-08-03 22:11       ` Richard Weinberger
@ 2020-08-04  2:58         ` Zhihao Cheng
  -1 siblings, 0 replies; 24+ messages in thread
From: Zhihao Cheng @ 2020-08-04  2:58 UTC (permalink / raw)
  To: Richard Weinberger; +Cc: linux-mtd, LKML, Richard Weinberger, zhangyi (F)

在 2020/8/4 6:11, Richard Weinberger 写道:
> On Mon, Aug 3, 2020 at 4:01 AM Zhihao Cheng <chengzhihao1@huawei.com> wrote:
>>> Hmm, I see the problem but I fear this patch does not cure the race completely.
>>> It just lowers the chance to hit it.
>>> What if KTHREAD_SHOULD_STOP is set right after you checked for it?
>> The patch can handle this case. ubi_thread will exit at
>> kthread_should_stop() in next iteration.
> How can it reach the next iteration?
> Maybe I didn't fully get your explanation.
>
> As far as I understand the problem correctly, the following happens:
> 1. ubi_thread is running and the program counter is somewhere between
> "if (kthread_should_stop())"
> and schedule()
> 2. While detaching kthread_stop() is called
> 3. Since the program counter in the thread is right before schedule(),
> it does not check KTHREAD_SHOULD_STOP
> and blindly calls into schedule()
> 4. The thread goes to sleep and nothing wakes it anymore -> endless wait.
>
> Is this correct so far?
Oh, you're thinking about influence by schedule(), I get it. But I think 
it still works. Because the ubi_thread is still on runqueue, it will be 
scheduled to execute later anyway.

op                                                    state of 
ubi_thread           on runqueue
set_current_state(TASK_INTERRUPTIBLE) TASK_INTERRUPTIBLE              Yes
if (kthread_should_stop()) // not satisfy 
TASK_INTERRUPTIBLE              Yes
kthread_stop:
   wake_up_process
     ttwu_queue
       ttwu_do_activate
         ttwu_do_wakeup TASK_RUNNING                       Yes
schedule
   __schedule(false)

  // prev->state is TASK_RUNNING, so we cannot move it from runqueue by 
deactivate_task(). So just pick next task to execute, ubi_thread is 
still on runqueue and will be scheduled to execute later.


The test patch added mdelay(5000) before schedule(), which can make sure 
kthread_stop()->wake_up_process() executed before schedule(). Previous 
analysis can be proved through test.

@@ -1638,6 +1641,15 @@ int ubi_thread(void *u)
                     !ubi->thread_enabled || 
ubi_dbg_is_bgt_disabled(ubi)) {
                         set_current_state(TASK_INTERRUPTIBLE);
                         spin_unlock(&ubi->wl_lock);
+                       if (kthread_should_stop()) {
+                               set_current_state(TASK_RUNNING);
+                               break;
+                       }
+
+                       pr_err("Check should stop B\n");
+                       mdelay(5000);
+                       pr_err("delay 5000ms \n");
+
                         schedule();
                         continue;
                 }

>
> Your solution is putting another check for KTHREAD_SHOULD_STOP before
> schedule().
> I argue that this will just reduce the chance to hit the race window
> because it can still happen
> that kthread_stop() is being called right after the second check and
> again before schedule().
> Then we end up with the same situation.
>



^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] ubi: check kthread_should_stop() after the setting of task state
@ 2020-08-04  2:58         ` Zhihao Cheng
  0 siblings, 0 replies; 24+ messages in thread
From: Zhihao Cheng @ 2020-08-04  2:58 UTC (permalink / raw)
  To: Richard Weinberger; +Cc: Richard Weinberger, linux-mtd, LKML, zhangyi (F)

在 2020/8/4 6:11, Richard Weinberger 写道:
> On Mon, Aug 3, 2020 at 4:01 AM Zhihao Cheng <chengzhihao1@huawei.com> wrote:
>>> Hmm, I see the problem but I fear this patch does not cure the race completely.
>>> It just lowers the chance to hit it.
>>> What if KTHREAD_SHOULD_STOP is set right after you checked for it?
>> The patch can handle this case. ubi_thread will exit at
>> kthread_should_stop() in next iteration.
> How can it reach the next iteration?
> Maybe I didn't fully get your explanation.
>
> As far as I understand the problem correctly, the following happens:
> 1. ubi_thread is running and the program counter is somewhere between
> "if (kthread_should_stop())"
> and schedule()
> 2. While detaching kthread_stop() is called
> 3. Since the program counter in the thread is right before schedule(),
> it does not check KTHREAD_SHOULD_STOP
> and blindly calls into schedule()
> 4. The thread goes to sleep and nothing wakes it anymore -> endless wait.
>
> Is this correct so far?
Oh, you're thinking about influence by schedule(), I get it. But I think 
it still works. Because the ubi_thread is still on runqueue, it will be 
scheduled to execute later anyway.

op                                                    state of 
ubi_thread           on runqueue
set_current_state(TASK_INTERRUPTIBLE) TASK_INTERRUPTIBLE              Yes
if (kthread_should_stop()) // not satisfy 
TASK_INTERRUPTIBLE              Yes
kthread_stop:
   wake_up_process
     ttwu_queue
       ttwu_do_activate
         ttwu_do_wakeup TASK_RUNNING                       Yes
schedule
   __schedule(false)

  // prev->state is TASK_RUNNING, so we cannot move it from runqueue by 
deactivate_task(). So just pick next task to execute, ubi_thread is 
still on runqueue and will be scheduled to execute later.


The test patch added mdelay(5000) before schedule(), which can make sure 
kthread_stop()->wake_up_process() executed before schedule(). Previous 
analysis can be proved through test.

@@ -1638,6 +1641,15 @@ int ubi_thread(void *u)
                     !ubi->thread_enabled || 
ubi_dbg_is_bgt_disabled(ubi)) {
                         set_current_state(TASK_INTERRUPTIBLE);
                         spin_unlock(&ubi->wl_lock);
+                       if (kthread_should_stop()) {
+                               set_current_state(TASK_RUNNING);
+                               break;
+                       }
+
+                       pr_err("Check should stop B\n");
+                       mdelay(5000);
+                       pr_err("delay 5000ms \n");
+
                         schedule();
                         continue;
                 }

>
> Your solution is putting another check for KTHREAD_SHOULD_STOP before
> schedule().
> I argue that this will just reduce the chance to hit the race window
> because it can still happen
> that kthread_stop() is being called right after the second check and
> again before schedule().
> Then we end up with the same situation.
>



______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] ubi: check kthread_should_stop() after the setting of task state
  2020-08-04  2:58         ` Zhihao Cheng
@ 2020-08-04 21:56           ` Richard Weinberger
  -1 siblings, 0 replies; 24+ messages in thread
From: Richard Weinberger @ 2020-08-04 21:56 UTC (permalink / raw)
  To: Zhihao Cheng; +Cc: linux-mtd, LKML, Richard Weinberger, zhangyi (F)

On Tue, Aug 4, 2020 at 4:58 AM Zhihao Cheng <chengzhihao1@huawei.com> wrote:
> Oh, you're thinking about influence by schedule(), I get it. But I think
> it still works. Because the ubi_thread is still on runqueue, it will be
> scheduled to execute later anyway.

It will not get woken. This is the problem.

>
> op                                                    state of
> ubi_thread           on runqueue
> set_current_state(TASK_INTERRUPTIBLE) TASK_INTERRUPTIBLE              Yes
> if (kthread_should_stop()) // not satisfy
> TASK_INTERRUPTIBLE              Yes
> kthread_stop:
>    wake_up_process
>      ttwu_queue
>        ttwu_do_activate
>          ttwu_do_wakeup TASK_RUNNING                       Yes
> schedule
>    __schedule(false)
>
>   // prev->state is TASK_RUNNING, so we cannot move it from runqueue by
> deactivate_task(). So just pick next task to execute, ubi_thread is
> still on runqueue and will be scheduled to execute later.

It will be in state TASK_RUNNING only if your check is reached.

If kthread_stop() is called *before* your code:
+                       if (kthread_should_stop()) {
+                               set_current_state(TASK_RUNNING);
+                               break;
+                       }

...everything is fine.
But there is still a race window between your if
(kthread_should_stop()) and schedule() in the next line.
So if kthread_stop() is called right *after* the if and *before*
schedule(), the task state is still TASK_INTERRUPTIBLE
--> schedule() will not return unless the task is explicitly woken,
which does not happen.

Before your patch, the race window was much larger, I fully agree, but
your patch does not cure the problem
it just makes it harder to hit.

And using mdelay() to verify such a thing is also tricky because
mdelay() will influence the task state.

-- 
Thanks,
//richard

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] ubi: check kthread_should_stop() after the setting of task state
@ 2020-08-04 21:56           ` Richard Weinberger
  0 siblings, 0 replies; 24+ messages in thread
From: Richard Weinberger @ 2020-08-04 21:56 UTC (permalink / raw)
  To: Zhihao Cheng; +Cc: Richard Weinberger, linux-mtd, LKML, zhangyi (F)

On Tue, Aug 4, 2020 at 4:58 AM Zhihao Cheng <chengzhihao1@huawei.com> wrote:
> Oh, you're thinking about influence by schedule(), I get it. But I think
> it still works. Because the ubi_thread is still on runqueue, it will be
> scheduled to execute later anyway.

It will not get woken. This is the problem.

>
> op                                                    state of
> ubi_thread           on runqueue
> set_current_state(TASK_INTERRUPTIBLE) TASK_INTERRUPTIBLE              Yes
> if (kthread_should_stop()) // not satisfy
> TASK_INTERRUPTIBLE              Yes
> kthread_stop:
>    wake_up_process
>      ttwu_queue
>        ttwu_do_activate
>          ttwu_do_wakeup TASK_RUNNING                       Yes
> schedule
>    __schedule(false)
>
>   // prev->state is TASK_RUNNING, so we cannot move it from runqueue by
> deactivate_task(). So just pick next task to execute, ubi_thread is
> still on runqueue and will be scheduled to execute later.

It will be in state TASK_RUNNING only if your check is reached.

If kthread_stop() is called *before* your code:
+                       if (kthread_should_stop()) {
+                               set_current_state(TASK_RUNNING);
+                               break;
+                       }

...everything is fine.
But there is still a race window between your if
(kthread_should_stop()) and schedule() in the next line.
So if kthread_stop() is called right *after* the if and *before*
schedule(), the task state is still TASK_INTERRUPTIBLE
--> schedule() will not return unless the task is explicitly woken,
which does not happen.

Before your patch, the race window was much larger, I fully agree, but
your patch does not cure the problem
it just makes it harder to hit.

And using mdelay() to verify such a thing is also tricky because
mdelay() will influence the task state.

-- 
Thanks,
//richard

______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] ubi: check kthread_should_stop() after the setting of task state
  2020-08-04 21:56           ` Richard Weinberger
@ 2020-08-05  2:23             ` Zhihao Cheng
  -1 siblings, 0 replies; 24+ messages in thread
From: Zhihao Cheng @ 2020-08-05  2:23 UTC (permalink / raw)
  To: Richard Weinberger; +Cc: linux-mtd, LKML, Richard Weinberger, zhangyi (F)

在 2020/8/5 5:56, Richard Weinberger 写道:
> On Tue, Aug 4, 2020 at 4:58 AM Zhihao Cheng <chengzhihao1@huawei.com> wrote:
>> Oh, you're thinking about influence by schedule(), I get it. But I think
>> it still works. Because the ubi_thread is still on runqueue, it will be
>> scheduled to execute later anyway.
> It will not get woken. This is the problem.
>
>> op                                                    state of
>> ubi_thread           on runqueue
>> set_current_state(TASK_INTERRUPTIBLE) TASK_INTERRUPTIBLE              Yes
>> if (kthread_should_stop()) // not satisfy
>> TASK_INTERRUPTIBLE              Yes
>> kthread_stop:
>>     wake_up_process
>>       ttwu_queue
>>         ttwu_do_activate
>>           ttwu_do_wakeup TASK_RUNNING                       Yes
>> schedule
>>     __schedule(false)
>>
>>    // prev->state is TASK_RUNNING, so we cannot move it from runqueue by
>> deactivate_task(). So just pick next task to execute, ubi_thread is
>> still on runqueue and will be scheduled to execute later.
> It will be in state TASK_RUNNING only if your check is reached.
>
> If kthread_stop() is called *before* your code:
> +                       if (kthread_should_stop()) {
> +                               set_current_state(TASK_RUNNING);
> +                               break;
> +                       }
>
> ...everything is fine.
> But there is still a race window between your if
> (kthread_should_stop()) and schedule() in the next line.
> So if kthread_stop() is called right *after* the if and *before*
> schedule(), the task state is still TASK_INTERRUPTIBLE
> --> schedule() will not return unless the task is explicitly woken,
> which does not happen.
Er, I can't get the point. I can list two possible situations, did I 
miss other situations?

P1:ubi_thread
   set_current_state(TASK_INTERRUPTIBLE)
   if (kthread_should_stop()) {
     set_current_state(TASK_RUNNING)
     break
   }
   schedule()                            -> don't *remove* task from 
runqueue if *TASK_RUNNING*, removing operation is protected by rq_lock

P2:kthread_stop
   set_bit(KTHREAD_SHOULD_STOP, &kthread->flags)
   wake_up_process(k)             -> enqueue task & set *TASK_RUNNING*, 
these two operations are protected by rq_lock
   wait_for_completion(&kthread->exited)


Situation 1:
P1_set_current_state               on-rq, TASK_RUNNING -> TASK_INTERRUPTIBLE
P1_kthread_should_stop        on-rq, TASK_INTERRUPTIBLE
P2_set_bit                               on-rq, TASK_INTERRUPTIBLE , 
KTHREAD_SHOULD_STOP
P2_wake_up_process             on-rq, TASK_INTERRUPTIBLE -> TASK_RUNNING 
, KTHREAD_SHOULD_STOP
P1_schedule                           on-rq, TASK_RUNNING , 
KTHREAD_SHOULD_STOP
P2_wait_for_completion        // wait for P1 exit

Situation 2:
P1_set_current_state             on-rq, TASK_RUNNING -> TASK_INTERRUPTIBLE
P1_kthread_should_stop       on-rq, TASK_INTERRUPTIBLE
P2_set_bit                             on-rq, TASK_INTERRUPTIBLE , 
KTHREAD_SHOULD_STOP
P1_schedule                          off-rq, TASK_INTERRUPTIBLE , 
KTHREAD_SHOULD_STOP
P2_wake_up_process             on-rq, TASK_INTERRUPTIBLE -> TASK_RUNNING 
, KTHREAD_SHOULD_STOP
P2_wait_for_completion       // wait for P1 exit
> Before your patch, the race window was much larger, I fully agree, but
> your patch does not cure the problem
> it just makes it harder to hit.
>
> And using mdelay() to verify such a thing is also tricky because
> mdelay() will influence the task state.
>



^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] ubi: check kthread_should_stop() after the setting of task state
@ 2020-08-05  2:23             ` Zhihao Cheng
  0 siblings, 0 replies; 24+ messages in thread
From: Zhihao Cheng @ 2020-08-05  2:23 UTC (permalink / raw)
  To: Richard Weinberger; +Cc: Richard Weinberger, linux-mtd, LKML, zhangyi (F)

在 2020/8/5 5:56, Richard Weinberger 写道:
> On Tue, Aug 4, 2020 at 4:58 AM Zhihao Cheng <chengzhihao1@huawei.com> wrote:
>> Oh, you're thinking about influence by schedule(), I get it. But I think
>> it still works. Because the ubi_thread is still on runqueue, it will be
>> scheduled to execute later anyway.
> It will not get woken. This is the problem.
>
>> op                                                    state of
>> ubi_thread           on runqueue
>> set_current_state(TASK_INTERRUPTIBLE) TASK_INTERRUPTIBLE              Yes
>> if (kthread_should_stop()) // not satisfy
>> TASK_INTERRUPTIBLE              Yes
>> kthread_stop:
>>     wake_up_process
>>       ttwu_queue
>>         ttwu_do_activate
>>           ttwu_do_wakeup TASK_RUNNING                       Yes
>> schedule
>>     __schedule(false)
>>
>>    // prev->state is TASK_RUNNING, so we cannot move it from runqueue by
>> deactivate_task(). So just pick next task to execute, ubi_thread is
>> still on runqueue and will be scheduled to execute later.
> It will be in state TASK_RUNNING only if your check is reached.
>
> If kthread_stop() is called *before* your code:
> +                       if (kthread_should_stop()) {
> +                               set_current_state(TASK_RUNNING);
> +                               break;
> +                       }
>
> ...everything is fine.
> But there is still a race window between your if
> (kthread_should_stop()) and schedule() in the next line.
> So if kthread_stop() is called right *after* the if and *before*
> schedule(), the task state is still TASK_INTERRUPTIBLE
> --> schedule() will not return unless the task is explicitly woken,
> which does not happen.
Er, I can't get the point. I can list two possible situations, did I 
miss other situations?

P1:ubi_thread
   set_current_state(TASK_INTERRUPTIBLE)
   if (kthread_should_stop()) {
     set_current_state(TASK_RUNNING)
     break
   }
   schedule()                            -> don't *remove* task from 
runqueue if *TASK_RUNNING*, removing operation is protected by rq_lock

P2:kthread_stop
   set_bit(KTHREAD_SHOULD_STOP, &kthread->flags)
   wake_up_process(k)             -> enqueue task & set *TASK_RUNNING*, 
these two operations are protected by rq_lock
   wait_for_completion(&kthread->exited)


Situation 1:
P1_set_current_state               on-rq, TASK_RUNNING -> TASK_INTERRUPTIBLE
P1_kthread_should_stop        on-rq, TASK_INTERRUPTIBLE
P2_set_bit                               on-rq, TASK_INTERRUPTIBLE , 
KTHREAD_SHOULD_STOP
P2_wake_up_process             on-rq, TASK_INTERRUPTIBLE -> TASK_RUNNING 
, KTHREAD_SHOULD_STOP
P1_schedule                           on-rq, TASK_RUNNING , 
KTHREAD_SHOULD_STOP
P2_wait_for_completion        // wait for P1 exit

Situation 2:
P1_set_current_state             on-rq, TASK_RUNNING -> TASK_INTERRUPTIBLE
P1_kthread_should_stop       on-rq, TASK_INTERRUPTIBLE
P2_set_bit                             on-rq, TASK_INTERRUPTIBLE , 
KTHREAD_SHOULD_STOP
P1_schedule                          off-rq, TASK_INTERRUPTIBLE , 
KTHREAD_SHOULD_STOP
P2_wake_up_process             on-rq, TASK_INTERRUPTIBLE -> TASK_RUNNING 
, KTHREAD_SHOULD_STOP
P2_wait_for_completion       // wait for P1 exit
> Before your patch, the race window was much larger, I fully agree, but
> your patch does not cure the problem
> it just makes it harder to hit.
>
> And using mdelay() to verify such a thing is also tricky because
> mdelay() will influence the task state.
>



______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] ubi: check kthread_should_stop() after the setting of task state
  2020-08-05  2:23             ` Zhihao Cheng
@ 2020-08-06 20:15               ` Richard Weinberger
  -1 siblings, 0 replies; 24+ messages in thread
From: Richard Weinberger @ 2020-08-06 20:15 UTC (permalink / raw)
  To: Zhihao Cheng; +Cc: linux-mtd, LKML, Richard Weinberger, zhangyi (F)

On Wed, Aug 5, 2020 at 4:23 AM Zhihao Cheng <chengzhihao1@huawei.com> wrote:
> Er, I can't get the point. I can list two possible situations, did I
> miss other situations?

Yes. You keep ignoring the case I brought up.

Let's start from scratch, maybe I miss something.
So I'm sorry for being persistent.

The ubi thread can be reduced to a loop like this one:
1. for (;;) {
2.      if (kthread_should_stop())
3.              break;
4.
5.      if ( /* no work pending*/ ){
6.              set_current_state(TASK_INTERRUPTIBLE);
7.              schedule();
8.              continue;
9.      }
10.
11.     do_work();
12. }

syzcaller found a case where stopping the thread did not work.
If another task tries to stop the thread while no work is pending and
the program counter in the thread
is between lines 5 and 6, the kthread_stop() instruction has no effect.
It has no effect because the thread sets the thread state to
interruptible sleep and then schedules away.

This is a common anti-pattern in the Linux kernel, sadly.

Do you agree with me so far or do you think syzcaller found a different issue?

Your patch changes the loop as follows:
1. for (;;) {
2.      if (kthread_should_stop())
3.              break;
4.
5.      if ( /* no work pending*/ ){
6.              set_current_state(TASK_INTERRUPTIBLE);
7.
8.              if (kthread_should_stop()) {
9.                      set_current_state(TASK_RUNNING);
10.                     break;
11.             }
12.
13.             schedule();
14.             continue;
15.     }
16.
17.     do_work();
18. }

That way there is a higher chance that the thread sees the stop flag
and gracefully terminates, I fully agree on that.
But it does not completely solve the problem.
If kthread_stop() happens while the program counter of the ubi thread
is at line 12, the stop flag is still missed
and we end up in interruptible sleep just like before.

So, to solve the problem entirely I suggest changing schedule() to
schedule_timeout() and let the thread wake up
periodically.

-- 
Thanks,
//richard

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] ubi: check kthread_should_stop() after the setting of task state
@ 2020-08-06 20:15               ` Richard Weinberger
  0 siblings, 0 replies; 24+ messages in thread
From: Richard Weinberger @ 2020-08-06 20:15 UTC (permalink / raw)
  To: Zhihao Cheng; +Cc: Richard Weinberger, linux-mtd, LKML, zhangyi (F)

On Wed, Aug 5, 2020 at 4:23 AM Zhihao Cheng <chengzhihao1@huawei.com> wrote:
> Er, I can't get the point. I can list two possible situations, did I
> miss other situations?

Yes. You keep ignoring the case I brought up.

Let's start from scratch, maybe I miss something.
So I'm sorry for being persistent.

The ubi thread can be reduced to a loop like this one:
1. for (;;) {
2.      if (kthread_should_stop())
3.              break;
4.
5.      if ( /* no work pending*/ ){
6.              set_current_state(TASK_INTERRUPTIBLE);
7.              schedule();
8.              continue;
9.      }
10.
11.     do_work();
12. }

syzcaller found a case where stopping the thread did not work.
If another task tries to stop the thread while no work is pending and
the program counter in the thread
is between lines 5 and 6, the kthread_stop() instruction has no effect.
It has no effect because the thread sets the thread state to
interruptible sleep and then schedules away.

This is a common anti-pattern in the Linux kernel, sadly.

Do you agree with me so far or do you think syzcaller found a different issue?

Your patch changes the loop as follows:
1. for (;;) {
2.      if (kthread_should_stop())
3.              break;
4.
5.      if ( /* no work pending*/ ){
6.              set_current_state(TASK_INTERRUPTIBLE);
7.
8.              if (kthread_should_stop()) {
9.                      set_current_state(TASK_RUNNING);
10.                     break;
11.             }
12.
13.             schedule();
14.             continue;
15.     }
16.
17.     do_work();
18. }

That way there is a higher chance that the thread sees the stop flag
and gracefully terminates, I fully agree on that.
But it does not completely solve the problem.
If kthread_stop() happens while the program counter of the ubi thread
is at line 12, the stop flag is still missed
and we end up in interruptible sleep just like before.

So, to solve the problem entirely I suggest changing schedule() to
schedule_timeout() and let the thread wake up
periodically.

-- 
Thanks,
//richard

______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] ubi: check kthread_should_stop() after the setting of task state
  2020-08-06 20:15               ` Richard Weinberger
@ 2020-08-07  2:18                 ` Zhihao Cheng
  -1 siblings, 0 replies; 24+ messages in thread
From: Zhihao Cheng @ 2020-08-07  2:18 UTC (permalink / raw)
  To: Richard Weinberger; +Cc: linux-mtd, LKML, Richard Weinberger, zhangyi (F)

在 2020/8/7 4:15, Richard Weinberger 写道:
> On Wed, Aug 5, 2020 at 4:23 AM Zhihao Cheng <chengzhihao1@huawei.com> wrote:
>> Er, I can't get the point. I can list two possible situations, did I
>> miss other situations?
> Yes. You keep ignoring the case I brought up.
>
> Let's start from scratch, maybe I miss something.
> So I'm sorry for being persistent.
Never mind, we're all trying to figure it out.  :-) . Besides, I'm not 
good at expressing question in English. (In Practicing...)
> The ubi thread can be reduced to a loop like this one:
> 1. for (;;) {
> 2.      if (kthread_should_stop())
> 3.              break;
> 4.
> 5.      if ( /* no work pending*/ ){
> 6.              set_current_state(TASK_INTERRUPTIBLE);
> 7.              schedule();
> 8.              continue;
> 9.      }
> 10.
> 11.     do_work();
> 12. }
>
> syzcaller found a case where stopping the thread did not work.
> If another task tries to stop the thread while no work is pending and
> the program counter in the thread
> is between lines 5 and 6, the kthread_stop() instruction has no effect.
> It has no effect because the thread sets the thread state to
> interruptible sleep and then schedules away.
>
> This is a common anti-pattern in the Linux kernel, sadly.

Yes, but UBIFS is the exception, my solution looks like UBIFS.

int ubifs_bg_thread(void *info)
{
     while(1) {
         if (kthread_should_stop())
             break;

         set_current_state(TASK_INTERRUPTIBLE);
         if (!c->need_bgt) {
             /*
              * Nothing prevents us from going sleep now and
              * be never woken up and block the task which
              * could wait in 'kthread_stop()' forever.
              */
             if (kthread_should_stop())
                 break;
             schedule();
             continue;
         }
     }
}


>
> Do you agree with me so far or do you think syzcaller found a different issue?
Yes, I agree.
>
> Your patch changes the loop as follows:
> 1. for (;;) {
> 2.      if (kthread_should_stop())
> 3.              break;
> 4.
> 5.      if ( /* no work pending*/ ){
> 6.              set_current_state(TASK_INTERRUPTIBLE);
> 7.
> 8.              if (kthread_should_stop()) {
> 9.                      set_current_state(TASK_RUNNING);
> 10.                     break;
> 11.             }
> 12.
> 13.             schedule();
> 14.             continue;
> 15.     }
> 16.
> 17.     do_work();
> 18. }
>
> That way there is a higher chance that the thread sees the stop flag
> and gracefully terminates, I fully agree on that.
There's no disagreement so far.
> But it does not completely solve the problem.
> If kthread_stop() happens while the program counter of the ubi thread
> is at line 12, the stop flag is still missed
> and we end up in interruptible sleep just like before.

That's where we hold different views. I have 3 viewpoints(You can point 
out which one you disagree.):

1. If kthread_stop() happens at line 12, ubi thread is *marked* with 
stop flag, it will stop at kthread_should_stop() as long as it can reach 
the next iteration.

2. If task A is on runqueue and its state is TASK_RUNNING, task A will 
be scheduled to execute.

3. If kthread_stop() happens at line 12, after program counter going to 
line 14, ubi thead is on runqueue and its state is TASK_RUNNING. I have 
explained this in situation 1 in last session.


I mean ubi thread is on runqueue with TASK_RUNNING state & stop flag 
after the process you described.

Line 12   kthread_stop()

                  set_bit(mark stop flag) && wake_up_process(enqueue && 
set TASK_RUNNING )    => TASK_RUNNING & stop flag & on runqueue

Line 13  schedule()

                  Do nothing but pick next task to execute

>
> So, to solve the problem entirely I suggest changing schedule() to
> schedule_timeout() and let the thread wake up
> periodically.
>



^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] ubi: check kthread_should_stop() after the setting of task state
@ 2020-08-07  2:18                 ` Zhihao Cheng
  0 siblings, 0 replies; 24+ messages in thread
From: Zhihao Cheng @ 2020-08-07  2:18 UTC (permalink / raw)
  To: Richard Weinberger; +Cc: Richard Weinberger, linux-mtd, LKML, zhangyi (F)

在 2020/8/7 4:15, Richard Weinberger 写道:
> On Wed, Aug 5, 2020 at 4:23 AM Zhihao Cheng <chengzhihao1@huawei.com> wrote:
>> Er, I can't get the point. I can list two possible situations, did I
>> miss other situations?
> Yes. You keep ignoring the case I brought up.
>
> Let's start from scratch, maybe I miss something.
> So I'm sorry for being persistent.
Never mind, we're all trying to figure it out.  :-) . Besides, I'm not 
good at expressing question in English. (In Practicing...)
> The ubi thread can be reduced to a loop like this one:
> 1. for (;;) {
> 2.      if (kthread_should_stop())
> 3.              break;
> 4.
> 5.      if ( /* no work pending*/ ){
> 6.              set_current_state(TASK_INTERRUPTIBLE);
> 7.              schedule();
> 8.              continue;
> 9.      }
> 10.
> 11.     do_work();
> 12. }
>
> syzcaller found a case where stopping the thread did not work.
> If another task tries to stop the thread while no work is pending and
> the program counter in the thread
> is between lines 5 and 6, the kthread_stop() instruction has no effect.
> It has no effect because the thread sets the thread state to
> interruptible sleep and then schedules away.
>
> This is a common anti-pattern in the Linux kernel, sadly.

Yes, but UBIFS is the exception, my solution looks like UBIFS.

int ubifs_bg_thread(void *info)
{
     while(1) {
         if (kthread_should_stop())
             break;

         set_current_state(TASK_INTERRUPTIBLE);
         if (!c->need_bgt) {
             /*
              * Nothing prevents us from going sleep now and
              * be never woken up and block the task which
              * could wait in 'kthread_stop()' forever.
              */
             if (kthread_should_stop())
                 break;
             schedule();
             continue;
         }
     }
}


>
> Do you agree with me so far or do you think syzcaller found a different issue?
Yes, I agree.
>
> Your patch changes the loop as follows:
> 1. for (;;) {
> 2.      if (kthread_should_stop())
> 3.              break;
> 4.
> 5.      if ( /* no work pending*/ ){
> 6.              set_current_state(TASK_INTERRUPTIBLE);
> 7.
> 8.              if (kthread_should_stop()) {
> 9.                      set_current_state(TASK_RUNNING);
> 10.                     break;
> 11.             }
> 12.
> 13.             schedule();
> 14.             continue;
> 15.     }
> 16.
> 17.     do_work();
> 18. }
>
> That way there is a higher chance that the thread sees the stop flag
> and gracefully terminates, I fully agree on that.
There's no disagreement so far.
> But it does not completely solve the problem.
> If kthread_stop() happens while the program counter of the ubi thread
> is at line 12, the stop flag is still missed
> and we end up in interruptible sleep just like before.

That's where we hold different views. I have 3 viewpoints(You can point 
out which one you disagree.):

1. If kthread_stop() happens at line 12, ubi thread is *marked* with 
stop flag, it will stop at kthread_should_stop() as long as it can reach 
the next iteration.

2. If task A is on runqueue and its state is TASK_RUNNING, task A will 
be scheduled to execute.

3. If kthread_stop() happens at line 12, after program counter going to 
line 14, ubi thead is on runqueue and its state is TASK_RUNNING. I have 
explained this in situation 1 in last session.


I mean ubi thread is on runqueue with TASK_RUNNING state & stop flag 
after the process you described.

Line 12   kthread_stop()

                  set_bit(mark stop flag) && wake_up_process(enqueue && 
set TASK_RUNNING )    => TASK_RUNNING & stop flag & on runqueue

Line 13  schedule()

                  Do nothing but pick next task to execute

>
> So, to solve the problem entirely I suggest changing schedule() to
> schedule_timeout() and let the thread wake up
> periodically.
>



______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] ubi: check kthread_should_stop() after the setting of task state
  2020-08-07  2:18                 ` Zhihao Cheng
@ 2020-08-07 19:29                   ` Richard Weinberger
  -1 siblings, 0 replies; 24+ messages in thread
From: Richard Weinberger @ 2020-08-07 19:29 UTC (permalink / raw)
  To: Zhihao Cheng; +Cc: linux-mtd, LKML, Richard Weinberger, zhangyi (F)

On Fri, Aug 7, 2020 at 4:18 AM Zhihao Cheng <chengzhihao1@huawei.com> wrote:
> That's where we hold different views. I have 3 viewpoints(You can point
> out which one you disagree.):
>
> 1. If kthread_stop() happens at line 12, ubi thread is *marked* with
> stop flag, it will stop at kthread_should_stop() as long as it can reach
> the next iteration.
>
> 2. If task A is on runqueue and its state is TASK_RUNNING, task A will
> be scheduled to execute.
>
> 3. If kthread_stop() happens at line 12, after program counter going to
> line 14, ubi thead is on runqueue and its state is TASK_RUNNING. I have
> explained this in situation 1 in last session.
>
>
> I mean ubi thread is on runqueue with TASK_RUNNING state & stop flag
> after the process you described.
>
> Line 12   kthread_stop()
>
>                   set_bit(mark stop flag) && wake_up_process(enqueue &&
> set TASK_RUNNING )    => TASK_RUNNING & stop flag & on runqueue
>
> Line 13  schedule()
>
>                   Do nothing but pick next task to execute

You are perfectly right! I failed to concentrate on the state changes.
Now all makes sense, also your comment before the if statement.
So I don't know how to make this more clear in the code.
Maybe it's just me being dense and in need for a vacation. ;-)

-- 
Thanks,
//richard

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] ubi: check kthread_should_stop() after the setting of task state
@ 2020-08-07 19:29                   ` Richard Weinberger
  0 siblings, 0 replies; 24+ messages in thread
From: Richard Weinberger @ 2020-08-07 19:29 UTC (permalink / raw)
  To: Zhihao Cheng; +Cc: Richard Weinberger, linux-mtd, LKML, zhangyi (F)

On Fri, Aug 7, 2020 at 4:18 AM Zhihao Cheng <chengzhihao1@huawei.com> wrote:
> That's where we hold different views. I have 3 viewpoints(You can point
> out which one you disagree.):
>
> 1. If kthread_stop() happens at line 12, ubi thread is *marked* with
> stop flag, it will stop at kthread_should_stop() as long as it can reach
> the next iteration.
>
> 2. If task A is on runqueue and its state is TASK_RUNNING, task A will
> be scheduled to execute.
>
> 3. If kthread_stop() happens at line 12, after program counter going to
> line 14, ubi thead is on runqueue and its state is TASK_RUNNING. I have
> explained this in situation 1 in last session.
>
>
> I mean ubi thread is on runqueue with TASK_RUNNING state & stop flag
> after the process you described.
>
> Line 12   kthread_stop()
>
>                   set_bit(mark stop flag) && wake_up_process(enqueue &&
> set TASK_RUNNING )    => TASK_RUNNING & stop flag & on runqueue
>
> Line 13  schedule()
>
>                   Do nothing but pick next task to execute

You are perfectly right! I failed to concentrate on the state changes.
Now all makes sense, also your comment before the if statement.
So I don't know how to make this more clear in the code.
Maybe it's just me being dense and in need for a vacation. ;-)

-- 
Thanks,
//richard

______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] ubi: check kthread_should_stop() after the setting of task state
  2020-08-07 19:29                   ` Richard Weinberger
@ 2020-08-08  3:26                     ` Zhihao Cheng
  -1 siblings, 0 replies; 24+ messages in thread
From: Zhihao Cheng @ 2020-08-08  3:26 UTC (permalink / raw)
  To: Richard Weinberger; +Cc: linux-mtd, LKML, Richard Weinberger, zhangyi (F)

在 2020/8/8 3:29, Richard Weinberger 写道:
> On Fri, Aug 7, 2020 at 4:18 AM Zhihao Cheng <chengzhihao1@huawei.com> wrote:

> Maybe it's just me being dense and in need for a vacation. ;-)
> 
I have quite a few ubi/ubifs patches in pending list, may you 
comment/check them before 5.9 ending please? thanks. \( ̄▽ ̄)

For example:

https://patchwork.ozlabs.org/project/linux-mtd/patch/20200601091037.3794172-2-chengzhihao1@huawei.com/

https://patchwork.ozlabs.org/project/linux-mtd/patch/20200602112410.660785-1-chengzhihao1@huawei.com/

https://patchwork.ozlabs.org/project/linux-mtd/cover/20200616071146.2607061-1-chengzhihao1@huawei.com/


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] ubi: check kthread_should_stop() after the setting of task state
@ 2020-08-08  3:26                     ` Zhihao Cheng
  0 siblings, 0 replies; 24+ messages in thread
From: Zhihao Cheng @ 2020-08-08  3:26 UTC (permalink / raw)
  To: Richard Weinberger; +Cc: Richard Weinberger, linux-mtd, LKML, zhangyi (F)

在 2020/8/8 3:29, Richard Weinberger 写道:
> On Fri, Aug 7, 2020 at 4:18 AM Zhihao Cheng <chengzhihao1@huawei.com> wrote:

> Maybe it's just me being dense and in need for a vacation. ;-)
> 
I have quite a few ubi/ubifs patches in pending list, may you 
comment/check them before 5.9 ending please? thanks. \( ̄▽ ̄)

For example:

https://patchwork.ozlabs.org/project/linux-mtd/patch/20200601091037.3794172-2-chengzhihao1@huawei.com/

https://patchwork.ozlabs.org/project/linux-mtd/patch/20200602112410.660785-1-chengzhihao1@huawei.com/

https://patchwork.ozlabs.org/project/linux-mtd/cover/20200616071146.2607061-1-chengzhihao1@huawei.com/


______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] ubi: check kthread_should_stop() after the setting of task state
  2020-08-08  3:26                     ` Zhihao Cheng
@ 2020-09-13 18:48                       ` Richard Weinberger
  -1 siblings, 0 replies; 24+ messages in thread
From: Richard Weinberger @ 2020-09-13 18:48 UTC (permalink / raw)
  To: Zhihao Cheng; +Cc: linux-mtd, LKML, Richard Weinberger, zhangyi (F)

On Sat, Aug 8, 2020 at 5:26 AM Zhihao Cheng <chengzhihao1@huawei.com> wrote:
> 在 2020/8/8 3:29, Richard Weinberger 写道:
> > On Fri, Aug 7, 2020 at 4:18 AM Zhihao Cheng <chengzhihao1@huawei.com> wrote:
>
> > Maybe it's just me being dense and in need for a vacation. ;-)

Applied to fixes. :-)

-- 
Thanks,
//richard

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] ubi: check kthread_should_stop() after the setting of task state
@ 2020-09-13 18:48                       ` Richard Weinberger
  0 siblings, 0 replies; 24+ messages in thread
From: Richard Weinberger @ 2020-09-13 18:48 UTC (permalink / raw)
  To: Zhihao Cheng; +Cc: Richard Weinberger, linux-mtd, LKML, zhangyi (F)

On Sat, Aug 8, 2020 at 5:26 AM Zhihao Cheng <chengzhihao1@huawei.com> wrote:
> 在 2020/8/8 3:29, Richard Weinberger 写道:
> > On Fri, Aug 7, 2020 at 4:18 AM Zhihao Cheng <chengzhihao1@huawei.com> wrote:
>
> > Maybe it's just me being dense and in need for a vacation. ;-)

Applied to fixes. :-)

-- 
Thanks,
//richard

______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/

^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2020-09-13 18:49 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-06-01  9:12 [PATCH] ubi: check kthread_should_stop() after the setting of task state Zhihao Cheng
2020-06-01  9:12 ` Zhihao Cheng
2020-08-02 21:25 ` Richard Weinberger
2020-08-02 21:25   ` Richard Weinberger
2020-08-03  2:01   ` Zhihao Cheng
2020-08-03  2:01     ` Zhihao Cheng
2020-08-03 22:11     ` Richard Weinberger
2020-08-03 22:11       ` Richard Weinberger
2020-08-04  2:58       ` Zhihao Cheng
2020-08-04  2:58         ` Zhihao Cheng
2020-08-04 21:56         ` Richard Weinberger
2020-08-04 21:56           ` Richard Weinberger
2020-08-05  2:23           ` Zhihao Cheng
2020-08-05  2:23             ` Zhihao Cheng
2020-08-06 20:15             ` Richard Weinberger
2020-08-06 20:15               ` Richard Weinberger
2020-08-07  2:18               ` Zhihao Cheng
2020-08-07  2:18                 ` Zhihao Cheng
2020-08-07 19:29                 ` Richard Weinberger
2020-08-07 19:29                   ` Richard Weinberger
2020-08-08  3:26                   ` Zhihao Cheng
2020-08-08  3:26                     ` Zhihao Cheng
2020-09-13 18:48                     ` Richard Weinberger
2020-09-13 18:48                       ` Richard Weinberger

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.