All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] bpf: fix rq lock recursion issue
@ 2022-06-13  2:52 Satya Durga Srinivasu Prabhala
  2022-06-13  9:22 ` Toke Høiland-Jørgensen
  0 siblings, 1 reply; 12+ messages in thread
From: Satya Durga Srinivasu Prabhala @ 2022-06-13  2:52 UTC (permalink / raw)
  To: bpf
  Cc: Satya Durga Srinivasu Prabhala, ast, andrii, daniel,
	joannelkoong, toke, brouer

Below recursion is observed in a rare scenario where __schedule()
takes rq lock, at around same time task's affinity is being changed,
bpf function for tracing sched_switch calls migrate_enabled(),
checks for affinity change (cpus_ptr != cpus_mask) lands into
__set_cpus_allowed_ptr which tries acquire rq lock and causing the
recursion bug.

Fix the issue by replacing migrate_enable/disable() with
preempt_enable/disable().

-010 |spin_bug(lock = ???, msg = ???)
-011 |debug_spin_lock_before(inline)
-011 |do_raw_spin_lock(lock = 0xFFFFFF89323BB600)
-012 |_raw_spin_lock(inline)
-012 |raw_spin_rq_lock_nested(inline)
-012 |raw_spin_rq_lock(inline)
-012 |task_rq_lock(p = 0xFFFFFF88CFF1DA00, rf = 0xFFFFFFC03707BBE8)
-013 |__set_cpus_allowed_ptr(inline)
-013 |migrate_enable()
-014 |trace_call_bpf(call = ?, ctx = 0xFFFFFFFDEF954600)
-015 |perf_trace_run_bpf_submit(inline)
-015 |perf_trace_sched_switch(__data = 0xFFFFFFE82CF0BCB8, preempt = FALSE, prev = ?, next = ?)
-016 |__traceiter_sched_switch(inline)
-016 |trace_sched_switch(inline)
-016 |__schedule(sched_mode = ?)
-017 |schedule()
-018 |arch_local_save_flags(inline)
-018 |arch_irqs_disabled(inline)
-018 |__raw_spin_lock_irq(inline)
-018 |_raw_spin_lock_irq(inline)
-018 |worker_thread(__worker = 0xFFFFFF88CE251300)
-019 |kthread(_create = 0xFFFFFF88730A5A80)
-020 |ret_from_fork(asm)

Signed-off-by: Satya Durga Srinivasu Prabhala <quic_satyap@quicinc.com>
---
 include/linux/bpf.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index 2b914a56a2c5..4ecb065140e9 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -1414,7 +1414,7 @@ bpf_prog_run_array(const struct bpf_prog_array *array,
 	if (unlikely(!array))
 		return ret;
 
-	migrate_disable();
+	preempt_disable();
 	old_run_ctx = bpf_set_run_ctx(&run_ctx.run_ctx);
 	item = &array->items[0];
 	while ((prog = READ_ONCE(item->prog))) {
@@ -1423,7 +1423,7 @@ bpf_prog_run_array(const struct bpf_prog_array *array,
 		item++;
 	}
 	bpf_reset_run_ctx(old_run_ctx);
-	migrate_enable();
+	preempt_enable();
 	return ret;
 }
 
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH] bpf: fix rq lock recursion issue
  2022-06-13  2:52 [PATCH] bpf: fix rq lock recursion issue Satya Durga Srinivasu Prabhala
@ 2022-06-13  9:22 ` Toke Høiland-Jørgensen
  2022-06-13 16:17   ` Satya Durga Srinivasu Prabhala
  2022-06-13 16:35   ` Yonghong Song
  0 siblings, 2 replies; 12+ messages in thread
From: Toke Høiland-Jørgensen @ 2022-06-13  9:22 UTC (permalink / raw)
  To: Satya Durga Srinivasu Prabhala, bpf
  Cc: Satya Durga Srinivasu Prabhala, ast, andrii, daniel,
	joannelkoong, brouer

Satya Durga Srinivasu Prabhala <quic_satyap@quicinc.com> writes:

> Below recursion is observed in a rare scenario where __schedule()
> takes rq lock, at around same time task's affinity is being changed,
> bpf function for tracing sched_switch calls migrate_enabled(),
> checks for affinity change (cpus_ptr != cpus_mask) lands into
> __set_cpus_allowed_ptr which tries acquire rq lock and causing the
> recursion bug.

So this only affects tracing programs that attach to tasks that can have
their affinity changed? Or do we need to review migrate_enable() vs
preempt_enable() for networking hooks as well?

-Toke


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] bpf: fix rq lock recursion issue
  2022-06-13  9:22 ` Toke Høiland-Jørgensen
@ 2022-06-13 16:17   ` Satya Durga Srinivasu Prabhala
  2022-06-13 16:35   ` Yonghong Song
  1 sibling, 0 replies; 12+ messages in thread
From: Satya Durga Srinivasu Prabhala @ 2022-06-13 16:17 UTC (permalink / raw)
  To: 'Toke Høiland-Jørgensen', bpf
  Cc: ast, andrii, daniel, joannelkoong, brouer


On 6/13/22 2:22 AM, Toke Høiland-Jørgensen wrote:
> Satya Durga Srinivasu Prabhala <quic_satyap@quicinc.com> writes:
>
>> Below recursion is observed in a rare scenario where __schedule()
>> takes rq lock, at around same time task's affinity is being changed,
>> bpf function for tracing sched_switch calls migrate_enabled(),
>> checks for affinity change (cpus_ptr != cpus_mask) lands into
>> __set_cpus_allowed_ptr which tries acquire rq lock and causing the
>> recursion bug.
> So this only affects tracing programs that attach to tasks that can have
> their affinity changed?

That's right.

>   Or do we need to review migrate_enable() vs
> preempt_enable() for networking hooks as well?

I believe networking hooks won't take rq locks, so, we won't see the issue
with networking hooks as far as I can tell.

>
> -Toke
>


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] bpf: fix rq lock recursion issue
  2022-06-13  9:22 ` Toke Høiland-Jørgensen
  2022-06-13 16:17   ` Satya Durga Srinivasu Prabhala
@ 2022-06-13 16:35   ` Yonghong Song
  2022-06-13 17:47     ` Satya Durga Srinivasu Prabhala
  1 sibling, 1 reply; 12+ messages in thread
From: Yonghong Song @ 2022-06-13 16:35 UTC (permalink / raw)
  To: Toke Høiland-Jørgensen, Satya Durga Srinivasu Prabhala, bpf
  Cc: ast, andrii, daniel, joannelkoong, brouer



On 6/13/22 2:22 AM, Toke Høiland-Jørgensen wrote:
> !-------------------------------------------------------------------|
>    This Message Is From an External Sender
>    This message came from outside your organization.
> |-------------------------------------------------------------------!
> 
> Satya Durga Srinivasu Prabhala <quic_satyap@quicinc.com> writes:
> 
>> Below recursion is observed in a rare scenario where __schedule()
>> takes rq lock, at around same time task's affinity is being changed,
>> bpf function for tracing sched_switch calls migrate_enabled(),
>> checks for affinity change (cpus_ptr != cpus_mask) lands into
>> __set_cpus_allowed_ptr which tries acquire rq lock and causing the
>> recursion bug.
> 
> So this only affects tracing programs that attach to tasks that can have
> their affinity changed? Or do we need to review migrate_enable() vs
> preempt_enable() for networking hooks as well?

I think that changing from migrate_disable() to preempt_disable()
won't work from RT kernel. In fact, the original preempt_disable() to
migrate_disable() is triggered by RT kernel discussion.

As you mentioned, this is a very special case. Not sure whether we have
a good way to fix it or not. Is it possible we can check whether rq lock
is held or not under condition cpus_ptr != cpus_mask? If it is,
migrate_disable() (or a variant of it) should return an error code
to indicate it won't work?

> 
> -Toke
> 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] bpf: fix rq lock recursion issue
  2022-06-13 16:35   ` Yonghong Song
@ 2022-06-13 17:47     ` Satya Durga Srinivasu Prabhala
  2022-06-13 21:01       ` Alexei Starovoitov
  0 siblings, 1 reply; 12+ messages in thread
From: Satya Durga Srinivasu Prabhala @ 2022-06-13 17:47 UTC (permalink / raw)
  To: 'Yonghong Song', 'Toke Høiland-Jørgensen', bpf
  Cc: ast, andrii, daniel, joannelkoong, brouer


On 6/13/22 9:35 AM, Yonghong Song wrote:
>
>
> On 6/13/22 2:22 AM, Toke Høiland-Jørgensen wrote:
>> !-------------------------------------------------------------------|
>>    This Message Is From an External Sender
>>    This message came from outside your organization.
>> |-------------------------------------------------------------------!
>>
>> Satya Durga Srinivasu Prabhala <quic_satyap@quicinc.com> writes:
>>
>>> Below recursion is observed in a rare scenario where __schedule()
>>> takes rq lock, at around same time task's affinity is being changed,
>>> bpf function for tracing sched_switch calls migrate_enabled(),
>>> checks for affinity change (cpus_ptr != cpus_mask) lands into
>>> __set_cpus_allowed_ptr which tries acquire rq lock and causing the
>>> recursion bug.
>>
>> So this only affects tracing programs that attach to tasks that can have
>> their affinity changed? Or do we need to review migrate_enable() vs
>> preempt_enable() for networking hooks as well?
>
> I think that changing from migrate_disable() to preempt_disable()
> won't work from RT kernel. In fact, the original preempt_disable() to
> migrate_disable() is triggered by RT kernel discussion.
>
> As you mentioned, this is a very special case. Not sure whether we have
> a good way to fix it or not. Is it possible we can check whether rq lock
> is held or not under condition cpus_ptr != cpus_mask? If it is,
> migrate_disable() (or a variant of it) should return an error code
> to indicate it won't work?

That essentially becomes using preempt_enable/disable().
If we need migrate_enable/disable() for RT kernels, we can
add specific check for RT Kernels like below which should fix
issue for non-RT Kernels? 

diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index 2b914a56a2c5..ec1a287dbf5e 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -1414,7 +1414,11 @@ bpf_prog_run_array(const struct bpf_prog_array 
*array,
         if (unlikely(!array))
                 return ret;

+#ifdef CONFIG_PREEMPT_RT
         migrate_disable();
+#else
+       preempt_disable();
+#endif
         old_run_ctx = bpf_set_run_ctx(&run_ctx.run_ctx);
         item = &array->items[0];
         while ((prog = READ_ONCE(item->prog))) {
@@ -1423,7 +1427,11 @@ bpf_prog_run_array(const struct bpf_prog_array 
*array,
                 item++;
         }
         bpf_reset_run_ctx(old_run_ctx);
+#ifdef CONFIG_PREEMPT_RT
         migrate_enable();
+#else
+       preempt_enable();
+#endif
         return ret;
  }

>
>>
>> -Toke
>>


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH] bpf: fix rq lock recursion issue
  2022-06-13 17:47     ` Satya Durga Srinivasu Prabhala
@ 2022-06-13 21:01       ` Alexei Starovoitov
  2022-06-13 21:35         ` Satya Durga Srinivasu Prabhala
  0 siblings, 1 reply; 12+ messages in thread
From: Alexei Starovoitov @ 2022-06-13 21:01 UTC (permalink / raw)
  To: Satya Durga Srinivasu Prabhala
  Cc: Yonghong Song, Toke Høiland-Jørgensen, bpf,
	Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	Joanne Koong, Jesper Dangaard Brouer

On Mon, Jun 13, 2022 at 10:47 AM Satya Durga Srinivasu Prabhala
<quic_satyap@quicinc.com> wrote:
>
>
> On 6/13/22 9:35 AM, Yonghong Song wrote:
> >
> >
> > On 6/13/22 2:22 AM, Toke Høiland-Jørgensen wrote:
> >> !-------------------------------------------------------------------|
> >>    This Message Is From an External Sender
> >>    This message came from outside your organization.
> >> |-------------------------------------------------------------------!
> >>
> >> Satya Durga Srinivasu Prabhala <quic_satyap@quicinc.com> writes:
> >>
> >>> Below recursion is observed in a rare scenario where __schedule()
> >>> takes rq lock, at around same time task's affinity is being changed,
> >>> bpf function for tracing sched_switch calls migrate_enabled(),
> >>> checks for affinity change (cpus_ptr != cpus_mask) lands into
> >>> __set_cpus_allowed_ptr which tries acquire rq lock and causing the
> >>> recursion bug.
> >>
> >> So this only affects tracing programs that attach to tasks that can have
> >> their affinity changed? Or do we need to review migrate_enable() vs
> >> preempt_enable() for networking hooks as well?
> >
> > I think that changing from migrate_disable() to preempt_disable()
> > won't work from RT kernel. In fact, the original preempt_disable() to
> > migrate_disable() is triggered by RT kernel discussion.
> >
> > As you mentioned, this is a very special case. Not sure whether we have
> > a good way to fix it or not. Is it possible we can check whether rq lock
> > is held or not under condition cpus_ptr != cpus_mask? If it is,
> > migrate_disable() (or a variant of it) should return an error code
> > to indicate it won't work?
>
> That essentially becomes using preempt_enable/disable().
> If we need migrate_enable/disable() for RT kernels, we can
> add specific check for RT Kernels like below which should fix
> issue for non-RT Kernels?
>
> diff --git a/include/linux/bpf.h b/include/linux/bpf.h
> index 2b914a56a2c5..ec1a287dbf5e 100644
> --- a/include/linux/bpf.h
> +++ b/include/linux/bpf.h
> @@ -1414,7 +1414,11 @@ bpf_prog_run_array(const struct bpf_prog_array
> *array,
>          if (unlikely(!array))
>                  return ret;
>
> +#ifdef CONFIG_PREEMPT_RT
>          migrate_disable();
> +#else
> +       preempt_disable();
> +#endif
>          old_run_ctx = bpf_set_run_ctx(&run_ctx.run_ctx);
>          item = &array->items[0];
>          while ((prog = READ_ONCE(item->prog))) {
> @@ -1423,7 +1427,11 @@ bpf_prog_run_array(const struct bpf_prog_array
> *array,
>                  item++;
>          }
>          bpf_reset_run_ctx(old_run_ctx);
> +#ifdef CONFIG_PREEMPT_RT
>          migrate_enable();
> +#else
> +       preempt_enable();
> +#endif

This doesn't solve anything.
Please provide a reproducer.
iirc the task's affinity change can race even with preemption disabled
on this cpu. Why would s/migrate/preemption/ address the deadlock ?
Once there is a reproducer we need to figure out a different way
of addressing it. Maybe we will special case trace_sched_switch.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] bpf: fix rq lock recursion issue
  2022-06-13 21:01       ` Alexei Starovoitov
@ 2022-06-13 21:35         ` Satya Durga Srinivasu Prabhala
  2022-06-13 21:49           ` Alexei Starovoitov
  0 siblings, 1 reply; 12+ messages in thread
From: Satya Durga Srinivasu Prabhala @ 2022-06-13 21:35 UTC (permalink / raw)
  To: 'Alexei Starovoitov'
  Cc: 'Yonghong Song',
	'Toke Høiland-Jørgensen', 'bpf',
	'Alexei Starovoitov', 'Andrii Nakryiko',
	'Daniel Borkmann', 'Joanne Koong',
	'Jesper Dangaard Brouer'


On 6/13/22 2:01 PM, Alexei Starovoitov wrote:
> is doesn't solve anything.
> Please provide a reproducer.

I'm trying to find an easy way to repro the issue, so far, unsuccessful.

> iirc the task's affinity change can race even with preemption disabled
> on this cpu. Why would s/migrate/preemption/ address the deadlock ?

I don't think task's affinity change races with preemption disabled/enabled.

Switching to preemption disable/enable calls helps as it's just simple
counter increment and decrement with a barrier, but with migrate 
disable/enable when task's affinity changes, we run into recursive bug
due to rq lock.

By the way, migrate disable/enable does end up calling preemt disable/enable
as well, pls see [1].

[1]
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/kernel/sched/core.c#n2224



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] bpf: fix rq lock recursion issue
  2022-06-13 21:35         ` Satya Durga Srinivasu Prabhala
@ 2022-06-13 21:49           ` Alexei Starovoitov
  2022-06-14  1:10             ` Satya Durga Srinivasu Prabhala
  0 siblings, 1 reply; 12+ messages in thread
From: Alexei Starovoitov @ 2022-06-13 21:49 UTC (permalink / raw)
  To: Satya Durga Srinivasu Prabhala
  Cc: Yonghong Song, Toke Høiland-Jørgensen, bpf,
	Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	Joanne Koong, Jesper Dangaard Brouer

On Mon, Jun 13, 2022 at 2:35 PM Satya Durga Srinivasu Prabhala
<quic_satyap@quicinc.com> wrote:
>
>
> On 6/13/22 2:01 PM, Alexei Starovoitov wrote:
> > is doesn't solve anything.
> > Please provide a reproducer.
>
> I'm trying to find an easy way to repro the issue, so far, unsuccessful.
>
> > iirc the task's affinity change can race even with preemption disabled
> > on this cpu. Why would s/migrate/preemption/ address the deadlock ?
>
> I don't think task's affinity change races with preemption disabled/enabled.
>
> Switching to preemption disable/enable calls helps as it's just simple
> counter increment and decrement with a barrier, but with migrate
> disable/enable when task's affinity changes, we run into recursive bug
> due to rq lock.

As Yonghong already explained, replacing migrate_disable
with preempt_disable around bpf prog invocation is not an option.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] bpf: fix rq lock recursion issue
  2022-06-13 21:49           ` Alexei Starovoitov
@ 2022-06-14  1:10             ` Satya Durga Srinivasu Prabhala
  2022-06-14  6:09               ` Yonghong Song
  0 siblings, 1 reply; 12+ messages in thread
From: Satya Durga Srinivasu Prabhala @ 2022-06-14  1:10 UTC (permalink / raw)
  To: 'Alexei Starovoitov'
  Cc: 'Yonghong Song',
	'Toke Høiland-Jørgensen', 'bpf',
	'Alexei Starovoitov', 'Andrii Nakryiko',
	'Daniel Borkmann', 'Joanne Koong',
	'Jesper Dangaard Brouer'


On 6/13/22 2:49 PM, Alexei Starovoitov wrote:
> On Mon, Jun 13, 2022 at 2:35 PM Satya Durga Srinivasu Prabhala
> <quic_satyap@quicinc.com> wrote:
>>
>> On 6/13/22 2:01 PM, Alexei Starovoitov wrote:
>>> is doesn't solve anything.
>>> Please provide a reproducer.
>> I'm trying to find an easy way to repro the issue, so far, unsuccessful.
>>
>>> iirc the task's affinity change can race even with preemption disabled
>>> on this cpu. Why would s/migrate/preemption/ address the deadlock ?
>> I don't think task's affinity change races with preemption disabled/enabled.
>>
>> Switching to preemption disable/enable calls helps as it's just simple
>> counter increment and decrement with a barrier, but with migrate
>> disable/enable when task's affinity changes, we run into recursive bug
>> due to rq lock.
> As Yonghong already explained, replacing migrate_disable
> with preempt_disable around bpf prog invocation is not an option.

If I understand correctly, Yonghong mentioned that replacing migrate_
with preempt_ won't work for RT Kernels and migrate_ APIs were introduced
for RT Kernels is what he was pointing to. I went back and cross checked
on 5.10 LTS Kernel, I see that the migrate_ calls end up just calling into
preemt_ calls [1]. So far non-RT kernels, sticking to preemt_ calls should
still work. Please let me know if I missed anything.

[1]
https://android.googlesource.com/kernel/common/+/refs/heads/android12-5.10/include/linux/preempt.h#335


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] bpf: fix rq lock recursion issue
  2022-06-14  1:10             ` Satya Durga Srinivasu Prabhala
@ 2022-06-14  6:09               ` Yonghong Song
  2022-06-24  6:56                 ` Satya Durga Srinivasu Prabhala
  0 siblings, 1 reply; 12+ messages in thread
From: Yonghong Song @ 2022-06-14  6:09 UTC (permalink / raw)
  To: Satya Durga Srinivasu Prabhala, 'Alexei Starovoitov'
  Cc: 'Toke Høiland-Jørgensen', 'bpf',
	'Alexei Starovoitov', 'Andrii Nakryiko',
	'Daniel Borkmann', 'Joanne Koong',
	'Jesper Dangaard Brouer'



On 6/13/22 6:10 PM, Satya Durga Srinivasu Prabhala wrote:
> 
> On 6/13/22 2:49 PM, Alexei Starovoitov wrote:
>> On Mon, Jun 13, 2022 at 2:35 PM Satya Durga Srinivasu Prabhala
>> <quic_satyap@quicinc.com> wrote:
>>>
>>> On 6/13/22 2:01 PM, Alexei Starovoitov wrote:
>>>> is doesn't solve anything.
>>>> Please provide a reproducer.
>>> I'm trying to find an easy way to repro the issue, so far, unsuccessful.
>>>
>>>> iirc the task's affinity change can race even with preemption disabled
>>>> on this cpu. Why would s/migrate/preemption/ address the deadlock ?
>>> I don't think task's affinity change races with preemption 
>>> disabled/enabled.
>>>
>>> Switching to preemption disable/enable calls helps as it's just simple
>>> counter increment and decrement with a barrier, but with migrate
>>> disable/enable when task's affinity changes, we run into recursive bug
>>> due to rq lock.
>> As Yonghong already explained, replacing migrate_disable
>> with preempt_disable around bpf prog invocation is not an option.
> 
> If I understand correctly, Yonghong mentioned that replacing migrate_
> with preempt_ won't work for RT Kernels and migrate_ APIs were introduced
> for RT Kernels is what he was pointing to. I went back and cross checked
> on 5.10 LTS Kernel, I see that the migrate_ calls end up just calling into
> preemt_ calls [1]. So far non-RT kernels, sticking to preemt_ calls should
> still work. Please let me know if I missed anything.

Yes, old kernel migrate_disable/enable() implementation with
simply preempt_disable/enable() are transitional. You can check
5.12 kernel migrate_disable/enable() implementation. Note that
your patch, if accepted, will apply to the latest kernel. So we
cannot simply replace migrate_disable() with prempt_disable(),
which won't work for RT kernel.

> 
> [1]
> https://android.googlesource.com/kernel/common/+/refs/heads/android12-5.10/include/linux/preempt.h#335 
> 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] bpf: fix rq lock recursion issue
  2022-06-14  6:09               ` Yonghong Song
@ 2022-06-24  6:56                 ` Satya Durga Srinivasu Prabhala
  2022-06-24 16:46                   ` Alexei Starovoitov
  0 siblings, 1 reply; 12+ messages in thread
From: Satya Durga Srinivasu Prabhala @ 2022-06-24  6:56 UTC (permalink / raw)
  To: 'Yonghong Song', 'Alexei Starovoitov'
  Cc: 'Toke Høiland-Jørgensen', 'bpf',
	'Alexei Starovoitov', 'Andrii Nakryiko',
	'Daniel Borkmann', 'Joanne Koong',
	'Jesper Dangaard Brouer'


On 6/13/22 11:09 PM, Yonghong Song wrote:
>
>
> On 6/13/22 6:10 PM, Satya Durga Srinivasu Prabhala wrote:
>>
>> On 6/13/22 2:49 PM, Alexei Starovoitov wrote:
>>> On Mon, Jun 13, 2022 at 2:35 PM Satya Durga Srinivasu Prabhala
>>> <quic_satyap@quicinc.com> wrote:
>>>>
>>>> On 6/13/22 2:01 PM, Alexei Starovoitov wrote:
>>>>> is doesn't solve anything.
>>>>> Please provide a reproducer.
>>>> I'm trying to find an easy way to repro the issue, so far, 
>>>> unsuccessful.
>>>>
>>>>> iirc the task's affinity change can race even with preemption 
>>>>> disabled
>>>>> on this cpu. Why would s/migrate/preemption/ address the deadlock ?
>>>> I don't think task's affinity change races with preemption 
>>>> disabled/enabled.
>>>>
>>>> Switching to preemption disable/enable calls helps as it's just simple
>>>> counter increment and decrement with a barrier, but with migrate
>>>> disable/enable when task's affinity changes, we run into recursive bug
>>>> due to rq lock.
>>> As Yonghong already explained, replacing migrate_disable
>>> with preempt_disable around bpf prog invocation is not an option.
>>
>> If I understand correctly, Yonghong mentioned that replacing migrate_
>> with preempt_ won't work for RT Kernels and migrate_ APIs were 
>> introduced
>> for RT Kernels is what he was pointing to. I went back and cross checked
>> on 5.10 LTS Kernel, I see that the migrate_ calls end up just calling 
>> into
>> preemt_ calls [1]. So far non-RT kernels, sticking to preemt_ calls 
>> should
>> still work. Please let me know if I missed anything.
>
> Yes, old kernel migrate_disable/enable() implementation with
> simply preempt_disable/enable() are transitional. You can check
> 5.12 kernel migrate_disable/enable() implementation. Note that
> your patch, if accepted, will apply to the latest kernel. So we
> cannot simply replace migrate_disable() with prempt_disable(),
> which won't work for RT kernel.

Thanks for getting back and apologies for the delay. I understand that
we may break RT kernels with this patch. So, I was proposing to stick to
migrate_disable/enable() calls on RT Kernels and use preempt_disable/enable()
in case of non-RT Kernel. Which warrants change in scheduler, will push 
patch and try get feedback from Scheduler experts.

While I'm here I would like to cross check with you xperts on ideas to 
reproduce the issue easily and consistently. Your inputs will immensely help to 
debug issue further.
>
>>
>> [1]
>> https://android.googlesource.com/kernel/common/+/refs/heads/android12-5.10/include/linux/preempt.h#335 
>>


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] bpf: fix rq lock recursion issue
  2022-06-24  6:56                 ` Satya Durga Srinivasu Prabhala
@ 2022-06-24 16:46                   ` Alexei Starovoitov
  0 siblings, 0 replies; 12+ messages in thread
From: Alexei Starovoitov @ 2022-06-24 16:46 UTC (permalink / raw)
  To: Satya Durga Srinivasu Prabhala
  Cc: Yonghong Song, Toke Høiland-Jørgensen, bpf,
	Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	Joanne Koong, Jesper Dangaard Brouer

On Thu, Jun 23, 2022 at 11:56 PM Satya Durga Srinivasu Prabhala
<quic_satyap@quicinc.com> wrote:
>
>
> On 6/13/22 11:09 PM, Yonghong Song wrote:
> >
> >
> > On 6/13/22 6:10 PM, Satya Durga Srinivasu Prabhala wrote:
> >>
> >> On 6/13/22 2:49 PM, Alexei Starovoitov wrote:
> >>> On Mon, Jun 13, 2022 at 2:35 PM Satya Durga Srinivasu Prabhala
> >>> <quic_satyap@quicinc.com> wrote:
> >>>>
> >>>> On 6/13/22 2:01 PM, Alexei Starovoitov wrote:
> >>>>> is doesn't solve anything.
> >>>>> Please provide a reproducer.
> >>>> I'm trying to find an easy way to repro the issue, so far,
> >>>> unsuccessful.
> >>>>
> >>>>> iirc the task's affinity change can race even with preemption
> >>>>> disabled
> >>>>> on this cpu. Why would s/migrate/preemption/ address the deadlock ?
> >>>> I don't think task's affinity change races with preemption
> >>>> disabled/enabled.
> >>>>
> >>>> Switching to preemption disable/enable calls helps as it's just simple
> >>>> counter increment and decrement with a barrier, but with migrate
> >>>> disable/enable when task's affinity changes, we run into recursive bug
> >>>> due to rq lock.
> >>> As Yonghong already explained, replacing migrate_disable
> >>> with preempt_disable around bpf prog invocation is not an option.
> >>
> >> If I understand correctly, Yonghong mentioned that replacing migrate_
> >> with preempt_ won't work for RT Kernels and migrate_ APIs were
> >> introduced
> >> for RT Kernels is what he was pointing to. I went back and cross checked
> >> on 5.10 LTS Kernel, I see that the migrate_ calls end up just calling
> >> into
> >> preemt_ calls [1]. So far non-RT kernels, sticking to preemt_ calls
> >> should
> >> still work. Please let me know if I missed anything.
> >
> > Yes, old kernel migrate_disable/enable() implementation with
> > simply preempt_disable/enable() are transitional. You can check
> > 5.12 kernel migrate_disable/enable() implementation. Note that
> > your patch, if accepted, will apply to the latest kernel. So we
> > cannot simply replace migrate_disable() with prempt_disable(),
> > which won't work for RT kernel.
>
> Thanks for getting back and apologies for the delay. I understand that
> we may break RT kernels with this patch. So, I was proposing to stick to
> migrate_disable/enable() calls on RT Kernels and use preempt_disable/enable()
> in case of non-RT Kernel. Which warrants change in scheduler, will push
> patch and try get feedback from Scheduler experts.

We will stay with migrate_disable/enable on all types of kernel.
This is not negotiable.

> While I'm here I would like to cross check with you xperts on ideas to
> reproduce the issue easily and consistently. Your inputs will immensely help to
> debug issue further.

Please do. No patches will be applied until there is a clear reproducer.

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2022-06-24 16:47 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-06-13  2:52 [PATCH] bpf: fix rq lock recursion issue Satya Durga Srinivasu Prabhala
2022-06-13  9:22 ` Toke Høiland-Jørgensen
2022-06-13 16:17   ` Satya Durga Srinivasu Prabhala
2022-06-13 16:35   ` Yonghong Song
2022-06-13 17:47     ` Satya Durga Srinivasu Prabhala
2022-06-13 21:01       ` Alexei Starovoitov
2022-06-13 21:35         ` Satya Durga Srinivasu Prabhala
2022-06-13 21:49           ` Alexei Starovoitov
2022-06-14  1:10             ` Satya Durga Srinivasu Prabhala
2022-06-14  6:09               ` Yonghong Song
2022-06-24  6:56                 ` Satya Durga Srinivasu Prabhala
2022-06-24 16:46                   ` Alexei Starovoitov

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.