linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH kernel] srcu: Fix static initialization
@ 2020-09-08 14:43 Alexey Kardashevskiy
  2020-09-09  9:24 ` Alexey Kardashevskiy
  0 siblings, 1 reply; 10+ messages in thread
From: Alexey Kardashevskiy @ 2020-09-08 14:43 UTC (permalink / raw)
  To: rcu
  Cc: Alexey Kardashevskiy, Paul E. McKenney, Josh Triplett,
	Lai Jiangshan, Mathieu Desnoyers, Steven Rostedt, linux-kernel

init_srcu_struct_nodes() is called with is_static==true only internally
and when this happens, the srcu->sda is not initialized in
init_srcu_struct_fields() and we crash on dereferencing @sdp.

This fixes the crash by moving "if (is_static)" out of the loop which
only does useful work for is_static=false case anyway.

Found by syzkaller.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
---
 kernel/rcu/srcutree.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c
index c100acf332ed..49b54a50bde8 100644
--- a/kernel/rcu/srcutree.c
+++ b/kernel/rcu/srcutree.c
@@ -135,6 +135,9 @@ static void init_srcu_struct_nodes(struct srcu_struct *ssp, bool is_static)
 				   levelspread[level - 1];
 	}
 
+	if (is_static)
+		return;
+
 	/*
 	 * Initialize the per-CPU srcu_data array, which feeds into the
 	 * leaves of the srcu_node tree.
@@ -161,8 +164,6 @@ static void init_srcu_struct_nodes(struct srcu_struct *ssp, bool is_static)
 		timer_setup(&sdp->delay_work, srcu_delay_timer, 0);
 		sdp->ssp = ssp;
 		sdp->grpmask = 1 << (cpu - sdp->mynode->grplo);
-		if (is_static)
-			continue;
 
 		/* Dynamically allocated, better be no srcu_read_locks()! */
 		for (i = 0; i < ARRAY_SIZE(sdp->srcu_lock_count); i++) {
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH kernel] srcu: Fix static initialization
  2020-09-08 14:43 [PATCH kernel] srcu: Fix static initialization Alexey Kardashevskiy
@ 2020-09-09  9:24 ` Alexey Kardashevskiy
  2020-09-09 11:50   ` Paul E. McKenney
  0 siblings, 1 reply; 10+ messages in thread
From: Alexey Kardashevskiy @ 2020-09-09  9:24 UTC (permalink / raw)
  To: rcu
  Cc: Paul E. McKenney, Josh Triplett, Lai Jiangshan,
	Mathieu Desnoyers, Steven Rostedt, linux-kernel



On 09/09/2020 00:43, Alexey Kardashevskiy wrote:
> init_srcu_struct_nodes() is called with is_static==true only internally
> and when this happens, the srcu->sda is not initialized in
> init_srcu_struct_fields() and we crash on dereferencing @sdp.
> 
> This fixes the crash by moving "if (is_static)" out of the loop which
> only does useful work for is_static=false case anyway.
> 
> Found by syzkaller.
> 
> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
> ---
>  kernel/rcu/srcutree.c | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c
> index c100acf332ed..49b54a50bde8 100644
> --- a/kernel/rcu/srcutree.c
> +++ b/kernel/rcu/srcutree.c
> @@ -135,6 +135,9 @@ static void init_srcu_struct_nodes(struct srcu_struct *ssp, bool is_static)
>  				   levelspread[level - 1];
>  	}
>  
> +	if (is_static)
> +		return;

Actually, this is needed here too:

 if (!ssp->sda)
         return;

as
ssp->sda = alloc_percpu(struct srcu_data)

can fail if the process is killed too soon - it is quite easy to get
this situation with syzkaller (syscalls fuzzer)

Makes sense?


> +
>  	/*
>  	 * Initialize the per-CPU srcu_data array, which feeds into the
>  	 * leaves of the srcu_node tree.
> @@ -161,8 +164,6 @@ static void init_srcu_struct_nodes(struct srcu_struct *ssp, bool is_static)
>  		timer_setup(&sdp->delay_work, srcu_delay_timer, 0);
>  		sdp->ssp = ssp;
>  		sdp->grpmask = 1 << (cpu - sdp->mynode->grplo);
> -		if (is_static)
> -			continue;
>  
>  		/* Dynamically allocated, better be no srcu_read_locks()! */
>  		for (i = 0; i < ARRAY_SIZE(sdp->srcu_lock_count); i++) {
> 

-- 
Alexey

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH kernel] srcu: Fix static initialization
  2020-09-09  9:24 ` Alexey Kardashevskiy
@ 2020-09-09 11:50   ` Paul E. McKenney
  2020-09-09 12:31     ` Alexey Kardashevskiy
  0 siblings, 1 reply; 10+ messages in thread
From: Paul E. McKenney @ 2020-09-09 11:50 UTC (permalink / raw)
  To: Alexey Kardashevskiy
  Cc: rcu, Josh Triplett, Lai Jiangshan, Mathieu Desnoyers,
	Steven Rostedt, linux-kernel

On Wed, Sep 09, 2020 at 07:24:11PM +1000, Alexey Kardashevskiy wrote:
> 
> 
> On 09/09/2020 00:43, Alexey Kardashevskiy wrote:
> > init_srcu_struct_nodes() is called with is_static==true only internally
> > and when this happens, the srcu->sda is not initialized in
> > init_srcu_struct_fields() and we crash on dereferencing @sdp.
> > 
> > This fixes the crash by moving "if (is_static)" out of the loop which
> > only does useful work for is_static=false case anyway.
> > 
> > Found by syzkaller.
> > 
> > Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
> > ---
> >  kernel/rcu/srcutree.c | 5 +++--
> >  1 file changed, 3 insertions(+), 2 deletions(-)
> > 
> > diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c
> > index c100acf332ed..49b54a50bde8 100644
> > --- a/kernel/rcu/srcutree.c
> > +++ b/kernel/rcu/srcutree.c
> > @@ -135,6 +135,9 @@ static void init_srcu_struct_nodes(struct srcu_struct *ssp, bool is_static)
> >  				   levelspread[level - 1];
> >  	}
> >  
> > +	if (is_static)
> > +		return;
> 
> Actually, this is needed here too:
> 
>  if (!ssp->sda)
>          return;
> 
> as
> ssp->sda = alloc_percpu(struct srcu_data)
> 
> can fail if the process is killed too soon - it is quite easy to get
> this situation with syzkaller (syscalls fuzzer)
> 
> Makes sense?

Just to make sure that I understand, these failures occur when the task
running init_srcu_struct_nodes() is killed, correct?

Or has someone managed to invoke (say) synchronize_srcu() on a
dynamically allocated srcu_struct before invoking init_srcu_struct() on
that srcu_struct?  This would be an SRCU usage bug.  If you dynamically
allocate your srcu_struct, you are absolutely required to invoke
init_srcu_struct() on it before doing anything else with it.

Or am I missing something here?

(The rcutorture test suite does test both static and dynamic allocation
of the srcu_struct, so I am expecting something a bit subtle here.)

							Thanx, Paul

> > +
> >  	/*
> >  	 * Initialize the per-CPU srcu_data array, which feeds into the
> >  	 * leaves of the srcu_node tree.
> > @@ -161,8 +164,6 @@ static void init_srcu_struct_nodes(struct srcu_struct *ssp, bool is_static)
> >  		timer_setup(&sdp->delay_work, srcu_delay_timer, 0);
> >  		sdp->ssp = ssp;
> >  		sdp->grpmask = 1 << (cpu - sdp->mynode->grplo);
> > -		if (is_static)
> > -			continue;
> >  
> >  		/* Dynamically allocated, better be no srcu_read_locks()! */
> >  		for (i = 0; i < ARRAY_SIZE(sdp->srcu_lock_count); i++) {
> > 
> 
> -- 
> Alexey

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH kernel] srcu: Fix static initialization
  2020-09-09 11:50   ` Paul E. McKenney
@ 2020-09-09 12:31     ` Alexey Kardashevskiy
  2020-09-10 18:53       ` Paul E. McKenney
  0 siblings, 1 reply; 10+ messages in thread
From: Alexey Kardashevskiy @ 2020-09-09 12:31 UTC (permalink / raw)
  To: paulmck
  Cc: rcu, Josh Triplett, Lai Jiangshan, Mathieu Desnoyers,
	Steven Rostedt, linux-kernel



On 09/09/2020 21:50, Paul E. McKenney wrote:
> On Wed, Sep 09, 2020 at 07:24:11PM +1000, Alexey Kardashevskiy wrote:
>>
>>
>> On 09/09/2020 00:43, Alexey Kardashevskiy wrote:
>>> init_srcu_struct_nodes() is called with is_static==true only internally
>>> and when this happens, the srcu->sda is not initialized in
>>> init_srcu_struct_fields() and we crash on dereferencing @sdp.
>>>
>>> This fixes the crash by moving "if (is_static)" out of the loop which
>>> only does useful work for is_static=false case anyway.
>>>
>>> Found by syzkaller.
>>>
>>> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
>>> ---
>>>  kernel/rcu/srcutree.c | 5 +++--
>>>  1 file changed, 3 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c
>>> index c100acf332ed..49b54a50bde8 100644
>>> --- a/kernel/rcu/srcutree.c
>>> +++ b/kernel/rcu/srcutree.c
>>> @@ -135,6 +135,9 @@ static void init_srcu_struct_nodes(struct srcu_struct *ssp, bool is_static)
>>>  				   levelspread[level - 1];
>>>  	}
>>>  
>>> +	if (is_static)
>>> +		return;
>>
>> Actually, this is needed here too:
>>
>>  if (!ssp->sda)
>>          return;
>>
>> as
>> ssp->sda = alloc_percpu(struct srcu_data)
>>
>> can fail if the process is killed too soon - it is quite easy to get
>> this situation with syzkaller (syscalls fuzzer)
>>
>> Makes sense?
> 
> Just to make sure that I understand, these failures occur when the task
> running init_srcu_struct_nodes() is killed, correct?

There are multiple user tasks (8) which open /dev/kvm, do 1 ioctl -
KVM_CREATE_VM - and terminate, running on 8 vcpus, 8 VMs, crashes every
20min or so, less tasks or vcpus - and the problem does not appear.


> 
> Or has someone managed to invoke (say) synchronize_srcu() on a
> dynamically allocated srcu_struct before invoking init_srcu_struct() on
> that srcu_struct?  

Nah, none of that :)

init_srcu_struct_nodes() assumes ssp->sda!=NULL but alloc_percpu() fails
here:

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/mm/percpu.c#n1734
===
	} else if (mutex_lock_killable(&pcpu_alloc_mutex)) {
			pcpu_memcg_post_alloc_hook(objcg, NULL, 0, size);
			return NULL;
===

I am still up to reading that osr-rcuusage.pdf to provide better
analysis :) Thanks,


> This would be an SRCU usage bug.  If you dynamically
> allocate your srcu_struct, you are absolutely required to invoke
> init_srcu_struct() on it before doing anything else with it.
> 
> Or am I missing something here?
> 
> (The rcutorture test suite does test both static and dynamic allocation
> of the srcu_struct, so I am expecting something a bit subtle here.)
> 
> 							Thanx, Paul
> 
>>> +
>>>  	/*
>>>  	 * Initialize the per-CPU srcu_data array, which feeds into the
>>>  	 * leaves of the srcu_node tree.
>>> @@ -161,8 +164,6 @@ static void init_srcu_struct_nodes(struct srcu_struct *ssp, bool is_static)
>>>  		timer_setup(&sdp->delay_work, srcu_delay_timer, 0);
>>>  		sdp->ssp = ssp;
>>>  		sdp->grpmask = 1 << (cpu - sdp->mynode->grplo);
>>> -		if (is_static)
>>> -			continue;
>>>  
>>>  		/* Dynamically allocated, better be no srcu_read_locks()! */
>>>  		for (i = 0; i < ARRAY_SIZE(sdp->srcu_lock_count); i++) {
>>>
>>
>> -- 
>> Alexey

-- 
Alexey

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH kernel] srcu: Fix static initialization
  2020-09-09 12:31     ` Alexey Kardashevskiy
@ 2020-09-10 18:53       ` Paul E. McKenney
  2020-09-11  5:09         ` Alexey Kardashevskiy
  0 siblings, 1 reply; 10+ messages in thread
From: Paul E. McKenney @ 2020-09-10 18:53 UTC (permalink / raw)
  To: Alexey Kardashevskiy
  Cc: rcu, Josh Triplett, Lai Jiangshan, Mathieu Desnoyers,
	Steven Rostedt, linux-kernel

On Wed, Sep 09, 2020 at 10:31:03PM +1000, Alexey Kardashevskiy wrote:
> 
> 
> On 09/09/2020 21:50, Paul E. McKenney wrote:
> > On Wed, Sep 09, 2020 at 07:24:11PM +1000, Alexey Kardashevskiy wrote:
> >>
> >>
> >> On 09/09/2020 00:43, Alexey Kardashevskiy wrote:
> >>> init_srcu_struct_nodes() is called with is_static==true only internally
> >>> and when this happens, the srcu->sda is not initialized in
> >>> init_srcu_struct_fields() and we crash on dereferencing @sdp.
> >>>
> >>> This fixes the crash by moving "if (is_static)" out of the loop which
> >>> only does useful work for is_static=false case anyway.
> >>>
> >>> Found by syzkaller.
> >>>
> >>> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
> >>> ---
> >>>  kernel/rcu/srcutree.c | 5 +++--
> >>>  1 file changed, 3 insertions(+), 2 deletions(-)
> >>>
> >>> diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c
> >>> index c100acf332ed..49b54a50bde8 100644
> >>> --- a/kernel/rcu/srcutree.c
> >>> +++ b/kernel/rcu/srcutree.c
> >>> @@ -135,6 +135,9 @@ static void init_srcu_struct_nodes(struct srcu_struct *ssp, bool is_static)
> >>>  				   levelspread[level - 1];
> >>>  	}
> >>>  
> >>> +	if (is_static)
> >>> +		return;
> >>
> >> Actually, this is needed here too:
> >>
> >>  if (!ssp->sda)
> >>          return;
> >>
> >> as
> >> ssp->sda = alloc_percpu(struct srcu_data)
> >>
> >> can fail if the process is killed too soon - it is quite easy to get
> >> this situation with syzkaller (syscalls fuzzer)
> >>
> >> Makes sense?
> > 
> > Just to make sure that I understand, these failures occur when the task
> > running init_srcu_struct_nodes() is killed, correct?
> 
> There are multiple user tasks (8) which open /dev/kvm, do 1 ioctl -
> KVM_CREATE_VM - and terminate, running on 8 vcpus, 8 VMs, crashes every
> 20min or so, less tasks or vcpus - and the problem does not appear.
> 
> 
> > 
> > Or has someone managed to invoke (say) synchronize_srcu() on a
> > dynamically allocated srcu_struct before invoking init_srcu_struct() on
> > that srcu_struct?  
> 
> Nah, none of that :)
> 
> init_srcu_struct_nodes() assumes ssp->sda!=NULL but alloc_percpu() fails
> here:
> 
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/mm/percpu.c#n1734
> ===
> 	} else if (mutex_lock_killable(&pcpu_alloc_mutex)) {
> 			pcpu_memcg_post_alloc_hook(objcg, NULL, 0, size);
> 			return NULL;
> ===
> 
> I am still up to reading that osr-rcuusage.pdf to provide better
> analysis :) Thanks,

Ah, got it!  Does the following patch help?

There will likely also need to be changes to cleanup_srcu_struct(),
but first let's see if I understand the problem.  ;-)

							Thanx, Paul

------------------------------------------------------------------------

diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c
index c13348e..6f7880a 100644
--- a/kernel/rcu/srcutree.c
+++ b/kernel/rcu/srcutree.c
@@ -177,11 +177,13 @@ static int init_srcu_struct_fields(struct srcu_struct *ssp, bool is_static)
 	INIT_DELAYED_WORK(&ssp->work, process_srcu);
 	if (!is_static)
 		ssp->sda = alloc_percpu(struct srcu_data);
+	if (!ssp->sda)
+		return -ENOMEM;
 	init_srcu_struct_nodes(ssp, is_static);
 	ssp->srcu_gp_seq_needed_exp = 0;
 	ssp->srcu_last_gp_end = ktime_get_mono_fast_ns();
 	smp_store_release(&ssp->srcu_gp_seq_needed, 0); /* Init done. */
-	return ssp->sda ? 0 : -ENOMEM;
+	return 0;
 }
 
 #ifdef CONFIG_DEBUG_LOCK_ALLOC

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH kernel] srcu: Fix static initialization
  2020-09-10 18:53       ` Paul E. McKenney
@ 2020-09-11  5:09         ` Alexey Kardashevskiy
  2020-09-11 13:52           ` Paul E. McKenney
  0 siblings, 1 reply; 10+ messages in thread
From: Alexey Kardashevskiy @ 2020-09-11  5:09 UTC (permalink / raw)
  To: paulmck
  Cc: rcu, Josh Triplett, Lai Jiangshan, Mathieu Desnoyers,
	Steven Rostedt, linux-kernel



On 11/09/2020 04:53, Paul E. McKenney wrote:
> On Wed, Sep 09, 2020 at 10:31:03PM +1000, Alexey Kardashevskiy wrote:
>>
>>
>> On 09/09/2020 21:50, Paul E. McKenney wrote:
>>> On Wed, Sep 09, 2020 at 07:24:11PM +1000, Alexey Kardashevskiy wrote:
>>>>
>>>>
>>>> On 09/09/2020 00:43, Alexey Kardashevskiy wrote:
>>>>> init_srcu_struct_nodes() is called with is_static==true only internally
>>>>> and when this happens, the srcu->sda is not initialized in
>>>>> init_srcu_struct_fields() and we crash on dereferencing @sdp.
>>>>>
>>>>> This fixes the crash by moving "if (is_static)" out of the loop which
>>>>> only does useful work for is_static=false case anyway.
>>>>>
>>>>> Found by syzkaller.
>>>>>
>>>>> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
>>>>> ---
>>>>>  kernel/rcu/srcutree.c | 5 +++--
>>>>>  1 file changed, 3 insertions(+), 2 deletions(-)
>>>>>
>>>>> diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c
>>>>> index c100acf332ed..49b54a50bde8 100644
>>>>> --- a/kernel/rcu/srcutree.c
>>>>> +++ b/kernel/rcu/srcutree.c
>>>>> @@ -135,6 +135,9 @@ static void init_srcu_struct_nodes(struct srcu_struct *ssp, bool is_static)
>>>>>  				   levelspread[level - 1];
>>>>>  	}
>>>>>  
>>>>> +	if (is_static)
>>>>> +		return;
>>>>
>>>> Actually, this is needed here too:
>>>>
>>>>  if (!ssp->sda)
>>>>          return;
>>>>
>>>> as
>>>> ssp->sda = alloc_percpu(struct srcu_data)
>>>>
>>>> can fail if the process is killed too soon - it is quite easy to get
>>>> this situation with syzkaller (syscalls fuzzer)
>>>>
>>>> Makes sense?
>>>
>>> Just to make sure that I understand, these failures occur when the task
>>> running init_srcu_struct_nodes() is killed, correct?
>>
>> There are multiple user tasks (8) which open /dev/kvm, do 1 ioctl -
>> KVM_CREATE_VM - and terminate, running on 8 vcpus, 8 VMs, crashes every
>> 20min or so, less tasks or vcpus - and the problem does not appear.
>>
>>
>>>
>>> Or has someone managed to invoke (say) synchronize_srcu() on a
>>> dynamically allocated srcu_struct before invoking init_srcu_struct() on
>>> that srcu_struct?  
>>
>> Nah, none of that :)
>>
>> init_srcu_struct_nodes() assumes ssp->sda!=NULL but alloc_percpu() fails
>> here:
>>
>> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/mm/percpu.c#n1734
>> ===
>> 	} else if (mutex_lock_killable(&pcpu_alloc_mutex)) {
>> 			pcpu_memcg_post_alloc_hook(objcg, NULL, 0, size);
>> 			return NULL;
>> ===
>>
>> I am still up to reading that osr-rcuusage.pdf to provide better
>> analysis :) Thanks,
> 
> Ah, got it!  Does the following patch help?
> 
> There will likely also need to be changes to cleanup_srcu_struct(),
> but first let's see if I understand the problem.  ;-)
> 
> 							Thanx, Paul
> 
> ------------------------------------------------------------------------
> 
> diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c
> index c13348e..6f7880a 100644
> --- a/kernel/rcu/srcutree.c
> +++ b/kernel/rcu/srcutree.c
> @@ -177,11 +177,13 @@ static int init_srcu_struct_fields(struct srcu_struct *ssp, bool is_static)
>  	INIT_DELAYED_WORK(&ssp->work, process_srcu);
>  	if (!is_static)
>  		ssp->sda = alloc_percpu(struct srcu_data);
> +	if (!ssp->sda)
> +		return -ENOMEM;
>  	init_srcu_struct_nodes(ssp, is_static);
>  	ssp->srcu_gp_seq_needed_exp = 0;
>  	ssp->srcu_last_gp_end = ktime_get_mono_fast_ns();
>  	smp_store_release(&ssp->srcu_gp_seq_needed, 0); /* Init done. */


The line above confuses me a bit. What you propose returns without
smp_store_release() called which should not matter I suppose.

Otherwise it should work, although I cannot verify right now as my box
went down and since it is across Pacific - it may take time to power
cycle it :) Thanks,


> -	return ssp->sda ? 0 : -ENOMEM;
> +	return 0;
>  }
>  
>  #ifdef CONFIG_DEBUG_LOCK_ALLOC
> 

-- 
Alexey

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH kernel] srcu: Fix static initialization
  2020-09-11  5:09         ` Alexey Kardashevskiy
@ 2020-09-11 13:52           ` Paul E. McKenney
  2020-09-16 16:12             ` Paul E. McKenney
  0 siblings, 1 reply; 10+ messages in thread
From: Paul E. McKenney @ 2020-09-11 13:52 UTC (permalink / raw)
  To: Alexey Kardashevskiy
  Cc: rcu, Josh Triplett, Lai Jiangshan, Mathieu Desnoyers,
	Steven Rostedt, linux-kernel

On Fri, Sep 11, 2020 at 03:09:41PM +1000, Alexey Kardashevskiy wrote:
> 
> 
> On 11/09/2020 04:53, Paul E. McKenney wrote:
> > On Wed, Sep 09, 2020 at 10:31:03PM +1000, Alexey Kardashevskiy wrote:
> >>
> >>
> >> On 09/09/2020 21:50, Paul E. McKenney wrote:
> >>> On Wed, Sep 09, 2020 at 07:24:11PM +1000, Alexey Kardashevskiy wrote:
> >>>>
> >>>>
> >>>> On 09/09/2020 00:43, Alexey Kardashevskiy wrote:
> >>>>> init_srcu_struct_nodes() is called with is_static==true only internally
> >>>>> and when this happens, the srcu->sda is not initialized in
> >>>>> init_srcu_struct_fields() and we crash on dereferencing @sdp.
> >>>>>
> >>>>> This fixes the crash by moving "if (is_static)" out of the loop which
> >>>>> only does useful work for is_static=false case anyway.
> >>>>>
> >>>>> Found by syzkaller.
> >>>>>
> >>>>> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
> >>>>> ---
> >>>>>  kernel/rcu/srcutree.c | 5 +++--
> >>>>>  1 file changed, 3 insertions(+), 2 deletions(-)
> >>>>>
> >>>>> diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c
> >>>>> index c100acf332ed..49b54a50bde8 100644
> >>>>> --- a/kernel/rcu/srcutree.c
> >>>>> +++ b/kernel/rcu/srcutree.c
> >>>>> @@ -135,6 +135,9 @@ static void init_srcu_struct_nodes(struct srcu_struct *ssp, bool is_static)
> >>>>>  				   levelspread[level - 1];
> >>>>>  	}
> >>>>>  
> >>>>> +	if (is_static)
> >>>>> +		return;
> >>>>
> >>>> Actually, this is needed here too:
> >>>>
> >>>>  if (!ssp->sda)
> >>>>          return;
> >>>>
> >>>> as
> >>>> ssp->sda = alloc_percpu(struct srcu_data)
> >>>>
> >>>> can fail if the process is killed too soon - it is quite easy to get
> >>>> this situation with syzkaller (syscalls fuzzer)
> >>>>
> >>>> Makes sense?
> >>>
> >>> Just to make sure that I understand, these failures occur when the task
> >>> running init_srcu_struct_nodes() is killed, correct?
> >>
> >> There are multiple user tasks (8) which open /dev/kvm, do 1 ioctl -
> >> KVM_CREATE_VM - and terminate, running on 8 vcpus, 8 VMs, crashes every
> >> 20min or so, less tasks or vcpus - and the problem does not appear.
> >>
> >>
> >>>
> >>> Or has someone managed to invoke (say) synchronize_srcu() on a
> >>> dynamically allocated srcu_struct before invoking init_srcu_struct() on
> >>> that srcu_struct?  
> >>
> >> Nah, none of that :)
> >>
> >> init_srcu_struct_nodes() assumes ssp->sda!=NULL but alloc_percpu() fails
> >> here:
> >>
> >> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/mm/percpu.c#n1734
> >> ===
> >> 	} else if (mutex_lock_killable(&pcpu_alloc_mutex)) {
> >> 			pcpu_memcg_post_alloc_hook(objcg, NULL, 0, size);
> >> 			return NULL;
> >> ===
> >>
> >> I am still up to reading that osr-rcuusage.pdf to provide better
> >> analysis :) Thanks,
> > 
> > Ah, got it!  Does the following patch help?
> > 
> > There will likely also need to be changes to cleanup_srcu_struct(),
> > but first let's see if I understand the problem.  ;-)
> > 
> > 							Thanx, Paul
> > 
> > ------------------------------------------------------------------------
> > 
> > diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c
> > index c13348e..6f7880a 100644
> > --- a/kernel/rcu/srcutree.c
> > +++ b/kernel/rcu/srcutree.c
> > @@ -177,11 +177,13 @@ static int init_srcu_struct_fields(struct srcu_struct *ssp, bool is_static)
> >  	INIT_DELAYED_WORK(&ssp->work, process_srcu);
> >  	if (!is_static)
> >  		ssp->sda = alloc_percpu(struct srcu_data);
> > +	if (!ssp->sda)
> > +		return -ENOMEM;
> >  	init_srcu_struct_nodes(ssp, is_static);
> >  	ssp->srcu_gp_seq_needed_exp = 0;
> >  	ssp->srcu_last_gp_end = ktime_get_mono_fast_ns();
> >  	smp_store_release(&ssp->srcu_gp_seq_needed, 0); /* Init done. */
> 
> The line above confuses me a bit. What you propose returns without
> smp_store_release() called which should not matter I suppose.

The idea is that if init_srcu_struct() returns -ENOMEM, the structure
has not been initialized and had better not be used.  If the calling code
cannot handle that outcome, then the calling code needs to do something
to insulate init_srcu_struct() from signals.  One thing that it could
do would be to invoke init_srcu_struct() from a workqueue handler and
wait for this handler to complete.

Please keep in mind that there is nothing init_srcu_struct() can do
about this:  The srcu_struct is useless unless alloc_percpu() succeeds.

And yes, I do need to update the header comments to make this clear.

> Otherwise it should work, although I cannot verify right now as my box
> went down and since it is across Pacific - it may take time to power
> cycle it :) Thanks,

I know that feeling!  And here is hoping that the box is out of reach
of the local hot spots.  ;-)

							Thanx, Paul

> > -	return ssp->sda ? 0 : -ENOMEM;
> > +	return 0;
> >  }
> >  
> >  #ifdef CONFIG_DEBUG_LOCK_ALLOC
> > 
> 
> -- 
> Alexey

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH kernel] srcu: Fix static initialization
  2020-09-11 13:52           ` Paul E. McKenney
@ 2020-09-16 16:12             ` Paul E. McKenney
  2020-09-22  0:41               ` Alexey Kardashevskiy
  0 siblings, 1 reply; 10+ messages in thread
From: Paul E. McKenney @ 2020-09-16 16:12 UTC (permalink / raw)
  To: Alexey Kardashevskiy
  Cc: rcu, Josh Triplett, Lai Jiangshan, Mathieu Desnoyers,
	Steven Rostedt, linux-kernel

On Fri, Sep 11, 2020 at 06:52:08AM -0700, Paul E. McKenney wrote:
> On Fri, Sep 11, 2020 at 03:09:41PM +1000, Alexey Kardashevskiy wrote:
> > On 11/09/2020 04:53, Paul E. McKenney wrote:
> > > On Wed, Sep 09, 2020 at 10:31:03PM +1000, Alexey Kardashevskiy wrote:

[ . . . ]

> > >> init_srcu_struct_nodes() assumes ssp->sda!=NULL but alloc_percpu() fails
> > >> here:
> > >>
> > >> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/mm/percpu.c#n1734
> > >> ===
> > >> 	} else if (mutex_lock_killable(&pcpu_alloc_mutex)) {
> > >> 			pcpu_memcg_post_alloc_hook(objcg, NULL, 0, size);
> > >> 			return NULL;
> > >> ===
> > >>
> > >> I am still up to reading that osr-rcuusage.pdf to provide better
> > >> analysis :) Thanks,
> > > 
> > > Ah, got it!  Does the following patch help?
> > > 
> > > There will likely also need to be changes to cleanup_srcu_struct(),
> > > but first let's see if I understand the problem.  ;-)
> > > 
> > > 							Thanx, Paul
> > > 
> > > ------------------------------------------------------------------------
> > > 
> > > diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c
> > > index c13348e..6f7880a 100644
> > > --- a/kernel/rcu/srcutree.c
> > > +++ b/kernel/rcu/srcutree.c
> > > @@ -177,11 +177,13 @@ static int init_srcu_struct_fields(struct srcu_struct *ssp, bool is_static)
> > >  	INIT_DELAYED_WORK(&ssp->work, process_srcu);
> > >  	if (!is_static)
> > >  		ssp->sda = alloc_percpu(struct srcu_data);
> > > +	if (!ssp->sda)
> > > +		return -ENOMEM;
> > >  	init_srcu_struct_nodes(ssp, is_static);
> > >  	ssp->srcu_gp_seq_needed_exp = 0;
> > >  	ssp->srcu_last_gp_end = ktime_get_mono_fast_ns();
> > >  	smp_store_release(&ssp->srcu_gp_seq_needed, 0); /* Init done. */
> > 
> > The line above confuses me a bit. What you propose returns without
> > smp_store_release() called which should not matter I suppose.
> 
> The idea is that if init_srcu_struct() returns -ENOMEM, the structure
> has not been initialized and had better not be used.  If the calling code
> cannot handle that outcome, then the calling code needs to do something
> to insulate init_srcu_struct() from signals.  One thing that it could
> do would be to invoke init_srcu_struct() from a workqueue handler and
> wait for this handler to complete.
> 
> Please keep in mind that there is nothing init_srcu_struct() can do
> about this:  The srcu_struct is useless unless alloc_percpu() succeeds.
> 
> And yes, I do need to update the header comments to make this clear.
> 
> > Otherwise it should work, although I cannot verify right now as my box
> > went down and since it is across Pacific - it may take time to power
> > cycle it :) Thanks,
> 
> I know that feeling!  And here is hoping that the box is out of reach
> of the local hot spots.  ;-)

Just following up...  Did that patch help?

							Thanx, Paul

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH kernel] srcu: Fix static initialization
  2020-09-16 16:12             ` Paul E. McKenney
@ 2020-09-22  0:41               ` Alexey Kardashevskiy
  2020-09-22 22:00                 ` Paul E. McKenney
  0 siblings, 1 reply; 10+ messages in thread
From: Alexey Kardashevskiy @ 2020-09-22  0:41 UTC (permalink / raw)
  To: paulmck
  Cc: rcu, Josh Triplett, Lai Jiangshan, Mathieu Desnoyers,
	Steven Rostedt, linux-kernel



On 17/09/2020 02:12, Paul E. McKenney wrote:
> On Fri, Sep 11, 2020 at 06:52:08AM -0700, Paul E. McKenney wrote:
>> On Fri, Sep 11, 2020 at 03:09:41PM +1000, Alexey Kardashevskiy wrote:
>>> On 11/09/2020 04:53, Paul E. McKenney wrote:
>>>> On Wed, Sep 09, 2020 at 10:31:03PM +1000, Alexey Kardashevskiy wrote:
> 
> [ . . . ]
> 
>>>>> init_srcu_struct_nodes() assumes ssp->sda!=NULL but alloc_percpu() fails
>>>>> here:
>>>>>
>>>>> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/mm/percpu.c#n1734
>>>>> ===
>>>>> 	} else if (mutex_lock_killable(&pcpu_alloc_mutex)) {
>>>>> 			pcpu_memcg_post_alloc_hook(objcg, NULL, 0, size);
>>>>> 			return NULL;
>>>>> ===
>>>>>
>>>>> I am still up to reading that osr-rcuusage.pdf to provide better
>>>>> analysis :) Thanks,
>>>>
>>>> Ah, got it!  Does the following patch help?
>>>>
>>>> There will likely also need to be changes to cleanup_srcu_struct(),
>>>> but first let's see if I understand the problem.  ;-)
>>>>
>>>> 							Thanx, Paul
>>>>
>>>> ------------------------------------------------------------------------
>>>>
>>>> diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c
>>>> index c13348e..6f7880a 100644
>>>> --- a/kernel/rcu/srcutree.c
>>>> +++ b/kernel/rcu/srcutree.c
>>>> @@ -177,11 +177,13 @@ static int init_srcu_struct_fields(struct srcu_struct *ssp, bool is_static)
>>>>  	INIT_DELAYED_WORK(&ssp->work, process_srcu);
>>>>  	if (!is_static)
>>>>  		ssp->sda = alloc_percpu(struct srcu_data);
>>>> +	if (!ssp->sda)
>>>> +		return -ENOMEM;
>>>>  	init_srcu_struct_nodes(ssp, is_static);
>>>>  	ssp->srcu_gp_seq_needed_exp = 0;
>>>>  	ssp->srcu_last_gp_end = ktime_get_mono_fast_ns();
>>>>  	smp_store_release(&ssp->srcu_gp_seq_needed, 0); /* Init done. */
>>>
>>> The line above confuses me a bit. What you propose returns without
>>> smp_store_release() called which should not matter I suppose.
>>
>> The idea is that if init_srcu_struct() returns -ENOMEM, the structure
>> has not been initialized and had better not be used.  If the calling code
>> cannot handle that outcome, then the calling code needs to do something
>> to insulate init_srcu_struct() from signals.  One thing that it could
>> do would be to invoke init_srcu_struct() from a workqueue handler and
>> wait for this handler to complete.
>>
>> Please keep in mind that there is nothing init_srcu_struct() can do
>> about this:  The srcu_struct is useless unless alloc_percpu() succeeds.
>>
>> And yes, I do need to update the header comments to make this clear.
>>
>>> Otherwise it should work, although I cannot verify right now as my box
>>> went down and since it is across Pacific - it may take time to power
>>> cycle it :) Thanks,
>>
>> I know that feeling!  And here is hoping that the box is out of reach
>> of the local hot spots.  ;-)
> 
> Just following up...  Did that patch help?


Yes it did.

Tested-by: Alexey Kardashevskiy <aik@ozlabs.ru>



> 
> 							Thanx, Paul
> 

-- 
Alexey

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH kernel] srcu: Fix static initialization
  2020-09-22  0:41               ` Alexey Kardashevskiy
@ 2020-09-22 22:00                 ` Paul E. McKenney
  0 siblings, 0 replies; 10+ messages in thread
From: Paul E. McKenney @ 2020-09-22 22:00 UTC (permalink / raw)
  To: Alexey Kardashevskiy
  Cc: rcu, Josh Triplett, Lai Jiangshan, Mathieu Desnoyers,
	Steven Rostedt, linux-kernel

On Tue, Sep 22, 2020 at 10:41:37AM +1000, Alexey Kardashevskiy wrote:
> 
> 
> On 17/09/2020 02:12, Paul E. McKenney wrote:
> > On Fri, Sep 11, 2020 at 06:52:08AM -0700, Paul E. McKenney wrote:
> >> On Fri, Sep 11, 2020 at 03:09:41PM +1000, Alexey Kardashevskiy wrote:
> >>> On 11/09/2020 04:53, Paul E. McKenney wrote:
> >>>> On Wed, Sep 09, 2020 at 10:31:03PM +1000, Alexey Kardashevskiy wrote:
> > 
> > [ . . . ]
> > 
> >>>>> init_srcu_struct_nodes() assumes ssp->sda!=NULL but alloc_percpu() fails
> >>>>> here:
> >>>>>
> >>>>> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/mm/percpu.c#n1734
> >>>>> ===
> >>>>> 	} else if (mutex_lock_killable(&pcpu_alloc_mutex)) {
> >>>>> 			pcpu_memcg_post_alloc_hook(objcg, NULL, 0, size);
> >>>>> 			return NULL;
> >>>>> ===
> >>>>>
> >>>>> I am still up to reading that osr-rcuusage.pdf to provide better
> >>>>> analysis :) Thanks,
> >>>>
> >>>> Ah, got it!  Does the following patch help?
> >>>>
> >>>> There will likely also need to be changes to cleanup_srcu_struct(),
> >>>> but first let's see if I understand the problem.  ;-)
> >>>>
> >>>> 							Thanx, Paul
> >>>>
> >>>> ------------------------------------------------------------------------
> >>>>
> >>>> diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c
> >>>> index c13348e..6f7880a 100644
> >>>> --- a/kernel/rcu/srcutree.c
> >>>> +++ b/kernel/rcu/srcutree.c
> >>>> @@ -177,11 +177,13 @@ static int init_srcu_struct_fields(struct srcu_struct *ssp, bool is_static)
> >>>>  	INIT_DELAYED_WORK(&ssp->work, process_srcu);
> >>>>  	if (!is_static)
> >>>>  		ssp->sda = alloc_percpu(struct srcu_data);
> >>>> +	if (!ssp->sda)
> >>>> +		return -ENOMEM;
> >>>>  	init_srcu_struct_nodes(ssp, is_static);
> >>>>  	ssp->srcu_gp_seq_needed_exp = 0;
> >>>>  	ssp->srcu_last_gp_end = ktime_get_mono_fast_ns();
> >>>>  	smp_store_release(&ssp->srcu_gp_seq_needed, 0); /* Init done. */
> >>>
> >>> The line above confuses me a bit. What you propose returns without
> >>> smp_store_release() called which should not matter I suppose.
> >>
> >> The idea is that if init_srcu_struct() returns -ENOMEM, the structure
> >> has not been initialized and had better not be used.  If the calling code
> >> cannot handle that outcome, then the calling code needs to do something
> >> to insulate init_srcu_struct() from signals.  One thing that it could
> >> do would be to invoke init_srcu_struct() from a workqueue handler and
> >> wait for this handler to complete.
> >>
> >> Please keep in mind that there is nothing init_srcu_struct() can do
> >> about this:  The srcu_struct is useless unless alloc_percpu() succeeds.
> >>
> >> And yes, I do need to update the header comments to make this clear.
> >>
> >>> Otherwise it should work, although I cannot verify right now as my box
> >>> went down and since it is across Pacific - it may take time to power
> >>> cycle it :) Thanks,
> >>
> >> I know that feeling!  And here is hoping that the box is out of reach
> >> of the local hot spots.  ;-)
> > 
> > Just following up...  Did that patch help?
> 
> Yes it did.
> 
> Tested-by: Alexey Kardashevskiy <aik@ozlabs.ru>

Thank you, and I will apply on the next rebase.

							Thanx, Paul

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2020-09-22 22:00 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-09-08 14:43 [PATCH kernel] srcu: Fix static initialization Alexey Kardashevskiy
2020-09-09  9:24 ` Alexey Kardashevskiy
2020-09-09 11:50   ` Paul E. McKenney
2020-09-09 12:31     ` Alexey Kardashevskiy
2020-09-10 18:53       ` Paul E. McKenney
2020-09-11  5:09         ` Alexey Kardashevskiy
2020-09-11 13:52           ` Paul E. McKenney
2020-09-16 16:12             ` Paul E. McKenney
2020-09-22  0:41               ` Alexey Kardashevskiy
2020-09-22 22:00                 ` Paul E. McKenney

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).