All of lore.kernel.org
 help / color / mirror / Atom feed
* [REGRESSION 5.19] NULL dereference by ucsi_acpi driver
@ 2022-08-19 16:32 Takashi Iwai
  2022-08-20 18:40 ` Greg Kroah-Hartman
  2022-08-24  9:50 ` Thorsten Leemhuis
  0 siblings, 2 replies; 10+ messages in thread
From: Takashi Iwai @ 2022-08-19 16:32 UTC (permalink / raw)
  To: Greg Kroah-Hartman; +Cc: Linyu Yuan, linux-usb, linux-kernel

Hi,

we've got multiple reports about 5.19 kernel starting crashing after
some time, and this turned out to be triggered by ucsi_acpi driver.
The details are found in:
  https://bugzilla.suse.com/show_bug.cgi?id=1202386

The culprit seems to be the commit 87d0e2f41b8c
    usb: typec: ucsi: add a common function ucsi_unregister_connectors()
    
This commit looks as if it were a harmless cleanup, but this failed in
a subtle way.  Namely, in the error scenario, the driver gets an error
at ucsi_register_altmodes(), and goes to the error handling to release
the resources.  Through this refactoring, the release part was unified
to a funciton ucsi_unregister_connectors().  And there, it has a NULL
check of con->wq, and it bails out the loop if it's NULL. 
Meanwhile, ucsi_register_port() itself still calls destroy_workqueue()
and clear con->wq at its error path.  This ended up in the leftover
power supply device with the uninitialized / cleared device.

It was confirmed that the problem could be avoided by a simple
revert.

I guess another fix could be removing the part clearing con->wq, i.e.

--- a/drivers/usb/typec/ucsi/ucsi.c
+++ b/drivers/usb/typec/ucsi/ucsi.c
@@ -1192,11 +1192,6 @@ static int ucsi_register_port(struct ucsi *ucsi, int index)
 out_unlock:
 	mutex_unlock(&con->lock);
 
-	if (ret && con->wq) {
-		destroy_workqueue(con->wq);
-		con->wq = NULL;
-	}
-
 	return ret;
 }
 

... but it's totally untested and I'm not entirely sure whether it's
better.


thanks,

Takashi

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [REGRESSION 5.19] NULL dereference by ucsi_acpi driver
  2022-08-19 16:32 [REGRESSION 5.19] NULL dereference by ucsi_acpi driver Takashi Iwai
@ 2022-08-20 18:40 ` Greg Kroah-Hartman
  2022-08-22  2:44   ` Linyu Yuan
  2022-08-22 13:24   ` Heikki Krogerus
  2022-08-24  9:50 ` Thorsten Leemhuis
  1 sibling, 2 replies; 10+ messages in thread
From: Greg Kroah-Hartman @ 2022-08-20 18:40 UTC (permalink / raw)
  To: Heikki Krogerus, Takashi Iwai; +Cc: Linyu Yuan, linux-usb, linux-kernel

On Fri, Aug 19, 2022 at 06:32:43PM +0200, Takashi Iwai wrote:
> Hi,
> 
> we've got multiple reports about 5.19 kernel starting crashing after
> some time, and this turned out to be triggered by ucsi_acpi driver.
> The details are found in:
>   https://bugzilla.suse.com/show_bug.cgi?id=1202386
> 
> The culprit seems to be the commit 87d0e2f41b8c
>     usb: typec: ucsi: add a common function ucsi_unregister_connectors()

Adding Heikki to the thread...

>     
> This commit looks as if it were a harmless cleanup, but this failed in
> a subtle way.  Namely, in the error scenario, the driver gets an error
> at ucsi_register_altmodes(), and goes to the error handling to release
> the resources.  Through this refactoring, the release part was unified
> to a funciton ucsi_unregister_connectors().  And there, it has a NULL
> check of con->wq, and it bails out the loop if it's NULL. 
> Meanwhile, ucsi_register_port() itself still calls destroy_workqueue()
> and clear con->wq at its error path.  This ended up in the leftover
> power supply device with the uninitialized / cleared device.
> 
> It was confirmed that the problem could be avoided by a simple
> revert.

I'll be glad to revert this now, unless Heikki thinks:

> 
> I guess another fix could be removing the part clearing con->wq, i.e.
> 
> --- a/drivers/usb/typec/ucsi/ucsi.c
> +++ b/drivers/usb/typec/ucsi/ucsi.c
> @@ -1192,11 +1192,6 @@ static int ucsi_register_port(struct ucsi *ucsi, int index)
>  out_unlock:
>  	mutex_unlock(&con->lock);
>  
> -	if (ret && con->wq) {
> -		destroy_workqueue(con->wq);
> -		con->wq = NULL;
> -	}
> -
>  	return ret;
>  }
>  
> 
> ... but it's totally untested and I'm not entirely sure whether it's
> better.

that is any better?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [REGRESSION 5.19] NULL dereference by ucsi_acpi driver
  2022-08-20 18:40 ` Greg Kroah-Hartman
@ 2022-08-22  2:44   ` Linyu Yuan
  2022-08-30 12:51     ` Greg Kroah-Hartman
  2022-08-22 13:24   ` Heikki Krogerus
  1 sibling, 1 reply; 10+ messages in thread
From: Linyu Yuan @ 2022-08-22  2:44 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Heikki Krogerus, Takashi Iwai; +Cc: linux-usb, linux-kernel


On 8/21/2022 2:40 AM, Greg Kroah-Hartman wrote:
> On Fri, Aug 19, 2022 at 06:32:43PM +0200, Takashi Iwai wrote:
>> Hi,
>>
>> we've got multiple reports about 5.19 kernel starting crashing after
>> some time, and this turned out to be triggered by ucsi_acpi driver.
>> The details are found in:
>>    https://bugzilla.suse.com/show_bug.cgi?id=1202386
>>
>> The culprit seems to be the commit 87d0e2f41b8c
>>      usb: typec: ucsi: add a common function ucsi_unregister_connectors()
> Adding Heikki to the thread...
>
>>      
>> This commit looks as if it were a harmless cleanup, but this failed in
>> a subtle way.  Namely, in the error scenario, the driver gets an error
>> at ucsi_register_altmodes(), and goes to the error handling to release
>> the resources.  Through this refactoring, the release part was unified
>> to a funciton ucsi_unregister_connectors().  And there, it has a NULL
>> check of con->wq, and it bails out the loop if it's NULL.
>> Meanwhile, ucsi_register_port() itself still calls destroy_workqueue()
>> and clear con->wq at its error path.  This ended up in the leftover
>> power supply device with the uninitialized / cleared device.
>>
>> It was confirmed that the problem could be avoided by a simple
>> revert.
> I'll be glad to revert this now, unless Heikki thinks:
>
>> I guess another fix could be removing the part clearing con->wq, i.e.
>>
>> --- a/drivers/usb/typec/ucsi/ucsi.c
>> +++ b/drivers/usb/typec/ucsi/ucsi.c
>> @@ -1192,11 +1192,6 @@ static int ucsi_register_port(struct ucsi *ucsi, int index)
>>   out_unlock:
>>   	mutex_unlock(&con->lock);
>>   
>> -	if (ret && con->wq) {
>> -		destroy_workqueue(con->wq);
>> -		con->wq = NULL;
>> -	}
>> -
>>   	return ret;
>>   }
>>   
>>
>> ... but it's totally untested and I'm not entirely sure whether it's
>> better.

this part is original code, yes, but when I make the change you mentioned,

as in the function ucsi_unregister_connectors(),  just use con->wq to 
represent which connector initialized previous,

indeed if we clear it in ucsi_register_port(), something will left unclear.

please send a patch to fix it.

I think your change is good.

> that is any better?
>
> thanks,
>
> greg k-h

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [REGRESSION 5.19] NULL dereference by ucsi_acpi driver
  2022-08-20 18:40 ` Greg Kroah-Hartman
  2022-08-22  2:44   ` Linyu Yuan
@ 2022-08-22 13:24   ` Heikki Krogerus
  2022-08-23  2:26     ` Linyu Yuan
  1 sibling, 1 reply; 10+ messages in thread
From: Heikki Krogerus @ 2022-08-22 13:24 UTC (permalink / raw)
  To: Greg Kroah-Hartman; +Cc: Takashi Iwai, Linyu Yuan, linux-usb, linux-kernel

Hi,

On Sat, Aug 20, 2022 at 08:40:52PM +0200, Greg Kroah-Hartman wrote:
> On Fri, Aug 19, 2022 at 06:32:43PM +0200, Takashi Iwai wrote:
> > Hi,
> > 
> > we've got multiple reports about 5.19 kernel starting crashing after
> > some time, and this turned out to be triggered by ucsi_acpi driver.
> > The details are found in:
> >   https://bugzilla.suse.com/show_bug.cgi?id=1202386
> > 
> > The culprit seems to be the commit 87d0e2f41b8c
> >     usb: typec: ucsi: add a common function ucsi_unregister_connectors()
> 
> Adding Heikki to the thread...
> 
> >     
> > This commit looks as if it were a harmless cleanup, but this failed in
> > a subtle way.  Namely, in the error scenario, the driver gets an error
> > at ucsi_register_altmodes(), and goes to the error handling to release
> > the resources.  Through this refactoring, the release part was unified
> > to a funciton ucsi_unregister_connectors().  And there, it has a NULL
> > check of con->wq, and it bails out the loop if it's NULL. 
> > Meanwhile, ucsi_register_port() itself still calls destroy_workqueue()
> > and clear con->wq at its error path.  This ended up in the leftover
> > power supply device with the uninitialized / cleared device.
> > 
> > It was confirmed that the problem could be avoided by a simple
> > revert.
> 
> I'll be glad to revert this now, unless Heikki thinks:
> 
> > 
> > I guess another fix could be removing the part clearing con->wq, i.e.
> > 
> > --- a/drivers/usb/typec/ucsi/ucsi.c
> > +++ b/drivers/usb/typec/ucsi/ucsi.c
> > @@ -1192,11 +1192,6 @@ static int ucsi_register_port(struct ucsi *ucsi, int index)
> >  out_unlock:
> >  	mutex_unlock(&con->lock);
> >  
> > -	if (ret && con->wq) {
> > -		destroy_workqueue(con->wq);
> > -		con->wq = NULL;
> > -	}
> > -
> >  	return ret;
> >  }
> >  
> > 
> > ... but it's totally untested and I'm not entirely sure whether it's
> > better.
> 
> that is any better?

No, I don't think that's better. Right now I would prefer that we play
it safe and revert.

The conditions are different in the two places where the ports are
unregistered in this driver. Therefore I don't think it makes sense
to use a function like ucsi_unregister_connectors() that tries to
cover both cases. It will always be a little bit fragile.

Instead we could introduce a function that can be used to remove a
single port. That would leave the handling of the conditions to the
callers of the function, but it would still remove the boilerplate.
That would be much safer IMO.

But to fix this problem, I think we should revert.

thanks,

-- 
heikki

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [REGRESSION 5.19] NULL dereference by ucsi_acpi driver
  2022-08-22 13:24   ` Heikki Krogerus
@ 2022-08-23  2:26     ` Linyu Yuan
  2022-08-23  6:41       ` Greg Kroah-Hartman
  0 siblings, 1 reply; 10+ messages in thread
From: Linyu Yuan @ 2022-08-23  2:26 UTC (permalink / raw)
  To: Heikki Krogerus, Greg Kroah-Hartman; +Cc: Takashi Iwai, linux-usb, linux-kernel


On 8/22/2022 9:24 PM, Heikki Krogerus wrote:
> Hi,
>
> On Sat, Aug 20, 2022 at 08:40:52PM +0200, Greg Kroah-Hartman wrote:
>> On Fri, Aug 19, 2022 at 06:32:43PM +0200, Takashi Iwai wrote:
>>> Hi,
>>>
>>> we've got multiple reports about 5.19 kernel starting crashing after
>>> some time, and this turned out to be triggered by ucsi_acpi driver.
>>> The details are found in:
>>>    https://bugzilla.suse.com/show_bug.cgi?id=1202386
>>>
>>> The culprit seems to be the commit 87d0e2f41b8c
>>>      usb: typec: ucsi: add a common function ucsi_unregister_connectors()
>> Adding Heikki to the thread...
>>
>>>      
>>> This commit looks as if it were a harmless cleanup, but this failed in
>>> a subtle way.  Namely, in the error scenario, the driver gets an error
>>> at ucsi_register_altmodes(), and goes to the error handling to release
>>> the resources.  Through this refactoring, the release part was unified
>>> to a funciton ucsi_unregister_connectors().  And there, it has a NULL
>>> check of con->wq, and it bails out the loop if it's NULL.
>>> Meanwhile, ucsi_register_port() itself still calls destroy_workqueue()
>>> and clear con->wq at its error path.  This ended up in the leftover
>>> power supply device with the uninitialized / cleared device.
>>>
>>> It was confirmed that the problem could be avoided by a simple
>>> revert.
>> I'll be glad to revert this now, unless Heikki thinks:
>>
>>> I guess another fix could be removing the part clearing con->wq, i.e.
>>>
>>> --- a/drivers/usb/typec/ucsi/ucsi.c
>>> +++ b/drivers/usb/typec/ucsi/ucsi.c
>>> @@ -1192,11 +1192,6 @@ static int ucsi_register_port(struct ucsi *ucsi, int index)
>>>   out_unlock:
>>>   	mutex_unlock(&con->lock);
>>>   
>>> -	if (ret && con->wq) {
>>> -		destroy_workqueue(con->wq);
>>> -		con->wq = NULL;
>>> -	}
>>> -
>>>   	return ret;
>>>   }
>>>   
>>>
>>> ... but it's totally untested and I'm not entirely sure whether it's
>>> better.
>> that is any better?
> No, I don't think that's better. Right now I would prefer that we play
> it safe and revert.
>
> The conditions are different in the two places where the ports are
> unregistered in this driver. Therefore I don't think it makes sense
> to use a function like ucsi_unregister_connectors() that tries to
> cover both cases. It will always be a little bit fragile.
>
> Instead we could introduce a function that can be used to remove a
> single port. That would leave the handling of the conditions to the
> callers of the function, but it would still remove the boilerplate.
> That would be much safer IMO.
>
> But to fix this problem, I think we should revert.

but revert will happen on several stable branch, right ?

i think simple fix is good, from my view there is no big differences to 
create a function for a single port.


>
> thanks,
>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [REGRESSION 5.19] NULL dereference by ucsi_acpi driver
  2022-08-23  2:26     ` Linyu Yuan
@ 2022-08-23  6:41       ` Greg Kroah-Hartman
  2022-08-23  6:52         ` Takashi Iwai
  0 siblings, 1 reply; 10+ messages in thread
From: Greg Kroah-Hartman @ 2022-08-23  6:41 UTC (permalink / raw)
  To: Linyu Yuan; +Cc: Heikki Krogerus, Takashi Iwai, linux-usb, linux-kernel

On Tue, Aug 23, 2022 at 10:26:59AM +0800, Linyu Yuan wrote:
> 
> On 8/22/2022 9:24 PM, Heikki Krogerus wrote:
> > Hi,
> > 
> > On Sat, Aug 20, 2022 at 08:40:52PM +0200, Greg Kroah-Hartman wrote:
> > > On Fri, Aug 19, 2022 at 06:32:43PM +0200, Takashi Iwai wrote:
> > > > Hi,
> > > > 
> > > > we've got multiple reports about 5.19 kernel starting crashing after
> > > > some time, and this turned out to be triggered by ucsi_acpi driver.
> > > > The details are found in:
> > > >    https://bugzilla.suse.com/show_bug.cgi?id=1202386
> > > > 
> > > > The culprit seems to be the commit 87d0e2f41b8c
> > > >      usb: typec: ucsi: add a common function ucsi_unregister_connectors()
> > > Adding Heikki to the thread...
> > > 
> > > > This commit looks as if it were a harmless cleanup, but this failed in
> > > > a subtle way.  Namely, in the error scenario, the driver gets an error
> > > > at ucsi_register_altmodes(), and goes to the error handling to release
> > > > the resources.  Through this refactoring, the release part was unified
> > > > to a funciton ucsi_unregister_connectors().  And there, it has a NULL
> > > > check of con->wq, and it bails out the loop if it's NULL.
> > > > Meanwhile, ucsi_register_port() itself still calls destroy_workqueue()
> > > > and clear con->wq at its error path.  This ended up in the leftover
> > > > power supply device with the uninitialized / cleared device.
> > > > 
> > > > It was confirmed that the problem could be avoided by a simple
> > > > revert.
> > > I'll be glad to revert this now, unless Heikki thinks:
> > > 
> > > > I guess another fix could be removing the part clearing con->wq, i.e.
> > > > 
> > > > --- a/drivers/usb/typec/ucsi/ucsi.c
> > > > +++ b/drivers/usb/typec/ucsi/ucsi.c
> > > > @@ -1192,11 +1192,6 @@ static int ucsi_register_port(struct ucsi *ucsi, int index)
> > > >   out_unlock:
> > > >   	mutex_unlock(&con->lock);
> > > > -	if (ret && con->wq) {
> > > > -		destroy_workqueue(con->wq);
> > > > -		con->wq = NULL;
> > > > -	}
> > > > -
> > > >   	return ret;
> > > >   }
> > > > 
> > > > ... but it's totally untested and I'm not entirely sure whether it's
> > > > better.
> > > that is any better?
> > No, I don't think that's better. Right now I would prefer that we play
> > it safe and revert.
> > 
> > The conditions are different in the two places where the ports are
> > unregistered in this driver. Therefore I don't think it makes sense
> > to use a function like ucsi_unregister_connectors() that tries to
> > cover both cases. It will always be a little bit fragile.
> > 
> > Instead we could introduce a function that can be used to remove a
> > single port. That would leave the handling of the conditions to the
> > callers of the function, but it would still remove the boilerplate.
> > That would be much safer IMO.
> > 
> > But to fix this problem, I think we should revert.
> 
> but revert will happen on several stable branch, right ?

If someone sends it to me, yes :)

{hint}


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [REGRESSION 5.19] NULL dereference by ucsi_acpi driver
  2022-08-23  6:41       ` Greg Kroah-Hartman
@ 2022-08-23  6:52         ` Takashi Iwai
  0 siblings, 0 replies; 10+ messages in thread
From: Takashi Iwai @ 2022-08-23  6:52 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Linyu Yuan, Heikki Krogerus, Takashi Iwai, linux-usb, linux-kernel

On Tue, 23 Aug 2022 08:41:00 +0200,
Greg Kroah-Hartman wrote:
> 
> On Tue, Aug 23, 2022 at 10:26:59AM +0800, Linyu Yuan wrote:
> > 
> > On 8/22/2022 9:24 PM, Heikki Krogerus wrote:
> > > Hi,
> > > 
> > > On Sat, Aug 20, 2022 at 08:40:52PM +0200, Greg Kroah-Hartman wrote:
> > > > On Fri, Aug 19, 2022 at 06:32:43PM +0200, Takashi Iwai wrote:
> > > > > Hi,
> > > > > 
> > > > > we've got multiple reports about 5.19 kernel starting crashing after
> > > > > some time, and this turned out to be triggered by ucsi_acpi driver.
> > > > > The details are found in:
> > > > >    https://bugzilla.suse.com/show_bug.cgi?id=1202386
> > > > > 
> > > > > The culprit seems to be the commit 87d0e2f41b8c
> > > > >      usb: typec: ucsi: add a common function ucsi_unregister_connectors()
> > > > Adding Heikki to the thread...
> > > > 
> > > > > This commit looks as if it were a harmless cleanup, but this failed in
> > > > > a subtle way.  Namely, in the error scenario, the driver gets an error
> > > > > at ucsi_register_altmodes(), and goes to the error handling to release
> > > > > the resources.  Through this refactoring, the release part was unified
> > > > > to a funciton ucsi_unregister_connectors().  And there, it has a NULL
> > > > > check of con->wq, and it bails out the loop if it's NULL.
> > > > > Meanwhile, ucsi_register_port() itself still calls destroy_workqueue()
> > > > > and clear con->wq at its error path.  This ended up in the leftover
> > > > > power supply device with the uninitialized / cleared device.
> > > > > 
> > > > > It was confirmed that the problem could be avoided by a simple
> > > > > revert.
> > > > I'll be glad to revert this now, unless Heikki thinks:
> > > > 
> > > > > I guess another fix could be removing the part clearing con->wq, i.e.
> > > > > 
> > > > > --- a/drivers/usb/typec/ucsi/ucsi.c
> > > > > +++ b/drivers/usb/typec/ucsi/ucsi.c
> > > > > @@ -1192,11 +1192,6 @@ static int ucsi_register_port(struct ucsi *ucsi, int index)
> > > > >   out_unlock:
> > > > >   	mutex_unlock(&con->lock);
> > > > > -	if (ret && con->wq) {
> > > > > -		destroy_workqueue(con->wq);
> > > > > -		con->wq = NULL;
> > > > > -	}
> > > > > -
> > > > >   	return ret;
> > > > >   }
> > > > > 
> > > > > ... but it's totally untested and I'm not entirely sure whether it's
> > > > > better.
> > > > that is any better?
> > > No, I don't think that's better. Right now I would prefer that we play
> > > it safe and revert.
> > > 
> > > The conditions are different in the two places where the ports are
> > > unregistered in this driver. Therefore I don't think it makes sense
> > > to use a function like ucsi_unregister_connectors() that tries to
> > > cover both cases. It will always be a little bit fragile.
> > > 
> > > Instead we could introduce a function that can be used to remove a
> > > single port. That would leave the handling of the conditions to the
> > > callers of the function, but it would still remove the boilerplate.
> > > That would be much safer IMO.
> > > 
> > > But to fix this problem, I think we should revert.
> > 
> > but revert will happen on several stable branch, right ?
> 
> If someone sends it to me, yes :)
> 
> {hint}

OK, will submit :)


Takashi

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [REGRESSION 5.19] NULL dereference by ucsi_acpi driver
  2022-08-19 16:32 [REGRESSION 5.19] NULL dereference by ucsi_acpi driver Takashi Iwai
  2022-08-20 18:40 ` Greg Kroah-Hartman
@ 2022-08-24  9:50 ` Thorsten Leemhuis
  1 sibling, 0 replies; 10+ messages in thread
From: Thorsten Leemhuis @ 2022-08-24  9:50 UTC (permalink / raw)
  To: regressions; +Cc: linux-usb, linux-kernel

[TLDR: I'm adding this regression report to the list of tracked
regressions; all text from me you find below is based on a few templates
paragraphs you might have encountered already already in similar form.]

Hi, this is your Linux kernel regression tracker.

On 19.08.22 18:32, Takashi Iwai wrote:
> Hi,
> 
> we've got multiple reports about 5.19 kernel starting crashing after
> some time, and this turned out to be triggered by ucsi_acpi driver.
> The details are found in:
>   https://bugzilla.suse.com/show_bug.cgi?id=1202386
> 
> The culprit seems to be the commit 87d0e2f41b8c
>     usb: typec: ucsi: add a common function ucsi_unregister_connectors()
>     
> This commit looks as if it were a harmless cleanup, but this failed in
> a subtle way.  Namely, in the error scenario, the driver gets an error
> at ucsi_register_altmodes(), and goes to the error handling to release
> the resources.  Through this refactoring, the release part was unified
> to a funciton ucsi_unregister_connectors().  And there, it has a NULL
> check of con->wq, and it bails out the loop if it's NULL. 
> Meanwhile, ucsi_register_port() itself still calls destroy_workqueue()
> and clear con->wq at its error path.  This ended up in the leftover
> power supply device with the uninitialized / cleared device.
> 
> It was confirmed that the problem could be avoided by a simple
> revert.
> 
> I guess another fix could be removing the part clearing con->wq, i.e.
> 
> --- a/drivers/usb/typec/ucsi/ucsi.c
> +++ b/drivers/usb/typec/ucsi/ucsi.c
> @@ -1192,11 +1192,6 @@ static int ucsi_register_port(struct ucsi *ucsi, int index)
>  out_unlock:
>  	mutex_unlock(&con->lock);
>  
> -	if (ret && con->wq) {
> -		destroy_workqueue(con->wq);
> -		con->wq = NULL;
> -	}
> -
>  	return ret;
>  }
>  
> 
> ... but it's totally untested and I'm not entirely sure whether it's
> better.

Thanks for the report. To be sure below issue doesn't fall through the
cracks unnoticed, I'm adding it to regzbot, my Linux kernel regression
tracking bot:

#regzbot introduced 87d0e2f41b8c ^
https://bugzilla.suse.com/show_bug.cgi?id=1202386
#regzbot title NULL dereference by ucsi_acpi driver
#regzbot ignore-activity

This isn't a regression? This issue or a fix for it are already
discussed somewhere else? It was fixed already? You want to clarify when
the regression started to happen? Or point out I got the title or
something else totally wrong? Then just reply -- ideally with also
telling regzbot about it, as explained here:
https://linux-regtracking.leemhuis.info/tracked-regression/

Reminder for developers: When fixing the issue, add 'Link:' tags
pointing to the report (the mail this one replies to), as explained for
in the Linux kernel's documentation; above webpage explains why this is
important for tracked regressions.

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)

P.S.: As the Linux kernel's regression tracker I deal with a lot of
reports and sometimes miss something important when writing mails like
this. If that's the case here, don't hesitate to tell me in a public
reply, it's in everyone's interest to set the public record straight.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [REGRESSION 5.19] NULL dereference by ucsi_acpi driver
  2022-08-22  2:44   ` Linyu Yuan
@ 2022-08-30 12:51     ` Greg Kroah-Hartman
  2022-08-30 12:53       ` Greg Kroah-Hartman
  0 siblings, 1 reply; 10+ messages in thread
From: Greg Kroah-Hartman @ 2022-08-30 12:51 UTC (permalink / raw)
  To: Linyu Yuan; +Cc: Heikki Krogerus, Takashi Iwai, linux-usb, linux-kernel

On Mon, Aug 22, 2022 at 10:44:26AM +0800, Linyu Yuan wrote:
> 
> On 8/21/2022 2:40 AM, Greg Kroah-Hartman wrote:
> > On Fri, Aug 19, 2022 at 06:32:43PM +0200, Takashi Iwai wrote:
> > > Hi,
> > > 
> > > we've got multiple reports about 5.19 kernel starting crashing after
> > > some time, and this turned out to be triggered by ucsi_acpi driver.
> > > The details are found in:
> > >    https://bugzilla.suse.com/show_bug.cgi?id=1202386
> > > 
> > > The culprit seems to be the commit 87d0e2f41b8c
> > >      usb: typec: ucsi: add a common function ucsi_unregister_connectors()
> > Adding Heikki to the thread...
> > 
> > > This commit looks as if it were a harmless cleanup, but this failed in
> > > a subtle way.  Namely, in the error scenario, the driver gets an error
> > > at ucsi_register_altmodes(), and goes to the error handling to release
> > > the resources.  Through this refactoring, the release part was unified
> > > to a funciton ucsi_unregister_connectors().  And there, it has a NULL
> > > check of con->wq, and it bails out the loop if it's NULL.
> > > Meanwhile, ucsi_register_port() itself still calls destroy_workqueue()
> > > and clear con->wq at its error path.  This ended up in the leftover
> > > power supply device with the uninitialized / cleared device.
> > > 
> > > It was confirmed that the problem could be avoided by a simple
> > > revert.
> > I'll be glad to revert this now, unless Heikki thinks:
> > 
> > > I guess another fix could be removing the part clearing con->wq, i.e.
> > > 
> > > --- a/drivers/usb/typec/ucsi/ucsi.c
> > > +++ b/drivers/usb/typec/ucsi/ucsi.c
> > > @@ -1192,11 +1192,6 @@ static int ucsi_register_port(struct ucsi *ucsi, int index)
> > >   out_unlock:
> > >   	mutex_unlock(&con->lock);
> > > -	if (ret && con->wq) {
> > > -		destroy_workqueue(con->wq);
> > > -		con->wq = NULL;
> > > -	}
> > > -
> > >   	return ret;
> > >   }
> > > 
> > > ... but it's totally untested and I'm not entirely sure whether it's
> > > better.
> 
> this part is original code, yes, but when I make the change you mentioned,
> 
> as in the function ucsi_unregister_connectors(),  just use con->wq to
> represent which connector initialized previous,
> 
> indeed if we clear it in ucsi_register_port(), something will left unclear.
> 
> please send a patch to fix it.
> 
> I think your change is good.

Ok, can someone send me a patch to apply to the tree for this?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [REGRESSION 5.19] NULL dereference by ucsi_acpi driver
  2022-08-30 12:51     ` Greg Kroah-Hartman
@ 2022-08-30 12:53       ` Greg Kroah-Hartman
  0 siblings, 0 replies; 10+ messages in thread
From: Greg Kroah-Hartman @ 2022-08-30 12:53 UTC (permalink / raw)
  To: Linyu Yuan; +Cc: Heikki Krogerus, Takashi Iwai, linux-usb, linux-kernel

On Tue, Aug 30, 2022 at 02:51:54PM +0200, Greg Kroah-Hartman wrote:
> On Mon, Aug 22, 2022 at 10:44:26AM +0800, Linyu Yuan wrote:
> > 
> > On 8/21/2022 2:40 AM, Greg Kroah-Hartman wrote:
> > > On Fri, Aug 19, 2022 at 06:32:43PM +0200, Takashi Iwai wrote:
> > > > Hi,
> > > > 
> > > > we've got multiple reports about 5.19 kernel starting crashing after
> > > > some time, and this turned out to be triggered by ucsi_acpi driver.
> > > > The details are found in:
> > > >    https://bugzilla.suse.com/show_bug.cgi?id=1202386
> > > > 
> > > > The culprit seems to be the commit 87d0e2f41b8c
> > > >      usb: typec: ucsi: add a common function ucsi_unregister_connectors()
> > > Adding Heikki to the thread...
> > > 
> > > > This commit looks as if it were a harmless cleanup, but this failed in
> > > > a subtle way.  Namely, in the error scenario, the driver gets an error
> > > > at ucsi_register_altmodes(), and goes to the error handling to release
> > > > the resources.  Through this refactoring, the release part was unified
> > > > to a funciton ucsi_unregister_connectors().  And there, it has a NULL
> > > > check of con->wq, and it bails out the loop if it's NULL.
> > > > Meanwhile, ucsi_register_port() itself still calls destroy_workqueue()
> > > > and clear con->wq at its error path.  This ended up in the leftover
> > > > power supply device with the uninitialized / cleared device.
> > > > 
> > > > It was confirmed that the problem could be avoided by a simple
> > > > revert.
> > > I'll be glad to revert this now, unless Heikki thinks:
> > > 
> > > > I guess another fix could be removing the part clearing con->wq, i.e.
> > > > 
> > > > --- a/drivers/usb/typec/ucsi/ucsi.c
> > > > +++ b/drivers/usb/typec/ucsi/ucsi.c
> > > > @@ -1192,11 +1192,6 @@ static int ucsi_register_port(struct ucsi *ucsi, int index)
> > > >   out_unlock:
> > > >   	mutex_unlock(&con->lock);
> > > > -	if (ret && con->wq) {
> > > > -		destroy_workqueue(con->wq);
> > > > -		con->wq = NULL;
> > > > -	}
> > > > -
> > > >   	return ret;
> > > >   }
> > > > 
> > > > ... but it's totally untested and I'm not entirely sure whether it's
> > > > better.
> > 
> > this part is original code, yes, but when I make the change you mentioned,
> > 
> > as in the function ucsi_unregister_connectors(),  just use con->wq to
> > represent which connector initialized previous,
> > 
> > indeed if we clear it in ucsi_register_port(), something will left unclear.
> > 
> > please send a patch to fix it.
> > 
> > I think your change is good.
> 
> Ok, can someone send me a patch to apply to the tree for this?

Oh nevermind, I already have the revert in my tree, sorry for the noise.

greg k-h

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2022-08-30 12:54 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-08-19 16:32 [REGRESSION 5.19] NULL dereference by ucsi_acpi driver Takashi Iwai
2022-08-20 18:40 ` Greg Kroah-Hartman
2022-08-22  2:44   ` Linyu Yuan
2022-08-30 12:51     ` Greg Kroah-Hartman
2022-08-30 12:53       ` Greg Kroah-Hartman
2022-08-22 13:24   ` Heikki Krogerus
2022-08-23  2:26     ` Linyu Yuan
2022-08-23  6:41       ` Greg Kroah-Hartman
2022-08-23  6:52         ` Takashi Iwai
2022-08-24  9:50 ` Thorsten Leemhuis

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.