All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] goku_udc: Add check for NULL in goku_irq
@ 2023-02-03 10:18 Anastasia Belova
  2023-02-03 10:45 ` Greg Kroah-Hartman
  2023-02-03 14:43 ` Alan Stern
  0 siblings, 2 replies; 11+ messages in thread
From: Anastasia Belova @ 2023-02-03 10:18 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Anastasia Belova, Jakob Koschel, linux-usb, linux-kernel, lvc-project

Before dereferencing dev->driver check it for NULL.

If an interrupt handler is called after assigning
NULL to dev->driver, but before resetting dev->int_enable,
NULL-pointer will be dereferenced.

Found by Linux Verification Center (linuxtesting.org) with SVACE.

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Anastasia Belova <abelova@astralinux.ru>
---
 drivers/usb/gadget/udc/goku_udc.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/usb/gadget/udc/goku_udc.c b/drivers/usb/gadget/udc/goku_udc.c
index bdc56b24b5c9..896bba8b47f1 100644
--- a/drivers/usb/gadget/udc/goku_udc.c
+++ b/drivers/usb/gadget/udc/goku_udc.c
@@ -1616,8 +1616,9 @@ static irqreturn_t goku_irq(int irq, void *_dev)
 pm_next:
 		if (stat & INT_USBRESET) {		/* hub reset done */
 			ACK(INT_USBRESET);
-			INFO(dev, "USB reset done, gadget %s\n",
-				dev->driver->driver.name);
+			if (dev->driver)
+				INFO(dev, "USB reset done, gadget %s\n",
+					dev->driver->driver.name);
 		}
 		// and INT_ERR on some endpoint's crc/bitstuff/... problem
 	}
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH] goku_udc: Add check for NULL in goku_irq
  2023-02-03 10:18 [PATCH] goku_udc: Add check for NULL in goku_irq Anastasia Belova
@ 2023-02-03 10:45 ` Greg Kroah-Hartman
  2023-02-15 13:39   ` Анастасия Белова
  2023-02-03 14:43 ` Alan Stern
  1 sibling, 1 reply; 11+ messages in thread
From: Greg Kroah-Hartman @ 2023-02-03 10:45 UTC (permalink / raw)
  To: Anastasia Belova; +Cc: Jakob Koschel, linux-usb, linux-kernel, lvc-project

On Fri, Feb 03, 2023 at 01:18:28PM +0300, Anastasia Belova wrote:
> Before dereferencing dev->driver check it for NULL.
> 
> If an interrupt handler is called after assigning
> NULL to dev->driver, but before resetting dev->int_enable,
> NULL-pointer will be dereferenced.
> 
> Found by Linux Verification Center (linuxtesting.org) with SVACE.
> 
> Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
> Signed-off-by: Anastasia Belova <abelova@astralinux.ru>
> ---
>  drivers/usb/gadget/udc/goku_udc.c | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/usb/gadget/udc/goku_udc.c b/drivers/usb/gadget/udc/goku_udc.c
> index bdc56b24b5c9..896bba8b47f1 100644
> --- a/drivers/usb/gadget/udc/goku_udc.c
> +++ b/drivers/usb/gadget/udc/goku_udc.c
> @@ -1616,8 +1616,9 @@ static irqreturn_t goku_irq(int irq, void *_dev)
>  pm_next:
>  		if (stat & INT_USBRESET) {		/* hub reset done */
>  			ACK(INT_USBRESET);
> -			INFO(dev, "USB reset done, gadget %s\n",
> -				dev->driver->driver.name);
> +			if (dev->driver)
> +				INFO(dev, "USB reset done, gadget %s\n",
> +					dev->driver->driver.name);

How can this ever happen?  Can you trigger this somehow?  If not, I
don't think this is going to be possible (also what's up with printk
from an irq handler???)

Odds are, no one actually has this hardware anymore, right?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] goku_udc: Add check for NULL in goku_irq
  2023-02-03 10:18 [PATCH] goku_udc: Add check for NULL in goku_irq Anastasia Belova
  2023-02-03 10:45 ` Greg Kroah-Hartman
@ 2023-02-03 14:43 ` Alan Stern
  1 sibling, 0 replies; 11+ messages in thread
From: Alan Stern @ 2023-02-03 14:43 UTC (permalink / raw)
  To: Anastasia Belova
  Cc: Greg Kroah-Hartman, Jakob Koschel, linux-usb, linux-kernel, lvc-project

On Fri, Feb 03, 2023 at 01:18:28PM +0300, Anastasia Belova wrote:
> Before dereferencing dev->driver check it for NULL.
> 
> If an interrupt handler is called after assigning
> NULL to dev->driver, but before resetting dev->int_enable,
> NULL-pointer will be dereferenced.

Wouldn't it be better just to set dev->driver to NULL _after_ resetting 
dev->int_enable?

Alan Stern

> Found by Linux Verification Center (linuxtesting.org) with SVACE.
> 
> Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
> Signed-off-by: Anastasia Belova <abelova@astralinux.ru>
> ---
>  drivers/usb/gadget/udc/goku_udc.c | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/usb/gadget/udc/goku_udc.c b/drivers/usb/gadget/udc/goku_udc.c
> index bdc56b24b5c9..896bba8b47f1 100644
> --- a/drivers/usb/gadget/udc/goku_udc.c
> +++ b/drivers/usb/gadget/udc/goku_udc.c
> @@ -1616,8 +1616,9 @@ static irqreturn_t goku_irq(int irq, void *_dev)
>  pm_next:
>  		if (stat & INT_USBRESET) {		/* hub reset done */
>  			ACK(INT_USBRESET);
> -			INFO(dev, "USB reset done, gadget %s\n",
> -				dev->driver->driver.name);
> +			if (dev->driver)
> +				INFO(dev, "USB reset done, gadget %s\n",
> +					dev->driver->driver.name);
>  		}
>  		// and INT_ERR on some endpoint's crc/bitstuff/... problem
>  	}
> -- 
> 2.30.2
> 

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] goku_udc: Add check for NULL in goku_irq
  2023-02-03 10:45 ` Greg Kroah-Hartman
@ 2023-02-15 13:39   ` Анастасия Белова
  2023-02-15 13:48     ` Greg Kroah-Hartman
  0 siblings, 1 reply; 11+ messages in thread
From: Анастасия Белова @ 2023-02-15 13:39 UTC (permalink / raw)
  To: Greg Kroah-Hartman; +Cc: Jakob Koschel, linux-usb, linux-kernel, lvc-project


03.02.2023 13:45, Greg Kroah-Hartman пишет:
> On Fri, Feb 03, 2023 at 01:18:28PM +0300, Anastasia Belova wrote:
>> Before dereferencing dev->driver check it for NULL.
>>
>> If an interrupt handler is called after assigning
>> NULL to dev->driver, but before resetting dev->int_enable,
>> NULL-pointer will be dereferenced.
>>
>> Found by Linux Verification Center (linuxtesting.org) with SVACE.
>>
>> Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
>> Signed-off-by: Anastasia Belova <abelova@astralinux.ru>
>> ---
>>   drivers/usb/gadget/udc/goku_udc.c | 5 +++--
>>   1 file changed, 3 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/usb/gadget/udc/goku_udc.c b/drivers/usb/gadget/udc/goku_udc.c
>> index bdc56b24b5c9..896bba8b47f1 100644
>> --- a/drivers/usb/gadget/udc/goku_udc.c
>> +++ b/drivers/usb/gadget/udc/goku_udc.c
>> @@ -1616,8 +1616,9 @@ static irqreturn_t goku_irq(int irq, void *_dev)
>>   pm_next:
>>   		if (stat & INT_USBRESET) {		/* hub reset done */
>>   			ACK(INT_USBRESET);
>> -			INFO(dev, "USB reset done, gadget %s\n",
>> -				dev->driver->driver.name);
>> +			if (dev->driver)
>> +				INFO(dev, "USB reset done, gadget %s\n",
>> +					dev->driver->driver.name);
> How can this ever happen?  Can you trigger this somehow?  If not, I
> don't think this is going to be possible (also what's up with printk
> from an irq handler???)

Unfortunately, I can't find the way to trigger this at the moment.

What about printk, should trace_printk be used instead?

>
> Odds are, no one actually has this hardware anymore, right?

Despite of this, such vulnerability should be fixed because

there is a possibility to exploit it.

> thanks,
>
> greg k-h

Best regards,

Anastasia Belova


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] goku_udc: Add check for NULL in goku_irq
  2023-02-15 13:39   ` Анастасия Белова
@ 2023-02-15 13:48     ` Greg Kroah-Hartman
  2023-03-11  3:29       ` Mirsad Goran Todorovac
  0 siblings, 1 reply; 11+ messages in thread
From: Greg Kroah-Hartman @ 2023-02-15 13:48 UTC (permalink / raw)
  To: Анастасия
	Белова
  Cc: Jakob Koschel, linux-usb, linux-kernel, lvc-project

On Wed, Feb 15, 2023 at 04:39:56PM +0300, Анастасия Белова wrote:
> 
> 03.02.2023 13:45, Greg Kroah-Hartman пишет:
> > On Fri, Feb 03, 2023 at 01:18:28PM +0300, Anastasia Belova wrote:
> > > Before dereferencing dev->driver check it for NULL.
> > > 
> > > If an interrupt handler is called after assigning
> > > NULL to dev->driver, but before resetting dev->int_enable,
> > > NULL-pointer will be dereferenced.
> > > 
> > > Found by Linux Verification Center (linuxtesting.org) with SVACE.
> > > 
> > > Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
> > > Signed-off-by: Anastasia Belova <abelova@astralinux.ru>
> > > ---
> > >   drivers/usb/gadget/udc/goku_udc.c | 5 +++--
> > >   1 file changed, 3 insertions(+), 2 deletions(-)
> > > 
> > > diff --git a/drivers/usb/gadget/udc/goku_udc.c b/drivers/usb/gadget/udc/goku_udc.c
> > > index bdc56b24b5c9..896bba8b47f1 100644
> > > --- a/drivers/usb/gadget/udc/goku_udc.c
> > > +++ b/drivers/usb/gadget/udc/goku_udc.c
> > > @@ -1616,8 +1616,9 @@ static irqreturn_t goku_irq(int irq, void *_dev)
> > >   pm_next:
> > >   		if (stat & INT_USBRESET) {		/* hub reset done */
> > >   			ACK(INT_USBRESET);
> > > -			INFO(dev, "USB reset done, gadget %s\n",
> > > -				dev->driver->driver.name);
> > > +			if (dev->driver)
> > > +				INFO(dev, "USB reset done, gadget %s\n",
> > > +					dev->driver->driver.name);
> > How can this ever happen?  Can you trigger this somehow?  If not, I
> > don't think this is going to be possible (also what's up with printk
> > from an irq handler???)
> 
> Unfortunately, I can't find the way to trigger this at the moment.

Then the change should not be made.

> What about printk, should trace_printk be used instead?

Why?

> > Odds are, no one actually has this hardware anymore, right?
> 
> Despite of this, such vulnerability should be fixed because
> there is a possibility to exploit it.

How can this be "exploited" if it can not ever be triggered?

Also, this would cause a NULL dereference in an irq handler, how can you
"exploit" that?

Please only submit patches that actually do something.  It is getting
very hard to want to even review patches from this "project" based on
the recent submissions.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] goku_udc: Add check for NULL in goku_irq
  2023-02-15 13:48     ` Greg Kroah-Hartman
@ 2023-03-11  3:29       ` Mirsad Goran Todorovac
  2023-03-13 12:19         ` Anastasia Belova
  0 siblings, 1 reply; 11+ messages in thread
From: Mirsad Goran Todorovac @ 2023-03-11  3:29 UTC (permalink / raw)
  To: Greg Kroah-Hartman,
	Анастасия
	Белова
  Cc: Jakob Koschel, linux-usb, linux-kernel, lvc-project

On 15. 02. 2023. 14:48, Greg Kroah-Hartman wrote:
> On Wed, Feb 15, 2023 at 04:39:56PM +0300, Анастасия Белова wrote:
>>
>> 03.02.2023 13:45, Greg Kroah-Hartman пишет:
>>> On Fri, Feb 03, 2023 at 01:18:28PM +0300, Anastasia Belova wrote:
>>>> Before dereferencing dev->driver check it for NULL.
>>>>
>>>> If an interrupt handler is called after assigning
>>>> NULL to dev->driver, but before resetting dev->int_enable,
>>>> NULL-pointer will be dereferenced.
>>>>
>>>> Found by Linux Verification Center (linuxtesting.org) with SVACE.
>>>>
>>>> Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
>>>> Signed-off-by: Anastasia Belova <abelova@astralinux.ru>
>>>> ---
>>>>   drivers/usb/gadget/udc/goku_udc.c | 5 +++--
>>>>   1 file changed, 3 insertions(+), 2 deletions(-)
>>>>
>>>> diff --git a/drivers/usb/gadget/udc/goku_udc.c b/drivers/usb/gadget/udc/goku_udc.c
>>>> index bdc56b24b5c9..896bba8b47f1 100644
>>>> --- a/drivers/usb/gadget/udc/goku_udc.c
>>>> +++ b/drivers/usb/gadget/udc/goku_udc.c
>>>> @@ -1616,8 +1616,9 @@ static irqreturn_t goku_irq(int irq, void *_dev)
>>>>   pm_next:
>>>>   		if (stat & INT_USBRESET) {		/* hub reset done */
>>>>   			ACK(INT_USBRESET);
>>>> -			INFO(dev, "USB reset done, gadget %s\n",
>>>> -				dev->driver->driver.name);
>>>> +			if (dev->driver)
>>>> +				INFO(dev, "USB reset done, gadget %s\n",
>>>> +					dev->driver->driver.name);
>>> How can this ever happen?  Can you trigger this somehow?  If not, I
>>> don't think this is going to be possible (also what's up with printk
>>> from an irq handler???)
>>
>> Unfortunately, I can't find the way to trigger this at the moment.
> 
> Then the change should not be made.
> 
>> What about printk, should trace_printk be used instead?
> 
> Why?
> 
>>> Odds are, no one actually has this hardware anymore, right?
>>
>> Despite of this, such vulnerability should be fixed because
>> there is a possibility to exploit it.
> 
> How can this be "exploited" if it can not ever be triggered?
> 
> Also, this would cause a NULL dereference in an irq handler, how can you
> "exploit" that?
> 
> Please only submit patches that actually do something.  It is getting
> very hard to want to even review patches from this "project" based on
> the recent submissions.
> 
> thanks,
> 
> greg k-h

Hi Greg, Anastasia,

Without any pros or cons, or taking sides, there appears to be a similar check
when using dev->driver->driver.name in

https://elixir.bootlin.com/linux/latest/source/drivers/usb/gadget/udc/goku_udc.c#L1158

	seq_printf(m,
		   "%s - %s\n"
		   "%s version: %s %s\n"
		   "Gadget driver: %s\n"
		   "Host %s, %s\n"
		   "\n",
		   pci_name(dev->pdev), driver_desc,
		   driver_name, DRIVER_VERSION, dmastr(),
		   dev->driver ? dev->driver->driver.name : "(none)",
		   is_usb_connected
			   ? ((tmp & PW_PULLUP) ? "full speed" : "powered")
			   : "disconnected",
		   udc_ep_state(dev->ep0state));

On the other hand, where could dev->drivre be reset without resetting dev->int_enable?

dev->driver = NULL appears here:

static int goku_udc_stop(struct usb_gadget *g)
{
	struct goku_udc	*dev = to_goku_udc(g);
	unsigned long	flags;

	spin_lock_irqsave(&dev->lock, flags);
	dev->driver = NULL;
	stop_activity(dev);
	spin_unlock_irqrestore(&dev->lock, flags);

	return 0;
}

it is followed by stop_activity() calling udc_reset():

static void udc_reset(struct goku_udc *dev)
{
	struct goku_udc_regs __iomem	*regs = dev->regs;

	writel(0, &regs->power_detect);
	writel(0, &regs->int_enable);
	readl(&regs->int_enable);
	dev->int_enable = 0;
.
.
.

... but this happens in between spin_lock_irqsave() and spin_unlock_irqsave(),
which appears like a correct way to do it.

But second appearance is here:

https://elixir.bootlin.com/linux/latest/source/drivers/usb/gadget/udc/goku_udc.c#L1559

	spin_lock(&dev->lock);

rescan:
	stat = readl(&regs->int_status) & dev->int_enable;
        if (!stat)
		goto done;
	dev->irqs++;

	/* device-wide irqs */
	if (unlikely(stat & INT_DEVWIDE)) {
		if (stat & INT_SYSERROR) {
			ERROR(dev, "system error\n");
			stop_activity(dev);
			stat = 0;
			handled = 1;
			// FIXME have a neater way to prevent re-enumeration
			dev->driver = NULL;
			goto done;
		}

goto done leads to:

done:
	(void)readl(&regs->int_enable);
	spin_unlock(&dev->lock);

This unlocks dev->lock before setting dev->int_enable to zero, or calling writel(0, &regs->int_enable);
which could be problematic. Unless it called stop_activity(dev) four lines earlier. Which does
bot of:

	writel(0, &regs->int_enable);
	dev->int_enable = 0;

So, FWIW, we seem to be safe. Yet, there might be no harm in printing "(null)" rather
than having an NULL pointer dereference, it seems.

Yet, there is another unprotected dereference of dev->driver:

https://elixir.bootlin.com/linux/latest/source/drivers/usb/gadget/udc/goku_udc.c#L1513

	spin_unlock (&dev->lock);
	tmp = dev->driver->setup(&dev->gadget, &ctrl);
	spin_lock (&dev->lock);

All others (in goku_udc.c, at least) have triple safeguards like:

				if (dev->gadget.speed != USB_SPEED_UNKNOWN
						&& dev->driver
						&& dev->driver->suspend) {
					spin_unlock(&dev->lock);
					dev->driver->suspend(&dev->gadget);
					spin_lock(&dev->lock);
				}

So the above should maybe put to:

	if (dev->driver && dev->driver->setup) {
		spin_unlock (&dev->lock);
		tmp = dev->driver->setup(&dev->gadget, &ctrl);
		spin_lock (&dev->lock);
	}

instead to be completely certain.

Forgive me for this uninspired rant. Thank you if you've read this far.
I hope this helps.

My $0.02.

Regards,
Mirsad

-- 
Mirsad Goran Todorovac
Sistem inženjer
Grafički fakultet | Akademija likovnih umjetnosti
Sveučilište u Zagrebu
 
System engineer
Faculty of Graphic Arts | Academy of Fine Arts
University of Zagreb, Republic of Croatia
The European Union


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] goku_udc: Add check for NULL in goku_irq
  2023-03-11  3:29       ` Mirsad Goran Todorovac
@ 2023-03-13 12:19         ` Anastasia Belova
  2023-03-13 13:49           ` Mirsad Goran Todorovac
  0 siblings, 1 reply; 11+ messages in thread
From: Anastasia Belova @ 2023-03-13 12:19 UTC (permalink / raw)
  To: Mirsad Goran Todorovac
  Cc: Jakob Koschel, linux-usb, linux-kernel, lvc-project, Greg Kroah-Hartman


11.03.2023 06:29, Mirsad Goran Todorovac пишет:
> On 15. 02. 2023. 14:48, Greg Kroah-Hartman wrote:
>> On Wed, Feb 15, 2023 at 04:39:56PM +0300, Анастасия Белова wrote:
>>> 03.02.2023 13:45, Greg Kroah-Hartman пишет:
>>>> On Fri, Feb 03, 2023 at 01:18:28PM +0300, Anastasia Belova wrote:
>>>>> Before dereferencing dev->driver check it for NULL.
>>>>>
>>>>> If an interrupt handler is called after assigning
>>>>> NULL to dev->driver, but before resetting dev->int_enable,
>>>>> NULL-pointer will be dereferenced.
>>>>>
>>>>> Found by Linux Verification Center (linuxtesting.org) with SVACE.
>>>>>
>>>>> Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
>>>>> Signed-off-by: Anastasia Belova <abelova@astralinux.ru>
>>>>> ---
>>>>>    drivers/usb/gadget/udc/goku_udc.c | 5 +++--
>>>>>    1 file changed, 3 insertions(+), 2 deletions(-)
>>>>>
>>>>> diff --git a/drivers/usb/gadget/udc/goku_udc.c b/drivers/usb/gadget/udc/goku_udc.c
>>>>> index bdc56b24b5c9..896bba8b47f1 100644
>>>>> --- a/drivers/usb/gadget/udc/goku_udc.c
>>>>> +++ b/drivers/usb/gadget/udc/goku_udc.c
>>>>> @@ -1616,8 +1616,9 @@ static irqreturn_t goku_irq(int irq, void *_dev)
>>>>>    pm_next:
>>>>>    		if (stat & INT_USBRESET) {		/* hub reset done */
>>>>>    			ACK(INT_USBRESET);
>>>>> -			INFO(dev, "USB reset done, gadget %s\n",
>>>>> -				dev->driver->driver.name);
>>>>> +			if (dev->driver)
>>>>> +				INFO(dev, "USB reset done, gadget %s\n",
>>>>> +					dev->driver->driver.name);
>>>> How can this ever happen?  Can you trigger this somehow?  If not, I
>>>> don't think this is going to be possible (also what's up with printk
>>>> from an irq handler???)
>>> Unfortunately, I can't find the way to trigger this at the moment.
>> Then the change should not be made.
>>
>>> What about printk, should trace_printk be used instead?
>> Why?
>>
>>>> Odds are, no one actually has this hardware anymore, right?
>>> Despite of this, such vulnerability should be fixed because
>>> there is a possibility to exploit it.
>> How can this be "exploited" if it can not ever be triggered?
>>
>> Also, this would cause a NULL dereference in an irq handler, how can you
>> "exploit" that?
>>
>> Please only submit patches that actually do something.  It is getting
>> very hard to want to even review patches from this "project" based on
>> the recent submissions.
>>
>> thanks,
>>
>> greg k-h
> Hi Greg, Anastasia,

Hi Misrad,

> Without any pros or cons, or taking sides, there appears to be a similar check
> when using dev->driver->driver.name in
>
> https://elixir.bootlin.com/linux/latest/source/drivers/usb/gadget/udc/goku_udc.c#L1158
>
> 	seq_printf(m,
> 		   "%s - %s\n"
> 		   "%s version: %s %s\n"
> 		   "Gadget driver: %s\n"
> 		   "Host %s, %s\n"
> 		   "\n",
> 		   pci_name(dev->pdev), driver_desc,
> 		   driver_name, DRIVER_VERSION, dmastr(),
> 		   dev->driver ? dev->driver->driver.name : "(none)",
> 		   is_usb_connected
> 			   ? ((tmp & PW_PULLUP) ? "full speed" : "powered")
> 			   : "disconnected",
> 		   udc_ep_state(dev->ep0state));
>
> On the other hand, where could dev->drivre be reset without resetting dev->int_enable?
>
> dev->driver = NULL appears here:
>
> static int goku_udc_stop(struct usb_gadget *g)
> {
> 	struct goku_udc	*dev = to_goku_udc(g);
> 	unsigned long	flags;
>
> 	spin_lock_irqsave(&dev->lock, flags);
> 	dev->driver = NULL;
> 	stop_activity(dev);
> 	spin_unlock_irqrestore(&dev->lock, flags);
>
> 	return 0;
> }
>
> it is followed by stop_activity() calling udc_reset():
>
> static void udc_reset(struct goku_udc *dev)
> {
> 	struct goku_udc_regs __iomem	*regs = dev->regs;
>
> 	writel(0, &regs->power_detect);
> 	writel(0, &regs->int_enable);
> 	readl(&regs->int_enable);
> 	dev->int_enable = 0;
> .
> .
> .
>
> ... but this happens in between spin_lock_irqsave() and spin_unlock_irqsave(),
> which appears like a correct way to do it.

Are you sure that spin_lock_irqsave makes the code safe? This function 
disables interrupts on

local processor only (Linux Device Drivers, Third Edition). So it 
doesn't seem to be

absolutely safe on multiprocessor systems.

> But second appearance is here:
>
> https://elixir.bootlin.com/linux/latest/source/drivers/usb/gadget/udc/goku_udc.c#L1559
>
> 	spin_lock(&dev->lock);
>
> rescan:
> 	stat = readl(&regs->int_status) & dev->int_enable;
>          if (!stat)
> 		goto done;
> 	dev->irqs++;
>
> 	/* device-wide irqs */
> 	if (unlikely(stat & INT_DEVWIDE)) {
> 		if (stat & INT_SYSERROR) {
> 			ERROR(dev, "system error\n");
> 			stop_activity(dev);
> 			stat = 0;
> 			handled = 1;
> 			// FIXME have a neater way to prevent re-enumeration
> 			dev->driver = NULL;
> 			goto done;
> 		}
>
> goto done leads to:
>
> done:
> 	(void)readl(&regs->int_enable);
> 	spin_unlock(&dev->lock);
>
> This unlocks dev->lock before setting dev->int_enable to zero, or calling writel(0, &regs->int_enable);
> which could be problematic. Unless it called stop_activity(dev) four lines earlier. Which does
> bot of:
>
> 	writel(0, &regs->int_enable);
> 	dev->int_enable = 0;
>
> So, FWIW, we seem to be safe. Yet, there might be no harm in printing "(null)" rather
> than having an NULL pointer dereference, it seems.
>
> Yet, there is another unprotected dereference of dev->driver:
>
> https://elixir.bootlin.com/linux/latest/source/drivers/usb/gadget/udc/goku_udc.c#L1513
>
> 	spin_unlock (&dev->lock);
> 	tmp = dev->driver->setup(&dev->gadget, &ctrl);
> 	spin_lock (&dev->lock);
>
> All others (in goku_udc.c, at least) have triple safeguards like:
>
> 				if (dev->gadget.speed != USB_SPEED_UNKNOWN
> 						&& dev->driver
> 						&& dev->driver->suspend) {
> 					spin_unlock(&dev->lock);
> 					dev->driver->suspend(&dev->gadget);
> 					spin_lock(&dev->lock);
> 				}
>
> So the above should maybe put to:
>
> 	if (dev->driver && dev->driver->setup) {
> 		spin_unlock (&dev->lock);
> 		tmp = dev->driver->setup(&dev->gadget, &ctrl);
> 		spin_lock (&dev->lock);
> 	}
>
> instead to be completely certain.
>
> Forgive me for this uninspired rant. Thank you if you've read this far.
> I hope this helps.
>
> My $0.02.
>
> Regards,
> Mirsad
>
Thanks,

Anastasia


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] goku_udc: Add check for NULL in goku_irq
  2023-03-13 12:19         ` Anastasia Belova
@ 2023-03-13 13:49           ` Mirsad Goran Todorovac
  2023-03-15 14:26             ` Anastasia Belova
  0 siblings, 1 reply; 11+ messages in thread
From: Mirsad Goran Todorovac @ 2023-03-13 13:49 UTC (permalink / raw)
  To: Anastasia Belova
  Cc: Jakob Koschel, linux-usb, linux-kernel, lvc-project, Greg Kroah-Hartman

On 13.3.2023. 13:19, Anastasia Belova wrote:
> 
> 11.03.2023 06:29, Mirsad Goran Todorovac пишет:
>> On 15. 02. 2023. 14:48, Greg Kroah-Hartman wrote:
>>> On Wed, Feb 15, 2023 at 04:39:56PM +0300, Анастасия Белова wrote:
>>>> 03.02.2023 13:45, Greg Kroah-Hartman пишет:
>>>>> On Fri, Feb 03, 2023 at 01:18:28PM +0300, Anastasia Belova wrote:
>>>>>> Before dereferencing dev->driver check it for NULL.
>>>>>>
>>>>>> If an interrupt handler is called after assigning
>>>>>> NULL to dev->driver, but before resetting dev->int_enable,
>>>>>> NULL-pointer will be dereferenced.
>>>>>>
>>>>>> Found by Linux Verification Center (linuxtesting.org) with SVACE.
>>>>>>
>>>>>> Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
>>>>>> Signed-off-by: Anastasia Belova <abelova@astralinux.ru>
>>>>>> ---
>>>>>>    drivers/usb/gadget/udc/goku_udc.c | 5 +++--
>>>>>>    1 file changed, 3 insertions(+), 2 deletions(-)
>>>>>>
>>>>>> diff --git a/drivers/usb/gadget/udc/goku_udc.c b/drivers/usb/gadget/udc/goku_udc.c
>>>>>> index bdc56b24b5c9..896bba8b47f1 100644
>>>>>> --- a/drivers/usb/gadget/udc/goku_udc.c
>>>>>> +++ b/drivers/usb/gadget/udc/goku_udc.c
>>>>>> @@ -1616,8 +1616,9 @@ static irqreturn_t goku_irq(int irq, void *_dev)
>>>>>>    pm_next:
>>>>>>            if (stat & INT_USBRESET) {        /* hub reset done */
>>>>>>                ACK(INT_USBRESET);
>>>>>> -            INFO(dev, "USB reset done, gadget %s\n",
>>>>>> -                dev->driver->driver.name);
>>>>>> +            if (dev->driver)
>>>>>> +                INFO(dev, "USB reset done, gadget %s\n",
>>>>>> +                    dev->driver->driver.name);
>>>>> How can this ever happen?  Can you trigger this somehow?  If not, I
>>>>> don't think this is going to be possible (also what's up with printk
>>>>> from an irq handler???)
>>>> Unfortunately, I can't find the way to trigger this at the moment.
>>> Then the change should not be made.
>>>
>>>> What about printk, should trace_printk be used instead?
>>> Why?
>>>
>>>>> Odds are, no one actually has this hardware anymore, right?
>>>> Despite of this, such vulnerability should be fixed because
>>>> there is a possibility to exploit it.
>>> How can this be "exploited" if it can not ever be triggered?
>>>
>>> Also, this would cause a NULL dereference in an irq handler, how can you
>>> "exploit" that?
>>>
>>> Please only submit patches that actually do something.  It is getting
>>> very hard to want to even review patches from this "project" based on
>>> the recent submissions.
>>>
>>> thanks,
>>>
>>> greg k-h
>> Hi Greg, Anastasia,
> 
> Hi Misrad,
> 
>> Without any pros or cons, or taking sides, there appears to be a similar check
>> when using dev->driver->driver.name in
>>
>> https://elixir.bootlin.com/linux/latest/source/drivers/usb/gadget/udc/goku_udc.c#L1158
>>
>>     seq_printf(m,
>>            "%s - %s\n"
>>            "%s version: %s %s\n"
>>            "Gadget driver: %s\n"
>>            "Host %s, %s\n"
>>            "\n",
>>            pci_name(dev->pdev), driver_desc,
>>            driver_name, DRIVER_VERSION, dmastr(),
>>            dev->driver ? dev->driver->driver.name : "(none)",
>>            is_usb_connected
>>                ? ((tmp & PW_PULLUP) ? "full speed" : "powered")
>>                : "disconnected",
>>            udc_ep_state(dev->ep0state));
>>
>> On the other hand, where could dev->drivre be reset without resetting dev->int_enable?
>>
>> dev->driver = NULL appears here:
>>
>> static int goku_udc_stop(struct usb_gadget *g)
>> {
>>     struct goku_udc    *dev = to_goku_udc(g);
>>     unsigned long    flags;
>>
>>     spin_lock_irqsave(&dev->lock, flags);
>>     dev->driver = NULL;
>>     stop_activity(dev);
>>     spin_unlock_irqrestore(&dev->lock, flags);
>>
>>     return 0;
>> }
>>
>> it is followed by stop_activity() calling udc_reset():
>>
>> static void udc_reset(struct goku_udc *dev)
>> {
>>     struct goku_udc_regs __iomem    *regs = dev->regs;
>>
>>     writel(0, &regs->power_detect);
>>     writel(0, &regs->int_enable);
>>     readl(&regs->int_enable);
>>     dev->int_enable = 0;
>> .
>> .
>> .
>>
>> ... but this happens in between spin_lock_irqsave() and spin_unlock_irqsave(),
>> which appears like a correct way to do it.
> 
> Are you sure that spin_lock_irqsave makes the code safe? This function disables interrupts on
> local processor only (Linux Device Drivers, Third Edition). So it doesn't seem to be
> absolutely safe on multiprocessor systems.

Hi, Anastasia,

Looking from the Second Edition or the book and the source, I see that
spin_lock_irqsave() expands to:

static inline unsigned long __raw_spin_lock_irqsave(raw_spinlock_t *lock)
{
	unsigned long flags;

	local_irq_save(flags);
	preempt_disable();
	spin_acquire(&lock->dep_map, 0, 0, _RET_IP_);
	LOCK_CONTENDED(lock, do_raw_spin_trylock, do_raw_spin_lock);
	return flags;
}

if the multiple threads on multiple cores/SMTs contend for the same lock,
that with preempt_disable() should assure mutual exclusion.

Can you please quote from the Third Edition of Linux Device Drivers where
it says otherwise?

BTW, please also consider reading this article:

https://docs.kernel.org/driver-api/io_ordering.html

I saw they were using this readl() after writel() for synchronisation, so
please see if this clears your doubts. It says "readl() should flush any pending
writes".

But I certainly see no harm in your proposal of guarding against NULL pointer
dereference of dev->driver, both in

 > -            INFO(dev, "USB reset done, gadget %s\n",
 > -                dev->driver->driver.name);
 > +            if (dev->driver)
 > +                INFO(dev, "USB reset done, gadget %s\n",
 > +                    dev->driver->driver.name);

or use the construct as the one in another line of the driver:

 > -            INFO(dev, "USB reset done, gadget %s\n",
 > -                dev->driver->driver.name);
 > +            INFO(dev, "USB reset done, gadget %s\n",
 > +                dev->driver ? dev->driver->driver.name : "(none)");

(This would IMHO enable detecting and logging when dev->driver unexpectedly becomes NULL
in a race condition, rather that just silently skipping and ignoring the situation.)

but additionally also in:

 > -     spin_unlock (&dev->lock);
 > -     tmp = dev->driver->setup(&dev->gadget, &ctrl);
 > -     spin_lock (&dev->lock);
 > +     if (dev->driver && dev->driver->setup) {
 > +         spin_unlock (&dev->lock);
 > +         tmp = dev->driver->setup(&dev->gadget, &ctrl);
 > +         spin_lock (&dev->lock);
 > +     }

for completeness and robustness sake.

The author agrees that the race conditions in the device drivers are very hard
to reproduce:

https://www.oreilly.com/library/view/linux-device-drivers/0596000081/ch09s08.html

But I am really not able to analyse all possible scenarios ATM.
Maybe some ideas come after getting some oxygen.

(This is not an authoritative answer on the matter, just an attempt on analysis.)

Regards,
Mirsad

>> But second appearance is here:
>>
>> https://elixir.bootlin.com/linux/latest/source/drivers/usb/gadget/udc/goku_udc.c#L1559
>>
>>     spin_lock(&dev->lock);
>>
>> rescan:
>>     stat = readl(&regs->int_status) & dev->int_enable;
>>          if (!stat)
>>         goto done;
>>     dev->irqs++;
>>
>>     /* device-wide irqs */
>>     if (unlikely(stat & INT_DEVWIDE)) {
>>         if (stat & INT_SYSERROR) {
>>             ERROR(dev, "system error\n");
>>             stop_activity(dev);
>>             stat = 0;
>>             handled = 1;
>>             // FIXME have a neater way to prevent re-enumeration
>>             dev->driver = NULL;
>>             goto done;
>>         }
>>
>> goto done leads to:
>>
>> done:
>>     (void)readl(&regs->int_enable);
>>     spin_unlock(&dev->lock);
>>
>> This unlocks dev->lock before setting dev->int_enable to zero, or calling writel(0, &regs->int_enable);
>> which could be problematic. Unless it called stop_activity(dev) four lines earlier. Which does
>> bot of:
>>
>>     writel(0, &regs->int_enable);
>>     dev->int_enable = 0;
>>
>> So, FWIW, we seem to be safe. Yet, there might be no harm in printing "(null)" rather
>> than having an NULL pointer dereference, it seems.
>>
>> Yet, there is another unprotected dereference of dev->driver:
>>
>> https://elixir.bootlin.com/linux/latest/source/drivers/usb/gadget/udc/goku_udc.c#L1513
>>
>>     spin_unlock (&dev->lock);
>>     tmp = dev->driver->setup(&dev->gadget, &ctrl);
>>     spin_lock (&dev->lock);
>>
>> All others (in goku_udc.c, at least) have triple safeguards like:
>>
>>                 if (dev->gadget.speed != USB_SPEED_UNKNOWN
>>                         && dev->driver
>>                         && dev->driver->suspend) {
>>                     spin_unlock(&dev->lock);
>>                     dev->driver->suspend(&dev->gadget);
>>                     spin_lock(&dev->lock);
>>                 }
>>
>> So the above should maybe put to:
>>
>>     if (dev->driver && dev->driver->setup) {
>>         spin_unlock (&dev->lock);
>>         tmp = dev->driver->setup(&dev->gadget, &ctrl);
>>         spin_lock (&dev->lock);
>>     }
>>
>> instead to be completely certain.
>>
>> Forgive me for this uninspired rant. Thank you if you've read this far.
>> I hope this helps.
>>
>> My $0.02.
>>
>> Regards,
>> Mirsad
>>
> Thanks,
> 
> Anastasia

-- 
Mirsad Todorovac
System engineer
Faculty of Graphic Arts | Academy of Fine Arts
University of Zagreb
Republic of Croatia, the European Union

Sistem inženjer
Grafički fakultet | Akademija likovnih umjetnosti
Sveučilište u Zagrebu


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] goku_udc: Add check for NULL in goku_irq
  2023-03-13 13:49           ` Mirsad Goran Todorovac
@ 2023-03-15 14:26             ` Anastasia Belova
  2023-03-15 16:26               ` Mirsad Goran Todorovac
  0 siblings, 1 reply; 11+ messages in thread
From: Anastasia Belova @ 2023-03-15 14:26 UTC (permalink / raw)
  To: Mirsad Goran Todorovac
  Cc: Jakob Koschel, linux-usb, linux-kernel, lvc-project, Greg Kroah-Hartman


13.03.2023 16:49, Mirsad Goran Todorovac пишет:
> On 13.3.2023. 13:19, Anastasia Belova wrote:
>>
>> 11.03.2023 06:29, Mirsad Goran Todorovac пишет:
>>> On 15. 02. 2023. 14:48, Greg Kroah-Hartman wrote:
>>>> On Wed, Feb 15, 2023 at 04:39:56PM +0300, Анастасия Белова wrote:
>>>>> 03.02.2023 13:45, Greg Kroah-Hartman пишет:
>>>>>> On Fri, Feb 03, 2023 at 01:18:28PM +0300, Anastasia Belova wrote:
>>>>>>> Before dereferencing dev->driver check it for NULL.
>>>>>>>
>>>>>>> If an interrupt handler is called after assigning
>>>>>>> NULL to dev->driver, but before resetting dev->int_enable,
>>>>>>> NULL-pointer will be dereferenced.
>>>>>>>
>>>>>>> Found by Linux Verification Center (linuxtesting.org) with SVACE.
>>>>>>>
>>>>>>> Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
>>>>>>> Signed-off-by: Anastasia Belova <abelova@astralinux.ru>
>>>>>>> ---
>>>>>>>    drivers/usb/gadget/udc/goku_udc.c | 5 +++--
>>>>>>>    1 file changed, 3 insertions(+), 2 deletions(-)
>>>>>>>
>>>>>>> diff --git a/drivers/usb/gadget/udc/goku_udc.c 
>>>>>>> b/drivers/usb/gadget/udc/goku_udc.c
>>>>>>> index bdc56b24b5c9..896bba8b47f1 100644
>>>>>>> --- a/drivers/usb/gadget/udc/goku_udc.c
>>>>>>> +++ b/drivers/usb/gadget/udc/goku_udc.c
>>>>>>> @@ -1616,8 +1616,9 @@ static irqreturn_t goku_irq(int irq, void 
>>>>>>> *_dev)
>>>>>>>    pm_next:
>>>>>>>            if (stat & INT_USBRESET) {        /* hub reset done */
>>>>>>>                ACK(INT_USBRESET);
>>>>>>> -            INFO(dev, "USB reset done, gadget %s\n",
>>>>>>> -                dev->driver->driver.name);
>>>>>>> +            if (dev->driver)
>>>>>>> +                INFO(dev, "USB reset done, gadget %s\n",
>>>>>>> +                    dev->driver->driver.name);
>>>>>> How can this ever happen?  Can you trigger this somehow?  If not, I
>>>>>> don't think this is going to be possible (also what's up with printk
>>>>>> from an irq handler???)
>>>>> Unfortunately, I can't find the way to trigger this at the moment.
>>>> Then the change should not be made.
>>>>
>>>>> What about printk, should trace_printk be used instead?
>>>> Why?
>>>>
>>>>>> Odds are, no one actually has this hardware anymore, right?
>>>>> Despite of this, such vulnerability should be fixed because
>>>>> there is a possibility to exploit it.
>>>> How can this be "exploited" if it can not ever be triggered?
>>>>
>>>> Also, this would cause a NULL dereference in an irq handler, how 
>>>> can you
>>>> "exploit" that?
>>>>
>>>> Please only submit patches that actually do something.  It is getting
>>>> very hard to want to even review patches from this "project" based on
>>>> the recent submissions.
>>>>
>>>> thanks,
>>>>
>>>> greg k-h
>>> Hi Greg, Anastasia,
>>
>> Hi Misrad,
>>
>>> Without any pros or cons, or taking sides, there appears to be a 
>>> similar check
>>> when using dev->driver->driver.name in
>>>
>>> https://elixir.bootlin.com/linux/latest/source/drivers/usb/gadget/udc/goku_udc.c#L1158 
>>>
>>>
>>>     seq_printf(m,
>>>            "%s - %s\n"
>>>            "%s version: %s %s\n"
>>>            "Gadget driver: %s\n"
>>>            "Host %s, %s\n"
>>>            "\n",
>>>            pci_name(dev->pdev), driver_desc,
>>>            driver_name, DRIVER_VERSION, dmastr(),
>>>            dev->driver ? dev->driver->driver.name : "(none)",
>>>            is_usb_connected
>>>                ? ((tmp & PW_PULLUP) ? "full speed" : "powered")
>>>                : "disconnected",
>>>            udc_ep_state(dev->ep0state));
>>>
>>> On the other hand, where could dev->drivre be reset without 
>>> resetting dev->int_enable?
>>>
>>> dev->driver = NULL appears here:
>>>
>>> static int goku_udc_stop(struct usb_gadget *g)
>>> {
>>>     struct goku_udc    *dev = to_goku_udc(g);
>>>     unsigned long    flags;
>>>
>>>     spin_lock_irqsave(&dev->lock, flags);
>>>     dev->driver = NULL;
>>>     stop_activity(dev);
>>>     spin_unlock_irqrestore(&dev->lock, flags);
>>>
>>>     return 0;
>>> }
>>>
>>> it is followed by stop_activity() calling udc_reset():
>>>
>>> static void udc_reset(struct goku_udc *dev)
>>> {
>>>     struct goku_udc_regs __iomem    *regs = dev->regs;
>>>
>>>     writel(0, &regs->power_detect);
>>>     writel(0, &regs->int_enable);
>>>     readl(&regs->int_enable);
>>>     dev->int_enable = 0;
>>> .
>>> .
>>> .
>>>
>>> ... but this happens in between spin_lock_irqsave() and 
>>> spin_unlock_irqsave(),
>>> which appears like a correct way to do it.
>>
>> Are you sure that spin_lock_irqsave makes the code safe? This 
>> function disables interrupts on
>> local processor only (Linux Device Drivers, Third Edition). So it 
>> doesn't seem to be
>> absolutely safe on multiprocessor systems.
>
> Hi, Anastasia,
>
> Looking from the Second Edition or the book and the source, I see that
> spin_lock_irqsave() expands to:
>
> static inline unsigned long __raw_spin_lock_irqsave(raw_spinlock_t *lock)
> {
>     unsigned long flags;
>
>     local_irq_save(flags);
>     preempt_disable();
>     spin_acquire(&lock->dep_map, 0, 0, _RET_IP_);
>     LOCK_CONTENDED(lock, do_raw_spin_trylock, do_raw_spin_lock);
>     return flags;
> }
>
> if the multiple threads on multiple cores/SMTs contend for the same lock,
> that with preempt_disable() should assure mutual exclusion.
>
> Can you please quote from the Third Edition of Linux Device Drivers where
> it says otherwise?
>

Hi, Mirsad,


If I get it right, preempt_disable blocks interrupts on all processors,

correct? This statement seems to make the code safe, but there is a quote

from Linux Device Drivers, Third Edition, CHAPTER 5, Concurrency and Race

Conditions: "...spin_lock_irqsave disables interrupts (on the local 
processor

only) before taking the spinlock...". These thoughts contradict, don't they?


> BTW, please also consider reading this article:
>
> https://docs.kernel.org/driver-api/io_ordering.html
>
> I saw they were using this readl() after writel() for synchronisation, so
> please see if this clears your doubts. It says "readl() should flush 
> any pending
> writes".
>
> But I certainly see no harm in your proposal of guarding against NULL 
> pointer
> dereference of dev->driver, both in
>
> > -            INFO(dev, "USB reset done, gadget %s\n",
> > -                dev->driver->driver.name);
> > +            if (dev->driver)
> > +                INFO(dev, "USB reset done, gadget %s\n",
> > +                    dev->driver->driver.name);
>
> or use the construct as the one in another line of the driver:
>
> > -            INFO(dev, "USB reset done, gadget %s\n",
> > -                dev->driver->driver.name);
> > +            INFO(dev, "USB reset done, gadget %s\n",
> > +                dev->driver ? dev->driver->driver.name : "(none)");
>
> (This would IMHO enable detecting and logging when dev->driver 
> unexpectedly becomes NULL
> in a race condition, rather that just silently skipping and ignoring 
> the situation.)
>
Agree, the second construct looks better.
> but additionally also in:
>
> > -     spin_unlock (&dev->lock);
> > -     tmp = dev->driver->setup(&dev->gadget, &ctrl);
> > -     spin_lock (&dev->lock);
> > +     if (dev->driver && dev->driver->setup) {
> > +         spin_unlock (&dev->lock);
> > +         tmp = dev->driver->setup(&dev->gadget, &ctrl);
> > +         spin_lock (&dev->lock);
> > +     }
>
> for completeness and robustness sake.
>
> The author agrees that the race conditions in the device drivers are 
> very hard
> to reproduce:
>
> https://www.oreilly.com/library/view/linux-device-drivers/0596000081/ch09s08.html 
>
>
> But I am really not able to analyse all possible scenarios ATM.
> Maybe some ideas come after getting some oxygen.
>
> (This is not an authoritative answer on the matter, just an attempt on 
> analysis.)
>
> Regards,
> Mirsad
>
Regards,

Anastasia

>>> But second appearance is here:
>>>
>>> https://elixir.bootlin.com/linux/latest/source/drivers/usb/gadget/udc/goku_udc.c#L1559 
>>>
>>>
>>>     spin_lock(&dev->lock);
>>>
>>> rescan:
>>>     stat = readl(&regs->int_status) & dev->int_enable;
>>>          if (!stat)
>>>         goto done;
>>>     dev->irqs++;
>>>
>>>     /* device-wide irqs */
>>>     if (unlikely(stat & INT_DEVWIDE)) {
>>>         if (stat & INT_SYSERROR) {
>>>             ERROR(dev, "system error\n");
>>>             stop_activity(dev);
>>>             stat = 0;
>>>             handled = 1;
>>>             // FIXME have a neater way to prevent re-enumeration
>>>             dev->driver = NULL;
>>>             goto done;
>>>         }
>>>
>>> goto done leads to:
>>>
>>> done:
>>>     (void)readl(&regs->int_enable);
>>>     spin_unlock(&dev->lock);
>>>
>>> This unlocks dev->lock before setting dev->int_enable to zero, or 
>>> calling writel(0, &regs->int_enable);
>>> which could be problematic. Unless it called stop_activity(dev) four 
>>> lines earlier. Which does
>>> bot of:
>>>
>>>     writel(0, &regs->int_enable);
>>>     dev->int_enable = 0;
>>>
>>> So, FWIW, we seem to be safe. Yet, there might be no harm in 
>>> printing "(null)" rather
>>> than having an NULL pointer dereference, it seems.
>>>
>>> Yet, there is another unprotected dereference of dev->driver:
>>>
>>> https://elixir.bootlin.com/linux/latest/source/drivers/usb/gadget/udc/goku_udc.c#L1513 
>>>
>>>
>>>     spin_unlock (&dev->lock);
>>>     tmp = dev->driver->setup(&dev->gadget, &ctrl);
>>>     spin_lock (&dev->lock);
>>>
>>> All others (in goku_udc.c, at least) have triple safeguards like:
>>>
>>>                 if (dev->gadget.speed != USB_SPEED_UNKNOWN
>>>                         && dev->driver
>>>                         && dev->driver->suspend) {
>>>                     spin_unlock(&dev->lock);
>>> dev->driver->suspend(&dev->gadget);
>>>                     spin_lock(&dev->lock);
>>>                 }
>>>
>>> So the above should maybe put to:
>>>
>>>     if (dev->driver && dev->driver->setup) {
>>>         spin_unlock (&dev->lock);
>>>         tmp = dev->driver->setup(&dev->gadget, &ctrl);
>>>         spin_lock (&dev->lock);
>>>     }
>>>
>>> instead to be completely certain.
>>>
>>> Forgive me for this uninspired rant. Thank you if you've read this far.
>>> I hope this helps.
>>>
>>> My $0.02.
>>>
>>> Regards,
>>> Mirsad
>>>
>> Thanks,
>>
>> Anastasia
>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] goku_udc: Add check for NULL in goku_irq
  2023-03-15 14:26             ` Anastasia Belova
@ 2023-03-15 16:26               ` Mirsad Goran Todorovac
  2023-03-31 14:07                 ` Anastasia Belova
  0 siblings, 1 reply; 11+ messages in thread
From: Mirsad Goran Todorovac @ 2023-03-15 16:26 UTC (permalink / raw)
  To: Anastasia Belova
  Cc: Jakob Koschel, linux-usb, linux-kernel, lvc-project, Greg Kroah-Hartman

On 15.3.2023. 15:26, Anastasia Belova wrote:
> 
> 13.03.2023 16:49, Mirsad Goran Todorovac пишет:
>> On 13.3.2023. 13:19, Anastasia Belova wrote:
>>>
>>> 11.03.2023 06:29, Mirsad Goran Todorovac пишет:
>>>> On 15. 02. 2023. 14:48, Greg Kroah-Hartman wrote:
>>>>> On Wed, Feb 15, 2023 at 04:39:56PM +0300, Анастасия Белова wrote:
>>>>>> 03.02.2023 13:45, Greg Kroah-Hartman пишет:
>>>>>>> On Fri, Feb 03, 2023 at 01:18:28PM +0300, Anastasia Belova wrote:
>>>>>>>> Before dereferencing dev->driver check it for NULL.
>>>>>>>>
>>>>>>>> If an interrupt handler is called after assigning
>>>>>>>> NULL to dev->driver, but before resetting dev->int_enable,
>>>>>>>> NULL-pointer will be dereferenced.
>>>>>>>>
>>>>>>>> Found by Linux Verification Center (linuxtesting.org) with SVACE.
>>>>>>>>
>>>>>>>> Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
>>>>>>>> Signed-off-by: Anastasia Belova <abelova@astralinux.ru>
>>>>>>>> ---
>>>>>>>>    drivers/usb/gadget/udc/goku_udc.c | 5 +++--
>>>>>>>>    1 file changed, 3 insertions(+), 2 deletions(-)
>>>>>>>>
>>>>>>>> diff --git a/drivers/usb/gadget/udc/goku_udc.c b/drivers/usb/gadget/udc/goku_udc.c
>>>>>>>> index bdc56b24b5c9..896bba8b47f1 100644
>>>>>>>> --- a/drivers/usb/gadget/udc/goku_udc.c
>>>>>>>> +++ b/drivers/usb/gadget/udc/goku_udc.c
>>>>>>>> @@ -1616,8 +1616,9 @@ static irqreturn_t goku_irq(int irq, void *_dev)
>>>>>>>>    pm_next:
>>>>>>>>            if (stat & INT_USBRESET) {        /* hub reset done */
>>>>>>>>                ACK(INT_USBRESET);
>>>>>>>> -            INFO(dev, "USB reset done, gadget %s\n",
>>>>>>>> -                dev->driver->driver.name);
>>>>>>>> +            if (dev->driver)
>>>>>>>> +                INFO(dev, "USB reset done, gadget %s\n",
>>>>>>>> +                    dev->driver->driver.name);
>>>>>>> How can this ever happen?  Can you trigger this somehow?  If not, I
>>>>>>> don't think this is going to be possible (also what's up with printk
>>>>>>> from an irq handler???)
>>>>>> Unfortunately, I can't find the way to trigger this at the moment.
>>>>> Then the change should not be made.
>>>>>
>>>>>> What about printk, should trace_printk be used instead?
>>>>> Why?
>>>>>
>>>>>>> Odds are, no one actually has this hardware anymore, right?
>>>>>> Despite of this, such vulnerability should be fixed because
>>>>>> there is a possibility to exploit it.
>>>>> How can this be "exploited" if it can not ever be triggered?
>>>>>
>>>>> Also, this would cause a NULL dereference in an irq handler, how can you
>>>>> "exploit" that?
>>>>>
>>>>> Please only submit patches that actually do something.  It is getting
>>>>> very hard to want to even review patches from this "project" based on
>>>>> the recent submissions.
>>>>>
>>>>> thanks,
>>>>>
>>>>> greg k-h
>>>> Hi Greg, Anastasia,
>>>
>>> Hi Misrad,
>>>
>>>> Without any pros or cons, or taking sides, there appears to be a similar check
>>>> when using dev->driver->driver.name in
>>>>
>>>> https://elixir.bootlin.com/linux/latest/source/drivers/usb/gadget/udc/goku_udc.c#L1158
>>>>
>>>>     seq_printf(m,
>>>>            "%s - %s\n"
>>>>            "%s version: %s %s\n"
>>>>            "Gadget driver: %s\n"
>>>>            "Host %s, %s\n"
>>>>            "\n",
>>>>            pci_name(dev->pdev), driver_desc,
>>>>            driver_name, DRIVER_VERSION, dmastr(),
>>>>            dev->driver ? dev->driver->driver.name : "(none)",
>>>>            is_usb_connected
>>>>                ? ((tmp & PW_PULLUP) ? "full speed" : "powered")
>>>>                : "disconnected",
>>>>            udc_ep_state(dev->ep0state));
>>>>
>>>> On the other hand, where could dev->drivre be reset without resetting dev->int_enable?
>>>>
>>>> dev->driver = NULL appears here:
>>>>
>>>> static int goku_udc_stop(struct usb_gadget *g)
>>>> {
>>>>     struct goku_udc    *dev = to_goku_udc(g);
>>>>     unsigned long    flags;
>>>>
>>>>     spin_lock_irqsave(&dev->lock, flags);
>>>>     dev->driver = NULL;
>>>>     stop_activity(dev);
>>>>     spin_unlock_irqrestore(&dev->lock, flags);
>>>>
>>>>     return 0;
>>>> }
>>>>
>>>> it is followed by stop_activity() calling udc_reset():
>>>>
>>>> static void udc_reset(struct goku_udc *dev)
>>>> {
>>>>     struct goku_udc_regs __iomem    *regs = dev->regs;
>>>>
>>>>     writel(0, &regs->power_detect);
>>>>     writel(0, &regs->int_enable);
>>>>     readl(&regs->int_enable);
>>>>     dev->int_enable = 0;
>>>> .
>>>> .
>>>> .
>>>>
>>>> ... but this happens in between spin_lock_irqsave() and spin_unlock_irqsave(),
>>>> which appears like a correct way to do it.
>>>
>>> Are you sure that spin_lock_irqsave makes the code safe? This function disables interrupts on
>>> local processor only (Linux Device Drivers, Third Edition). So it doesn't seem to be
>>> absolutely safe on multiprocessor systems.
>>
>> Hi, Anastasia,
>>
>> Looking from the Second Edition or the book and the source, I see that
>> spin_lock_irqsave() expands to:
>>
>> static inline unsigned long __raw_spin_lock_irqsave(raw_spinlock_t *lock)
>> {
>>     unsigned long flags;
>>
>>     local_irq_save(flags);
>>     preempt_disable();
>>     spin_acquire(&lock->dep_map, 0, 0, _RET_IP_);
>>     LOCK_CONTENDED(lock, do_raw_spin_trylock, do_raw_spin_lock);
>>     return flags;
>> }
>>
>> if the multiple threads on multiple cores/SMTs contend for the same lock,
>> that with preempt_disable() should assure mutual exclusion.
>>
>> Can you please quote from the Third Edition of Linux Device Drivers where
>> it says otherwise?
>>
> 
> Hi, Mirsad,
> 
> 
> If I get it right, preempt_disable blocks interrupts on all processors,
> correct? This statement seems to make the code safe, but there is a quote
> from Linux Device Drivers, Third Edition, CHAPTER 5, Concurrency and Race
> Conditions: "...spin_lock_irqsave disables interrupts (on the local processor
> only) before taking the spinlock...". These thoughts contradict, don't they?

Hi, Anastasia,

To quote the book:

https://static.lwn.net/images/pdf/LDD3/ch05.pdf page 119:

 > We have already seen how spin_lock works. spin_lock_irqsave disables interrupts (on
 > the local processor only) before taking the spinlock; the previous interrupt state is
 > stored in flags.

This is consistent with the source. To repeat, spin_lock_irqsave() eventually
expands to:

https://elixir.bootlin.com/linux/v6.3-rc2/source/include/linux/spinlock_api_smp.h#L97

static inline unsigned long __raw_spin_lock_irqsave(raw_spinlock_t *lock)
{
	unsigned long flags;

	local_irq_save(flags);
	preempt_disable();
	spin_acquire(&lock->dep_map, 0, 0, _RET_IP_);
	LOCK_CONTENDED(lock, do_raw_spin_trylock, do_raw_spin_lock);
	return flags;
}

preempt_disable() disables interrupt only on local CPU, but this is necessary
so IRQ wouldn't attempt another asynchronous lock attempt on the lock currently being
acquired, which would of course block forever this CPU, core or SMT.

The magic is in LOCK_CONTENTED, which eventually atomically sets a global semaphore
and executes a barrier() call, to ensure no compiler optimisation would mess with that
before all writes are synchronised to RAM.

Additional info on barrier() is here:

https://elixir.bootlin.com/linux/v6.3-rc2/source/include/linux/compiler.h#L83

This guarantees that the next read from the same or another CPU, core or SMT will
see lock (synchronised and "taken"), so the mutual exclusion between the cores is
OK. NOTE: the lock must be global, not local to thread for this to work.

So, the key is to go to the Source ;-)

This of course doesn't invalidate driver hardening against "impossible" situations.
In userland, assert() is good for this, but it is impractical for kernel to die on
each bug. Rather log the bug or oops?

Best regards,
Mirsad

>> BTW, please also consider reading this article:
>>
>> https://docs.kernel.org/driver-api/io_ordering.html
>>
>> I saw they were using this readl() after writel() for synchronisation, so
>> please see if this clears your doubts. It says "readl() should flush any pending
>> writes".
>>
>> But I certainly see no harm in your proposal of guarding against NULL pointer
>> dereference of dev->driver, both in
>>
>> > -            INFO(dev, "USB reset done, gadget %s\n",
>> > -                dev->driver->driver.name);
>> > +            if (dev->driver)
>> > +                INFO(dev, "USB reset done, gadget %s\n",
>> > +                    dev->driver->driver.name);
>>
>> or use the construct as the one in another line of the driver:
>>
>> > -            INFO(dev, "USB reset done, gadget %s\n",
>> > -                dev->driver->driver.name);
>> > +            INFO(dev, "USB reset done, gadget %s\n",
>> > +                dev->driver ? dev->driver->driver.name : "(none)");
>>
>> (This would IMHO enable detecting and logging when dev->driver unexpectedly becomes NULL
>> in a race condition, rather that just silently skipping and ignoring the situation.)
>>
> Agree, the second construct looks better.
>> but additionally also in:
>>
>> > -     spin_unlock (&dev->lock);
>> > -     tmp = dev->driver->setup(&dev->gadget, &ctrl);
>> > -     spin_lock (&dev->lock);
>> > +     if (dev->driver && dev->driver->setup) {
>> > +         spin_unlock (&dev->lock);
>> > +         tmp = dev->driver->setup(&dev->gadget, &ctrl);
>> > +         spin_lock (&dev->lock);
>> > +     }
>>
>> for completeness and robustness sake.
>>
>> The author agrees that the race conditions in the device drivers are very hard
>> to reproduce:
>>
>> https://www.oreilly.com/library/view/linux-device-drivers/0596000081/ch09s08.html
>>
>> But I am really not able to analyse all possible scenarios ATM.
>> Maybe some ideas come after getting some oxygen.
>>
>> (This is not an authoritative answer on the matter, just an attempt on analysis.)
>>
>> Regards,
>> Mirsad
>>
> Regards,
> 
> Anastasia
> 
>>>> But second appearance is here:
>>>>
>>>> https://elixir.bootlin.com/linux/latest/source/drivers/usb/gadget/udc/goku_udc.c#L1559
>>>>
>>>>     spin_lock(&dev->lock);
>>>>
>>>> rescan:
>>>>     stat = readl(&regs->int_status) & dev->int_enable;
>>>>          if (!stat)
>>>>         goto done;
>>>>     dev->irqs++;
>>>>
>>>>     /* device-wide irqs */
>>>>     if (unlikely(stat & INT_DEVWIDE)) {
>>>>         if (stat & INT_SYSERROR) {
>>>>             ERROR(dev, "system error\n");
>>>>             stop_activity(dev);
>>>>             stat = 0;
>>>>             handled = 1;
>>>>             // FIXME have a neater way to prevent re-enumeration
>>>>             dev->driver = NULL;
>>>>             goto done;
>>>>         }
>>>>
>>>> goto done leads to:
>>>>
>>>> done:
>>>>     (void)readl(&regs->int_enable);
>>>>     spin_unlock(&dev->lock);
>>>>
>>>> This unlocks dev->lock before setting dev->int_enable to zero, or calling writel(0, &regs->int_enable);
>>>> which could be problematic. Unless it called stop_activity(dev) four lines earlier. Which does
>>>> bot of:
>>>>
>>>>     writel(0, &regs->int_enable);
>>>>     dev->int_enable = 0;
>>>>
>>>> So, FWIW, we seem to be safe. Yet, there might be no harm in printing "(null)" rather
>>>> than having an NULL pointer dereference, it seems.
>>>>
>>>> Yet, there is another unprotected dereference of dev->driver:
>>>>
>>>> https://elixir.bootlin.com/linux/latest/source/drivers/usb/gadget/udc/goku_udc.c#L1513
>>>>
>>>>     spin_unlock (&dev->lock);
>>>>     tmp = dev->driver->setup(&dev->gadget, &ctrl);
>>>>     spin_lock (&dev->lock);
>>>>
>>>> All others (in goku_udc.c, at least) have triple safeguards like:
>>>>
>>>>                 if (dev->gadget.speed != USB_SPEED_UNKNOWN
>>>>                         && dev->driver
>>>>                         && dev->driver->suspend) {
>>>>                     spin_unlock(&dev->lock);
>>>> dev->driver->suspend(&dev->gadget);
>>>>                     spin_lock(&dev->lock);
>>>>                 }
>>>>
>>>> So the above should maybe put to:
>>>>
>>>>     if (dev->driver && dev->driver->setup) {
>>>>         spin_unlock (&dev->lock);
>>>>         tmp = dev->driver->setup(&dev->gadget, &ctrl);
>>>>         spin_lock (&dev->lock);
>>>>     }
>>>>
>>>> instead to be completely certain.
>>>>
>>>> Forgive me for this uninspired rant. Thank you if you've read this far.
>>>> I hope this helps.
>>>>
>>>> My $0.02.
>>>>
>>>> Regards,
>>>> Mirsad
>>>>
>>> Thanks,
>>>
>>> Anastasia
>>

-- 
Mirsad Todorovac
System engineer
Faculty of Graphic Arts | Academy of Fine Arts
University of Zagreb
Republic of Croatia, the European Union

Sistem inženjer
Grafički fakultet | Akademija likovnih umjetnosti
Sveučilište u Zagrebu


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] goku_udc: Add check for NULL in goku_irq
  2023-03-15 16:26               ` Mirsad Goran Todorovac
@ 2023-03-31 14:07                 ` Anastasia Belova
  0 siblings, 0 replies; 11+ messages in thread
From: Anastasia Belova @ 2023-03-31 14:07 UTC (permalink / raw)
  To: Mirsad Goran Todorovac
  Cc: Jakob Koschel, linux-usb, linux-kernel, lvc-project, Greg Kroah-Hartman


15.03.2023 19:26, Mirsad Goran Todorovac пишет:
> On 15.3.2023. 15:26, Anastasia Belova wrote:
>>
>> 13.03.2023 16:49, Mirsad Goran Todorovac пишет:
>>> On 13.3.2023. 13:19, Anastasia Belova wrote:
>>>>
>>>> 11.03.2023 06:29, Mirsad Goran Todorovac пишет:
>>>>> On 15. 02. 2023. 14:48, Greg Kroah-Hartman wrote:
>>>>>> On Wed, Feb 15, 2023 at 04:39:56PM +0300, Анастасия Белова wrote:
>>>>>>> 03.02.2023 13:45, Greg Kroah-Hartman пишет:
>>>>>>>> On Fri, Feb 03, 2023 at 01:18:28PM +0300, Anastasia Belova wrote:
>>>>>>>>> Before dereferencing dev->driver check it for NULL.
>>>>>>>>>
>>>>>>>>> If an interrupt handler is called after assigning
>>>>>>>>> NULL to dev->driver, but before resetting dev->int_enable,
>>>>>>>>> NULL-pointer will be dereferenced.
>>>>>>>>>
>>>>>>>>> Found by Linux Verification Center (linuxtesting.org) with SVACE.
>>>>>>>>>
>>>>>>>>> Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
>>>>>>>>> Signed-off-by: Anastasia Belova <abelova@astralinux.ru>
>>>>>>>>> ---
>>>>>>>>>    drivers/usb/gadget/udc/goku_udc.c | 5 +++--
>>>>>>>>>    1 file changed, 3 insertions(+), 2 deletions(-)
>>>>>>>>>
>>>>>>>>> diff --git a/drivers/usb/gadget/udc/goku_udc.c 
>>>>>>>>> b/drivers/usb/gadget/udc/goku_udc.c
>>>>>>>>> index bdc56b24b5c9..896bba8b47f1 100644
>>>>>>>>> --- a/drivers/usb/gadget/udc/goku_udc.c
>>>>>>>>> +++ b/drivers/usb/gadget/udc/goku_udc.c
>>>>>>>>> @@ -1616,8 +1616,9 @@ static irqreturn_t goku_irq(int irq, 
>>>>>>>>> void *_dev)
>>>>>>>>>    pm_next:
>>>>>>>>>            if (stat & INT_USBRESET) { /* hub reset done */
>>>>>>>>>                ACK(INT_USBRESET);
>>>>>>>>> -            INFO(dev, "USB reset done, gadget %s\n",
>>>>>>>>> -                dev->driver->driver.name);
>>>>>>>>> +            if (dev->driver)
>>>>>>>>> +                INFO(dev, "USB reset done, gadget %s\n",
>>>>>>>>> + dev->driver->driver.name);
>>>>>>>> How can this ever happen?  Can you trigger this somehow?  If 
>>>>>>>> not, I
>>>>>>>> don't think this is going to be possible (also what's up with 
>>>>>>>> printk
>>>>>>>> from an irq handler???)
>>>>>>> Unfortunately, I can't find the way to trigger this at the moment.
>>>>>> Then the change should not be made.
>>>>>>
>>>>>>> What about printk, should trace_printk be used instead?
>>>>>> Why?
>>>>>>
>>>>>>>> Odds are, no one actually has this hardware anymore, right?
>>>>>>> Despite of this, such vulnerability should be fixed because
>>>>>>> there is a possibility to exploit it.
>>>>>> How can this be "exploited" if it can not ever be triggered?
>>>>>>
>>>>>> Also, this would cause a NULL dereference in an irq handler, how 
>>>>>> can you
>>>>>> "exploit" that?
>>>>>>
>>>>>> Please only submit patches that actually do something. It is getting
>>>>>> very hard to want to even review patches from this "project" 
>>>>>> based on
>>>>>> the recent submissions.
>>>>>>
>>>>>> thanks,
>>>>>>
>>>>>> greg k-h
>>>>> Hi Greg, Anastasia,
>>>>
>>>> Hi Misrad,
>>>>
>>>>> Without any pros or cons, or taking sides, there appears to be a 
>>>>> similar check
>>>>> when using dev->driver->driver.name in
>>>>>
>>>>> https://elixir.bootlin.com/linux/latest/source/drivers/usb/gadget/udc/goku_udc.c#L1158 
>>>>>
>>>>>
>>>>>     seq_printf(m,
>>>>>            "%s - %s\n"
>>>>>            "%s version: %s %s\n"
>>>>>            "Gadget driver: %s\n"
>>>>>            "Host %s, %s\n"
>>>>>            "\n",
>>>>>            pci_name(dev->pdev), driver_desc,
>>>>>            driver_name, DRIVER_VERSION, dmastr(),
>>>>>            dev->driver ? dev->driver->driver.name : "(none)",
>>>>>            is_usb_connected
>>>>>                ? ((tmp & PW_PULLUP) ? "full speed" : "powered")
>>>>>                : "disconnected",
>>>>>            udc_ep_state(dev->ep0state));
>>>>>
>>>>> On the other hand, where could dev->drivre be reset without 
>>>>> resetting dev->int_enable?
>>>>>
>>>>> dev->driver = NULL appears here:
>>>>>
>>>>> static int goku_udc_stop(struct usb_gadget *g)
>>>>> {
>>>>>     struct goku_udc    *dev = to_goku_udc(g);
>>>>>     unsigned long    flags;
>>>>>
>>>>>     spin_lock_irqsave(&dev->lock, flags);
>>>>>     dev->driver = NULL;
>>>>>     stop_activity(dev);
>>>>>     spin_unlock_irqrestore(&dev->lock, flags);
>>>>>
>>>>>     return 0;
>>>>> }
>>>>>
>>>>> it is followed by stop_activity() calling udc_reset():
>>>>>
>>>>> static void udc_reset(struct goku_udc *dev)
>>>>> {
>>>>>     struct goku_udc_regs __iomem    *regs = dev->regs;
>>>>>
>>>>>     writel(0, &regs->power_detect);
>>>>>     writel(0, &regs->int_enable);
>>>>>     readl(&regs->int_enable);
>>>>>     dev->int_enable = 0;
>>>>> .
>>>>> .
>>>>> .
>>>>>
>>>>> ... but this happens in between spin_lock_irqsave() and 
>>>>> spin_unlock_irqsave(),
>>>>> which appears like a correct way to do it.
>>>>
>>>> Are you sure that spin_lock_irqsave makes the code safe? This 
>>>> function disables interrupts on
>>>> local processor only (Linux Device Drivers, Third Edition). So it 
>>>> doesn't seem to be
>>>> absolutely safe on multiprocessor systems.
>>>
>>> Hi, Anastasia,
>>>
>>> Looking from the Second Edition or the book and the source, I see that
>>> spin_lock_irqsave() expands to:
>>>
>>> static inline unsigned long __raw_spin_lock_irqsave(raw_spinlock_t 
>>> *lock)
>>> {
>>>     unsigned long flags;
>>>
>>>     local_irq_save(flags);
>>>     preempt_disable();
>>>     spin_acquire(&lock->dep_map, 0, 0, _RET_IP_);
>>>     LOCK_CONTENDED(lock, do_raw_spin_trylock, do_raw_spin_lock);
>>>     return flags;
>>> }
>>>
>>> if the multiple threads on multiple cores/SMTs contend for the same 
>>> lock,
>>> that with preempt_disable() should assure mutual exclusion.
>>>
>>> Can you please quote from the Third Edition of Linux Device Drivers 
>>> where
>>> it says otherwise?
>>>
>>
>> Hi, Mirsad,
>>
>>
>> If I get it right, preempt_disable blocks interrupts on all processors,
>> correct? This statement seems to make the code safe, but there is a 
>> quote
>> from Linux Device Drivers, Third Edition, CHAPTER 5, Concurrency and 
>> Race
>> Conditions: "...spin_lock_irqsave disables interrupts (on the local 
>> processor
>> only) before taking the spinlock...". These thoughts contradict, 
>> don't they?
>
> Hi, Anastasia,
>
> To quote the book:
>
> https://static.lwn.net/images/pdf/LDD3/ch05.pdf page 119:
>
> > We have already seen how spin_lock works. spin_lock_irqsave disables 
> interrupts (on
> > the local processor only) before taking the spinlock; the previous 
> interrupt state is
> > stored in flags.
>
> This is consistent with the source. To repeat, spin_lock_irqsave() 
> eventually
> expands to:
>
> https://elixir.bootlin.com/linux/v6.3-rc2/source/include/linux/spinlock_api_smp.h#L97 
>
>
> static inline unsigned long __raw_spin_lock_irqsave(raw_spinlock_t *lock)
> {
>     unsigned long flags;
>
>     local_irq_save(flags);
>     preempt_disable();
>     spin_acquire(&lock->dep_map, 0, 0, _RET_IP_);
>     LOCK_CONTENDED(lock, do_raw_spin_trylock, do_raw_spin_lock);
>     return flags;
> }
>
> preempt_disable() disables interrupt only on local CPU, but this is 
> necessary
> so IRQ wouldn't attempt another asynchronous lock attempt on the lock 
> currently being
> acquired, which would of course block forever this CPU, core or SMT.
>
> The magic is in LOCK_CONTENTED, which eventually atomically sets a 
> global semaphore
> and executes a barrier() call, to ensure no compiler optimisation 
> would mess with that
> before all writes are synchronised to RAM.
>
> Additional info on barrier() is here:
>
> https://elixir.bootlin.com/linux/v6.3-rc2/source/include/linux/compiler.h#L83 
>
>
> This guarantees that the next read from the same or another CPU, core 
> or SMT will
> see lock (synchronised and "taken"), so the mutual exclusion between 
> the cores is
> OK. NOTE: the lock must be global, not local to thread for this to work.
>
> So, the key is to go to the Source ;-)
>
Hi, Mirsad,


Thank you so much for our discussion. This case is clearer for me now.


Best regards,

Anastasia

> This of course doesn't invalidate driver hardening against 
> "impossible" situations.
> In userland, assert() is good for this, but it is impractical for 
> kernel to die on
> each bug. Rather log the bug or oops?
>
> Best regards,
> Mirsad
>
>>> BTW, please also consider reading this article:
>>>
>>> https://docs.kernel.org/driver-api/io_ordering.html
>>>
>>> I saw they were using this readl() after writel() for 
>>> synchronisation, so
>>> please see if this clears your doubts. It says "readl() should flush 
>>> any pending
>>> writes".
>>>
>>> But I certainly see no harm in your proposal of guarding against 
>>> NULL pointer
>>> dereference of dev->driver, both in
>>>
>>> > -            INFO(dev, "USB reset done, gadget %s\n",
>>> > -                dev->driver->driver.name);
>>> > +            if (dev->driver)
>>> > +                INFO(dev, "USB reset done, gadget %s\n",
>>> > +                    dev->driver->driver.name);
>>>
>>> or use the construct as the one in another line of the driver:
>>>
>>> > -            INFO(dev, "USB reset done, gadget %s\n",
>>> > -                dev->driver->driver.name);
>>> > +            INFO(dev, "USB reset done, gadget %s\n",
>>> > +                dev->driver ? dev->driver->driver.name : "(none)");
>>>
>>> (This would IMHO enable detecting and logging when dev->driver 
>>> unexpectedly becomes NULL
>>> in a race condition, rather that just silently skipping and ignoring 
>>> the situation.)
>>>
>> Agree, the second construct looks better.
>>> but additionally also in:
>>>
>>> > -     spin_unlock (&dev->lock);
>>> > -     tmp = dev->driver->setup(&dev->gadget, &ctrl);
>>> > -     spin_lock (&dev->lock);
>>> > +     if (dev->driver && dev->driver->setup) {
>>> > +         spin_unlock (&dev->lock);
>>> > +         tmp = dev->driver->setup(&dev->gadget, &ctrl);
>>> > +         spin_lock (&dev->lock);
>>> > +     }
>>>
>>> for completeness and robustness sake.
>>>
>>> The author agrees that the race conditions in the device drivers are 
>>> very hard
>>> to reproduce:
>>>
>>> https://www.oreilly.com/library/view/linux-device-drivers/0596000081/ch09s08.html 
>>>
>>>
>>> But I am really not able to analyse all possible scenarios ATM.
>>> Maybe some ideas come after getting some oxygen.
>>>
>>> (This is not an authoritative answer on the matter, just an attempt 
>>> on analysis.)
>>>
>>> Regards,
>>> Mirsad
>>>
>> Regards,
>>
>> Anastasia
>>
>>>>> But second appearance is here:
>>>>>
>>>>> https://elixir.bootlin.com/linux/latest/source/drivers/usb/gadget/udc/goku_udc.c#L1559 
>>>>>
>>>>>
>>>>>     spin_lock(&dev->lock);
>>>>>
>>>>> rescan:
>>>>>     stat = readl(&regs->int_status) & dev->int_enable;
>>>>>          if (!stat)
>>>>>         goto done;
>>>>>     dev->irqs++;
>>>>>
>>>>>     /* device-wide irqs */
>>>>>     if (unlikely(stat & INT_DEVWIDE)) {
>>>>>         if (stat & INT_SYSERROR) {
>>>>>             ERROR(dev, "system error\n");
>>>>>             stop_activity(dev);
>>>>>             stat = 0;
>>>>>             handled = 1;
>>>>>             // FIXME have a neater way to prevent re-enumeration
>>>>>             dev->driver = NULL;
>>>>>             goto done;
>>>>>         }
>>>>>
>>>>> goto done leads to:
>>>>>
>>>>> done:
>>>>>     (void)readl(&regs->int_enable);
>>>>>     spin_unlock(&dev->lock);
>>>>>
>>>>> This unlocks dev->lock before setting dev->int_enable to zero, or 
>>>>> calling writel(0, &regs->int_enable);
>>>>> which could be problematic. Unless it called stop_activity(dev) 
>>>>> four lines earlier. Which does
>>>>> bot of:
>>>>>
>>>>>     writel(0, &regs->int_enable);
>>>>>     dev->int_enable = 0;
>>>>>
>>>>> So, FWIW, we seem to be safe. Yet, there might be no harm in 
>>>>> printing "(null)" rather
>>>>> than having an NULL pointer dereference, it seems.
>>>>>
>>>>> Yet, there is another unprotected dereference of dev->driver:
>>>>>
>>>>> https://elixir.bootlin.com/linux/latest/source/drivers/usb/gadget/udc/goku_udc.c#L1513 
>>>>>
>>>>>
>>>>>     spin_unlock (&dev->lock);
>>>>>     tmp = dev->driver->setup(&dev->gadget, &ctrl);
>>>>>     spin_lock (&dev->lock);
>>>>>
>>>>> All others (in goku_udc.c, at least) have triple safeguards like:
>>>>>
>>>>>                 if (dev->gadget.speed != USB_SPEED_UNKNOWN
>>>>>                         && dev->driver
>>>>>                         && dev->driver->suspend) {
>>>>>                     spin_unlock(&dev->lock);
>>>>> dev->driver->suspend(&dev->gadget);
>>>>>                     spin_lock(&dev->lock);
>>>>>                 }
>>>>>
>>>>> So the above should maybe put to:
>>>>>
>>>>>     if (dev->driver && dev->driver->setup) {
>>>>>         spin_unlock (&dev->lock);
>>>>>         tmp = dev->driver->setup(&dev->gadget, &ctrl);
>>>>>         spin_lock (&dev->lock);
>>>>>     }
>>>>>
>>>>> instead to be completely certain.
>>>>>
>>>>> Forgive me for this uninspired rant. Thank you if you've read this 
>>>>> far.
>>>>> I hope this helps.
>>>>>
>>>>> My $0.02.
>>>>>
>>>>> Regards,
>>>>> Mirsad
>>>>>
>>>> Thanks,
>>>>
>>>> Anastasia
>>>
>

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2023-03-31 14:14 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-02-03 10:18 [PATCH] goku_udc: Add check for NULL in goku_irq Anastasia Belova
2023-02-03 10:45 ` Greg Kroah-Hartman
2023-02-15 13:39   ` Анастасия Белова
2023-02-15 13:48     ` Greg Kroah-Hartman
2023-03-11  3:29       ` Mirsad Goran Todorovac
2023-03-13 12:19         ` Anastasia Belova
2023-03-13 13:49           ` Mirsad Goran Todorovac
2023-03-15 14:26             ` Anastasia Belova
2023-03-15 16:26               ` Mirsad Goran Todorovac
2023-03-31 14:07                 ` Anastasia Belova
2023-02-03 14:43 ` Alan Stern

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.