All of lore.kernel.org
 help / color / mirror / Atom feed
From: Alexander Holler <holler@ahsoftware.de>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: linux-kernel@vger.kernel.org, linux-fbdev@vger.kernel.org,
	Florian Tobias Schandinat <FlorianSchandinat@gmx.de>,
	Bernie Thompson <bernie@plugable.com>,
	Steve Glendinning <steve.glendinning@shawell.net>,
	stable@vger.kernel.org
Subject: Re: [PATCH 2/3 v2] fb: udlfb: fix hang at disconnect
Date: Tue, 29 Jan 2013 11:35:25 +0100	[thread overview]
Message-ID: <5107A5ED.7020009@ahsoftware.de> (raw)
In-Reply-To: <51071E21.9030008@ahsoftware.de>

Am 29.01.2013 01:56, schrieb Alexander Holler:
> Am 29.01.2013 01:22, schrieb Andrew Morton:
>> On Fri, 25 Jan 2013 19:49:27 +0100
>> Alexander Holler <holler@ahsoftware.de> wrote:
>>
>>> When a device was disconnected the driver may hang at waiting for urbs it never
>>> will get. Fix this by using a timeout while waiting for the used semaphore.
>>>
>>> There is still a memory leak if a timeout happens, but at least the driver
>>> now continues his disconnect routine.
>>>
>>> ...
>>>
>>> --- a/drivers/video/udlfb.c
>>> +++ b/drivers/video/udlfb.c
>>> @@ -1832,8 +1832,9 @@ static void dlfb_free_urb_list(struct dlfb_data *dev)
>>>   	/* keep waiting and freeing, until we've got 'em all */
>>>   	while (count--) {
>>>
>>> -		/* Getting interrupted means a leak, but ok at disconnect */
>>> -		ret = down_interruptible(&dev->urbs.limit_sem);
>>> +		/* Timeout likely occurs at disconnect (resulting in a leak) */
>>> +		ret = down_timeout_killable(&dev->urbs.limit_sem,
>>> +						FREE_URB_TIMEOUT);
>>>   		if (ret)
>>>   			break;
>>
>> This is rather a hack.  Do you have an understanding of the underlying
>> bug?  Why is the driver waiting for things which will never happen?

To add a bit more explanation:

I've experienced that bug after moving the fb-damage-handling into a 
workqueue (to make the driver usable as console). This likely has 
increased the possibility that an urb gets missed when the usb-stack 
calls the (usb-)disconnect function of the driver. But I don't know as I 
couldn't use the driver before (as fbcon) so I don't really have a 
comparison.

What currently happens here is something like that:

fb -> damage -> workload which sends urb and waits for answer
device disconnect -> dlfb_usb_disconnect() -> stall (no answer to the 
above urb)

I don't know why the disconnect waits for all urbs. The code looks like 
it does that just to free the allocated memory. As I'm not very familiar 
with the usb-stack, I would have to read up about the urb-handling to 
find out how to free the memory otherwise.

As the previous comment in the code suggests that urbs already got 
missed (on shutdown) before, I assume that even without my patch, which 
moved the damage into a workqueue, the problem could occur which then 
prevents a shutdown as there is no timeout. As I've experienced that 
problem not only on disconnect, but on shutdown too (no shutdown was 
possible), I have to assume, that the previous used down_interruptible() 
didn't get a signal on shutdown (if the driver is used as fbcon), 
therefor I consider the timeout as necessary.

Regards,

Alexander


WARNING: multiple messages have this Message-ID (diff)
From: Alexander Holler <holler@ahsoftware.de>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: linux-kernel@vger.kernel.org, linux-fbdev@vger.kernel.org,
	Florian Tobias Schandinat <FlorianSchandinat@gmx.de>,
	Bernie Thompson <bernie@plugable.com>,
	Steve Glendinning <steve.glendinning@shawell.net>,
	stable@vger.kernel.org
Subject: Re: [PATCH 2/3 v2] fb: udlfb: fix hang at disconnect
Date: Tue, 29 Jan 2013 10:35:25 +0000	[thread overview]
Message-ID: <5107A5ED.7020009@ahsoftware.de> (raw)
In-Reply-To: <51071E21.9030008@ahsoftware.de>

Am 29.01.2013 01:56, schrieb Alexander Holler:
> Am 29.01.2013 01:22, schrieb Andrew Morton:
>> On Fri, 25 Jan 2013 19:49:27 +0100
>> Alexander Holler <holler@ahsoftware.de> wrote:
>>
>>> When a device was disconnected the driver may hang at waiting for urbs it never
>>> will get. Fix this by using a timeout while waiting for the used semaphore.
>>>
>>> There is still a memory leak if a timeout happens, but at least the driver
>>> now continues his disconnect routine.
>>>
>>> ...
>>>
>>> --- a/drivers/video/udlfb.c
>>> +++ b/drivers/video/udlfb.c
>>> @@ -1832,8 +1832,9 @@ static void dlfb_free_urb_list(struct dlfb_data *dev)
>>>   	/* keep waiting and freeing, until we've got 'em all */
>>>   	while (count--) {
>>>
>>> -		/* Getting interrupted means a leak, but ok at disconnect */
>>> -		ret = down_interruptible(&dev->urbs.limit_sem);
>>> +		/* Timeout likely occurs at disconnect (resulting in a leak) */
>>> +		ret = down_timeout_killable(&dev->urbs.limit_sem,
>>> +						FREE_URB_TIMEOUT);
>>>   		if (ret)
>>>   			break;
>>
>> This is rather a hack.  Do you have an understanding of the underlying
>> bug?  Why is the driver waiting for things which will never happen?

To add a bit more explanation:

I've experienced that bug after moving the fb-damage-handling into a 
workqueue (to make the driver usable as console). This likely has 
increased the possibility that an urb gets missed when the usb-stack 
calls the (usb-)disconnect function of the driver. But I don't know as I 
couldn't use the driver before (as fbcon) so I don't really have a 
comparison.

What currently happens here is something like that:

fb -> damage -> workload which sends urb and waits for answer
device disconnect -> dlfb_usb_disconnect() -> stall (no answer to the 
above urb)

I don't know why the disconnect waits for all urbs. The code looks like 
it does that just to free the allocated memory. As I'm not very familiar 
with the usb-stack, I would have to read up about the urb-handling to 
find out how to free the memory otherwise.

As the previous comment in the code suggests that urbs already got 
missed (on shutdown) before, I assume that even without my patch, which 
moved the damage into a workqueue, the problem could occur which then 
prevents a shutdown as there is no timeout. As I've experienced that 
problem not only on disconnect, but on shutdown too (no shutdown was 
possible), I have to assume, that the previous used down_interruptible() 
didn't get a signal on shutdown (if the driver is used as fbcon), 
therefor I consider the timeout as necessary.

Regards,

Alexander


  reply	other threads:[~2013-01-29 10:36 UTC|newest]

Thread overview: 46+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-01-12 13:20 [PATCH] fb: udlfb: fix hang at disconnect Alexander Holler
2013-01-12 13:20 ` Alexander Holler
2013-01-12 22:22 ` Bernie Thompson
2013-01-12 22:22   ` Bernie Thompson
2013-01-13 12:05   ` Alexander Holler
2013-01-13 12:05     ` Alexander Holler
2013-01-13 12:24     ` Alexander Holler
2013-01-13 12:24       ` Alexander Holler
2013-01-25 18:49     ` [PATCH 1/3] semaphore: introduce down_timeout_killable() Alexander Holler
2013-01-25 18:49       ` Alexander Holler
2013-01-25 18:49       ` [PATCH 2/3 v2] fb: udlfb: fix hang at disconnect Alexander Holler
2013-01-25 18:49         ` Alexander Holler
2013-01-29  0:22         ` Andrew Morton
2013-01-29  0:22           ` Andrew Morton
2013-01-29  0:56           ` Alexander Holler
2013-01-29  0:56             ` Alexander Holler
2013-01-29 10:35             ` Alexander Holler [this message]
2013-01-29 10:35               ` Alexander Holler
2013-01-29 11:11               ` Alexander Holler
2013-01-29 11:11                 ` Alexander Holler
2013-01-29 15:51                 ` Alexander Holler
2013-01-29 15:51                   ` Alexander Holler
2013-01-29 20:35                   ` Alexander Holler
2013-01-29 20:35                     ` Alexander Holler
2013-01-29 20:56                     ` Alexander Holler
2013-01-29 20:56                       ` Alexander Holler
2013-02-04  1:14                     ` Greg KH
2013-02-04  1:14                       ` Greg KH
2013-02-04 12:05                       ` Alexander Holler
2013-02-04 12:05                         ` Alexander Holler
2013-02-04 19:17                         ` Alexander Holler
2013-02-04 19:17                           ` Alexander Holler
2013-02-04 19:25                           ` Greg KH
2013-02-04 19:25                             ` Greg KH
2013-02-05  7:08                             ` Alexander Holler
2013-02-05  7:08                               ` Alexander Holler
2013-02-05 17:22                               ` Greg KH
2013-02-05 17:22                                 ` Greg KH
2013-02-05 17:36                                 ` Alexander Holler
2013-02-05 17:36                                   ` Alexander Holler
2013-02-08  4:07                                   ` Dave Airlie
2013-02-08  4:07                                     ` Dave Airlie
2013-02-08  9:53                                     ` Alexander Holler
2013-02-08  9:53                                       ` Alexander Holler
2013-01-25 18:49       ` [PATCH 3/3] fb: smscufx: " Alexander Holler
2013-01-25 18:49         ` Alexander Holler

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5107A5ED.7020009@ahsoftware.de \
    --to=holler@ahsoftware.de \
    --cc=FlorianSchandinat@gmx.de \
    --cc=akpm@linux-foundation.org \
    --cc=bernie@plugable.com \
    --cc=linux-fbdev@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=stable@vger.kernel.org \
    --cc=steve.glendinning@shawell.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.