linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] fb: udlfb: fix hang at disconnect
@ 2013-01-12 13:20 Alexander Holler
  2013-01-12 22:22 ` Bernie Thompson
  0 siblings, 1 reply; 23+ messages in thread
From: Alexander Holler @ 2013-01-12 13:20 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-fbdev, Bernie Thompson, Florian Tobias Schandinat,
	Alan Cox, Steve Glendinning, Dave Airlie, Alexander Holler,
	stable

When a device was disconnected the driver may hang at waiting for urbs it never
will get. Fix this by using a timeout while waiting for the used semaphore.

There is still a memory leak if a timeout happens, but at least the driver
now continues his disconnect routine.

Cc: <stable@vger.kernel.org>
Signed-off-by: Alexander Holler <holler@ahsoftware.de>
---
 drivers/video/udlfb.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/video/udlfb.c b/drivers/video/udlfb.c
index 86d449e..cc4a8d1 100644
--- a/drivers/video/udlfb.c
+++ b/drivers/video/udlfb.c
@@ -1832,8 +1832,8 @@ static void dlfb_free_urb_list(struct dlfb_data *dev)
 	/* keep waiting and freeing, until we've got 'em all */
 	while (count--) {
 
-		/* Getting interrupted means a leak, but ok at disconnect */
-		ret = down_interruptible(&dev->urbs.limit_sem);
+		/* Timeout likely occurs at disconnect (resulting in a leak) */
+		ret = down_timeout(&dev->urbs.limit_sem, GET_URB_TIMEOUT);
 		if (ret)
 			break;
 
-- 
1.7.11.7


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH] fb: udlfb: fix hang at disconnect
  2013-01-12 13:20 [PATCH] fb: udlfb: fix hang at disconnect Alexander Holler
@ 2013-01-12 22:22 ` Bernie Thompson
  2013-01-13 12:05   ` Alexander Holler
  0 siblings, 1 reply; 23+ messages in thread
From: Bernie Thompson @ 2013-01-12 22:22 UTC (permalink / raw)
  To: Alexander Holler
  Cc: linux-kernel, linux-fbdev, Florian Tobias Schandinat, Alan Cox,
	Steve Glendinning, Dave Airlie, stable

Hi Alexander,

On Sat, Jan 12, 2013 at 5:20 AM, Alexander Holler <holler@ahsoftware.de> wrote:
> When a device was disconnected the driver may hang at waiting for urbs it never
> will get. Fix this by using a timeout while waiting for the used semaphore.

The code used to be this way, but it used to cause nasty shutdown hangs:
http://git.plugable.com/gitphp/index.php?p=udlfb&a=commitdiff&h=1dd39a65001deb5a84088dfabb788d3274fbb6b6

Which is why the code is the way it is today.

Can you say under what situations you're hitting hangs on device
disconnect?  Have you tested extensively to confirm no shutdown hangs
with your patch?

Stepping back, there was another recent patch from the community to
udlfb to work around issues of sleeping in the wrong context. The fix
involved introducing another scheduled workitem. This slows everything
down when it's in the main path, and isn't really desirable if we can
avoid it.

Another option to eliminate all these problems -- long considered but
never implemented -- is to get rid of all semaphores and potential
sleeps in udlfb entirely.  That would require a strategy to throttle
rendering in some way other than by waiting in kernel (without some
throttling strategy, the USB bus can be a bottleneck which can flood
the system with rendered but untransmitted pixels).

Options might be:

1) When transfer buffers are full, keep track of dirty rectangles for
the rest and pick up where we left off the next time we're entered
(avoiding flooding by potentially having pixels in the dirty regions
be written over multiple times before we get to rendering them once)

2 ) If we "bet" on page-fault-based defio dirty pixel detection, we
could allocate buffers dynamically but increase the scheduling time to
transfer as our outstanding buffer count grows, and reduce the latency
only when the buffer count goes down (again, pixels will be
potentially rendered many times before being transfered once, avoiding
flooding).

Any other ideas on the specific or general case are welcome.  Also
note that udlfb is being largely superceeded by the udl DRM driver -
so any decisions here should also be considered in that codebase.

In any case, thanks for giving the DisplayLink USB 2.0 graphics
drivers attention - it's much appreciated!

Bernie Thompson
http://plugable.com/

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] fb: udlfb: fix hang at disconnect
  2013-01-12 22:22 ` Bernie Thompson
@ 2013-01-13 12:05   ` Alexander Holler
  2013-01-13 12:24     ` Alexander Holler
  2013-01-25 18:49     ` [PATCH 1/3] semaphore: introduce down_timeout_killable() Alexander Holler
  0 siblings, 2 replies; 23+ messages in thread
From: Alexander Holler @ 2013-01-13 12:05 UTC (permalink / raw)
  To: Bernie Thompson
  Cc: linux-kernel, linux-fbdev, Florian Tobias Schandinat, Alan Cox,
	Steve Glendinning, Dave Airlie, stable

Am 12.01.2013 23:22, schrieb Bernie Thompson:
> Hi Alexander,
>
> On Sat, Jan 12, 2013 at 5:20 AM, Alexander Holler <holler@ahsoftware.de> wrote:
>> When a device was disconnected the driver may hang at waiting for urbs it never
>> will get. Fix this by using a timeout while waiting for the used semaphore.
>
> The code used to be this way, but it used to cause nasty shutdown hangs:
> http://git.plugable.com/gitphp/index.php?p=udlfb&a=commitdiff&h=1dd39a65001deb5a84088dfabb788d3274fbb6b6
>
> Which is why the code is the way it is today.
>
> Can you say under what situations you're hitting hangs on device
> disconnect?  Have you tested extensively to confirm no shutdown hangs
> with your patch?
>

The driver almost always (2/3) hangs here when the device gets 
disconnected. It is easy to see when the device gets attached again as 
nothing will happen if the driver (already) hangs (in addition that a 
shutdown isn't possible).

I didn't test it extensively, but without the patch the driver isn't 
usable here. Maybe my previous patch which moves damages to a workqueue 
is the reason that it's more likely that urbs get missing, but the 
problem already existed because an urb might get missed on disconnect. I 
don't know what problems existed before, maybe people just had a problem 
with the BUG_ON(ret). If that _interrupted_ is really needed, it could 
make sense to implement a down_timeout_interruptible() for semaphores.

> Stepping back, there was another recent patch from the community to
> udlfb to work around issues of sleeping in the wrong context. The fix
> involved introducing another scheduled workitem. This slows everything
> down when it's in the main path, and isn't really desirable if we can
> avoid it.

Do you mean the one I've recently posted? It is needed, at least for 3.7 
(I don't know since when those "schedule while atomic" messages appear).
It might slow down refreshes, but it is needed, at least until someone 
gets around those semaphores or removes the spinlocks in upper layers 
(as Alan Cox suggested with the "I am crap" helper for printk).

Maybe using a WQ_HIGHPRI for the workqueue with the damages will speed 
up things.

More optimizations might be doable too (e.g. combining multiple queued 
damages).

> Another option to eliminate all these problems -- long considered but
> never implemented -- is to get rid of all semaphores and potential
> sleeps in udlfb entirely.  That would require a strategy to throttle
> rendering in some way other than by waiting in kernel (without some
> throttling strategy, the USB bus can be a bottleneck which can flood
> the system with rendered but untransmitted pixels).
>
> Options might be:
>
> 1) When transfer buffers are full, keep track of dirty rectangles for
> the rest and pick up where we left off the next time we're entered
> (avoiding flooding by potentially having pixels in the dirty regions
> be written over multiple times before we get to rendering them once)
>
> 2 ) If we "bet" on page-fault-based defio dirty pixel detection, we
> could allocate buffers dynamically but increase the scheduling time to
> transfer as our outstanding buffer count grows, and reduce the latency
> only when the buffer count goes down (again, pixels will be
> potentially rendered many times before being transfered once, avoiding
> flooding).
>
> Any other ideas on the specific or general case are welcome.  Also
> note that udlfb is being largely superceeded by the udl DRM driver -
> so any decisions here should also be considered in that codebase.
>
> In any case, thanks for giving the DisplayLink USB 2.0 graphics
> drivers attention - it's much appreciated!

Thanks for the sugestions, but I don't feel the need to spend a lot of 
time here. I just wanted to use the console with the device and a kernel 
3.7.x and neither udlfb nor udl currently worked (and I'm pretty sure 
I've used one of them some time before, likely udlfb).

Btw, to see the console again after a disconnect and connect, I'm 
currently using the following (necessary) quick&dirty hack:

---------
         /* if clients still have us open, will be freed on last close */
-       if (dev->fb_count == 0)
+//     if (dev->fb_count == 0)
                 schedule_delayed_work(&dev->free_framebuffer_work, 0);
---------

Without that the framebuffer will never get unregistered (because just 
unlinking it doesn't remove the fb-console which counts for one client) 
with the result that the new one (after connecting the device again) 
will not get the console.

Regards,

Alexander

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] fb: udlfb: fix hang at disconnect
  2013-01-13 12:05   ` Alexander Holler
@ 2013-01-13 12:24     ` Alexander Holler
  2013-01-25 18:49     ` [PATCH 1/3] semaphore: introduce down_timeout_killable() Alexander Holler
  1 sibling, 0 replies; 23+ messages in thread
From: Alexander Holler @ 2013-01-13 12:24 UTC (permalink / raw)
  To: Bernie Thompson
  Cc: linux-kernel, linux-fbdev, Florian Tobias Schandinat, Alan Cox,
	Steve Glendinning, Dave Airlie, stable

Am 13.01.2013 13:05, schrieb Alexander Holler:
> Am 12.01.2013 23:22, schrieb Bernie Thompson:

> I didn't test it extensively, but without the patch the driver isn't
> usable here. Maybe my previous patch which moves damages to a workqueue

To add some more explanations, I'm currently only testing it with a 
statically linked udlfb (for fbcon) as that is what I'm mainly using the 
device for (with otherwise headless boxes). When udlfb is a module, I 
don't see those "schedule while atomic" messages (I don't know why), but 
having a console only after the modules got loaded isn't always an option.

Regards,

Alexander

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH 1/3] semaphore: introduce down_timeout_killable()
  2013-01-13 12:05   ` Alexander Holler
  2013-01-13 12:24     ` Alexander Holler
@ 2013-01-25 18:49     ` Alexander Holler
  2013-01-25 18:49       ` [PATCH 2/3 v2] fb: udlfb: fix hang at disconnect Alexander Holler
  2013-01-25 18:49       ` [PATCH 3/3] fb: smscufx: " Alexander Holler
  1 sibling, 2 replies; 23+ messages in thread
From: Alexander Holler @ 2013-01-25 18:49 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-fbdev, Florian Tobias Schandinat, Bernie Thompson,
	Steve Glendinning, Alexander Holler

The name says it all, it's like down_timeout() but returns on fatal signals
too.

Signed-off-by: Alexander Holler <holler@ahsoftware.de>
---
 include/linux/semaphore.h |  2 ++
 kernel/semaphore.c        | 37 +++++++++++++++++++++++++++++++++++++
 2 files changed, 39 insertions(+)

diff --git a/include/linux/semaphore.h b/include/linux/semaphore.h
index dc368b8..b508d4d 100644
--- a/include/linux/semaphore.h
+++ b/include/linux/semaphore.h
@@ -41,6 +41,8 @@ extern int __must_check down_interruptible(struct semaphore *sem);
 extern int __must_check down_killable(struct semaphore *sem);
 extern int __must_check down_trylock(struct semaphore *sem);
 extern int __must_check down_timeout(struct semaphore *sem, long jiffies);
+extern int __must_check down_timeout_killable(struct semaphore *sem,
+						long jiffies);
 extern void up(struct semaphore *sem);
 
 #endif /* __LINUX_SEMAPHORE_H */
diff --git a/kernel/semaphore.c b/kernel/semaphore.c
index 4567fc0..4a5e823 100644
--- a/kernel/semaphore.c
+++ b/kernel/semaphore.c
@@ -37,6 +37,8 @@ static noinline void __down(struct semaphore *sem);
 static noinline int __down_interruptible(struct semaphore *sem);
 static noinline int __down_killable(struct semaphore *sem);
 static noinline int __down_timeout(struct semaphore *sem, long jiffies);
+static noinline int __down_timeout_killable(struct semaphore *sem,
+						long jiffies);
 static noinline void __up(struct semaphore *sem);
 
 /**
@@ -169,6 +171,35 @@ int down_timeout(struct semaphore *sem, long jiffies)
 EXPORT_SYMBOL(down_timeout);
 
 /**
+ * down_timeout_killable - acquire the semaphore within a specified time or
+ * until a fatal signal arrives.
+ * @sem: the semaphore to be acquired
+ * @jiffies: how long to wait before failing
+ *
+ * Attempts to acquire the semaphore.  If no more tasks are allowed to
+ * acquire the semaphore, calling this function will put the task to sleep.
+ * If the semaphore is not released within the specified number of jiffies,
+ * this function returns -ETIME. If the sleep is interrupted by a fatal signal,
+ * this function will return -EINTR. If the semaphore is successfully acquired,
+ * this function returns 0.
+ */
+int down_timeout_killable(struct semaphore *sem, long jiffies)
+{
+	unsigned long flags;
+	int result = 0;
+
+	raw_spin_lock_irqsave(&sem->lock, flags);
+	if (likely(sem->count > 0))
+		sem->count--;
+	else
+		result = __down_timeout_killable(sem, jiffies);
+	raw_spin_unlock_irqrestore(&sem->lock, flags);
+
+	return result;
+}
+EXPORT_SYMBOL(down_timeout_killable);
+
+/**
  * up - release the semaphore
  * @sem: the semaphore to release
  *
@@ -253,6 +284,12 @@ static noinline int __sched __down_timeout(struct semaphore *sem, long jiffies)
 	return __down_common(sem, TASK_UNINTERRUPTIBLE, jiffies);
 }
 
+static noinline int __sched __down_timeout_killable(struct semaphore *sem,
+							long jiffies)
+{
+	return __down_common(sem, TASK_KILLABLE, jiffies);
+}
+
 static noinline void __sched __up(struct semaphore *sem)
 {
 	struct semaphore_waiter *waiter = list_first_entry(&sem->wait_list,
-- 
1.7.11.7


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 2/3 v2] fb: udlfb: fix hang at disconnect
  2013-01-25 18:49     ` [PATCH 1/3] semaphore: introduce down_timeout_killable() Alexander Holler
@ 2013-01-25 18:49       ` Alexander Holler
  2013-01-29  0:22         ` Andrew Morton
  2013-01-25 18:49       ` [PATCH 3/3] fb: smscufx: " Alexander Holler
  1 sibling, 1 reply; 23+ messages in thread
From: Alexander Holler @ 2013-01-25 18:49 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-fbdev, Florian Tobias Schandinat, Bernie Thompson,
	Steve Glendinning, Alexander Holler, stable

When a device was disconnected the driver may hang at waiting for urbs it never
will get. Fix this by using a timeout while waiting for the used semaphore.

There is still a memory leak if a timeout happens, but at least the driver
now continues his disconnect routine.

Cc: <stable@vger.kernel.org>
Signed-off-by: Alexander Holler <holler@ahsoftware.de>
---
 drivers/video/udlfb.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/video/udlfb.c b/drivers/video/udlfb.c
index 86d449e..db6cc66 100644
--- a/drivers/video/udlfb.c
+++ b/drivers/video/udlfb.c
@@ -1832,8 +1832,9 @@ static void dlfb_free_urb_list(struct dlfb_data *dev)
 	/* keep waiting and freeing, until we've got 'em all */
 	while (count--) {
 
-		/* Getting interrupted means a leak, but ok at disconnect */
-		ret = down_interruptible(&dev->urbs.limit_sem);
+		/* Timeout likely occurs at disconnect (resulting in a leak) */
+		ret = down_timeout_killable(&dev->urbs.limit_sem,
+						FREE_URB_TIMEOUT);
 		if (ret)
 			break;
 
-- 
1.7.11.7


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 3/3] fb: smscufx: fix hang at disconnect
  2013-01-25 18:49     ` [PATCH 1/3] semaphore: introduce down_timeout_killable() Alexander Holler
  2013-01-25 18:49       ` [PATCH 2/3 v2] fb: udlfb: fix hang at disconnect Alexander Holler
@ 2013-01-25 18:49       ` Alexander Holler
  1 sibling, 0 replies; 23+ messages in thread
From: Alexander Holler @ 2013-01-25 18:49 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-fbdev, Florian Tobias Schandinat, Bernie Thompson,
	Steve Glendinning, Alexander Holler, stable

When a device was disconnected the driver may hang at waiting for urbs it never
will get. Fix this by using a timeout while waiting for the used semaphore.

There is still a memory leak if a timeout happens, but at least the driver
now continues his disconnect routine.

Cc: <stable@vger.kernel.org>
Signed-off-by: Alexander Holler <holler@ahsoftware.de>
---
 drivers/video/smscufx.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/video/smscufx.c b/drivers/video/smscufx.c
index 97bd662..f26fa4f 100644
--- a/drivers/video/smscufx.c
+++ b/drivers/video/smscufx.c
@@ -1838,8 +1838,9 @@ static void ufx_free_urb_list(struct ufx_data *dev)
 
 	/* keep waiting and freeing, until we've got 'em all */
 	while (count--) {
-		/* Getting interrupted means a leak, but ok at shutdown*/
-		ret = down_interruptible(&dev->urbs.limit_sem);
+		/* Timeout likely occurs at disconnect (resulting in a leak) */
+		ret = down_timeout_killable(&dev->urbs.limit_sem,
+						FREE_URB_TIMEOUT);
 		if (ret)
 			break;
 
-- 
1.7.11.7


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH 2/3 v2] fb: udlfb: fix hang at disconnect
  2013-01-25 18:49       ` [PATCH 2/3 v2] fb: udlfb: fix hang at disconnect Alexander Holler
@ 2013-01-29  0:22         ` Andrew Morton
  2013-01-29  0:56           ` Alexander Holler
  0 siblings, 1 reply; 23+ messages in thread
From: Andrew Morton @ 2013-01-29  0:22 UTC (permalink / raw)
  To: Alexander Holler
  Cc: linux-kernel, linux-fbdev, Florian Tobias Schandinat,
	Bernie Thompson, Steve Glendinning, stable

On Fri, 25 Jan 2013 19:49:27 +0100
Alexander Holler <holler@ahsoftware.de> wrote:

> When a device was disconnected the driver may hang at waiting for urbs it never
> will get. Fix this by using a timeout while waiting for the used semaphore.
> 
> There is still a memory leak if a timeout happens, but at least the driver
> now continues his disconnect routine.
> 
> ...
>
> --- a/drivers/video/udlfb.c
> +++ b/drivers/video/udlfb.c
> @@ -1832,8 +1832,9 @@ static void dlfb_free_urb_list(struct dlfb_data *dev)
>  	/* keep waiting and freeing, until we've got 'em all */
>  	while (count--) {
>  
> -		/* Getting interrupted means a leak, but ok at disconnect */
> -		ret = down_interruptible(&dev->urbs.limit_sem);
> +		/* Timeout likely occurs at disconnect (resulting in a leak) */
> +		ret = down_timeout_killable(&dev->urbs.limit_sem,
> +						FREE_URB_TIMEOUT);
>  		if (ret)
>  			break;

This is rather a hack.  Do you have an understanding of the underlying
bug?  Why is the driver waiting for things which will never happen?


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 2/3 v2] fb: udlfb: fix hang at disconnect
  2013-01-29  0:22         ` Andrew Morton
@ 2013-01-29  0:56           ` Alexander Holler
  2013-01-29 10:35             ` Alexander Holler
  0 siblings, 1 reply; 23+ messages in thread
From: Alexander Holler @ 2013-01-29  0:56 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-kernel, linux-fbdev, Florian Tobias Schandinat,
	Bernie Thompson, Steve Glendinning, stable

Am 29.01.2013 01:22, schrieb Andrew Morton:
> On Fri, 25 Jan 2013 19:49:27 +0100
> Alexander Holler <holler@ahsoftware.de> wrote:
> 
>> When a device was disconnected the driver may hang at waiting for urbs it never
>> will get. Fix this by using a timeout while waiting for the used semaphore.
>>
>> There is still a memory leak if a timeout happens, but at least the driver
>> now continues his disconnect routine.
>>
>> ...
>>
>> --- a/drivers/video/udlfb.c
>> +++ b/drivers/video/udlfb.c
>> @@ -1832,8 +1832,9 @@ static void dlfb_free_urb_list(struct dlfb_data *dev)
>>  	/* keep waiting and freeing, until we've got 'em all */
>>  	while (count--) {
>>  
>> -		/* Getting interrupted means a leak, but ok at disconnect */
>> -		ret = down_interruptible(&dev->urbs.limit_sem);
>> +		/* Timeout likely occurs at disconnect (resulting in a leak) */
>> +		ret = down_timeout_killable(&dev->urbs.limit_sem,
>> +						FREE_URB_TIMEOUT);
>>  		if (ret)
>>  			break;
> 
> This is rather a hack.  Do you have an understanding of the underlying
> bug?  Why is the driver waiting for things which will never happen?

It is a hack, but I don't want to rewrite the whole driver. There is
already an attempt to do so, udl. The hack is a quick solution which
doesn't make something worse, just better. The only drawback might be
the few additional bytes for the implementation of
down_timeout_killable(), but I thought such should be already available,
and wondered it wasn't.

Fixing the underlying problem (including removing the leaks) would just
end up in another rewrite and I prefer to have a look at udl, maybe
fixing the current problems to use such a device as console there.

I've only posted this small patch series, because some (longterm) stable
kernels don't have udl, and smscufx, which is in large parts identical
to udlfb might want that patch too.

Just in case: I don't know anything about smscufx and have discovered
the similarities between those drivers only when I did a git grep to
search for something I fixed with a previous patch.

Regards,

Alexander

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 2/3 v2] fb: udlfb: fix hang at disconnect
  2013-01-29  0:56           ` Alexander Holler
@ 2013-01-29 10:35             ` Alexander Holler
  2013-01-29 11:11               ` Alexander Holler
  0 siblings, 1 reply; 23+ messages in thread
From: Alexander Holler @ 2013-01-29 10:35 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-kernel, linux-fbdev, Florian Tobias Schandinat,
	Bernie Thompson, Steve Glendinning, stable

Am 29.01.2013 01:56, schrieb Alexander Holler:
> Am 29.01.2013 01:22, schrieb Andrew Morton:
>> On Fri, 25 Jan 2013 19:49:27 +0100
>> Alexander Holler <holler@ahsoftware.de> wrote:
>>
>>> When a device was disconnected the driver may hang at waiting for urbs it never
>>> will get. Fix this by using a timeout while waiting for the used semaphore.
>>>
>>> There is still a memory leak if a timeout happens, but at least the driver
>>> now continues his disconnect routine.
>>>
>>> ...
>>>
>>> --- a/drivers/video/udlfb.c
>>> +++ b/drivers/video/udlfb.c
>>> @@ -1832,8 +1832,9 @@ static void dlfb_free_urb_list(struct dlfb_data *dev)
>>>   	/* keep waiting and freeing, until we've got 'em all */
>>>   	while (count--) {
>>>
>>> -		/* Getting interrupted means a leak, but ok at disconnect */
>>> -		ret = down_interruptible(&dev->urbs.limit_sem);
>>> +		/* Timeout likely occurs at disconnect (resulting in a leak) */
>>> +		ret = down_timeout_killable(&dev->urbs.limit_sem,
>>> +						FREE_URB_TIMEOUT);
>>>   		if (ret)
>>>   			break;
>>
>> This is rather a hack.  Do you have an understanding of the underlying
>> bug?  Why is the driver waiting for things which will never happen?

To add a bit more explanation:

I've experienced that bug after moving the fb-damage-handling into a 
workqueue (to make the driver usable as console). This likely has 
increased the possibility that an urb gets missed when the usb-stack 
calls the (usb-)disconnect function of the driver. But I don't know as I 
couldn't use the driver before (as fbcon) so I don't really have a 
comparison.

What currently happens here is something like that:

fb -> damage -> workload which sends urb and waits for answer
device disconnect -> dlfb_usb_disconnect() -> stall (no answer to the 
above urb)

I don't know why the disconnect waits for all urbs. The code looks like 
it does that just to free the allocated memory. As I'm not very familiar 
with the usb-stack, I would have to read up about the urb-handling to 
find out how to free the memory otherwise.

As the previous comment in the code suggests that urbs already got 
missed (on shutdown) before, I assume that even without my patch, which 
moved the damage into a workqueue, the problem could occur which then 
prevents a shutdown as there is no timeout. As I've experienced that 
problem not only on disconnect, but on shutdown too (no shutdown was 
possible), I have to assume, that the previous used down_interruptible() 
didn't get a signal on shutdown (if the driver is used as fbcon), 
therefor I consider the timeout as necessary.

Regards,

Alexander


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 2/3 v2] fb: udlfb: fix hang at disconnect
  2013-01-29 10:35             ` Alexander Holler
@ 2013-01-29 11:11               ` Alexander Holler
  2013-01-29 15:51                 ` Alexander Holler
  0 siblings, 1 reply; 23+ messages in thread
From: Alexander Holler @ 2013-01-29 11:11 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-kernel, linux-fbdev, Florian Tobias Schandinat,
	Bernie Thompson, Steve Glendinning, stable

Am 29.01.2013 11:35, schrieb Alexander Holler:
> Am 29.01.2013 01:56, schrieb Alexander Holler:
>> Am 29.01.2013 01:22, schrieb Andrew Morton:
>>> On Fri, 25 Jan 2013 19:49:27 +0100
>>> Alexander Holler <holler@ahsoftware.de> wrote:
>>>
>>>> When a device was disconnected the driver may hang at waiting for
>>>> urbs it never
>>>> will get. Fix this by using a timeout while waiting for the used
>>>> semaphore.
>>>>
>>>> There is still a memory leak if a timeout happens, but at least the
>>>> driver
>>>> now continues his disconnect routine.
>>>>
>>>> ...
>>>>
>>>> --- a/drivers/video/udlfb.c
>>>> +++ b/drivers/video/udlfb.c
>>>> @@ -1832,8 +1832,9 @@ static void dlfb_free_urb_list(struct
>>>> dlfb_data *dev)
>>>>       /* keep waiting and freeing, until we've got 'em all */
>>>>       while (count--) {
>>>>
>>>> -        /* Getting interrupted means a leak, but ok at disconnect */
>>>> -        ret = down_interruptible(&dev->urbs.limit_sem);
>>>> +        /* Timeout likely occurs at disconnect (resulting in a
>>>> leak) */
>>>> +        ret = down_timeout_killable(&dev->urbs.limit_sem,
>>>> +                        FREE_URB_TIMEOUT);
>>>>           if (ret)
>>>>               break;
>>>
>>> This is rather a hack.  Do you have an understanding of the underlying
>>> bug?  Why is the driver waiting for things which will never happen?
>
> To add a bit more explanation:
>
> I've experienced that bug after moving the fb-damage-handling into a
> workqueue (to make the driver usable as console). This likely has
> increased the possibility that an urb gets missed when the usb-stack
> calls the (usb-)disconnect function of the driver. But I don't know as I
> couldn't use the driver before (as fbcon) so I don't really have a
> comparison.
>
> What currently happens here is something like that:
>
> fb -> damage -> workload which sends urb and waits for answer
> device disconnect -> dlfb_usb_disconnect() -> stall (no answer to the
> above urb)
>
> I don't know why the disconnect waits for all urbs. The code looks like
> it does that just to free the allocated memory. As I'm not very familiar
> with the usb-stack, I would have to read up about the urb-handling to
> find out how to free the memory otherwise.
>
> As the previous comment in the code suggests that urbs already got
> missed (on shutdown) before, I assume that even without my patch, which
> moved the damage into a workqueue, the problem could occur which then
> prevents a shutdown as there is no timeout. As I've experienced that
> problem not only on disconnect, but on shutdown too (no shutdown was
> possible), I have to assume, that the previous used down_interruptible()
> didn't get a signal on shutdown (if the driver is used as fbcon),
> therefor I consider the timeout as necessary.

To explain the problem on shutdown a bit further, I think the following 
happens (usb and driver are statically linked and started by the kernel):

shutdown -> kill signal -> usb stack shuts down -> udlfb waits (forever) 
for a kill or an urb which it doesn't get.

Maybe the sequence is different if the usb-stack and udlfb are used as a 
module and/or udlfb is used only for X/fb. I'm not sure what actually 
does shut down the usb-stack in such a case, but maybe more than one 
kill signal might be thrown around.

Regards,

Alexander


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 2/3 v2] fb: udlfb: fix hang at disconnect
  2013-01-29 11:11               ` Alexander Holler
@ 2013-01-29 15:51                 ` Alexander Holler
  2013-01-29 20:35                   ` Alexander Holler
  0 siblings, 1 reply; 23+ messages in thread
From: Alexander Holler @ 2013-01-29 15:51 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-kernel, linux-fbdev, Florian Tobias Schandinat,
	Bernie Thompson, Steve Glendinning, stable

Am 29.01.2013 12:11, schrieb Alexander Holler:

>
> To explain the problem on shutdown a bit further, I think the following
> happens (usb and driver are statically linked and started by the kernel):
>
> shutdown -> kill signal -> usb stack shuts down -> udlfb waits (forever)
> for a kill or an urb which it doesn't get.

Having a second look at what I've written above, I'm not even sure if 
the kernel sends one or more fatal signals on shutdown at all. I've just 
assumed it because otherwise down_interruptible() wouldn't have worked 
before (it would have stalled on shutdown too (if an urb got missed), 
not only on disconnect).

Sounds like an interesting question I should read about (if/when fatal 
signals are issued by the kernel). ;)

> Maybe the sequence is different if the usb-stack and udlfb are used as a
> module and/or udlfb is used only for X/fb. I'm not sure what actually
> does shut down the usb-stack in such a case, but maybe more than one
> kill signal might be thrown around.

Regards,

Alexander

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 2/3 v2] fb: udlfb: fix hang at disconnect
  2013-01-29 15:51                 ` Alexander Holler
@ 2013-01-29 20:35                   ` Alexander Holler
  2013-01-29 20:56                     ` Alexander Holler
  2013-02-04  1:14                     ` Greg KH
  0 siblings, 2 replies; 23+ messages in thread
From: Alexander Holler @ 2013-01-29 20:35 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-kernel, linux-fbdev, Florian Tobias Schandinat,
	Bernie Thompson, Steve Glendinning, stable

Am 29.01.2013 16:51, schrieb Alexander Holler:
> Am 29.01.2013 12:11, schrieb Alexander Holler:
>
>>
>> To explain the problem on shutdown a bit further, I think the following
>> happens (usb and driver are statically linked and started by the kernel):
>>
>> shutdown -> kill signal -> usb stack shuts down -> udlfb waits (forever)
>> for a kill or an urb which it doesn't get.
>
> Having a second look at what I've written above, I'm not even sure if
> the kernel sends one or more fatal signals on shutdown at all. I've just
> assumed it because otherwise down_interruptible() wouldn't have worked
> before (it would have stalled on shutdown too (if an urb got missed),
> not only on disconnect).
>
> Sounds like an interesting question I should read about (if/when fatal
> signals are issued by the kernel). ;)
>
>> Maybe the sequence is different if the usb-stack and udlfb are used as a
>> module and/or udlfb is used only for X/fb. I'm not sure what actually
>> does shut down the usb-stack in such a case, but maybe more than one
>> kill signal might be thrown around.

If anyone still follows my monologue: The question was interesting
enough that I couldn't resist for long. ;)

(all pasted => format broken)

In drivers/tty/sysrq.c there is

------
static void send_sig_all(int sig)
{
         struct task_struct *p;

         read_lock(&tasklist_lock);
         for_each_process(p) {
                 if (p->flags & PF_KTHREAD)
                         continue;
                 if (is_global_init(p))
                         continue;

                 do_send_sig_info(sig, SEND_SIG_FORCED, p, true);
         }
         read_unlock(&tasklist_lock);
}

static void sysrq_handle_term(int key)
{
         send_sig_all(SIGTERM);
         console_loglevel = 8;
}

(...)

static void sysrq_handle_kill(int key)
{
         send_sig_all(SIGKILL);
         console_loglevel = 8;
}
------

Now I've done some learning by doing (kernel 3.7.5 + some patches):

------
diff --git a/drivers/video/udlfb.c b/drivers/video/udlfb.c
index df249f3..db8a86c 100644
--- a/drivers/video/udlfb.c
+++ b/drivers/video/udlfb.c
@@ -1876,14 +1876,18 @@ static void dlfb_free_urb_list(struct dlfb_data
*dev)
         unsigned long flags;

         pr_notice("Freeing all render urbs\n");
+       if (current->flags & PF_KTHREAD)
+               pr_info("AHO: I'm a kernel thread\n");

         /* keep waiting and freeing, until we've got 'em all */
         while (count--) {

                 /* Timeout likely occurs at disconnect (resulting in a
leak) */
                 ret = down_timeout_killable(&dev->urbs.limit_sem,
FREE_URB_TIMEOUT);
-               if (ret)
+               if (ret) {
+                       pr_info("AHO: ret %d\n", ret);
                         break;
+               }

                 spin_lock_irqsave(&dev->urbs.lock, flags);
------

Now I've disconnected the display. And, as send_sig_all() already 
suggests, the result was (besides discovering an oops in 
call_timer_fn.isra (once)):

------
[  120.963010] udlfb: AHO: I'm a kernel thread
[  122.957024] udlfb: AHO: ret -62
------
(-62 is -ETIME)

So, if the above down_timeout_killable() is only down_interruptible(), 
as in kernel 3.7.5, the  box would not shutdown afterwards, because on 
shutdown no signal would be send to that kernel-thread which called 
dlfb_free_urb_list().

A last note: dlfb_usb_disconnect() (thus dlfb_free_urb_list()) isn't 
called on shutdown if the device would still be connected. So the 
problem only might happen, if the screen will be disconnected before 
shutdown (and an urb gets missed). So the subject of my patch is correct. ;)

</monologue>

Regards,

Alexander

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH 2/3 v2] fb: udlfb: fix hang at disconnect
  2013-01-29 20:35                   ` Alexander Holler
@ 2013-01-29 20:56                     ` Alexander Holler
  2013-02-04  1:14                     ` Greg KH
  1 sibling, 0 replies; 23+ messages in thread
From: Alexander Holler @ 2013-01-29 20:56 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-kernel, linux-fbdev, Florian Tobias Schandinat,
	Bernie Thompson, Steve Glendinning, stable

Am 29.01.2013 21:35, schrieb Alexander Holler:
>
> So, if the above down_timeout_killable() is only down_interruptible(),
> as in kernel 3.7.5, the  box would not shutdown afterwards, because on
> shutdown no signal would be send to that kernel-thread which called
> dlfb_free_urb_list().
>
> A last note: dlfb_usb_disconnect() (thus dlfb_free_urb_list()) isn't
> called on shutdown if the device would still be connected. So the
> problem only might happen, if the screen will be disconnected before
> shutdown (and an urb gets missed). So the subject of my patch is
> correct. ;)
>
> </monologue>

Hmm, wrong, sorry, I still have something to add: As no signal arrives 
at all, v1 of that patch is enough and the implementation of 
down_timeout_killable() isn't necessary at all.

If there is a chance that the patch would be Acked-by by someone, I 
would made a v3, replacing

+		ret = down_timeout(&dev->urbs.limit_sem, GET_URB_TIMEOUT);

in v1 with

+		ret = down_timeout(&dev->urbs.limit_sem, FREE_URB_TIMEOUT);

as this seems to be what it should be.

Regards,

Alexander

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 2/3 v2] fb: udlfb: fix hang at disconnect
  2013-01-29 20:35                   ` Alexander Holler
  2013-01-29 20:56                     ` Alexander Holler
@ 2013-02-04  1:14                     ` Greg KH
  2013-02-04 12:05                       ` Alexander Holler
  1 sibling, 1 reply; 23+ messages in thread
From: Greg KH @ 2013-02-04  1:14 UTC (permalink / raw)
  To: Alexander Holler
  Cc: Andrew Morton, linux-kernel, linux-fbdev,
	Florian Tobias Schandinat, Bernie Thompson, Steve Glendinning,
	stable

On Tue, Jan 29, 2013 at 09:35:42PM +0100, Alexander Holler wrote:
> Am 29.01.2013 16:51, schrieb Alexander Holler:
> >Am 29.01.2013 12:11, schrieb Alexander Holler:
> >
> >>
> >>To explain the problem on shutdown a bit further, I think the following
> >>happens (usb and driver are statically linked and started by the kernel):
> >>
> >>shutdown -> kill signal -> usb stack shuts down -> udlfb waits (forever)
> >>for a kill or an urb which it doesn't get.
> >
> >Having a second look at what I've written above, I'm not even sure if
> >the kernel sends one or more fatal signals on shutdown at all. I've just
> >assumed it because otherwise down_interruptible() wouldn't have worked
> >before (it would have stalled on shutdown too (if an urb got missed),
> >not only on disconnect).
> >
> >Sounds like an interesting question I should read about (if/when fatal
> >signals are issued by the kernel). ;)
> >
> >>Maybe the sequence is different if the usb-stack and udlfb are used as a
> >>module and/or udlfb is used only for X/fb. I'm not sure what actually
> >>does shut down the usb-stack in such a case, but maybe more than one
> >>kill signal might be thrown around.
> 
> If anyone still follows my monologue: The question was interesting
> enough that I couldn't resist for long. ;)
> 
> (all pasted => format broken)
> 
> In drivers/tty/sysrq.c there is
> 
> ------
> static void send_sig_all(int sig)
> {
>         struct task_struct *p;
> 
>         read_lock(&tasklist_lock);
>         for_each_process(p) {
>                 if (p->flags & PF_KTHREAD)
>                         continue;
>                 if (is_global_init(p))
>                         continue;
> 
>                 do_send_sig_info(sig, SEND_SIG_FORCED, p, true);
>         }
>         read_unlock(&tasklist_lock);
> }
> 
> static void sysrq_handle_term(int key)
> {
>         send_sig_all(SIGTERM);
>         console_loglevel = 8;
> }
> 
> (...)
> 
> static void sysrq_handle_kill(int key)
> {
>         send_sig_all(SIGKILL);
>         console_loglevel = 8;
> }
> ------
> 
> Now I've done some learning by doing (kernel 3.7.5 + some patches):
> 
> ------
> diff --git a/drivers/video/udlfb.c b/drivers/video/udlfb.c
> index df249f3..db8a86c 100644
> --- a/drivers/video/udlfb.c
> +++ b/drivers/video/udlfb.c
> @@ -1876,14 +1876,18 @@ static void dlfb_free_urb_list(struct dlfb_data
> *dev)
>         unsigned long flags;
> 
>         pr_notice("Freeing all render urbs\n");
> +       if (current->flags & PF_KTHREAD)
> +               pr_info("AHO: I'm a kernel thread\n");
> 
>         /* keep waiting and freeing, until we've got 'em all */
>         while (count--) {
> 
>                 /* Timeout likely occurs at disconnect (resulting in a
> leak) */
>                 ret = down_timeout_killable(&dev->urbs.limit_sem,
> FREE_URB_TIMEOUT);
> -               if (ret)
> +               if (ret) {
> +                       pr_info("AHO: ret %d\n", ret);
>                         break;
> +               }
> 
>                 spin_lock_irqsave(&dev->urbs.lock, flags);
> ------
> 
> Now I've disconnected the display. And, as send_sig_all() already
> suggests, the result was (besides discovering an oops in
> call_timer_fn.isra (once)):
> 
> ------
> [  120.963010] udlfb: AHO: I'm a kernel thread
> [  122.957024] udlfb: AHO: ret -62
> ------
> (-62 is -ETIME)
> 
> So, if the above down_timeout_killable() is only
> down_interruptible(), as in kernel 3.7.5, the  box would not
> shutdown afterwards, because on shutdown no signal would be send to
> that kernel-thread which called dlfb_free_urb_list().
> 
> A last note: dlfb_usb_disconnect() (thus dlfb_free_urb_list()) isn't
> called on shutdown if the device would still be connected. So the
> problem only might happen, if the screen will be disconnected before
> shutdown (and an urb gets missed). So the subject of my patch is
> correct. ;)

Yes, we don't disconnect all devices from the USB bus on shutdown
because, I think, we didn't tear down all of the PCI devices originally,
so your USB bus never knew it was going to be shutdown.

This is how things have always worked, and shutting down PCI devices in
the past have been known to cause problems.  I think.  I vaguely
remember some issues when I tried to do this 10+ years or so ago in the
2.5 kernel days, but I could be totally wrong given that I can't
remember what I was working on last month at times...

So you are right in that your driver will wait for forever for a
disconnect() to happen, as it will never be called.  I don't understand
the problem that this is causing when it happens.  What's wrong with
udlfb that having the cpu suddently reset as the powerdown happened
without it knowing about it?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 2/3 v2] fb: udlfb: fix hang at disconnect
  2013-02-04  1:14                     ` Greg KH
@ 2013-02-04 12:05                       ` Alexander Holler
  2013-02-04 19:17                         ` Alexander Holler
  0 siblings, 1 reply; 23+ messages in thread
From: Alexander Holler @ 2013-02-04 12:05 UTC (permalink / raw)
  To: Greg KH
  Cc: Andrew Morton, linux-kernel, linux-fbdev,
	Florian Tobias Schandinat, Bernie Thompson, Steve Glendinning,
	stable

Am 04.02.2013 02:14, schrieb Greg KH:

> So you are right in that your driver will wait for forever for a
> disconnect() to happen, as it will never be called.  I don't understand
> the problem that this is causing when it happens.  What's wrong with
> udlfb that having the cpu suddently reset as the powerdown happened
> without it knowing about it?

There is nothing wrong with that. I've just explained why a problem 
doesn't occur on shutdown but on disconnect (of the device).

Regards,

Alexander

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 2/3 v2] fb: udlfb: fix hang at disconnect
  2013-02-04 12:05                       ` Alexander Holler
@ 2013-02-04 19:17                         ` Alexander Holler
  2013-02-04 19:25                           ` Greg KH
  0 siblings, 1 reply; 23+ messages in thread
From: Alexander Holler @ 2013-02-04 19:17 UTC (permalink / raw)
  To: Greg KH
  Cc: Andrew Morton, linux-kernel, linux-fbdev,
	Florian Tobias Schandinat, Bernie Thompson, Steve Glendinning,
	stable

Am 04.02.2013 13:05, schrieb Alexander Holler:
> Am 04.02.2013 02:14, schrieb Greg KH:
>
>> So you are right in that your driver will wait for forever for a
>> disconnect() to happen, as it will never be called.  I don't understand
>> the problem that this is causing when it happens.  What's wrong with
>> udlfb that having the cpu suddently reset as the powerdown happened
>> without it knowing about it?
>
> There is nothing wrong with that. I've just explained why a problem
> doesn't occur on shutdown but on disconnect (of the device).

Maybe my explanation before was just to long and I try to explain it a 
bit shorter:

If a device gets disconnected, the disconnect in udlfb might wait 
forever in down_interruptible() (because it waits for an urb it never 
receives). This even prevents a shutdown afterwards, because that 
down_interruptible() never receives a signal (at shutdown, because 
kernel threads don't get such).

So the change from down_timeout() to down_interruptible() in 
dlfb_free_urb_list() with commit 
33077b8d3042e01da61924973e372abe589ba297 only results in that the 
following code (thus the break there) will never be reached if an urb 
got missed (because of a disconnect).

And the accompanying comment (... at shutdown) is misleading, because on 
shutdown, the kernel thread which calls dlfb_free_urb_list() never gets 
a signal, so the interruption just never happens.

As I've experienced the "missing urb on disconnect" problem quiet often, 
I've changed that down_interruptible() to down_timeout() (in v1 and in 
v2 to down_timeout_interruptible, because I wasn't aware that no signal 
arrives on shutdown).

Hmm, ok, that explanation isn't much shorter. ;)

Regards,

Alexander

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 2/3 v2] fb: udlfb: fix hang at disconnect
  2013-02-04 19:17                         ` Alexander Holler
@ 2013-02-04 19:25                           ` Greg KH
  2013-02-05  7:08                             ` Alexander Holler
  0 siblings, 1 reply; 23+ messages in thread
From: Greg KH @ 2013-02-04 19:25 UTC (permalink / raw)
  To: Alexander Holler
  Cc: Andrew Morton, linux-kernel, linux-fbdev,
	Florian Tobias Schandinat, Bernie Thompson, Steve Glendinning,
	stable

On Mon, Feb 04, 2013 at 08:17:04PM +0100, Alexander Holler wrote:
> Am 04.02.2013 13:05, schrieb Alexander Holler:
> >Am 04.02.2013 02:14, schrieb Greg KH:
> >
> >>So you are right in that your driver will wait for forever for a
> >>disconnect() to happen, as it will never be called.  I don't understand
> >>the problem that this is causing when it happens.  What's wrong with
> >>udlfb that having the cpu suddently reset as the powerdown happened
> >>without it knowing about it?
> >
> >There is nothing wrong with that. I've just explained why a problem
> >doesn't occur on shutdown but on disconnect (of the device).
> 
> Maybe my explanation before was just to long and I try to explain it
> a bit shorter:
> 
> If a device gets disconnected, the disconnect in udlfb might wait
> forever in down_interruptible() (because it waits for an urb it
> never receives). This even prevents a shutdown afterwards, because
> that down_interruptible() never receives a signal (at shutdown,
> because kernel threads don't get such).

Where was that urb when the disconnect happened?  The USB core should
call your urb callback for any outstanding urbs at that point in time,
with the proper error flag being set, are you handling that properly?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 2/3 v2] fb: udlfb: fix hang at disconnect
  2013-02-04 19:25                           ` Greg KH
@ 2013-02-05  7:08                             ` Alexander Holler
  2013-02-05 17:22                               ` Greg KH
  0 siblings, 1 reply; 23+ messages in thread
From: Alexander Holler @ 2013-02-05  7:08 UTC (permalink / raw)
  To: Greg KH
  Cc: Andrew Morton, linux-kernel, linux-fbdev,
	Florian Tobias Schandinat, Bernie Thompson, Steve Glendinning,
	stable

Am 04.02.2013 20:25, schrieb Greg KH:
> On Mon, Feb 04, 2013 at 08:17:04PM +0100, Alexander Holler wrote:
>> Am 04.02.2013 13:05, schrieb Alexander Holler:
>>> Am 04.02.2013 02:14, schrieb Greg KH:
>>>
>>>> So you are right in that your driver will wait for forever for a
>>>> disconnect() to happen, as it will never be called.  I don't understand
>>>> the problem that this is causing when it happens.  What's wrong with
>>>> udlfb that having the cpu suddently reset as the powerdown happened
>>>> without it knowing about it?
>>>
>>> There is nothing wrong with that. I've just explained why a problem
>>> doesn't occur on shutdown but on disconnect (of the device).
>>
>> Maybe my explanation before was just to long and I try to explain it
>> a bit shorter:
>>
>> If a device gets disconnected, the disconnect in udlfb might wait
>> forever in down_interruptible() (because it waits for an urb it
>> never receives). This even prevents a shutdown afterwards, because
>> that down_interruptible() never receives a signal (at shutdown,
>> because kernel threads don't get such).
> 
> Where was that urb when the disconnect happened?  The USB core should
> call your urb callback for any outstanding urbs at that point in time,
> with the proper error flag being set, are you handling that properly?

I don't know where that urb is as I don't handle it. I just know that
_interruptible doesn't make any sense and _timeout is necessary here.
But as nobody else seems to have a problem, nobody else see seems to see
the problem there and I seem to be unable to explain it, just ignore
that patch.

Regards,

Alexander


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 2/3 v2] fb: udlfb: fix hang at disconnect
  2013-02-05  7:08                             ` Alexander Holler
@ 2013-02-05 17:22                               ` Greg KH
  2013-02-05 17:36                                 ` Alexander Holler
  0 siblings, 1 reply; 23+ messages in thread
From: Greg KH @ 2013-02-05 17:22 UTC (permalink / raw)
  To: Alexander Holler
  Cc: Andrew Morton, linux-kernel, linux-fbdev,
	Florian Tobias Schandinat, Bernie Thompson, Steve Glendinning,
	stable

On Tue, Feb 05, 2013 at 08:08:28AM +0100, Alexander Holler wrote:
> Am 04.02.2013 20:25, schrieb Greg KH:
> > On Mon, Feb 04, 2013 at 08:17:04PM +0100, Alexander Holler wrote:
> >> Am 04.02.2013 13:05, schrieb Alexander Holler:
> >>> Am 04.02.2013 02:14, schrieb Greg KH:
> >>>
> >>>> So you are right in that your driver will wait for forever for a
> >>>> disconnect() to happen, as it will never be called.  I don't understand
> >>>> the problem that this is causing when it happens.  What's wrong with
> >>>> udlfb that having the cpu suddently reset as the powerdown happened
> >>>> without it knowing about it?
> >>>
> >>> There is nothing wrong with that. I've just explained why a problem
> >>> doesn't occur on shutdown but on disconnect (of the device).
> >>
> >> Maybe my explanation before was just to long and I try to explain it
> >> a bit shorter:
> >>
> >> If a device gets disconnected, the disconnect in udlfb might wait
> >> forever in down_interruptible() (because it waits for an urb it
> >> never receives). This even prevents a shutdown afterwards, because
> >> that down_interruptible() never receives a signal (at shutdown,
> >> because kernel threads don't get such).
> > 
> > Where was that urb when the disconnect happened?  The USB core should
> > call your urb callback for any outstanding urbs at that point in time,
> > with the proper error flag being set, are you handling that properly?
> 
> I don't know where that urb is as I don't handle it.

What do you mean by that?  The urb is being sent back to your driver,
right?  If not, that's a bug, but please be sure that your urb callback
isn't really being called.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 2/3 v2] fb: udlfb: fix hang at disconnect
  2013-02-05 17:22                               ` Greg KH
@ 2013-02-05 17:36                                 ` Alexander Holler
  2013-02-08  4:07                                   ` Dave Airlie
  0 siblings, 1 reply; 23+ messages in thread
From: Alexander Holler @ 2013-02-05 17:36 UTC (permalink / raw)
  To: Greg KH
  Cc: Andrew Morton, linux-kernel, linux-fbdev,
	Florian Tobias Schandinat, Bernie Thompson, Steve Glendinning,
	stable

Am 05.02.2013 18:22, schrieb Greg KH:
> On Tue, Feb 05, 2013 at 08:08:28AM +0100, Alexander Holler wrote:
>> Am 04.02.2013 20:25, schrieb Greg KH:

>>> Where was that urb when the disconnect happened?  The USB core should
>>> call your urb callback for any outstanding urbs at that point in time,
>>> with the proper error flag being set, are you handling that properly?
>>
>> I don't know where that urb is as I don't handle it.
>
> What do you mean by that?  The urb is being sent back to your driver,
> right?  If not, that's a bug, but please be sure that your urb callback
> isn't really being called.

I meant it isn't _my_ driver. ;)

I'm just trying to add some würgarounds without having the need to 
rewrite the whole driver.

In regard to that "urb missing problem", I think I've just named it 
wrong and the actual problem is a race-condition between the semaphore 
handling (which is used to keep track of the urbs) and the urb handling 
inside the driver.

But I've just switched to udl (instead of udlfb) and will see if I can 
fix the bugs there to make it usable as a console. udl is a rewrite of 
udlfb with some additional features (e.g. drm), so hopefully fixing the 
remaining problems there will require less work.

Regards,

Alexander


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 2/3 v2] fb: udlfb: fix hang at disconnect
  2013-02-05 17:36                                 ` Alexander Holler
@ 2013-02-08  4:07                                   ` Dave Airlie
  2013-02-08  9:53                                     ` Alexander Holler
  0 siblings, 1 reply; 23+ messages in thread
From: Dave Airlie @ 2013-02-08  4:07 UTC (permalink / raw)
  To: Alexander Holler
  Cc: Greg KH, Andrew Morton, linux-kernel, linux-fbdev,
	Florian Tobias Schandinat, Bernie Thompson, Steve Glendinning,
	stable

>
> But I've just switched to udl (instead of udlfb) and will see if I can fix
> the bugs there to make it usable as a console. udl is a rewrite of udlfb
> with some additional features (e.g. drm), so hopefully fixing the remaining
> problems there will require less work.

I may have fixed the major udl problem with being a console, patch was
posted to dri-devel yesterday, and I've pushed it into -next.

Dave.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 2/3 v2] fb: udlfb: fix hang at disconnect
  2013-02-08  4:07                                   ` Dave Airlie
@ 2013-02-08  9:53                                     ` Alexander Holler
  0 siblings, 0 replies; 23+ messages in thread
From: Alexander Holler @ 2013-02-08  9:53 UTC (permalink / raw)
  To: Dave Airlie
  Cc: Greg KH, Andrew Morton, linux-kernel, linux-fbdev,
	Florian Tobias Schandinat, Bernie Thompson, Steve Glendinning,
	stable

Am 08.02.2013 05:07, schrieb Dave Airlie:
>>
>> But I've just switched to udl (instead of udlfb) and will see if I can fix
>> the bugs there to make it usable as a console. udl is a rewrite of udlfb
>> with some additional features (e.g. drm), so hopefully fixing the remaining
>> problems there will require less work.
>
> I may have fixed the major udl problem with being a console, patch was
> posted to dri-devel yesterday, and I've pushed it into -next.

Thanks a lot. I will check it.

Regards,

Alexander


^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2013-02-08  9:54 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-01-12 13:20 [PATCH] fb: udlfb: fix hang at disconnect Alexander Holler
2013-01-12 22:22 ` Bernie Thompson
2013-01-13 12:05   ` Alexander Holler
2013-01-13 12:24     ` Alexander Holler
2013-01-25 18:49     ` [PATCH 1/3] semaphore: introduce down_timeout_killable() Alexander Holler
2013-01-25 18:49       ` [PATCH 2/3 v2] fb: udlfb: fix hang at disconnect Alexander Holler
2013-01-29  0:22         ` Andrew Morton
2013-01-29  0:56           ` Alexander Holler
2013-01-29 10:35             ` Alexander Holler
2013-01-29 11:11               ` Alexander Holler
2013-01-29 15:51                 ` Alexander Holler
2013-01-29 20:35                   ` Alexander Holler
2013-01-29 20:56                     ` Alexander Holler
2013-02-04  1:14                     ` Greg KH
2013-02-04 12:05                       ` Alexander Holler
2013-02-04 19:17                         ` Alexander Holler
2013-02-04 19:25                           ` Greg KH
2013-02-05  7:08                             ` Alexander Holler
2013-02-05 17:22                               ` Greg KH
2013-02-05 17:36                                 ` Alexander Holler
2013-02-08  4:07                                   ` Dave Airlie
2013-02-08  9:53                                     ` Alexander Holler
2013-01-25 18:49       ` [PATCH 3/3] fb: smscufx: " Alexander Holler

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).