All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] fbcon: fix race condition between console lock and cursor timer
@ 2012-08-21  6:40 ` Dave Airlie
  0 siblings, 0 replies; 23+ messages in thread
From: Dave Airlie @ 2012-08-21  6:40 UTC (permalink / raw)
  To: linux-fbdev
  Cc: dri-devel, linux-kernel, Linus, Alan Cox, Randy Dunlap,
	Josh Boyer, Dave Airlie

So we've had a fair few reports of fbcon handover breakage between
efi/vesafb and i915 surface recently, so I dedicated a couple of
days to finding the problem.

Essentially the last thing we saw was the conflicting framebuffer
message and that was all.

So after much tracing with direct netconsole writes (printks
under console_lock not so useful), I think I found the race.

Thread A (driver load)    Thread B (timer thread)
  unbind_con_driver ->              |
  bind_con_driver ->                |
  vc->vc_sw->con_deinit ->          |
  fbcon_deinit ->                   |
  console_lock()                    |
      |                             |
      |                       fbcon_flashcursor timer fires
      |                       console_lock() <- blocked for A
      |
      |
fbcon_del_cursor_timer ->
  del_timer_sync
  (BOOM)

Of course because all of this is under the console lock,
we never see anything, also since we also just unbound the active
console guess what we never see anything.

Hopefully this fixes the problem for anyone seeing vesafb->kms
driver handoff.

Signed-off-by: David Airlie <airlied@redhat.com>
---
 drivers/video/console/fbcon.c |    6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/video/console/fbcon.c b/drivers/video/console/fbcon.c
index 2e471c2..f8a79fc 100644
--- a/drivers/video/console/fbcon.c
+++ b/drivers/video/console/fbcon.c
@@ -372,8 +372,12 @@ static void fb_flashcursor(struct work_struct *work)
 	struct vc_data *vc = NULL;
 	int c;
 	int mode;
+	int ret;
+
+	ret = console_trylock();
+	if (ret == 0)
+		return;
 
-	console_lock();
 	if (ops && ops->currcon != -1)
 		vc = vc_cons[ops->currcon].d;
 
-- 
1.7.10.2


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH] fbcon: fix race condition between console lock and cursor timer
@ 2012-08-21  6:40 ` Dave Airlie
  0 siblings, 0 replies; 23+ messages in thread
From: Dave Airlie @ 2012-08-21  6:40 UTC (permalink / raw)
  To: linux-fbdev
  Cc: dri-devel, linux-kernel, Linus, Alan Cox, Randy Dunlap,
	Josh Boyer, Dave Airlie

So we've had a fair few reports of fbcon handover breakage between
efi/vesafb and i915 surface recently, so I dedicated a couple of
days to finding the problem.

Essentially the last thing we saw was the conflicting framebuffer
message and that was all.

So after much tracing with direct netconsole writes (printks
under console_lock not so useful), I think I found the race.

Thread A (driver load)    Thread B (timer thread)
  unbind_con_driver ->              |
  bind_con_driver ->                |
  vc->vc_sw->con_deinit ->          |
  fbcon_deinit ->                   |
  console_lock()                    |
      |                             |
      |                       fbcon_flashcursor timer fires
      |                       console_lock() <- blocked for A
      |
      |
fbcon_del_cursor_timer ->
  del_timer_sync
  (BOOM)

Of course because all of this is under the console lock,
we never see anything, also since we also just unbound the active
console guess what we never see anything.

Hopefully this fixes the problem for anyone seeing vesafb->kms
driver handoff.

Signed-off-by: David Airlie <airlied@redhat.com>
---
 drivers/video/console/fbcon.c |    6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/video/console/fbcon.c b/drivers/video/console/fbcon.c
index 2e471c2..f8a79fc 100644
--- a/drivers/video/console/fbcon.c
+++ b/drivers/video/console/fbcon.c
@@ -372,8 +372,12 @@ static void fb_flashcursor(struct work_struct *work)
 	struct vc_data *vc = NULL;
 	int c;
 	int mode;
+	int ret;
+
+	ret = console_trylock();
+	if (ret = 0)
+		return;
 
-	console_lock();
 	if (ops && ops->currcon != -1)
 		vc = vc_cons[ops->currcon].d;
 
-- 
1.7.10.2


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH] fbcon: fix race condition between console lock and cursor timer
  2012-08-21  6:40 ` Dave Airlie
@ 2012-08-21  9:15   ` Alan Cox
  -1 siblings, 0 replies; 23+ messages in thread
From: Alan Cox @ 2012-08-21  9:15 UTC (permalink / raw)
  To: Dave Airlie
  Cc: linux-fbdev, dri-devel, linux-kernel, Linus, Randy Dunlap, Josh Boyer

> So after much tracing with direct netconsole writes (printks
> under console_lock not so useful), I think I found the race.

Direct netconsole write would be a useful patch to have mainline I think
8)

> Hopefully this fixes the problem for anyone seeing vesafb->kms
> driver handoff.

Not really the proper fix but its clear and is probably the best thing to
go in initially with a cc: stable. Can you at least stick a large 

+ /* FIXME: we should sort out the unbind locking instead */

on the patch however.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] fbcon: fix race condition between console lock and cursor timer
@ 2012-08-21  9:15   ` Alan Cox
  0 siblings, 0 replies; 23+ messages in thread
From: Alan Cox @ 2012-08-21  9:15 UTC (permalink / raw)
  To: Dave Airlie
  Cc: linux-fbdev, dri-devel, linux-kernel, Linus, Randy Dunlap, Josh Boyer

> So after much tracing with direct netconsole writes (printks
> under console_lock not so useful), I think I found the race.

Direct netconsole write would be a useful patch to have mainline I think
8)

> Hopefully this fixes the problem for anyone seeing vesafb->kms
> driver handoff.

Not really the proper fix but its clear and is probably the best thing to
go in initially with a cc: stable. Can you at least stick a large 

+ /* FIXME: we should sort out the unbind locking instead */

on the patch however.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] fbcon: fix race condition between console lock and cursor timer
  2012-08-21  6:40 ` Dave Airlie
  (?)
  (?)
@ 2012-08-21  9:15 ` Alan Cox
  -1 siblings, 0 replies; 23+ messages in thread
From: Alan Cox @ 2012-08-21  9:15 UTC (permalink / raw)
  To: Dave Airlie
  Cc: linux-fbdev, linux-kernel, Dunlap, Randy, dri-devel, Josh Boyer, Linus

> So after much tracing with direct netconsole writes (printks
> under console_lock not so useful), I think I found the race.

Direct netconsole write would be a useful patch to have mainline I think
8)

> Hopefully this fixes the problem for anyone seeing vesafb->kms
> driver handoff.

Not really the proper fix but its clear and is probably the best thing to
go in initially with a cc: stable. Can you at least stick a large 

+ /* FIXME: we should sort out the unbind locking instead */

on the patch however.

------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
--

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] fbcon: fix race condition between console lock and cursor timer
  2012-08-21  6:40 ` Dave Airlie
  (?)
@ 2012-08-21  9:15 ` Alan Cox
  -1 siblings, 0 replies; 23+ messages in thread
From: Alan Cox @ 2012-08-21  9:15 UTC (permalink / raw)
  To: Dave Airlie
  Cc: linux-fbdev, linux-kernel, Dunlap, Randy, dri-devel, Josh Boyer, Linus

> So after much tracing with direct netconsole writes (printks
> under console_lock not so useful), I think I found the race.

Direct netconsole write would be a useful patch to have mainline I think
8)

> Hopefully this fixes the problem for anyone seeing vesafb->kms
> driver handoff.

Not really the proper fix but its clear and is probably the best thing to
go in initially with a cc: stable. Can you at least stick a large 

+ /* FIXME: we should sort out the unbind locking instead */

on the patch however.

------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
--
_______________________________________________
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] fbcon: fix race condition between console lock and cursor timer
  2012-08-21  6:40 ` Dave Airlie
@ 2012-08-21 12:14   ` Peter Zijlstra
  -1 siblings, 0 replies; 23+ messages in thread
From: Peter Zijlstra @ 2012-08-21 12:14 UTC (permalink / raw)
  To: Dave Airlie
  Cc: linux-fbdev, dri-devel, linux-kernel, Linus, Alan Cox,
	Randy Dunlap, Josh Boyer

On Tue, 2012-08-21 at 16:40 +1000, Dave Airlie wrote:
> So after much tracing with direct netconsole writes (printks
> under console_lock not so useful) 

I always use earlyprintk on serial.. 

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] fbcon: fix race condition between console lock and cursor timer
@ 2012-08-21 12:14   ` Peter Zijlstra
  0 siblings, 0 replies; 23+ messages in thread
From: Peter Zijlstra @ 2012-08-21 12:14 UTC (permalink / raw)
  To: Dave Airlie
  Cc: linux-fbdev, dri-devel, linux-kernel, Linus, Alan Cox,
	Randy Dunlap, Josh Boyer

On Tue, 2012-08-21 at 16:40 +1000, Dave Airlie wrote:
> So after much tracing with direct netconsole writes (printks
> under console_lock not so useful) 

I always use earlyprintk on serial.. 

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] fbcon: fix race condition between console lock and cursor timer
  2012-08-21  6:40 ` Dave Airlie
                   ` (3 preceding siblings ...)
  (?)
@ 2012-08-21 12:14 ` Peter Zijlstra
  -1 siblings, 0 replies; 23+ messages in thread
From: Peter Zijlstra @ 2012-08-21 12:14 UTC (permalink / raw)
  To: Dave Airlie
  Cc: linux-fbdev, Josh, linux-kernel, Alan, Randy Dunlap, dri-devel,
	Boyer, Linus, Cox

On Tue, 2012-08-21 at 16:40 +1000, Dave Airlie wrote:
> So after much tracing with direct netconsole writes (printks
> under console_lock not so useful) 

I always use earlyprintk on serial.. 

------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
--

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] fbcon: fix race condition between console lock and cursor timer
  2012-08-21  6:40 ` Dave Airlie
                   ` (5 preceding siblings ...)
  (?)
@ 2012-08-21 12:14 ` Peter Zijlstra
  -1 siblings, 0 replies; 23+ messages in thread
From: Peter Zijlstra @ 2012-08-21 12:14 UTC (permalink / raw)
  To: Dave Airlie
  Cc: linux-fbdev, Josh, linux-kernel, Alan, Randy Dunlap, dri-devel,
	Boyer, Linus

On Tue, 2012-08-21 at 16:40 +1000, Dave Airlie wrote:
> So after much tracing with direct netconsole writes (printks
> under console_lock not so useful) 

I always use earlyprintk on serial.. 

------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
--
_______________________________________________
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] fbcon: fix race condition between console lock and cursor timer
  2012-08-21  9:15   ` Alan Cox
@ 2012-08-21 12:15     ` Peter Zijlstra
  -1 siblings, 0 replies; 23+ messages in thread
From: Peter Zijlstra @ 2012-08-21 12:15 UTC (permalink / raw)
  To: Alan Cox
  Cc: Dave Airlie, linux-fbdev, dri-devel, linux-kernel, Linus,
	Randy Dunlap, Josh Boyer

On Tue, 2012-08-21 at 10:15 +0100, Alan Cox wrote:
> > So after much tracing with direct netconsole writes (printks
> > under console_lock not so useful), I think I found the race.
> 
> Direct netconsole write would be a useful patch to have mainline I think
> 8) 

could we make that use the earlyprintk infrastructure?

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] fbcon: fix race condition between console lock and cursor timer
@ 2012-08-21 12:15     ` Peter Zijlstra
  0 siblings, 0 replies; 23+ messages in thread
From: Peter Zijlstra @ 2012-08-21 12:15 UTC (permalink / raw)
  To: Alan Cox
  Cc: Dave Airlie, linux-fbdev, dri-devel, linux-kernel, Linus,
	Randy Dunlap, Josh Boyer

On Tue, 2012-08-21 at 10:15 +0100, Alan Cox wrote:
> > So after much tracing with direct netconsole writes (printks
> > under console_lock not so useful), I think I found the race.
> 
> Direct netconsole write would be a useful patch to have mainline I think
> 8) 

could we make that use the earlyprintk infrastructure?

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] fbcon: fix race condition between console lock and cursor timer
  2012-08-21  9:15   ` Alan Cox
  (?)
  (?)
@ 2012-08-21 12:15   ` Peter Zijlstra
  -1 siblings, 0 replies; 23+ messages in thread
From: Peter Zijlstra @ 2012-08-21 12:15 UTC (permalink / raw)
  To: Alan Cox
  Cc: linux-fbdev, Josh, linux-kernel, Randy Dunlap, dri-devel,
	Dave Airlie, Boyer, Linus

On Tue, 2012-08-21 at 10:15 +0100, Alan Cox wrote:
> > So after much tracing with direct netconsole writes (printks
> > under console_lock not so useful), I think I found the race.
> 
> Direct netconsole write would be a useful patch to have mainline I think
> 8) 

could we make that use the earlyprintk infrastructure?

------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
--

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] fbcon: fix race condition between console lock and cursor timer
  2012-08-21  9:15   ` Alan Cox
  (?)
@ 2012-08-21 12:15   ` Peter Zijlstra
  -1 siblings, 0 replies; 23+ messages in thread
From: Peter Zijlstra @ 2012-08-21 12:15 UTC (permalink / raw)
  To: Alan Cox
  Cc: linux-fbdev, Josh, linux-kernel, Randy Dunlap, dri-devel,
	Dave Airlie, Boyer, Linus

On Tue, 2012-08-21 at 10:15 +0100, Alan Cox wrote:
> > So after much tracing with direct netconsole writes (printks
> > under console_lock not so useful), I think I found the race.
> 
> Direct netconsole write would be a useful patch to have mainline I think
> 8) 

could we make that use the earlyprintk infrastructure?

------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
--
_______________________________________________
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] fbcon: fix race condition between console lock and cursor timer
  2012-08-21  6:40 ` Dave Airlie
@ 2012-08-21 13:19   ` Josh Boyer
  -1 siblings, 0 replies; 23+ messages in thread
From: Josh Boyer @ 2012-08-21 13:19 UTC (permalink / raw)
  To: Dave Airlie
  Cc: linux-fbdev, dri-devel, linux-kernel, Linus, Alan Cox,
	Randy Dunlap, kernel-team

On Tue, Aug 21, 2012 at 2:40 AM, Dave Airlie <airlied@redhat.com> wrote:
> So we've had a fair few reports of fbcon handover breakage between
> efi/vesafb and i915 surface recently, so I dedicated a couple of
> days to finding the problem.
>
> Essentially the last thing we saw was the conflicting framebuffer
> message and that was all.
>
> So after much tracing with direct netconsole writes (printks
> under console_lock not so useful), I think I found the race.
>
> Thread A (driver load)    Thread B (timer thread)
>   unbind_con_driver ->              |
>   bind_con_driver ->                |
>   vc->vc_sw->con_deinit ->          |
>   fbcon_deinit ->                   |
>   console_lock()                    |
>       |                             |
>       |                       fbcon_flashcursor timer fires
>       |                       console_lock() <- blocked for A
>       |
>       |
> fbcon_del_cursor_timer ->
>   del_timer_sync
>   (BOOM)
>
> Of course because all of this is under the console lock,
> we never see anything, also since we also just unbound the active
> console guess what we never see anything.
>
> Hopefully this fixes the problem for anyone seeing vesafb->kms
> driver handoff.
>
> Signed-off-by: David Airlie <airlied@redhat.com>
> ---
>  drivers/video/console/fbcon.c |    6 +++++-
>  1 file changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/video/console/fbcon.c b/drivers/video/console/fbcon.c
> index 2e471c2..f8a79fc 100644
> --- a/drivers/video/console/fbcon.c
> +++ b/drivers/video/console/fbcon.c
> @@ -372,8 +372,12 @@ static void fb_flashcursor(struct work_struct *work)
>         struct vc_data *vc = NULL;
>         int c;
>         int mode;
> +       int ret;
> +
> +       ret = console_trylock();
> +       if (ret == 0)
> +               return;
>
> -       console_lock();
>         if (ops && ops->currcon != -1)
>                 vc = vc_cons[ops->currcon].d;
>

I have a Dell XPS 8300 machine with a Radeon card in it that started
showing this problem yesterday with 3.6-rc2 kernels.  I tested this
patch on top of v3.6-rc2-206-g10c63c9 this morning and the problem
seems to have been cleared up for me.  That includes making sure the
grub2 file has the gfxterm set, etc.

I know we've been seeing this quite a bit more on Fedora 17, so we'll
want to have some people test a 3.5 build with it but things are
looking better.

josh

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] fbcon: fix race condition between console lock and cursor timer
@ 2012-08-21 13:19   ` Josh Boyer
  0 siblings, 0 replies; 23+ messages in thread
From: Josh Boyer @ 2012-08-21 13:19 UTC (permalink / raw)
  To: Dave Airlie
  Cc: linux-fbdev, dri-devel, linux-kernel, Linus, Alan Cox,
	Randy Dunlap, kernel-team

On Tue, Aug 21, 2012 at 2:40 AM, Dave Airlie <airlied@redhat.com> wrote:
> So we've had a fair few reports of fbcon handover breakage between
> efi/vesafb and i915 surface recently, so I dedicated a couple of
> days to finding the problem.
>
> Essentially the last thing we saw was the conflicting framebuffer
> message and that was all.
>
> So after much tracing with direct netconsole writes (printks
> under console_lock not so useful), I think I found the race.
>
> Thread A (driver load)    Thread B (timer thread)
>   unbind_con_driver ->              |
>   bind_con_driver ->                |
>   vc->vc_sw->con_deinit ->          |
>   fbcon_deinit ->                   |
>   console_lock()                    |
>       |                             |
>       |                       fbcon_flashcursor timer fires
>       |                       console_lock() <- blocked for A
>       |
>       |
> fbcon_del_cursor_timer ->
>   del_timer_sync
>   (BOOM)
>
> Of course because all of this is under the console lock,
> we never see anything, also since we also just unbound the active
> console guess what we never see anything.
>
> Hopefully this fixes the problem for anyone seeing vesafb->kms
> driver handoff.
>
> Signed-off-by: David Airlie <airlied@redhat.com>
> ---
>  drivers/video/console/fbcon.c |    6 +++++-
>  1 file changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/video/console/fbcon.c b/drivers/video/console/fbcon.c
> index 2e471c2..f8a79fc 100644
> --- a/drivers/video/console/fbcon.c
> +++ b/drivers/video/console/fbcon.c
> @@ -372,8 +372,12 @@ static void fb_flashcursor(struct work_struct *work)
>         struct vc_data *vc = NULL;
>         int c;
>         int mode;
> +       int ret;
> +
> +       ret = console_trylock();
> +       if (ret = 0)
> +               return;
>
> -       console_lock();
>         if (ops && ops->currcon != -1)
>                 vc = vc_cons[ops->currcon].d;
>

I have a Dell XPS 8300 machine with a Radeon card in it that started
showing this problem yesterday with 3.6-rc2 kernels.  I tested this
patch on top of v3.6-rc2-206-g10c63c9 this morning and the problem
seems to have been cleared up for me.  That includes making sure the
grub2 file has the gfxterm set, etc.

I know we've been seeing this quite a bit more on Fedora 17, so we'll
want to have some people test a 3.5 build with it but things are
looking better.

josh

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] fbcon: fix race condition between console lock and cursor timer
  2012-08-21  6:40 ` Dave Airlie
                   ` (8 preceding siblings ...)
  (?)
@ 2012-08-21 13:19 ` Josh Boyer
  -1 siblings, 0 replies; 23+ messages in thread
From: Josh Boyer @ 2012-08-21 13:19 UTC (permalink / raw)
  To: Dave Airlie
  Cc: kernel-team, linux-fbdev, linux-kernel, Randy Dunlap, dri-devel,
	Linus, Alan Cox

On Tue, Aug 21, 2012 at 2:40 AM, Dave Airlie <airlied@redhat.com> wrote:
> So we've had a fair few reports of fbcon handover breakage between
> efi/vesafb and i915 surface recently, so I dedicated a couple of
> days to finding the problem.
>
> Essentially the last thing we saw was the conflicting framebuffer
> message and that was all.
>
> So after much tracing with direct netconsole writes (printks
> under console_lock not so useful), I think I found the race.
>
> Thread A (driver load)    Thread B (timer thread)
>   unbind_con_driver ->              |
>   bind_con_driver ->                |
>   vc->vc_sw->con_deinit ->          |
>   fbcon_deinit ->                   |
>   console_lock()                    |
>       |                             |
>       |                       fbcon_flashcursor timer fires
>       |                       console_lock() <- blocked for A
>       |
>       |
> fbcon_del_cursor_timer ->
>   del_timer_sync
>   (BOOM)
>
> Of course because all of this is under the console lock,
> we never see anything, also since we also just unbound the active
> console guess what we never see anything.
>
> Hopefully this fixes the problem for anyone seeing vesafb->kms
> driver handoff.
>
> Signed-off-by: David Airlie <airlied@redhat.com>
> ---
>  drivers/video/console/fbcon.c |    6 +++++-
>  1 file changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/video/console/fbcon.c b/drivers/video/console/fbcon.c
> index 2e471c2..f8a79fc 100644
> --- a/drivers/video/console/fbcon.c
> +++ b/drivers/video/console/fbcon.c
> @@ -372,8 +372,12 @@ static void fb_flashcursor(struct work_struct *work)
>         struct vc_data *vc = NULL;
>         int c;
>         int mode;
> +       int ret;
> +
> +       ret = console_trylock();
> +       if (ret == 0)
> +               return;
>
> -       console_lock();
>         if (ops && ops->currcon != -1)
>                 vc = vc_cons[ops->currcon].d;
>

I have a Dell XPS 8300 machine with a Radeon card in it that started
showing this problem yesterday with 3.6-rc2 kernels.  I tested this
patch on top of v3.6-rc2-206-g10c63c9 this morning and the problem
seems to have been cleared up for me.  That includes making sure the
grub2 file has the gfxterm set, etc.

I know we've been seeing this quite a bit more on Fedora 17, so we'll
want to have some people test a 3.5 build with it but things are
looking better.

josh

------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
--

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] fbcon: fix race condition between console lock and cursor timer
  2012-08-21  6:40 ` Dave Airlie
                   ` (6 preceding siblings ...)
  (?)
@ 2012-08-21 13:19 ` Josh Boyer
  -1 siblings, 0 replies; 23+ messages in thread
From: Josh Boyer @ 2012-08-21 13:19 UTC (permalink / raw)
  To: Dave Airlie
  Cc: kernel-team, linux-fbdev, linux-kernel, Randy Dunlap, dri-devel, Linus

On Tue, Aug 21, 2012 at 2:40 AM, Dave Airlie <airlied@redhat.com> wrote:
> So we've had a fair few reports of fbcon handover breakage between
> efi/vesafb and i915 surface recently, so I dedicated a couple of
> days to finding the problem.
>
> Essentially the last thing we saw was the conflicting framebuffer
> message and that was all.
>
> So after much tracing with direct netconsole writes (printks
> under console_lock not so useful), I think I found the race.
>
> Thread A (driver load)    Thread B (timer thread)
>   unbind_con_driver ->              |
>   bind_con_driver ->                |
>   vc->vc_sw->con_deinit ->          |
>   fbcon_deinit ->                   |
>   console_lock()                    |
>       |                             |
>       |                       fbcon_flashcursor timer fires
>       |                       console_lock() <- blocked for A
>       |
>       |
> fbcon_del_cursor_timer ->
>   del_timer_sync
>   (BOOM)
>
> Of course because all of this is under the console lock,
> we never see anything, also since we also just unbound the active
> console guess what we never see anything.
>
> Hopefully this fixes the problem for anyone seeing vesafb->kms
> driver handoff.
>
> Signed-off-by: David Airlie <airlied@redhat.com>
> ---
>  drivers/video/console/fbcon.c |    6 +++++-
>  1 file changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/video/console/fbcon.c b/drivers/video/console/fbcon.c
> index 2e471c2..f8a79fc 100644
> --- a/drivers/video/console/fbcon.c
> +++ b/drivers/video/console/fbcon.c
> @@ -372,8 +372,12 @@ static void fb_flashcursor(struct work_struct *work)
>         struct vc_data *vc = NULL;
>         int c;
>         int mode;
> +       int ret;
> +
> +       ret = console_trylock();
> +       if (ret == 0)
> +               return;
>
> -       console_lock();
>         if (ops && ops->currcon != -1)
>                 vc = vc_cons[ops->currcon].d;
>

I have a Dell XPS 8300 machine with a Radeon card in it that started
showing this problem yesterday with 3.6-rc2 kernels.  I tested this
patch on top of v3.6-rc2-206-g10c63c9 this morning and the problem
seems to have been cleared up for me.  That includes making sure the
grub2 file has the gfxterm set, etc.

I know we've been seeing this quite a bit more on Fedora 17, so we'll
want to have some people test a 3.5 build with it but things are
looking better.

josh

------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
--
_______________________________________________
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] fbcon: fix race condition between console lock and cursor timer
  2012-08-21  9:15   ` Alan Cox
  (?)
@ 2012-08-22  4:00     ` Dave Airlie
  -1 siblings, 0 replies; 23+ messages in thread
From: Dave Airlie @ 2012-08-22  4:00 UTC (permalink / raw)
  To: Alan Cox
  Cc: Dave Airlie, linux-fbdev, dri-devel, linux-kernel, Linus,
	Randy Dunlap, Josh Boyer

On Tue, Aug 21, 2012 at 7:15 PM, Alan Cox <alan@lxorguk.ukuu.org.uk> wrote:
>> So after much tracing with direct netconsole writes (printks
>> under console_lock not so useful), I think I found the race.
>
> Direct netconsole write would be a useful patch to have mainline I think
> 8)

Well I used a one line wrapper around the netconsole write_msg, which
just passed
NULL as the first arg, then sprinkled netconsole_write_msg around the
place, not having
printf stuff could be an annoyance for some people, for this it didn't matter.

Peter I wish I had a serial port to work with :-)

>
> Not really the proper fix but its clear and is probably the best thing to
> go in initially with a cc: stable. Can you at least stick a large
>
> + /* FIXME: we should sort out the unbind locking instead */

Done, and cc stable, I'll send this to Linus via my tree as its fairly
urgent from my pov.

Dave.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] fbcon: fix race condition between console lock and cursor timer
@ 2012-08-22  4:00     ` Dave Airlie
  0 siblings, 0 replies; 23+ messages in thread
From: Dave Airlie @ 2012-08-22  4:00 UTC (permalink / raw)
  To: Alan Cox
  Cc: linux-fbdev, linux-kernel, Randy Dunlap, dri-devel, Dave Airlie,
	Josh Boyer, Linus

On Tue, Aug 21, 2012 at 7:15 PM, Alan Cox <alan@lxorguk.ukuu.org.uk> wrote:
>> So after much tracing with direct netconsole writes (printks
>> under console_lock not so useful), I think I found the race.
>
> Direct netconsole write would be a useful patch to have mainline I think
> 8)

Well I used a one line wrapper around the netconsole write_msg, which
just passed
NULL as the first arg, then sprinkled netconsole_write_msg around the
place, not having
printf stuff could be an annoyance for some people, for this it didn't matter.

Peter I wish I had a serial port to work with :-)

>
> Not really the proper fix but its clear and is probably the best thing to
> go in initially with a cc: stable. Can you at least stick a large
>
> + /* FIXME: we should sort out the unbind locking instead */

Done, and cc stable, I'll send this to Linus via my tree as its fairly
urgent from my pov.

Dave.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] fbcon: fix race condition between console lock and cursor timer
@ 2012-08-22  4:00     ` Dave Airlie
  0 siblings, 0 replies; 23+ messages in thread
From: Dave Airlie @ 2012-08-22  4:00 UTC (permalink / raw)
  To: Alan Cox
  Cc: linux-fbdev, linux-kernel, Randy Dunlap, dri-devel, Dave Airlie,
	Josh Boyer, Linus

On Tue, Aug 21, 2012 at 7:15 PM, Alan Cox <alan@lxorguk.ukuu.org.uk> wrote:
>> So after much tracing with direct netconsole writes (printks
>> under console_lock not so useful), I think I found the race.
>
> Direct netconsole write would be a useful patch to have mainline I think
> 8)

Well I used a one line wrapper around the netconsole write_msg, which
just passed
NULL as the first arg, then sprinkled netconsole_write_msg around the
place, not having
printf stuff could be an annoyance for some people, for this it didn't matter.

Peter I wish I had a serial port to work with :-)

>
> Not really the proper fix but its clear and is probably the best thing to
> go in initially with a cc: stable. Can you at least stick a large
>
> + /* FIXME: we should sort out the unbind locking instead */

Done, and cc stable, I'll send this to Linus via my tree as its fairly
urgent from my pov.

Dave.

------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
--
_______________________________________________
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH] fbcon: fix race condition between console lock and cursor timer
@ 2012-08-21  6:40 Dave Airlie
  0 siblings, 0 replies; 23+ messages in thread
From: Dave Airlie @ 2012-08-21  6:40 UTC (permalink / raw)
  To: linux-fbdev
  Cc: linux-kernel, Randy Dunlap, dri-devel, Dave Airlie, Josh Boyer,
	Linus, Alan Cox

So we've had a fair few reports of fbcon handover breakage between
efi/vesafb and i915 surface recently, so I dedicated a couple of
days to finding the problem.

Essentially the last thing we saw was the conflicting framebuffer
message and that was all.

So after much tracing with direct netconsole writes (printks
under console_lock not so useful), I think I found the race.

Thread A (driver load)    Thread B (timer thread)
  unbind_con_driver ->              |
  bind_con_driver ->                |
  vc->vc_sw->con_deinit ->          |
  fbcon_deinit ->                   |
  console_lock()                    |
      |                             |
      |                       fbcon_flashcursor timer fires
      |                       console_lock() <- blocked for A
      |
      |
fbcon_del_cursor_timer ->
  del_timer_sync
  (BOOM)

Of course because all of this is under the console lock,
we never see anything, also since we also just unbound the active
console guess what we never see anything.

Hopefully this fixes the problem for anyone seeing vesafb->kms
driver handoff.

Signed-off-by: David Airlie <airlied@redhat.com>
---
 drivers/video/console/fbcon.c |    6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/video/console/fbcon.c b/drivers/video/console/fbcon.c
index 2e471c2..f8a79fc 100644
--- a/drivers/video/console/fbcon.c
+++ b/drivers/video/console/fbcon.c
@@ -372,8 +372,12 @@ static void fb_flashcursor(struct work_struct *work)
 	struct vc_data *vc = NULL;
 	int c;
 	int mode;
+	int ret;
+
+	ret = console_trylock();
+	if (ret == 0)
+		return;
 
-	console_lock();
 	if (ops && ops->currcon != -1)
 		vc = vc_cons[ops->currcon].d;
 
-- 
1.7.10.2


------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
--

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH] fbcon: fix race condition between console lock and cursor timer
@ 2012-08-21  6:40 Dave Airlie
  0 siblings, 0 replies; 23+ messages in thread
From: Dave Airlie @ 2012-08-21  6:40 UTC (permalink / raw)
  To: linux-fbdev
  Cc: linux-kernel, Randy Dunlap, dri-devel, Dave Airlie, Josh Boyer, Linus

So we've had a fair few reports of fbcon handover breakage between
efi/vesafb and i915 surface recently, so I dedicated a couple of
days to finding the problem.

Essentially the last thing we saw was the conflicting framebuffer
message and that was all.

So after much tracing with direct netconsole writes (printks
under console_lock not so useful), I think I found the race.

Thread A (driver load)    Thread B (timer thread)
  unbind_con_driver ->              |
  bind_con_driver ->                |
  vc->vc_sw->con_deinit ->          |
  fbcon_deinit ->                   |
  console_lock()                    |
      |                             |
      |                       fbcon_flashcursor timer fires
      |                       console_lock() <- blocked for A
      |
      |
fbcon_del_cursor_timer ->
  del_timer_sync
  (BOOM)

Of course because all of this is under the console lock,
we never see anything, also since we also just unbound the active
console guess what we never see anything.

Hopefully this fixes the problem for anyone seeing vesafb->kms
driver handoff.

Signed-off-by: David Airlie <airlied@redhat.com>
---
 drivers/video/console/fbcon.c |    6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/video/console/fbcon.c b/drivers/video/console/fbcon.c
index 2e471c2..f8a79fc 100644
--- a/drivers/video/console/fbcon.c
+++ b/drivers/video/console/fbcon.c
@@ -372,8 +372,12 @@ static void fb_flashcursor(struct work_struct *work)
 	struct vc_data *vc = NULL;
 	int c;
 	int mode;
+	int ret;
+
+	ret = console_trylock();
+	if (ret == 0)
+		return;
 
-	console_lock();
 	if (ops && ops->currcon != -1)
 		vc = vc_cons[ops->currcon].d;
 
-- 
1.7.10.2


------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
--
_______________________________________________
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2012-08-22  4:00 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-08-21  6:40 [PATCH] fbcon: fix race condition between console lock and cursor timer Dave Airlie
2012-08-21  6:40 ` Dave Airlie
2012-08-21  9:15 ` Alan Cox
2012-08-21  9:15 ` Alan Cox
2012-08-21  9:15 ` Alan Cox
2012-08-21  9:15   ` Alan Cox
2012-08-21 12:15   ` Peter Zijlstra
2012-08-21 12:15   ` Peter Zijlstra
2012-08-21 12:15   ` Peter Zijlstra
2012-08-21 12:15     ` Peter Zijlstra
2012-08-22  4:00   ` Dave Airlie
2012-08-22  4:00     ` Dave Airlie
2012-08-22  4:00     ` Dave Airlie
2012-08-21 12:14 ` Peter Zijlstra
2012-08-21 12:14 ` Peter Zijlstra
2012-08-21 12:14   ` Peter Zijlstra
2012-08-21 12:14 ` Peter Zijlstra
2012-08-21 13:19 ` Josh Boyer
2012-08-21 13:19 ` Josh Boyer
2012-08-21 13:19   ` Josh Boyer
2012-08-21 13:19 ` Josh Boyer
  -- strict thread matches above, loose matches on Subject: below --
2012-08-21  6:40 Dave Airlie
2012-08-21  6:40 Dave Airlie

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.