linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] tty: vt: Fix soft lockup in fbcon cursor blink timer.
@ 2016-05-17 18:41 David Daney
  2016-05-17 20:49 ` Pavel Machek
                   ` (3 more replies)
  0 siblings, 4 replies; 30+ messages in thread
From: David Daney @ 2016-05-17 18:41 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Jiri Slaby, Peter Hurley, Ming Lei,
	Dann Frazier, Scot Doyle, David Airlie, dri-devel,
	Radha.Chintakuntla, Pavel Machek
  Cc: linux-kernel, David Daney, stable

From: David Daney <david.daney@cavium.com>

We are getting somewhat random soft lockups with this signature:

[   86.992215] [<fffffc00080935e0>] el1_irq+0xa0/0x10c
[   86.997082] [<fffffc000841822c>] cursor_timer_handler+0x30/0x54
[   87.002991] [<fffffc000810ec44>] call_timer_fn+0x54/0x1a8
[   87.008378] [<fffffc000810ef88>] run_timer_softirq+0x1c4/0x2bc
[   87.014200] [<fffffc000809077c>] __do_softirq+0x114/0x344
[   87.019590] [<fffffc00080af45c>] irq_exit+0x74/0x98
[   87.024458] [<fffffc00080fac20>] __handle_domain_irq+0x98/0xfc
[   87.030278] [<fffffc000809056c>] gic_handle_irq+0x94/0x190

This is caused by the vt visual_init() function calling into
fbcon_init() with a vc_cur_blink_ms value of zero.  This is a
transient condition, as it is later set to a non-zero value.  But, if
the timer happens to expire while the blink rate is zero, it goes into
an endless loop, and we get soft lockup.

The fix is to initialize vc_cur_blink_ms before calling the con_init()
function.

Signed-off-by: David Daney <david.daney@cavium.com>
Cc: stable@vger.kernel.org
---
 drivers/tty/vt/vt.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/tty/vt/vt.c b/drivers/tty/vt/vt.c
index 3e3c757..eef5c36 100644
--- a/drivers/tty/vt/vt.c
+++ b/drivers/tty/vt/vt.c
@@ -750,6 +750,7 @@ static void visual_init(struct vc_data *vc, int num, int init)
 	vc->vc_complement_mask = 0;
 	vc->vc_can_do_color = 0;
 	vc->vc_panic_force_write = false;
+	vc->vc_cur_blink_ms = DEFAULT_CURSOR_BLINK_MS;
 	vc->vc_sw->con_init(vc, init);
 	if (!vc->vc_complement_mask)
 		vc->vc_complement_mask = vc->vc_can_do_color ? 0x7700 : 0x0800;
-- 
1.8.3.1

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] tty: vt: Fix soft lockup in fbcon cursor blink timer.
  2016-05-17 18:41 [PATCH] tty: vt: Fix soft lockup in fbcon cursor blink timer David Daney
@ 2016-05-17 20:49 ` Pavel Machek
  2016-05-18  0:42   ` Ming Lei
  2016-05-19 22:35 ` Scot Doyle
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 30+ messages in thread
From: Pavel Machek @ 2016-05-17 20:49 UTC (permalink / raw)
  To: David Daney
  Cc: Greg Kroah-Hartman, Jiri Slaby, Peter Hurley, Ming Lei,
	Dann Frazier, Scot Doyle, David Airlie, dri-devel,
	Radha.Chintakuntla, linux-kernel, David Daney, stable

On Tue 2016-05-17 11:41:04, David Daney wrote:
> From: David Daney <david.daney@cavium.com>
> 
> We are getting somewhat random soft lockups with this signature:
> 
> [   86.992215] [<fffffc00080935e0>] el1_irq+0xa0/0x10c
> [   86.997082] [<fffffc000841822c>] cursor_timer_handler+0x30/0x54
> [   87.002991] [<fffffc000810ec44>] call_timer_fn+0x54/0x1a8
> [   87.008378] [<fffffc000810ef88>] run_timer_softirq+0x1c4/0x2bc
> [   87.014200] [<fffffc000809077c>] __do_softirq+0x114/0x344
> [   87.019590] [<fffffc00080af45c>] irq_exit+0x74/0x98
> [   87.024458] [<fffffc00080fac20>] __handle_domain_irq+0x98/0xfc
> [   87.030278] [<fffffc000809056c>] gic_handle_irq+0x94/0x190
> 
> This is caused by the vt visual_init() function calling into
> fbcon_init() with a vc_cur_blink_ms value of zero.  This is a
> transient condition, as it is later set to a non-zero value.  But, if
> the timer happens to expire while the blink rate is zero, it goes into
> an endless loop, and we get soft lockup.
> 
> The fix is to initialize vc_cur_blink_ms before calling the con_init()
> function.
> 
> Signed-off-by: David Daney <david.daney@cavium.com>
> Cc: stable@vger.kernel.org

Acked-by: Pavel Machek <pavel@ucw.cz>

(And it is amazing how many problems configurable blink speed caused).

Thanks!
								Pavel

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] tty: vt: Fix soft lockup in fbcon cursor blink timer.
  2016-05-17 20:49 ` Pavel Machek
@ 2016-05-18  0:42   ` Ming Lei
  2016-05-18 20:24     ` Scot Doyle
  0 siblings, 1 reply; 30+ messages in thread
From: Ming Lei @ 2016-05-18  0:42 UTC (permalink / raw)
  To: Pavel Machek
  Cc: David Daney, Greg Kroah-Hartman, Jiri Slaby, Peter Hurley,
	Dann Frazier, Scot Doyle, David Airlie, dri-devel, Chintakuntla,
	Radha, Linux Kernel Mailing List, David Daney, stable

On Wed, May 18, 2016 at 4:49 AM, Pavel Machek <pavel@ucw.cz> wrote:
> On Tue 2016-05-17 11:41:04, David Daney wrote:
>> From: David Daney <david.daney@cavium.com>
>>
>> We are getting somewhat random soft lockups with this signature:
>>
>> [   86.992215] [<fffffc00080935e0>] el1_irq+0xa0/0x10c
>> [   86.997082] [<fffffc000841822c>] cursor_timer_handler+0x30/0x54
>> [   87.002991] [<fffffc000810ec44>] call_timer_fn+0x54/0x1a8
>> [   87.008378] [<fffffc000810ef88>] run_timer_softirq+0x1c4/0x2bc
>> [   87.014200] [<fffffc000809077c>] __do_softirq+0x114/0x344
>> [   87.019590] [<fffffc00080af45c>] irq_exit+0x74/0x98
>> [   87.024458] [<fffffc00080fac20>] __handle_domain_irq+0x98/0xfc
>> [   87.030278] [<fffffc000809056c>] gic_handle_irq+0x94/0x190
>>
>> This is caused by the vt visual_init() function calling into
>> fbcon_init() with a vc_cur_blink_ms value of zero.  This is a
>> transient condition, as it is later set to a non-zero value.  But, if
>> the timer happens to expire while the blink rate is zero, it goes into
>> an endless loop, and we get soft lockup.
>>
>> The fix is to initialize vc_cur_blink_ms before calling the con_init()
>> function.
>>
>> Signed-off-by: David Daney <david.daney@cavium.com>
>> Cc: stable@vger.kernel.org
>
> Acked-by: Pavel Machek <pavel@ucw.cz>

Tested-by: Ming Lei <ming.lei@canonical.com>

Thanks David and Pavel for making it work!

>
> (And it is amazing how many problems configurable blink speed caused).
>
> Thanks!
>                                                                 Pavel
>
> --
> (english) http://www.livejournal.com/~pavelmachek
> (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] tty: vt: Fix soft lockup in fbcon cursor blink timer.
  2016-05-18  0:42   ` Ming Lei
@ 2016-05-18 20:24     ` Scot Doyle
  2016-05-18 20:38       ` David Daney
  2016-05-19  0:27       ` [PATCH] tty: vt: Fix soft lockup in fbcon cursor blink timer Ming Lei
  0 siblings, 2 replies; 30+ messages in thread
From: Scot Doyle @ 2016-05-18 20:24 UTC (permalink / raw)
  To: David Daney, Ming Lei, Dann Frazier
  Cc: Pavel Machek, Greg Kroah-Hartman, Jiri Slaby, Peter Hurley,
	David Airlie, dri-devel, Chintakuntla, Radha, David Daney,
	Linux Kernel Mailing List, stable

On Wed, 18 May 2016, Ming Lei wrote:
> On Wed, May 18, 2016 at 4:49 AM, Pavel Machek <pavel@ucw.cz> wrote:
> > On Tue 2016-05-17 11:41:04, David Daney wrote:
> >> From: David Daney <david.daney@cavium.com>
> >>
> >> We are getting somewhat random soft lockups with this signature:
> >>
> >> [   86.992215] [<fffffc00080935e0>] el1_irq+0xa0/0x10c
> >> [   86.997082] [<fffffc000841822c>] cursor_timer_handler+0x30/0x54
> >> [   87.002991] [<fffffc000810ec44>] call_timer_fn+0x54/0x1a8
> >> [   87.008378] [<fffffc000810ef88>] run_timer_softirq+0x1c4/0x2bc
> >> [   87.014200] [<fffffc000809077c>] __do_softirq+0x114/0x344
> >> [   87.019590] [<fffffc00080af45c>] irq_exit+0x74/0x98
> >> [   87.024458] [<fffffc00080fac20>] __handle_domain_irq+0x98/0xfc
> >> [   87.030278] [<fffffc000809056c>] gic_handle_irq+0x94/0x190
> >>
> >> This is caused by the vt visual_init() function calling into
> >> fbcon_init() with a vc_cur_blink_ms value of zero.  This is a
> >> transient condition, as it is later set to a non-zero value.  But, if
> >> the timer happens to expire while the blink rate is zero, it goes into
> >> an endless loop, and we get soft lockup.
> >>
> >> The fix is to initialize vc_cur_blink_ms before calling the con_init()
> >> function.
> >>
> >> Signed-off-by: David Daney <david.daney@cavium.com>
> >> Cc: stable@vger.kernel.org
> >
> > Acked-by: Pavel Machek <pavel@ucw.cz>
> 
> Tested-by: Ming Lei <ming.lei@canonical.com>
> 
> Thanks David and Pavel for making it work!
> 
> >
> > (And it is amazing how many problems configurable blink speed caused).
> >
> > Thanks!
> >                                                                 Pavel
> >


Dann, Ming and David, thank you so much for all of your effort.

There were three other reports in the past year, each leading to their own 
patch, of boot lockups occuring when the cursor flash timer was set using 
an ops->cur_blink_jiffies value of 0.  I plan to propose a patch within 
the next day that will prevent this for all code paths.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] tty: vt: Fix soft lockup in fbcon cursor blink timer.
  2016-05-18 20:24     ` Scot Doyle
@ 2016-05-18 20:38       ` David Daney
  2016-05-19  4:21         ` [PATCH] fbcon: use default if cursor blink interval is not valid Scot Doyle
  2016-05-19  0:27       ` [PATCH] tty: vt: Fix soft lockup in fbcon cursor blink timer Ming Lei
  1 sibling, 1 reply; 30+ messages in thread
From: David Daney @ 2016-05-18 20:38 UTC (permalink / raw)
  To: Scot Doyle, David Daney, Ming Lei, Dann Frazier
  Cc: Pavel Machek, Greg Kroah-Hartman, Jiri Slaby, Peter Hurley,
	David Airlie, dri-devel, Chintakuntla, Radha, David Daney,
	Linux Kernel Mailing List, stable



On 05/18/2016 01:24 PM, Scot Doyle wrote:
> On Wed, 18 May 2016, Ming Lei wrote:
>> On Wed, May 18, 2016 at 4:49 AM, Pavel Machek <pavel@ucw.cz> wrote:
>>> On Tue 2016-05-17 11:41:04, David Daney wrote:
>>>> From: David Daney <david.daney@cavium.com>
>>>>
>>>> We are getting somewhat random soft lockups with this signature:
>>>>
>>>> [   86.992215] [<fffffc00080935e0>] el1_irq+0xa0/0x10c
>>>> [   86.997082] [<fffffc000841822c>] cursor_timer_handler+0x30/0x54
>>>> [   87.002991] [<fffffc000810ec44>] call_timer_fn+0x54/0x1a8
>>>> [   87.008378] [<fffffc000810ef88>] run_timer_softirq+0x1c4/0x2bc
>>>> [   87.014200] [<fffffc000809077c>] __do_softirq+0x114/0x344
>>>> [   87.019590] [<fffffc00080af45c>] irq_exit+0x74/0x98
>>>> [   87.024458] [<fffffc00080fac20>] __handle_domain_irq+0x98/0xfc
>>>> [   87.030278] [<fffffc000809056c>] gic_handle_irq+0x94/0x190
>>>>
>>>> This is caused by the vt visual_init() function calling into
>>>> fbcon_init() with a vc_cur_blink_ms value of zero.  This is a
>>>> transient condition, as it is later set to a non-zero value.  But, if
>>>> the timer happens to expire while the blink rate is zero, it goes into
>>>> an endless loop, and we get soft lockup.
>>>>
>>>> The fix is to initialize vc_cur_blink_ms before calling the con_init()
>>>> function.
>>>>
>>>> Signed-off-by: David Daney <david.daney@cavium.com>
>>>> Cc: stable@vger.kernel.org
>>>
>>> Acked-by: Pavel Machek <pavel@ucw.cz>
>>
>> Tested-by: Ming Lei <ming.lei@canonical.com>
>>
>> Thanks David and Pavel for making it work!
>>
>>>
>>> (And it is amazing how many problems configurable blink speed caused).
>>>
>>> Thanks!
>>>                                                                 Pavel
>>>
>
>
> Dann, Ming and David, thank you so much for all of your effort.
>
> There were three other reports in the past year, each leading to their own
> patch,

Why were none of the patches merged?  Were they obviously incorrect, or 
is there some other problem that blocks the resolution of this issue?

> of boot lockups occuring when the cursor flash timer was set using
> an ops->cur_blink_jiffies value of 0.  I plan to propose a patch within
> the next day that will prevent this for all code paths.
>

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] tty: vt: Fix soft lockup in fbcon cursor blink timer.
  2016-05-18 20:24     ` Scot Doyle
  2016-05-18 20:38       ` David Daney
@ 2016-05-19  0:27       ` Ming Lei
  2016-05-19  7:08         ` Pavel Machek
  1 sibling, 1 reply; 30+ messages in thread
From: Ming Lei @ 2016-05-19  0:27 UTC (permalink / raw)
  To: Scot Doyle
  Cc: David Daney, Dann Frazier, Pavel Machek, Greg Kroah-Hartman,
	Jiri Slaby, Peter Hurley, David Airlie, dri-devel, Chintakuntla,
	Radha, David Daney, Linux Kernel Mailing List, stable

On Thu, May 19, 2016 at 4:24 AM, Scot Doyle <lkml14@scotdoyle.com> wrote:
> On Wed, 18 May 2016, Ming Lei wrote:
>> On Wed, May 18, 2016 at 4:49 AM, Pavel Machek <pavel@ucw.cz> wrote:
>> > On Tue 2016-05-17 11:41:04, David Daney wrote:
>> >> From: David Daney <david.daney@cavium.com>
>> >>
>> >> We are getting somewhat random soft lockups with this signature:
>> >>
>> >> [   86.992215] [<fffffc00080935e0>] el1_irq+0xa0/0x10c
>> >> [   86.997082] [<fffffc000841822c>] cursor_timer_handler+0x30/0x54
>> >> [   87.002991] [<fffffc000810ec44>] call_timer_fn+0x54/0x1a8
>> >> [   87.008378] [<fffffc000810ef88>] run_timer_softirq+0x1c4/0x2bc
>> >> [   87.014200] [<fffffc000809077c>] __do_softirq+0x114/0x344
>> >> [   87.019590] [<fffffc00080af45c>] irq_exit+0x74/0x98
>> >> [   87.024458] [<fffffc00080fac20>] __handle_domain_irq+0x98/0xfc
>> >> [   87.030278] [<fffffc000809056c>] gic_handle_irq+0x94/0x190
>> >>
>> >> This is caused by the vt visual_init() function calling into
>> >> fbcon_init() with a vc_cur_blink_ms value of zero.  This is a
>> >> transient condition, as it is later set to a non-zero value.  But, if
>> >> the timer happens to expire while the blink rate is zero, it goes into
>> >> an endless loop, and we get soft lockup.
>> >>
>> >> The fix is to initialize vc_cur_blink_ms before calling the con_init()
>> >> function.
>> >>
>> >> Signed-off-by: David Daney <david.daney@cavium.com>
>> >> Cc: stable@vger.kernel.org
>> >
>> > Acked-by: Pavel Machek <pavel@ucw.cz>
>>
>> Tested-by: Ming Lei <ming.lei@canonical.com>
>>
>> Thanks David and Pavel for making it work!
>>
>> >
>> > (And it is amazing how many problems configurable blink speed caused).
>> >
>> > Thanks!
>> >                                                                 Pavel
>> >
>
>
> Dann, Ming and David, thank you so much for all of your effort.
>
> There were three other reports in the past year, each leading to their own
> patch, of boot lockups occuring when the cursor flash timer was set using
> an ops->cur_blink_jiffies value of 0.  I plan to propose a patch within
> the next day that will prevent this for all code paths.

Given this issue caues system unusable, I suggest to merge David's
oneline patch first, then you can think and try to figure out 'perfect' solution
for addressing all this kind of reports from last year.

Does it make sense?


Thanks,
Ming

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH] fbcon: use default if cursor blink interval is not valid
  2016-05-18 20:38       ` David Daney
@ 2016-05-19  4:21         ` Scot Doyle
  2016-05-19  7:25           ` Jeremy Kerr
                             ` (3 more replies)
  0 siblings, 4 replies; 30+ messages in thread
From: Scot Doyle @ 2016-05-19  4:21 UTC (permalink / raw)
  To: Tomi Valkeinen, Jean-Christophe Plagniol-Villard
  Cc: David Daney, Ming Lei, Dann Frazier, Peter Hurley, Pavel Machek,
	Jonathan Liu, Alistair Popple, Jean-Philippe Brucker,
	Chintakuntla, Radha, Greg Kroah-Hartman, Jiri Slaby,
	David Airlie, ddaney.cavm, lkml14, dri-devel, linux-fbdev,
	linux-kernel, stable

Two current [1] and three previous [2] systems locked during boot
because the cursor flash timer was set using an ops->cur_blink_jiffies
value of 0. Previous patches attempted to solve the problem by moving
variable initialization earlier in the setup sequence [2].

Use the normal cursor blink default interval of 200 ms if
ops->cur_blink_jiffies is not in the range specified in commit
bd63364caa8d. Since invalid values are not used, specific system
initialization timings should not cause lockups.

[1] https://bugs.launchpad.net/bugs/1574814
[2] see commits: 2a17d7e80f1d, f235f664a8af, a1e533ec07d5

Signed-off-by: Scot Doyle <lkml14@scotdoyle.com>
Cc: <stable@vger.kernel.org> [v4.2]
---
 drivers/video/console/fbcon.c | 17 +++++++++++++----
 1 file changed, 13 insertions(+), 4 deletions(-)

diff --git a/drivers/video/console/fbcon.c b/drivers/video/console/fbcon.c
index 6e92917..da61d87 100644
--- a/drivers/video/console/fbcon.c
+++ b/drivers/video/console/fbcon.c
@@ -396,13 +396,23 @@ static void fb_flashcursor(struct work_struct *work)
 	console_unlock();
 }
 
+static int cursor_blink_jiffies(int candidate)
+{
+	if (candidate >= msecs_to_jiffies(50) &&
+	    candidate <= msecs_to_jiffies(USHRT_MAX))
+		return candidate;
+	else
+		return HZ / 5;
+}
+
 static void cursor_timer_handler(unsigned long dev_addr)
 {
 	struct fb_info *info = (struct fb_info *) dev_addr;
 	struct fbcon_ops *ops = info->fbcon_par;
 
 	queue_work(system_power_efficient_wq, &info->queue);
-	mod_timer(&ops->cursor_timer, jiffies + ops->cur_blink_jiffies);
+	mod_timer(&ops->cursor_timer, jiffies +
+	    cursor_blink_jiffies(ops->cur_blink_jiffies));
 }
 
 static void fbcon_add_cursor_timer(struct fb_info *info)
@@ -417,7 +427,8 @@ static void fbcon_add_cursor_timer(struct fb_info *info)
 
 		init_timer(&ops->cursor_timer);
 		ops->cursor_timer.function = cursor_timer_handler;
-		ops->cursor_timer.expires = jiffies + ops->cur_blink_jiffies;
+		ops->cursor_timer.expires = jiffies +
+		    cursor_blink_jiffies(ops->cur_blink_jiffies);
 		ops->cursor_timer.data = (unsigned long ) info;
 		add_timer(&ops->cursor_timer);
 		ops->flags |= FBCON_FLAGS_CURSOR_TIMER;
@@ -709,7 +720,6 @@ static int con2fb_acquire_newinfo(struct vc_data *vc, struct fb_info *info,
 	}
 
 	if (!err) {
-		ops->cur_blink_jiffies = HZ / 5;
 		info->fbcon_par = ops;
 
 		if (vc)
@@ -957,7 +967,6 @@ static const char *fbcon_startup(void)
 	ops->currcon = -1;
 	ops->graphics = 1;
 	ops->cur_rotate = -1;
-	ops->cur_blink_jiffies = HZ / 5;
 	info->fbcon_par = ops;
 	p->con_rotate = initial_rotation;
 	set_blitting_type(vc, info);
-- 
2.1.4

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] tty: vt: Fix soft lockup in fbcon cursor blink timer.
  2016-05-19  0:27       ` [PATCH] tty: vt: Fix soft lockup in fbcon cursor blink timer Ming Lei
@ 2016-05-19  7:08         ` Pavel Machek
  0 siblings, 0 replies; 30+ messages in thread
From: Pavel Machek @ 2016-05-19  7:08 UTC (permalink / raw)
  To: Ming Lei
  Cc: Scot Doyle, David Daney, Dann Frazier, Greg Kroah-Hartman,
	Jiri Slaby, Peter Hurley, David Airlie, dri-devel, Chintakuntla,
	Radha, David Daney, Linux Kernel Mailing List, stable

On Thu 2016-05-19 08:27:37, Ming Lei wrote:
> On Thu, May 19, 2016 at 4:24 AM, Scot Doyle <lkml14@scotdoyle.com> wrote:
> > On Wed, 18 May 2016, Ming Lei wrote:
> >> On Wed, May 18, 2016 at 4:49 AM, Pavel Machek <pavel@ucw.cz> wrote:
> >> > On Tue 2016-05-17 11:41:04, David Daney wrote:
> >> >> From: David Daney <david.daney@cavium.com>
> >> >>
> >> >> We are getting somewhat random soft lockups with this signature:
> >> >>
> >> >> [   86.992215] [<fffffc00080935e0>] el1_irq+0xa0/0x10c
> >> >> [   86.997082] [<fffffc000841822c>] cursor_timer_handler+0x30/0x54
> >> >> [   87.002991] [<fffffc000810ec44>] call_timer_fn+0x54/0x1a8
> >> >> [   87.008378] [<fffffc000810ef88>] run_timer_softirq+0x1c4/0x2bc
> >> >> [   87.014200] [<fffffc000809077c>] __do_softirq+0x114/0x344
> >> >> [   87.019590] [<fffffc00080af45c>] irq_exit+0x74/0x98
> >> >> [   87.024458] [<fffffc00080fac20>] __handle_domain_irq+0x98/0xfc
> >> >> [   87.030278] [<fffffc000809056c>] gic_handle_irq+0x94/0x190
> >> >>
> >> >> This is caused by the vt visual_init() function calling into
> >> >> fbcon_init() with a vc_cur_blink_ms value of zero.  This is a
> >> >> transient condition, as it is later set to a non-zero value.  But, if
> >> >> the timer happens to expire while the blink rate is zero, it goes into
> >> >> an endless loop, and we get soft lockup.
> >> >>
> >> >> The fix is to initialize vc_cur_blink_ms before calling the con_init()
> >> >> function.
> >> >>
> >> >> Signed-off-by: David Daney <david.daney@cavium.com>
> >> >> Cc: stable@vger.kernel.org
> >> >
> >> > Acked-by: Pavel Machek <pavel@ucw.cz>
> >>
> >> Tested-by: Ming Lei <ming.lei@canonical.com>
> >>
> >> Thanks David and Pavel for making it work!
> >>
> >> >
> >> > (And it is amazing how many problems configurable blink speed caused).
> >> >
> >> > Thanks!
> >> >                                                                 Pavel
> >> >
> >
> >
> > Dann, Ming and David, thank you so much for all of your effort.
> >
> > There were three other reports in the past year, each leading to their own
> > patch, of boot lockups occuring when the cursor flash timer was set using
> > an ops->cur_blink_jiffies value of 0.  I plan to propose a patch within
> > the next day that will prevent this for all code paths.
> 
> Given this issue caues system unusable, I suggest to merge David's
> oneline patch first, then you can think and try to figure out 'perfect' solution
> for addressing all this kind of reports from last year.

Actually, I'd merge

[PATCH] fbcon: use default if cursor blink interval is not valid

first. That one is obviously safe. Nice big overkill, but safe. Then
nicer solution can be attempted...
									Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] fbcon: use default if cursor blink interval is not valid
  2016-05-19  4:21         ` [PATCH] fbcon: use default if cursor blink interval is not valid Scot Doyle
@ 2016-05-19  7:25           ` Jeremy Kerr
  2016-05-19  8:29           ` Ming Lei
                             ` (2 subsequent siblings)
  3 siblings, 0 replies; 30+ messages in thread
From: Jeremy Kerr @ 2016-05-19  7:25 UTC (permalink / raw)
  To: Scot Doyle, Tomi Valkeinen, Jean-Christophe Plagniol-Villard
  Cc: David Daney, Ming Lei, Dann Frazier, Peter Hurley, Pavel Machek,
	Jonathan Liu, Alistair Popple, Jean-Philippe Brucker,
	Chintakuntla, Radha, Greg Kroah-Hartman, Jiri Slaby,
	David Airlie, ddaney.cavm, dri-devel, linux-fbdev, linux-kernel,
	stable

Hi Scot,

> Use the normal cursor blink default interval of 200 ms if
> ops->cur_blink_jiffies is not in the range specified in commit
> bd63364caa8d. Since invalid values are not used, specific system
> initialization timings should not cause lockups.

This fixes an issue we're seeing with the ast driver on OpenPOWER
machines, thanks!

Acked-by: Jeremy Kerr <jk@ozlabs.org>

Cheers,


Jeremy

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] fbcon: use default if cursor blink interval is not valid
  2016-05-19  4:21         ` [PATCH] fbcon: use default if cursor blink interval is not valid Scot Doyle
  2016-05-19  7:25           ` Jeremy Kerr
@ 2016-05-19  8:29           ` Ming Lei
  2016-05-19  9:01           ` Pavel Machek
  2016-05-19 15:59           ` David Daney
  3 siblings, 0 replies; 30+ messages in thread
From: Ming Lei @ 2016-05-19  8:29 UTC (permalink / raw)
  To: Scot Doyle
  Cc: Tomi Valkeinen, Jean-Christophe Plagniol-Villard, David Daney,
	Dann Frazier, Peter Hurley, Pavel Machek, Jonathan Liu,
	Alistair Popple, Jean-Philippe Brucker, Chintakuntla, Radha,
	Greg Kroah-Hartman, Jiri Slaby, David Airlie, David Daney,
	dri-devel, linux-fbdev, Linux Kernel Mailing List, stable

On Thu, May 19, 2016 at 12:21 PM, Scot Doyle <lkml14@scotdoyle.com> wrote:
> Two current [1] and three previous [2] systems locked during boot
> because the cursor flash timer was set using an ops->cur_blink_jiffies
> value of 0. Previous patches attempted to solve the problem by moving
> variable initialization earlier in the setup sequence [2].
>
> Use the normal cursor blink default interval of 200 ms if
> ops->cur_blink_jiffies is not in the range specified in commit
> bd63364caa8d. Since invalid values are not used, specific system
> initialization timings should not cause lockups.
>
> [1] https://bugs.launchpad.net/bugs/1574814
> [2] see commits: 2a17d7e80f1d, f235f664a8af, a1e533ec07d5
>
> Signed-off-by: Scot Doyle <lkml14@scotdoyle.com>
> Cc: <stable@vger.kernel.org> [v4.2]

Tested-by: Ming Lei <ming.lei@canonical.com>

> ---
>  drivers/video/console/fbcon.c | 17 +++++++++++++----
>  1 file changed, 13 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/video/console/fbcon.c b/drivers/video/console/fbcon.c
> index 6e92917..da61d87 100644
> --- a/drivers/video/console/fbcon.c
> +++ b/drivers/video/console/fbcon.c
> @@ -396,13 +396,23 @@ static void fb_flashcursor(struct work_struct *work)
>         console_unlock();
>  }
>
> +static int cursor_blink_jiffies(int candidate)
> +{
> +       if (candidate >= msecs_to_jiffies(50) &&
> +           candidate <= msecs_to_jiffies(USHRT_MAX))
> +               return candidate;
> +       else
> +               return HZ / 5;
> +}
> +
>  static void cursor_timer_handler(unsigned long dev_addr)
>  {
>         struct fb_info *info = (struct fb_info *) dev_addr;
>         struct fbcon_ops *ops = info->fbcon_par;
>
>         queue_work(system_power_efficient_wq, &info->queue);
> -       mod_timer(&ops->cursor_timer, jiffies + ops->cur_blink_jiffies);
> +       mod_timer(&ops->cursor_timer, jiffies +
> +           cursor_blink_jiffies(ops->cur_blink_jiffies));
>  }
>
>  static void fbcon_add_cursor_timer(struct fb_info *info)
> @@ -417,7 +427,8 @@ static void fbcon_add_cursor_timer(struct fb_info *info)
>
>                 init_timer(&ops->cursor_timer);
>                 ops->cursor_timer.function = cursor_timer_handler;
> -               ops->cursor_timer.expires = jiffies + ops->cur_blink_jiffies;
> +               ops->cursor_timer.expires = jiffies +
> +                   cursor_blink_jiffies(ops->cur_blink_jiffies);
>                 ops->cursor_timer.data = (unsigned long ) info;
>                 add_timer(&ops->cursor_timer);
>                 ops->flags |= FBCON_FLAGS_CURSOR_TIMER;
> @@ -709,7 +720,6 @@ static int con2fb_acquire_newinfo(struct vc_data *vc, struct fb_info *info,
>         }
>
>         if (!err) {
> -               ops->cur_blink_jiffies = HZ / 5;
>                 info->fbcon_par = ops;
>
>                 if (vc)
> @@ -957,7 +967,6 @@ static const char *fbcon_startup(void)
>         ops->currcon = -1;
>         ops->graphics = 1;
>         ops->cur_rotate = -1;
> -       ops->cur_blink_jiffies = HZ / 5;
>         info->fbcon_par = ops;
>         p->con_rotate = initial_rotation;
>         set_blitting_type(vc, info);
> --
> 2.1.4
>

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] fbcon: use default if cursor blink interval is not valid
  2016-05-19  4:21         ` [PATCH] fbcon: use default if cursor blink interval is not valid Scot Doyle
  2016-05-19  7:25           ` Jeremy Kerr
  2016-05-19  8:29           ` Ming Lei
@ 2016-05-19  9:01           ` Pavel Machek
  2016-05-19 14:22             ` Scot Doyle
  2016-05-19 15:59           ` David Daney
  3 siblings, 1 reply; 30+ messages in thread
From: Pavel Machek @ 2016-05-19  9:01 UTC (permalink / raw)
  To: Scot Doyle
  Cc: Tomi Valkeinen, Jean-Christophe Plagniol-Villard, David Daney,
	Ming Lei, Dann Frazier, Peter Hurley, Jonathan Liu,
	Alistair Popple, Jean-Philippe Brucker, Chintakuntla, Radha,
	Greg Kroah-Hartman, Jiri Slaby, David Airlie, ddaney.cavm,
	dri-devel, linux-fbdev, linux-kernel, stable

Hi!

> Two current [1] and three previous [2] systems locked during boot
> because the cursor flash timer was set using an ops->cur_blink_jiffies
> value of 0. Previous patches attempted to solve the problem by moving
> variable initialization earlier in the setup sequence [2].
> 
> Use the normal cursor blink default interval of 200 ms if
> ops->cur_blink_jiffies is not in the range specified in commit
> bd63364caa8d. Since invalid values are not used, specific system
> initialization timings should not cause lockups.
> 
> [1] https://bugs.launchpad.net/bugs/1574814
> [2] see commits: 2a17d7e80f1d, f235f664a8af, a1e533ec07d5

Acked-by: Pavel Machek <pavel@ucw.cz>

>  static void cursor_timer_handler(unsigned long dev_addr)
>  {
>  	struct fb_info *info = (struct fb_info *) dev_addr;
>  	struct fbcon_ops *ops = info->fbcon_par;
>  
>  	queue_work(system_power_efficient_wq, &info->queue);
> -	mod_timer(&ops->cursor_timer, jiffies + ops->cur_blink_jiffies);
> +	mod_timer(&ops->cursor_timer, jiffies +
> +	    cursor_blink_jiffies(ops->cur_blink_jiffies));
>  }
>  
>  static void fbcon_add_cursor_timer(struct fb_info *info)

And actually... perhaps mod_timer should have some check for too low
timeouts..?

WARN_ON?
									Pavel

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] fbcon: use default if cursor blink interval is not valid
  2016-05-19  9:01           ` Pavel Machek
@ 2016-05-19 14:22             ` Scot Doyle
  2016-05-19 15:31               ` Ming Lei
  0 siblings, 1 reply; 30+ messages in thread
From: Scot Doyle @ 2016-05-19 14:22 UTC (permalink / raw)
  To: Pavel Machek
  Cc: Tomi Valkeinen, Jean-Christophe Plagniol-Villard, David Daney,
	Ming Lei, Dann Frazier, Jeremy Kerr, Peter Hurley, Jonathan Liu,
	Alistair Popple, Jean-Philippe Brucker, Chintakuntla, Radha,
	Greg Kroah-Hartman, Jiri Slaby, David Airlie, dri-devel,
	linux-fbdev, linux-kernel

On Thu, 19 May 2016, Pavel Machek wrote:
> Hi!
> 
> > Two current [1] and three previous [2] systems locked during boot
> > because the cursor flash timer was set using an ops->cur_blink_jiffies
> > value of 0. Previous patches attempted to solve the problem by moving
> > variable initialization earlier in the setup sequence [2].
> > 
> > Use the normal cursor blink default interval of 200 ms if
> > ops->cur_blink_jiffies is not in the range specified in commit
> > bd63364caa8d. Since invalid values are not used, specific system
> > initialization timings should not cause lockups.
> > 
> > [1] https://bugs.launchpad.net/bugs/1574814
> > [2] see commits: 2a17d7e80f1d, f235f664a8af, a1e533ec07d5
> 
> Acked-by: Pavel Machek <pavel@ucw.cz>
> 
> >  static void cursor_timer_handler(unsigned long dev_addr)
> >  {
> >  	struct fb_info *info = (struct fb_info *) dev_addr;
> >  	struct fbcon_ops *ops = info->fbcon_par;
> >  
> >  	queue_work(system_power_efficient_wq, &info->queue);
> > -	mod_timer(&ops->cursor_timer, jiffies + ops->cur_blink_jiffies);
> > +	mod_timer(&ops->cursor_timer, jiffies +
> > +	    cursor_blink_jiffies(ops->cur_blink_jiffies));
> >  }
> >  
> >  static void fbcon_add_cursor_timer(struct fb_info *info)
> 
> And actually... perhaps mod_timer should have some check for too low
> timeouts..?
> 
> WARN_ON?
> 									Pavel


Interesting idea. I applied this patch to a couple systems and 
receive the same warning on both:

diff --git a/kernel/time/timer.c b/kernel/time/timer.c
index 73164c3..f6c0b69 100644
--- a/kernel/time/timer.c
+++ b/kernel/time/timer.c
@@ -788,6 +788,7 @@ __mod_timer(struct timer_list *timer, unsigned long expires,
 
 	timer_stats_timer_set_start_info(timer);
 	BUG_ON(!timer->function);
+	WARN_ONCE(expires == jiffies, "timer should expire in the future");
 
 	base = lock_timer_base(timer, &flags);
 
------

[    2.060474] ------------[ cut here ]------------
[    2.061613] WARNING: CPU: 0 PID: 164 at kernel/time/timer.c:791 mod_timer+0x233/0x240
[    2.062740] timer should expire in the future
[    2.062757] CPU: 0 PID: 164 Comm: kworker/0:2 Not tainted 4.6.0+ #7
[    2.065870] Hardware name: Toshiba Leon, BIOS          12/04/2013
[    2.067828] Workqueue: events_power_efficient hub_init_func3
[    2.069762]  0000000000000000 ffff88007443bbb8 ffffffff8139932b ffff88007443bc08
[    2.071701]  0000000000000000 ffff88007443bbf8 ffffffff8112e57c 0000031700000000
[    2.073655]  ffff88007486a0b0 00000000fffea2da ffff88007486a000 0000000000000202
[    2.075594] Call Trace:
[    2.077503]  [<ffffffff8139932b>] dump_stack+0x4d/0x72
[    2.079426]  [<ffffffff8112e57c>] __warn+0xcc/0xf0
[    2.081325]  [<ffffffff8112e5ef>] warn_slowpath_fmt+0x4f/0x60
[    2.083212]  [<ffffffff813ad5e5>] ? find_next_bit+0x15/0x20
[    2.085022]  [<ffffffff8139914f>] ? cpumask_next_and+0x2f/0x40
[    2.086696]  [<ffffffff81188a93>] mod_timer+0x233/0x240
[    2.088362]  [<ffffffff815fff02>] usb_hcd_submit_urb+0x3f2/0x8c0
[    2.090026]  [<ffffffff81601dc4>] ? urb_destroy+0x24/0x30
[    2.091698]  [<ffffffff81142ba8>] ? insert_work+0x58/0xb0
[    2.093349]  [<ffffffff81602297>] usb_submit_urb+0x287/0x530
[    2.094985]  [<ffffffff815f986d>] hub_activate+0x1fd/0x5d0
[    2.096625]  [<ffffffff81150188>] ? finish_task_switch+0x78/0x1f0
[    2.098268]  [<ffffffff815f9cca>] hub_init_func3+0x1a/0x20
[    2.099908]  [<ffffffff811438e0>] process_one_work+0x140/0x3e0
[    2.101539]  [<ffffffff81143bce>] worker_thread+0x4e/0x480
[    2.103173]  [<ffffffff81143b80>] ? process_one_work+0x3e0/0x3e0
[    2.104790]  [<ffffffff81143b80>] ? process_one_work+0x3e0/0x3e0
[    2.106259]  [<ffffffff81149829>] kthread+0xc9/0xe0
[    2.107731]  [<ffffffff81856152>] ret_from_fork+0x22/0x40
[    2.109215]  [<ffffffff81149760>] ? __kthread_parkme+0x70/0x70
[    2.110704] ---[ end trace 3519886a1a990d99 ]---

mod_timer is called from over a thousand places. Should timers always 
expire in the future?

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] fbcon: use default if cursor blink interval is not valid
  2016-05-19 14:22             ` Scot Doyle
@ 2016-05-19 15:31               ` Ming Lei
  0 siblings, 0 replies; 30+ messages in thread
From: Ming Lei @ 2016-05-19 15:31 UTC (permalink / raw)
  To: Scot Doyle
  Cc: Pavel Machek, Tomi Valkeinen, Jean-Christophe Plagniol-Villard,
	David Daney, Dann Frazier, Jeremy Kerr, Peter Hurley,
	Jonathan Liu, Alistair Popple, Jean-Philippe Brucker,
	Chintakuntla, Radha, Greg Kroah-Hartman, Jiri Slaby,
	David Airlie, dri-devel, linux-fbdev, Linux Kernel Mailing List

On Thu, May 19, 2016 at 10:22 PM, Scot Doyle <lkml14@scotdoyle.com> wrote:
> On Thu, 19 May 2016, Pavel Machek wrote:
>> Hi!
>>
>> > Two current [1] and three previous [2] systems locked during boot
>> > because the cursor flash timer was set using an ops->cur_blink_jiffies
>> > value of 0. Previous patches attempted to solve the problem by moving
>> > variable initialization earlier in the setup sequence [2].
>> >
>> > Use the normal cursor blink default interval of 200 ms if
>> > ops->cur_blink_jiffies is not in the range specified in commit
>> > bd63364caa8d. Since invalid values are not used, specific system
>> > initialization timings should not cause lockups.
>> >
>> > [1] https://bugs.launchpad.net/bugs/1574814
>> > [2] see commits: 2a17d7e80f1d, f235f664a8af, a1e533ec07d5
>>
>> Acked-by: Pavel Machek <pavel@ucw.cz>
>>
>> >  static void cursor_timer_handler(unsigned long dev_addr)
>> >  {
>> >     struct fb_info *info = (struct fb_info *) dev_addr;
>> >     struct fbcon_ops *ops = info->fbcon_par;
>> >
>> >     queue_work(system_power_efficient_wq, &info->queue);
>> > -   mod_timer(&ops->cursor_timer, jiffies + ops->cur_blink_jiffies);
>> > +   mod_timer(&ops->cursor_timer, jiffies +
>> > +       cursor_blink_jiffies(ops->cur_blink_jiffies));
>> >  }
>> >
>> >  static void fbcon_add_cursor_timer(struct fb_info *info)
>>
>> And actually... perhaps mod_timer should have some check for too low
>> timeouts..?
>>
>> WARN_ON?
>>                                                                       Pavel
>
>
> Interesting idea. I applied this patch to a couple systems and
> receive the same warning on both:

If 'jiffies' is passed to mod_timer() for same timer unusually OR
mod_timer() isn't called from the timer handler, it shoudln't cause
soft lockup.

In the case of fbcon, 'jiffies' is always passed to mod_timer() and
mod_timer() is called from the timer handler meantime, that is a
real lockup.

>
> diff --git a/kernel/time/timer.c b/kernel/time/timer.c
> index 73164c3..f6c0b69 100644
> --- a/kernel/time/timer.c
> +++ b/kernel/time/timer.c
> @@ -788,6 +788,7 @@ __mod_timer(struct timer_list *timer, unsigned long expires,
>
>         timer_stats_timer_set_start_info(timer);
>         BUG_ON(!timer->function);
> +       WARN_ONCE(expires == jiffies, "timer should expire in the future");
>
>         base = lock_timer_base(timer, &flags);
>
> ------
>
> [    2.060474] ------------[ cut here ]------------
> [    2.061613] WARNING: CPU: 0 PID: 164 at kernel/time/timer.c:791 mod_timer+0x233/0x240
> [    2.062740] timer should expire in the future
> [    2.062757] CPU: 0 PID: 164 Comm: kworker/0:2 Not tainted 4.6.0+ #7
> [    2.065870] Hardware name: Toshiba Leon, BIOS          12/04/2013
> [    2.067828] Workqueue: events_power_efficient hub_init_func3
> [    2.069762]  0000000000000000 ffff88007443bbb8 ffffffff8139932b ffff88007443bc08
> [    2.071701]  0000000000000000 ffff88007443bbf8 ffffffff8112e57c 0000031700000000
> [    2.073655]  ffff88007486a0b0 00000000fffea2da ffff88007486a000 0000000000000202
> [    2.075594] Call Trace:
> [    2.077503]  [<ffffffff8139932b>] dump_stack+0x4d/0x72
> [    2.079426]  [<ffffffff8112e57c>] __warn+0xcc/0xf0
> [    2.081325]  [<ffffffff8112e5ef>] warn_slowpath_fmt+0x4f/0x60
> [    2.083212]  [<ffffffff813ad5e5>] ? find_next_bit+0x15/0x20
> [    2.085022]  [<ffffffff8139914f>] ? cpumask_next_and+0x2f/0x40
> [    2.086696]  [<ffffffff81188a93>] mod_timer+0x233/0x240
> [    2.088362]  [<ffffffff815fff02>] usb_hcd_submit_urb+0x3f2/0x8c0
> [    2.090026]  [<ffffffff81601dc4>] ? urb_destroy+0x24/0x30
> [    2.091698]  [<ffffffff81142ba8>] ? insert_work+0x58/0xb0
> [    2.093349]  [<ffffffff81602297>] usb_submit_urb+0x287/0x530
> [    2.094985]  [<ffffffff815f986d>] hub_activate+0x1fd/0x5d0
> [    2.096625]  [<ffffffff81150188>] ? finish_task_switch+0x78/0x1f0
> [    2.098268]  [<ffffffff815f9cca>] hub_init_func3+0x1a/0x20
> [    2.099908]  [<ffffffff811438e0>] process_one_work+0x140/0x3e0
> [    2.101539]  [<ffffffff81143bce>] worker_thread+0x4e/0x480
> [    2.103173]  [<ffffffff81143b80>] ? process_one_work+0x3e0/0x3e0
> [    2.104790]  [<ffffffff81143b80>] ? process_one_work+0x3e0/0x3e0
> [    2.106259]  [<ffffffff81149829>] kthread+0xc9/0xe0
> [    2.107731]  [<ffffffff81856152>] ret_from_fork+0x22/0x40
> [    2.109215]  [<ffffffff81149760>] ? __kthread_parkme+0x70/0x70
> [    2.110704] ---[ end trace 3519886a1a990d99 ]---
>
> mod_timer is called from over a thousand places. Should timers always
> expire in the future?
>

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] fbcon: use default if cursor blink interval is not valid
  2016-05-19  4:21         ` [PATCH] fbcon: use default if cursor blink interval is not valid Scot Doyle
                             ` (2 preceding siblings ...)
  2016-05-19  9:01           ` Pavel Machek
@ 2016-05-19 15:59           ` David Daney
  2016-05-19 22:25             ` Scot Doyle
  3 siblings, 1 reply; 30+ messages in thread
From: David Daney @ 2016-05-19 15:59 UTC (permalink / raw)
  To: Scot Doyle
  Cc: Tomi Valkeinen, Jean-Christophe Plagniol-Villard, David Daney,
	Ming Lei, Dann Frazier, Peter Hurley, Pavel Machek, Jonathan Liu,
	Alistair Popple, Jean-Philippe Brucker, Chintakuntla, Radha,
	Greg Kroah-Hartman, Jiri Slaby, David Airlie, ddaney.cavm,
	dri-devel, linux-fbdev, linux-kernel, stable

On 05/18/2016 09:21 PM, Scot Doyle wrote:
> Two current [1] and three previous [2] systems locked during boot
> because the cursor flash timer was set using an ops->cur_blink_jiffies
> value of 0. Previous patches attempted to solve the problem by moving
> variable initialization earlier in the setup sequence [2].
>
> Use the normal cursor blink default interval of 200 ms if
> ops->cur_blink_jiffies is not in the range specified in commit
> bd63364caa8d. Since invalid values are not used, specific system
> initialization timings should not cause lockups.
>

This patch just papers over the problem that you yourself introduced in 
commit bd63364caa8d ("vt: add cursor blink interval escape sequence").

As you know, I have a patch that fixes the problem at the source:
https://lkml.org/lkml/2016/5/17/455

I don't like the idea of silently ignoring bad values passed in from 
other code (drivers/tty/vt/vt.c), and much less doing the check for bad 
values each time the timer expires rather than just once, where the bad 
value is first introduced.

I think it would be preferable to WARN() at the site the bad value is 
introduced, so that we can easily find the real source of the problem. 
Initialize cur_blink_jiffies to a sane default value, then if something 
attempts to set it to a value that would cause soft lockup, WARN and 
refuse to change it.

Also there is a stylistic issue...


> [1] https://bugs.launchpad.net/bugs/1574814
> [2] see commits: 2a17d7e80f1d, f235f664a8af, a1e533ec07d5
>
> Signed-off-by: Scot Doyle <lkml14@scotdoyle.com>
> Cc: <stable@vger.kernel.org> [v4.2]
> ---
>   drivers/video/console/fbcon.c | 17 +++++++++++++----
>   1 file changed, 13 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/video/console/fbcon.c b/drivers/video/console/fbcon.c
> index 6e92917..da61d87 100644
> --- a/drivers/video/console/fbcon.c
> +++ b/drivers/video/console/fbcon.c
> @@ -396,13 +396,23 @@ static void fb_flashcursor(struct work_struct *work)
>   	console_unlock();
>   }
>
> +static int cursor_blink_jiffies(int candidate)
> +{
> +	if (candidate >= msecs_to_jiffies(50) &&
> +	    candidate <= msecs_to_jiffies(USHRT_MAX))
> +		return candidate;
> +	else
> +		return HZ / 5;

You use msecs_to_jiffies() is several places, then here open code the 
division.  Please use  msecs_to_jiffies(), that is it's intended job.


> +}
> +
>   static void cursor_timer_handler(unsigned long dev_addr)
>   {
>   	struct fb_info *info = (struct fb_info *) dev_addr;
>   	struct fbcon_ops *ops = info->fbcon_par;
>
>   	queue_work(system_power_efficient_wq, &info->queue);
> -	mod_timer(&ops->cursor_timer, jiffies + ops->cur_blink_jiffies);
> +	mod_timer(&ops->cursor_timer, jiffies +
> +	    cursor_blink_jiffies(ops->cur_blink_jiffies));
>   }
>
>   static void fbcon_add_cursor_timer(struct fb_info *info)
> @@ -417,7 +427,8 @@ static void fbcon_add_cursor_timer(struct fb_info *info)
>
>   		init_timer(&ops->cursor_timer);
>   		ops->cursor_timer.function = cursor_timer_handler;
> -		ops->cursor_timer.expires = jiffies + ops->cur_blink_jiffies;
> +		ops->cursor_timer.expires = jiffies +
> +		    cursor_blink_jiffies(ops->cur_blink_jiffies);
>   		ops->cursor_timer.data = (unsigned long ) info;
>   		add_timer(&ops->cursor_timer);
>   		ops->flags |= FBCON_FLAGS_CURSOR_TIMER;
> @@ -709,7 +720,6 @@ static int con2fb_acquire_newinfo(struct vc_data *vc, struct fb_info *info,
>   	}
>
>   	if (!err) {
> -		ops->cur_blink_jiffies = HZ / 5;
>   		info->fbcon_par = ops;
>
>   		if (vc)
> @@ -957,7 +967,6 @@ static const char *fbcon_startup(void)
>   	ops->currcon = -1;
>   	ops->graphics = 1;
>   	ops->cur_rotate = -1;
> -	ops->cur_blink_jiffies = HZ / 5;
>   	info->fbcon_par = ops;
>   	p->con_rotate = initial_rotation;
>   	set_blitting_type(vc, info);
>

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] fbcon: use default if cursor blink interval is not valid
  2016-05-19 15:59           ` David Daney
@ 2016-05-19 22:25             ` Scot Doyle
  2016-05-19 22:31               ` [PATCH] fbcon: warn on invalid cursor blink intervals Scot Doyle
  0 siblings, 1 reply; 30+ messages in thread
From: Scot Doyle @ 2016-05-19 22:25 UTC (permalink / raw)
  To: David Daney
  Cc: Tomi Valkeinen, Jean-Christophe Plagniol-Villard, David Daney,
	Ming Lei, Dann Frazier, Jeremy Kerr, Peter Hurley, Pavel Machek,
	Jonathan Liu, Alistair Popple, Jean-Philippe Brucker,
	Chintakuntla, Radha, Greg Kroah-Hartman, Jiri Slaby,
	David Airlie, ddaney.cavm, Scot Doyle, dri-devel, linux-fbdev,
	linux-kernel, stable

On Thu, 19 May 2016, David Daney wrote:
> On 05/18/2016 09:21 PM, Scot Doyle wrote:
> > Two current [1] and three previous [2] systems locked during boot
> > because the cursor flash timer was set using an ops->cur_blink_jiffies
> > value of 0. Previous patches attempted to solve the problem by moving
> > variable initialization earlier in the setup sequence [2].
> > 
> > Use the normal cursor blink default interval of 200 ms if
> > ops->cur_blink_jiffies is not in the range specified in commit
> > bd63364caa8d. Since invalid values are not used, specific system
> > initialization timings should not cause lockups.
> > 
> 
> This patch just papers over the problem that you yourself introduced in commit
> bd63364caa8d ("vt: add cursor blink interval escape sequence").
> 
> As you know, I have a patch that fixes the problem at the source:
> https://lkml.org/lkml/2016/5/17/455
> 
> I don't like the idea of silently ignoring bad values passed in from other
> code (drivers/tty/vt/vt.c), and much less doing the check for bad values each
> time the timer expires rather than just once, where the bad value is first
> introduced.
> 
> I think it would be preferable to WARN() at the site the bad value is
> introduced, so that we can easily find the real source of the problem.
> Initialize cur_blink_jiffies to a sane default value, then if something
> attempts to set it to a value that would cause soft lockup, WARN and refuse to
> change it.

I agree this approach would be cleaner and am willing to give it a try
by submitting an alternative patch and ack'ing yours. Thanks for taking 
the time to critique my proposal.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH] fbcon: warn on invalid cursor blink intervals
  2016-05-19 22:25             ` Scot Doyle
@ 2016-05-19 22:31               ` Scot Doyle
  2016-05-19 22:50                 ` David Daney
                                   ` (3 more replies)
  0 siblings, 4 replies; 30+ messages in thread
From: Scot Doyle @ 2016-05-19 22:31 UTC (permalink / raw)
  To: Tomi Valkeinen, Jean-Christophe Plagniol-Villard
  Cc: David Daney, Ming Lei, Dann Frazier, Jeremy Kerr, Peter Hurley,
	Pavel Machek, Jonathan Liu, Alistair Popple,
	Jean-Philippe Brucker, Chintakuntla, Radha, Greg Kroah-Hartman,
	Jiri Slaby, David Airlie, ddaney.cavm, Scot Doyle, dri-devel,
	linux-fbdev, linux-kernel, stable

Two systems are locking on boot [1] because ops->cur_blink_jiffies
is set to zero from vc->vc_cur_blink_ms.

Ignore such invalid intervals and log a warning.

[1] https://bugs.launchpad.net/bugs/1574814

Suggested-by: David Daney <david.daney@cavium.com>
Signed-off-by: Scot Doyle <lkml14@scotdoyle.com>
Cc: <stable@vger.kernel.org> [v4.2]
---
 drivers/video/console/fbcon.c | 14 ++++++++++++--
 1 file changed, 12 insertions(+), 2 deletions(-)

diff --git a/drivers/video/console/fbcon.c b/drivers/video/console/fbcon.c
index 6e92917..fad5b89 100644
--- a/drivers/video/console/fbcon.c
+++ b/drivers/video/console/fbcon.c
@@ -1095,7 +1095,13 @@ static void fbcon_init(struct vc_data *vc, int init)
 		con_copy_unimap(vc, svc);
 
 	ops = info->fbcon_par;
-	ops->cur_blink_jiffies = msecs_to_jiffies(vc->vc_cur_blink_ms);
+
+	if (vc->vc_cur_blink_ms >= 50)
+		ops->cur_blink_jiffies =
+		    msecs_to_jiffies(vc->vc_cur_blink_ms);
+	else
+		WARN_ONCE(1, "blink interval < 50 ms");
+
 	p->con_rotate = initial_rotation;
 	set_blitting_type(vc, info);
 
@@ -1309,7 +1315,11 @@ static void fbcon_cursor(struct vc_data *vc, int mode)
 	int y;
  	int c = scr_readw((u16 *) vc->vc_pos);
 
-	ops->cur_blink_jiffies = msecs_to_jiffies(vc->vc_cur_blink_ms);
+	if (vc->vc_cur_blink_ms >= 50)
+		ops->cur_blink_jiffies =
+		    msecs_to_jiffies(vc->vc_cur_blink_ms);
+	else
+		WARN_ONCE(1, "blink interval < 50 ms");
 
 	if (fbcon_is_inactive(vc, info) || vc->vc_deccm != 1)
 		return;
-- 
2.1.4

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] tty: vt: Fix soft lockup in fbcon cursor blink timer.
  2016-05-17 18:41 [PATCH] tty: vt: Fix soft lockup in fbcon cursor blink timer David Daney
  2016-05-17 20:49 ` Pavel Machek
@ 2016-05-19 22:35 ` Scot Doyle
  2016-05-28 11:44 ` Henrique de Moraes Holschuh
  2016-06-14 16:22 ` Ping: " David Daney
  3 siblings, 0 replies; 30+ messages in thread
From: Scot Doyle @ 2016-05-19 22:35 UTC (permalink / raw)
  To: David Daney
  Cc: Greg Kroah-Hartman, Jiri Slaby, Peter Hurley, Ming Lei,
	Dann Frazier, Scot Doyle, David Airlie, dri-devel,
	Radha.Chintakuntla, Pavel Machek, linux-kernel, David Daney,
	stable

On Tue, 17 May 2016, David Daney wrote:
> From: David Daney <david.daney@cavium.com>
> 
> We are getting somewhat random soft lockups with this signature:
> 
> [   86.992215] [<fffffc00080935e0>] el1_irq+0xa0/0x10c
> [   86.997082] [<fffffc000841822c>] cursor_timer_handler+0x30/0x54
> [   87.002991] [<fffffc000810ec44>] call_timer_fn+0x54/0x1a8
> [   87.008378] [<fffffc000810ef88>] run_timer_softirq+0x1c4/0x2bc
> [   87.014200] [<fffffc000809077c>] __do_softirq+0x114/0x344
> [   87.019590] [<fffffc00080af45c>] irq_exit+0x74/0x98
> [   87.024458] [<fffffc00080fac20>] __handle_domain_irq+0x98/0xfc
> [   87.030278] [<fffffc000809056c>] gic_handle_irq+0x94/0x190
> 
> This is caused by the vt visual_init() function calling into
> fbcon_init() with a vc_cur_blink_ms value of zero.  This is a
> transient condition, as it is later set to a non-zero value.  But, if
> the timer happens to expire while the blink rate is zero, it goes into
> an endless loop, and we get soft lockup.
> 
> The fix is to initialize vc_cur_blink_ms before calling the con_init()
> function.
> 
> Signed-off-by: David Daney <david.daney@cavium.com>
> Cc: stable@vger.kernel.org

Acked-by: Scot Doyle <lkml14@scotdoyle.com>

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] fbcon: warn on invalid cursor blink intervals
  2016-05-19 22:31               ` [PATCH] fbcon: warn on invalid cursor blink intervals Scot Doyle
@ 2016-05-19 22:50                 ` David Daney
  2016-05-20  1:21                 ` Jeremy Kerr
                                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 30+ messages in thread
From: David Daney @ 2016-05-19 22:50 UTC (permalink / raw)
  To: Scot Doyle
  Cc: Tomi Valkeinen, Jean-Christophe Plagniol-Villard, David Daney,
	Ming Lei, Dann Frazier, Jeremy Kerr, Peter Hurley, Pavel Machek,
	Jonathan Liu, Alistair Popple, Jean-Philippe Brucker,
	Chintakuntla, Radha, Greg Kroah-Hartman, Jiri Slaby,
	David Airlie, ddaney.cavm, dri-devel, linux-fbdev, linux-kernel,
	stable

On 05/19/2016 03:31 PM, Scot Doyle wrote:
> Two systems are locking on boot [1] because ops->cur_blink_jiffies
> is set to zero from vc->vc_cur_blink_ms.
>
> Ignore such invalid intervals and log a warning.
>
> [1] https://bugs.launchpad.net/bugs/1574814
>
> Suggested-by: David Daney <david.daney@cavium.com>
> Signed-off-by: Scot Doyle <lkml14@scotdoyle.com>
> Cc: <stable@vger.kernel.org> [v4.2]

This seems better. I didn't test it, but...

Acked-by: David Daney <david.daney@cavium.com>


> ---
>   drivers/video/console/fbcon.c | 14 ++++++++++++--
>   1 file changed, 12 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/video/console/fbcon.c b/drivers/video/console/fbcon.c
> index 6e92917..fad5b89 100644
> --- a/drivers/video/console/fbcon.c
> +++ b/drivers/video/console/fbcon.c
> @@ -1095,7 +1095,13 @@ static void fbcon_init(struct vc_data *vc, int init)
>   		con_copy_unimap(vc, svc);
>
>   	ops = info->fbcon_par;
> -	ops->cur_blink_jiffies = msecs_to_jiffies(vc->vc_cur_blink_ms);
> +
> +	if (vc->vc_cur_blink_ms >= 50)
> +		ops->cur_blink_jiffies =
> +		    msecs_to_jiffies(vc->vc_cur_blink_ms);
> +	else
> +		WARN_ONCE(1, "blink interval < 50 ms");
> +
>   	p->con_rotate = initial_rotation;
>   	set_blitting_type(vc, info);
>
> @@ -1309,7 +1315,11 @@ static void fbcon_cursor(struct vc_data *vc, int mode)
>   	int y;
>    	int c = scr_readw((u16 *) vc->vc_pos);
>
> -	ops->cur_blink_jiffies = msecs_to_jiffies(vc->vc_cur_blink_ms);
> +	if (vc->vc_cur_blink_ms >= 50)
> +		ops->cur_blink_jiffies =
> +		    msecs_to_jiffies(vc->vc_cur_blink_ms);
> +	else
> +		WARN_ONCE(1, "blink interval < 50 ms");
>
>   	if (fbcon_is_inactive(vc, info) || vc->vc_deccm != 1)
>   		return;
>

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] fbcon: warn on invalid cursor blink intervals
  2016-05-19 22:31               ` [PATCH] fbcon: warn on invalid cursor blink intervals Scot Doyle
  2016-05-19 22:50                 ` David Daney
@ 2016-05-20  1:21                 ` Jeremy Kerr
  2016-05-20  2:17                 ` Ming Lei
  2016-05-28 11:45                 ` Henrique de Moraes Holschuh
  3 siblings, 0 replies; 30+ messages in thread
From: Jeremy Kerr @ 2016-05-20  1:21 UTC (permalink / raw)
  To: Scot Doyle, Tomi Valkeinen, Jean-Christophe Plagniol-Villard
  Cc: David Daney, Ming Lei, Dann Frazier, Peter Hurley, Pavel Machek,
	Jonathan Liu, Alistair Popple, Jean-Philippe Brucker,
	Chintakuntla, Radha, Greg Kroah-Hartman, Jiri Slaby,
	David Airlie, ddaney.cavm, dri-devel, linux-fbdev, linux-kernel,
	stable

Hi Scot,

> Two systems are locking on boot [1] because ops->cur_blink_jiffies
> is set to zero from vc->vc_cur_blink_ms.
> 
> Ignore such invalid intervals and log a warning.

This prevents a lockup on AST BMC machines, but (as expected) generates
a warning against the fbcon driver, which is a significantly better
result.

Tested-by: Jeremy Kerr <jk@ozlabs.org>

[now to sort out the issue in the ast driver...]

Cheers,


Jeremy

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] fbcon: warn on invalid cursor blink intervals
  2016-05-19 22:31               ` [PATCH] fbcon: warn on invalid cursor blink intervals Scot Doyle
  2016-05-19 22:50                 ` David Daney
  2016-05-20  1:21                 ` Jeremy Kerr
@ 2016-05-20  2:17                 ` Ming Lei
  2016-05-20  2:26                   ` Jeremy Kerr
  2016-05-28 11:45                 ` Henrique de Moraes Holschuh
  3 siblings, 1 reply; 30+ messages in thread
From: Ming Lei @ 2016-05-20  2:17 UTC (permalink / raw)
  To: Scot Doyle
  Cc: Tomi Valkeinen, Jean-Christophe Plagniol-Villard, David Daney,
	Dann Frazier, Jeremy Kerr, Peter Hurley, Pavel Machek,
	Jonathan Liu, Alistair Popple, Jean-Philippe Brucker,
	Chintakuntla, Radha, Greg Kroah-Hartman, Jiri Slaby,
	David Airlie, David Daney, dri-devel, linux-fbdev,
	Linux Kernel Mailing List, stable

On Fri, May 20, 2016 at 6:31 AM, Scot Doyle <lkml14@scotdoyle.com> wrote:
> Two systems are locking on boot [1] because ops->cur_blink_jiffies
> is set to zero from vc->vc_cur_blink_ms.
>
> Ignore such invalid intervals and log a warning.
>
> [1] https://bugs.launchpad.net/bugs/1574814
>
> Suggested-by: David Daney <david.daney@cavium.com>
> Signed-off-by: Scot Doyle <lkml14@scotdoyle.com>
> Cc: <stable@vger.kernel.org> [v4.2]

Not sure this one is needed for stable because it justs dumps
a warning, and not set a valid period to ops->cur_blink_jiffies.

So I guess other fix patch is still required for the soft lockup issue, right?

Thanks,

> ---
>  drivers/video/console/fbcon.c | 14 ++++++++++++--
>  1 file changed, 12 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/video/console/fbcon.c b/drivers/video/console/fbcon.c
> index 6e92917..fad5b89 100644
> --- a/drivers/video/console/fbcon.c
> +++ b/drivers/video/console/fbcon.c
> @@ -1095,7 +1095,13 @@ static void fbcon_init(struct vc_data *vc, int init)
>                 con_copy_unimap(vc, svc);
>
>         ops = info->fbcon_par;
> -       ops->cur_blink_jiffies = msecs_to_jiffies(vc->vc_cur_blink_ms);
> +
> +       if (vc->vc_cur_blink_ms >= 50)
> +               ops->cur_blink_jiffies =
> +                   msecs_to_jiffies(vc->vc_cur_blink_ms);
> +       else
> +               WARN_ONCE(1, "blink interval < 50 ms");
> +
>         p->con_rotate = initial_rotation;
>         set_blitting_type(vc, info);
>
> @@ -1309,7 +1315,11 @@ static void fbcon_cursor(struct vc_data *vc, int mode)
>         int y;
>         int c = scr_readw((u16 *) vc->vc_pos);
>
> -       ops->cur_blink_jiffies = msecs_to_jiffies(vc->vc_cur_blink_ms);
> +       if (vc->vc_cur_blink_ms >= 50)
> +               ops->cur_blink_jiffies =
> +                   msecs_to_jiffies(vc->vc_cur_blink_ms);
> +       else
> +               WARN_ONCE(1, "blink interval < 50 ms");
>
>         if (fbcon_is_inactive(vc, info) || vc->vc_deccm != 1)
>                 return;
> --
> 2.1.4
>

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] fbcon: warn on invalid cursor blink intervals
  2016-05-20  2:17                 ` Ming Lei
@ 2016-05-20  2:26                   ` Jeremy Kerr
  2016-05-20  2:48                     ` Ming Lei
  0 siblings, 1 reply; 30+ messages in thread
From: Jeremy Kerr @ 2016-05-20  2:26 UTC (permalink / raw)
  To: Ming Lei, Scot Doyle
  Cc: Tomi Valkeinen, Jean-Christophe Plagniol-Villard, David Daney,
	Dann Frazier, Peter Hurley, Pavel Machek, Jonathan Liu,
	Alistair Popple, Jean-Philippe Brucker, Chintakuntla, Radha,
	Greg Kroah-Hartman, Jiri Slaby, David Airlie, David Daney,
	dri-devel, linux-fbdev, Linux Kernel Mailing List, stable

Hi Ming,

> Not sure this one is needed for stable because it justs dumps
> a warning, and not set a valid period to ops->cur_blink_jiffies.
> 
> So I guess other fix patch is still required for the soft lockup
> issue, right?

The main thing is that we don't set cur_blink_jiffies to the < 50ms
value. As far as I can tell, it means we'll still get the original
default, set in fbcon_startup():

	ops->cur_blink_jiffies = HZ / 5;

And so don't end up spinning on the timer expiry.

Cheers,


Jeremy

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] fbcon: warn on invalid cursor blink intervals
  2016-05-20  2:26                   ` Jeremy Kerr
@ 2016-05-20  2:48                     ` Ming Lei
  2016-05-20  5:04                       ` Jeremy Kerr
  0 siblings, 1 reply; 30+ messages in thread
From: Ming Lei @ 2016-05-20  2:48 UTC (permalink / raw)
  To: Jeremy Kerr
  Cc: Scot Doyle, Tomi Valkeinen, Jean-Christophe Plagniol-Villard,
	David Daney, Dann Frazier, Peter Hurley, Pavel Machek,
	Jonathan Liu, Alistair Popple, Jean-Philippe Brucker,
	Chintakuntla, Radha, Greg Kroah-Hartman, Jiri Slaby,
	David Airlie, David Daney, dri-devel, linux-fbdev,
	Linux Kernel Mailing List, stable

On Fri, May 20, 2016 at 10:26 AM, Jeremy Kerr <jk@ozlabs.org> wrote:
> Hi Ming,
>
>> Not sure this one is needed for stable because it justs dumps
>> a warning, and not set a valid period to ops->cur_blink_jiffies.
>>
>> So I guess other fix patch is still required for the soft lockup
>> issue, right?
>
> The main thing is that we don't set cur_blink_jiffies to the < 50ms
> value. As far as I can tell, it means we'll still get the original
> default, set in fbcon_startup():
>
>         ops->cur_blink_jiffies = HZ / 5;
>
> And so don't end up spinning on the timer expiry.

Jeremy, your theory is correct, thanks for your clarification! And my test
just shows that this patch does fix the soft lockup too, so

      Tested-by: Ming Lei <ming.lei@canonical.com>

Then looks there are two fix patches acked & tested:

     - the patch in this thread
     - another one "[PATCH] tty: vt: Fix soft lockup in fbcon cursor
blink timer."
     https://lkml.org/lkml/2016/5/17/455

So which one will be pushed to linus?

Thanks,
Ming

>
> Cheers,
>
>
> Jeremy

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] fbcon: warn on invalid cursor blink intervals
  2016-05-20  2:48                     ` Ming Lei
@ 2016-05-20  5:04                       ` Jeremy Kerr
  2016-05-20 16:27                         ` Scot Doyle
  0 siblings, 1 reply; 30+ messages in thread
From: Jeremy Kerr @ 2016-05-20  5:04 UTC (permalink / raw)
  To: Ming Lei
  Cc: Scot Doyle, Tomi Valkeinen, Jean-Christophe Plagniol-Villard,
	David Daney, Dann Frazier, Peter Hurley, Pavel Machek,
	Jonathan Liu, Alistair Popple, Jean-Philippe Brucker,
	Chintakuntla, Radha, Greg Kroah-Hartman, Jiri Slaby,
	David Airlie, David Daney, dri-devel, linux-fbdev,
	Linux Kernel Mailing List, stable

Hi Ming,

>Then looks there are two fix patches acked & tested:
>
> - the patch in this thread
> - another one "[PATCH] tty: vt: Fix soft lockup in fbcon cursor
>blink timer."
> https://lkml.org/lkml/2016/5/17/455
>
>So which one will be pushed to linus?

Not that it's my call, but we may want both; the first as a safety
measure to prevent an invalid cur_blink_jiffies ever being set, and the
second one to actually fix the initialisation of vc_cur_blink_ms (and
address the warning introduced by the first).

I guess we could just go with the latter for stable...

Cheers,


Jeremy

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] fbcon: warn on invalid cursor blink intervals
  2016-05-20  5:04                       ` Jeremy Kerr
@ 2016-05-20 16:27                         ` Scot Doyle
  2016-05-24  1:19                           ` Scot Doyle
  2016-05-28 11:48                           ` Henrique de Moraes Holschuh
  0 siblings, 2 replies; 30+ messages in thread
From: Scot Doyle @ 2016-05-20 16:27 UTC (permalink / raw)
  To: Tomi Valkeinen, Jean-Christophe Plagniol-Villard, Greg Kroah-Hartman
  Cc: Jeremy Kerr, Ming Lei, David Daney, Dann Frazier, Peter Hurley,
	Pavel Machek, Jonathan Liu, Alistair Popple,
	Jean-Philippe Brucker, Chintakuntla, Radha, Jiri Slaby,
	David Airlie, David Daney, Scot Doyle, dri-devel, linux-fbdev,
	Linux Kernel Mailing List, stable

On Fri, 20 May 2016, Jeremy Kerr wrote:
> Hi Ming,
> 
> >Then looks there are two fix patches acked & tested:
> >
> > - the patch in this thread
> > - another one "[PATCH] tty: vt: Fix soft lockup in fbcon cursor
> >blink timer."
> > https://lkml.org/lkml/2016/5/17/455
> >
> >So which one will be pushed to linus?
> 
> Not that it's my call, but we may want both; the first as a safety
> measure to prevent an invalid cur_blink_jiffies ever being set, and the
> second one to actually fix the initialisation of vc_cur_blink_ms (and
> address the warning introduced by the first).

Tomi / Greg,

I'd suggest
- applying "tty: vt: Fix soft lockup in fbcon cursor blink timer." to 4.7 and stable[4.2]
- applying "fbcon: warn on invalid cursor blink intervals" to 4.7
- ignoring "fbcon: use default if cursor blink interval is not valid"

Note: the patches don't depend on each other


> I guess we could just go with the latter for stable...
>
> Cheers,
> 
> Jeremy

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] fbcon: warn on invalid cursor blink intervals
  2016-05-20 16:27                         ` Scot Doyle
@ 2016-05-24  1:19                           ` Scot Doyle
  2016-05-28 11:48                           ` Henrique de Moraes Holschuh
  1 sibling, 0 replies; 30+ messages in thread
From: Scot Doyle @ 2016-05-24  1:19 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Tomi Valkeinen, Jean-Christophe Plagniol-Villard
  Cc: Jeremy Kerr, Ming Lei, David Daney, Dann Frazier, Peter Hurley,
	Pavel Machek, Jonathan Liu, Alistair Popple,
	Jean-Philippe Brucker, Chintakuntla, Radha, Jiri Slaby,
	David Airlie, David Daney, Scot Doyle, fengguang.wu, dri-devel,
	linux-fbdev, Linux Kernel Mailing List, stable

On Fri, 20 May 2016, Scot Doyle wrote:
> On Fri, 20 May 2016, Jeremy Kerr wrote:
> > Hi Ming,
> > 
> > >Then looks there are two fix patches acked & tested:
> > >
> > > - the patch in this thread
> > > - another one "[PATCH] tty: vt: Fix soft lockup in fbcon cursor
> > >blink timer."
> > > https://lkml.org/lkml/2016/5/17/455
> > >
> > >So which one will be pushed to linus?
> > 
> > Not that it's my call, but we may want both; the first as a safety
> > measure to prevent an invalid cur_blink_jiffies ever being set, and the
> > second one to actually fix the initialisation of vc_cur_blink_ms (and
> > address the warning introduced by the first).
> 
> Tomi / Greg,
> 
> I'd suggest
> - applying "tty: vt: Fix soft lockup in fbcon cursor blink timer." to 4.7 and stable[4.2]
> - applying "fbcon: warn on invalid cursor blink intervals" to 4.7
> - ignoring "fbcon: use default if cursor blink interval is not valid"
> 
> Note: the patches don't depend on each other

"tty: vt: Fix soft lockup..." should be applied first in order to avoid 
unnecessary reports due to the log warning in "fbcon: warn on invalid..."

> 
> > I guess we could just go with the latter for stable...
> >
> > Cheers,
> > 
> > Jeremy

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] tty: vt: Fix soft lockup in fbcon cursor blink timer.
  2016-05-17 18:41 [PATCH] tty: vt: Fix soft lockup in fbcon cursor blink timer David Daney
  2016-05-17 20:49 ` Pavel Machek
  2016-05-19 22:35 ` Scot Doyle
@ 2016-05-28 11:44 ` Henrique de Moraes Holschuh
  2016-06-14 16:22 ` Ping: " David Daney
  3 siblings, 0 replies; 30+ messages in thread
From: Henrique de Moraes Holschuh @ 2016-05-28 11:44 UTC (permalink / raw)
  To: David Daney
  Cc: Greg Kroah-Hartman, Jiri Slaby, Peter Hurley, Ming Lei,
	Dann Frazier, Scot Doyle, David Airlie, dri-devel,
	Radha.Chintakuntla, Pavel Machek, linux-kernel, David Daney,
	stable

On Tue, 17 May 2016, David Daney wrote:
> From: David Daney <david.daney@cavium.com>
> We are getting somewhat random soft lockups with this signature:
> 
> [   86.992215] [<fffffc00080935e0>] el1_irq+0xa0/0x10c
> [   86.997082] [<fffffc000841822c>] cursor_timer_handler+0x30/0x54
> [   87.002991] [<fffffc000810ec44>] call_timer_fn+0x54/0x1a8
> [   87.008378] [<fffffc000810ef88>] run_timer_softirq+0x1c4/0x2bc
> [   87.014200] [<fffffc000809077c>] __do_softirq+0x114/0x344
> [   87.019590] [<fffffc00080af45c>] irq_exit+0x74/0x98
> [   87.024458] [<fffffc00080fac20>] __handle_domain_irq+0x98/0xfc
> [   87.030278] [<fffffc000809056c>] gic_handle_irq+0x94/0x190
> 
> This is caused by the vt visual_init() function calling into
> fbcon_init() with a vc_cur_blink_ms value of zero.  This is a
> transient condition, as it is later set to a non-zero value.  But, if
> the timer happens to expire while the blink rate is zero, it goes into
> an endless loop, and we get soft lockup.
> 
> The fix is to initialize vc_cur_blink_ms before calling the con_init()
> function.
> 
> Signed-off-by: David Daney <david.daney@cavium.com>
> Cc: stable@vger.kernel.org

Tested-by: Henrique de Moraes Holschuh <hmh@hmh.eng.br> on top of 4.4.11.

> ---
>  drivers/tty/vt/vt.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/tty/vt/vt.c b/drivers/tty/vt/vt.c
> index 3e3c757..eef5c36 100644
> --- a/drivers/tty/vt/vt.c
> +++ b/drivers/tty/vt/vt.c
> @@ -750,6 +750,7 @@ static void visual_init(struct vc_data *vc, int num, int init)
>  	vc->vc_complement_mask = 0;
>  	vc->vc_can_do_color = 0;
>  	vc->vc_panic_force_write = false;
> +	vc->vc_cur_blink_ms = DEFAULT_CURSOR_BLINK_MS;
>  	vc->vc_sw->con_init(vc, init);
>  	if (!vc->vc_complement_mask)
>  		vc->vc_complement_mask = vc->vc_can_do_color ? 0x7700 : 0x0800;

-- 
  "One disk to rule them all, One disk to find them. One disk to bring
  them all and in the darkness grind them. In the Land of Redmond
  where the shadows lie." -- The Silicon Valley Tarot
  Henrique Holschuh

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] fbcon: warn on invalid cursor blink intervals
  2016-05-19 22:31               ` [PATCH] fbcon: warn on invalid cursor blink intervals Scot Doyle
                                   ` (2 preceding siblings ...)
  2016-05-20  2:17                 ` Ming Lei
@ 2016-05-28 11:45                 ` Henrique de Moraes Holschuh
  3 siblings, 0 replies; 30+ messages in thread
From: Henrique de Moraes Holschuh @ 2016-05-28 11:45 UTC (permalink / raw)
  To: Scot Doyle
  Cc: Tomi Valkeinen, Jean-Christophe Plagniol-Villard, David Daney,
	Ming Lei, Dann Frazier, Jeremy Kerr, Peter Hurley, Pavel Machek,
	Jonathan Liu, Alistair Popple, Jean-Philippe Brucker,
	Chintakuntla, Radha, Greg Kroah-Hartman, Jiri Slaby,
	David Airlie, ddaney.cavm, dri-devel, linux-fbdev, linux-kernel,
	stable

On Thu, 19 May 2016, Scot Doyle wrote:
> Two systems are locking on boot [1] because ops->cur_blink_jiffies
> is set to zero from vc->vc_cur_blink_ms.
> 
> Ignore such invalid intervals and log a warning.
> 
> [1] https://bugs.launchpad.net/bugs/1574814
> 
> Suggested-by: David Daney <david.daney@cavium.com>
> Signed-off-by: Scot Doyle <lkml14@scotdoyle.com>
> Cc: <stable@vger.kernel.org> [v4.2]

FWIW:
Tested-by: Henrique de Moraes Holschuh <hmh@hmh.eng.br> on top of 4.4.11.

And nothing caused it to issue warnings here, so far (with the other
recommended patch applied first).

> ---
>  drivers/video/console/fbcon.c | 14 ++++++++++++--
>  1 file changed, 12 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/video/console/fbcon.c b/drivers/video/console/fbcon.c
> index 6e92917..fad5b89 100644
> --- a/drivers/video/console/fbcon.c
> +++ b/drivers/video/console/fbcon.c
> @@ -1095,7 +1095,13 @@ static void fbcon_init(struct vc_data *vc, int init)
>  		con_copy_unimap(vc, svc);
>  
>  	ops = info->fbcon_par;
> -	ops->cur_blink_jiffies = msecs_to_jiffies(vc->vc_cur_blink_ms);
> +
> +	if (vc->vc_cur_blink_ms >= 50)
> +		ops->cur_blink_jiffies =
> +		    msecs_to_jiffies(vc->vc_cur_blink_ms);
> +	else
> +		WARN_ONCE(1, "blink interval < 50 ms");
> +
>  	p->con_rotate = initial_rotation;
>  	set_blitting_type(vc, info);
>  
> @@ -1309,7 +1315,11 @@ static void fbcon_cursor(struct vc_data *vc, int mode)
>  	int y;
>   	int c = scr_readw((u16 *) vc->vc_pos);
>  
> -	ops->cur_blink_jiffies = msecs_to_jiffies(vc->vc_cur_blink_ms);
> +	if (vc->vc_cur_blink_ms >= 50)
> +		ops->cur_blink_jiffies =
> +		    msecs_to_jiffies(vc->vc_cur_blink_ms);
> +	else
> +		WARN_ONCE(1, "blink interval < 50 ms");
>  
>  	if (fbcon_is_inactive(vc, info) || vc->vc_deccm != 1)
>  		return;

-- 
  "One disk to rule them all, One disk to find them. One disk to bring
  them all and in the darkness grind them. In the Land of Redmond
  where the shadows lie." -- The Silicon Valley Tarot
  Henrique Holschuh

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] fbcon: warn on invalid cursor blink intervals
  2016-05-20 16:27                         ` Scot Doyle
  2016-05-24  1:19                           ` Scot Doyle
@ 2016-05-28 11:48                           ` Henrique de Moraes Holschuh
  2016-06-12  5:05                             ` Chintakuntla, Radha
  1 sibling, 1 reply; 30+ messages in thread
From: Henrique de Moraes Holschuh @ 2016-05-28 11:48 UTC (permalink / raw)
  To: Scot Doyle
  Cc: Tomi Valkeinen, Jean-Christophe Plagniol-Villard,
	Greg Kroah-Hartman, Jeremy Kerr, Ming Lei, David Daney,
	Dann Frazier, Peter Hurley, Pavel Machek, Jonathan Liu,
	Alistair Popple, Jean-Philippe Brucker, Chintakuntla, Radha,
	Jiri Slaby, David Airlie, David Daney, dri-devel, linux-fbdev,
	Linux Kernel Mailing List, stable

On Fri, 20 May 2016, Scot Doyle wrote:
> On Fri, 20 May 2016, Jeremy Kerr wrote:
> > >Then looks there are two fix patches acked & tested:
> > >
> > > - the patch in this thread
> > > - another one "[PATCH] tty: vt: Fix soft lockup in fbcon cursor
> > >blink timer."
> > > https://lkml.org/lkml/2016/5/17/455
> > >
> > >So which one will be pushed to linus?
> > 
> > Not that it's my call, but we may want both; the first as a safety
> > measure to prevent an invalid cur_blink_jiffies ever being set, and the
> > second one to actually fix the initialisation of vc_cur_blink_ms (and
> > address the warning introduced by the first).
> 
> Tomi / Greg,
> 
> I'd suggest
> - applying "tty: vt: Fix soft lockup in fbcon cursor blink timer." to 4.7 and stable[4.2]
> - applying "fbcon: warn on invalid cursor blink intervals" to 4.7
> - ignoring "fbcon: use default if cursor blink interval is not valid"
> 
> Note: the patches don't depend on each other

I applied both recommended patches on top of 4.4.11 for testing, and they
made things a lot better here.

I suggest the second patch should be backported to stable too, might as well
fix this thing for good *and keep the door closed*.

-- 
  "One disk to rule them all, One disk to find them. One disk to bring
  them all and in the darkness grind them. In the Land of Redmond
  where the shadows lie." -- The Silicon Valley Tarot
  Henrique Holschuh

^ permalink raw reply	[flat|nested] 30+ messages in thread

* RE: [PATCH] fbcon: warn on invalid cursor blink intervals
  2016-05-28 11:48                           ` Henrique de Moraes Holschuh
@ 2016-06-12  5:05                             ` Chintakuntla, Radha
  0 siblings, 0 replies; 30+ messages in thread
From: Chintakuntla, Radha @ 2016-06-12  5:05 UTC (permalink / raw)
  To: Henrique de Moraes Holschuh, Scot Doyle
  Cc: Tomi Valkeinen, Jean-Christophe Plagniol-Villard,
	Greg Kroah-Hartman, Jeremy Kerr, Ming Lei, Daney, David,
	Dann Frazier, Peter Hurley, Pavel Machek, Jonathan Liu,
	Alistair Popple, Jean-Philippe Brucker, Jiri Slaby, David Airlie,
	David Daney, dri-devel, linux-fbdev, Linux Kernel Mailing List,
	stable, Chintakuntla, Radha


> -----Original Message-----
> From: Henrique de Moraes Holschuh [mailto:hmh@hmh.eng.br]
> Sent: Saturday, May 28, 2016 4:49 AM
> To: Scot Doyle <lkml14@scotdoyle.com>
> Cc: Tomi Valkeinen <tomi.valkeinen@ti.com>; Jean-Christophe Plagniol-
> Villard <plagnioj@jcrosoft.com>; Greg Kroah-Hartman
> <gregkh@linuxfoundation.org>; Jeremy Kerr <jk@ozlabs.org>; Ming Lei
> <ming.lei@canonical.com>; Daney, David <David.Daney@cavium.com>;
> Dann Frazier <dann.frazier@canonical.com>; Peter Hurley
> <peter@hurleysoftware.com>; Pavel Machek <pavel@ucw.cz>; Jonathan Liu
> <net147@gmail.com>; Alistair Popple <alistair@popple.id.au>; Jean-Philippe
> Brucker <jean-philippe.brucker@arm.com>; Chintakuntla, Radha
> <Radha.Chintakuntla@cavium.com>; Jiri Slaby <jslaby@suse.com>; David
> Airlie <airlied@linux.ie>; David Daney <ddaney.cavm@gmail.com>; dri-
> devel@lists.freedesktop.org; linux-fbdev@vger.kernel.org; Linux Kernel
> Mailing List <linux-kernel@vger.kernel.org>; stable
> <stable@vger.kernel.org>
> Subject: Re: [PATCH] fbcon: warn on invalid cursor blink intervals
> 
> On Fri, 20 May 2016, Scot Doyle wrote:
> > On Fri, 20 May 2016, Jeremy Kerr wrote:
> > > >Then looks there are two fix patches acked & tested:
> > > >
> > > > - the patch in this thread
> > > > - another one "[PATCH] tty: vt: Fix soft lockup in fbcon cursor
> > > >blink timer."
> > > > https://lkml.org/lkml/2016/5/17/455
> > > >
> > > >So which one will be pushed to linus?
> > >
> > > Not that it's my call, but we may want both; the first as a safety
> > > measure to prevent an invalid cur_blink_jiffies ever being set, and the
> > > second one to actually fix the initialisation of vc_cur_blink_ms (and
> > > address the warning introduced by the first).
> >
> > Tomi / Greg,
> >
> > I'd suggest
> > - applying "tty: vt: Fix soft lockup in fbcon cursor blink timer." to 4.7 and
> stable[4.2]
> > - applying "fbcon: warn on invalid cursor blink intervals" to 4.7
> > - ignoring "fbcon: use default if cursor blink interval is not valid"
> >
> > Note: the patches don't depend on each other
> 
> I applied both recommended patches on top of 4.4.11 for testing, and they
> made things a lot better here.
> 
> I suggest the second patch should be backported to stable too, might as well
> fix this thing for good *and keep the door closed*.

Is this patch available on some tree so that I can point to ?
And hope it will make it to linux-next soon ?

> 
> --
>   "One disk to rule them all, One disk to find them. One disk to bring
>   them all and in the darkness grind them. In the Land of Redmond
>   where the shadows lie." -- The Silicon Valley Tarot
>   Henrique Holschuh

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Ping: [PATCH] tty: vt: Fix soft lockup in fbcon cursor blink timer.
  2016-05-17 18:41 [PATCH] tty: vt: Fix soft lockup in fbcon cursor blink timer David Daney
                   ` (2 preceding siblings ...)
  2016-05-28 11:44 ` Henrique de Moraes Holschuh
@ 2016-06-14 16:22 ` David Daney
  3 siblings, 0 replies; 30+ messages in thread
From: David Daney @ 2016-06-14 16:22 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Jiri Slaby, Peter Hurley, Ming Lei, Dann Frazier, Scot Doyle,
	David Airlie, dri-devel, Radha.Chintakuntla, Pavel Machek,
	linux-kernel, David Daney, stable

Greg et al.,

Now that the merge window is closed, please consider this patch.

Thanks in advance,
David Daney


On 05/17/2016 11:41 AM, David Daney wrote:
> From: David Daney <david.daney@cavium.com>
>
> We are getting somewhat random soft lockups with this signature:
>
> [   86.992215] [<fffffc00080935e0>] el1_irq+0xa0/0x10c
> [   86.997082] [<fffffc000841822c>] cursor_timer_handler+0x30/0x54
> [   87.002991] [<fffffc000810ec44>] call_timer_fn+0x54/0x1a8
> [   87.008378] [<fffffc000810ef88>] run_timer_softirq+0x1c4/0x2bc
> [   87.014200] [<fffffc000809077c>] __do_softirq+0x114/0x344
> [   87.019590] [<fffffc00080af45c>] irq_exit+0x74/0x98
> [   87.024458] [<fffffc00080fac20>] __handle_domain_irq+0x98/0xfc
> [   87.030278] [<fffffc000809056c>] gic_handle_irq+0x94/0x190
>
> This is caused by the vt visual_init() function calling into
> fbcon_init() with a vc_cur_blink_ms value of zero.  This is a
> transient condition, as it is later set to a non-zero value.  But, if
> the timer happens to expire while the blink rate is zero, it goes into
> an endless loop, and we get soft lockup.
>
> The fix is to initialize vc_cur_blink_ms before calling the con_init()
> function.
>
> Signed-off-by: David Daney <david.daney@cavium.com>
> Cc: stable@vger.kernel.org
> ---
>   drivers/tty/vt/vt.c | 1 +
>   1 file changed, 1 insertion(+)
>
> diff --git a/drivers/tty/vt/vt.c b/drivers/tty/vt/vt.c
> index 3e3c757..eef5c36 100644
> --- a/drivers/tty/vt/vt.c
> +++ b/drivers/tty/vt/vt.c
> @@ -750,6 +750,7 @@ static void visual_init(struct vc_data *vc, int num, int init)
>   	vc->vc_complement_mask = 0;
>   	vc->vc_can_do_color = 0;
>   	vc->vc_panic_force_write = false;
> +	vc->vc_cur_blink_ms = DEFAULT_CURSOR_BLINK_MS;
>   	vc->vc_sw->con_init(vc, init);
>   	if (!vc->vc_complement_mask)
>   		vc->vc_complement_mask = vc->vc_can_do_color ? 0x7700 : 0x0800;
>

^ permalink raw reply	[flat|nested] 30+ messages in thread

end of thread, other threads:[~2016-06-14 16:22 UTC | newest]

Thread overview: 30+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-05-17 18:41 [PATCH] tty: vt: Fix soft lockup in fbcon cursor blink timer David Daney
2016-05-17 20:49 ` Pavel Machek
2016-05-18  0:42   ` Ming Lei
2016-05-18 20:24     ` Scot Doyle
2016-05-18 20:38       ` David Daney
2016-05-19  4:21         ` [PATCH] fbcon: use default if cursor blink interval is not valid Scot Doyle
2016-05-19  7:25           ` Jeremy Kerr
2016-05-19  8:29           ` Ming Lei
2016-05-19  9:01           ` Pavel Machek
2016-05-19 14:22             ` Scot Doyle
2016-05-19 15:31               ` Ming Lei
2016-05-19 15:59           ` David Daney
2016-05-19 22:25             ` Scot Doyle
2016-05-19 22:31               ` [PATCH] fbcon: warn on invalid cursor blink intervals Scot Doyle
2016-05-19 22:50                 ` David Daney
2016-05-20  1:21                 ` Jeremy Kerr
2016-05-20  2:17                 ` Ming Lei
2016-05-20  2:26                   ` Jeremy Kerr
2016-05-20  2:48                     ` Ming Lei
2016-05-20  5:04                       ` Jeremy Kerr
2016-05-20 16:27                         ` Scot Doyle
2016-05-24  1:19                           ` Scot Doyle
2016-05-28 11:48                           ` Henrique de Moraes Holschuh
2016-06-12  5:05                             ` Chintakuntla, Radha
2016-05-28 11:45                 ` Henrique de Moraes Holschuh
2016-05-19  0:27       ` [PATCH] tty: vt: Fix soft lockup in fbcon cursor blink timer Ming Lei
2016-05-19  7:08         ` Pavel Machek
2016-05-19 22:35 ` Scot Doyle
2016-05-28 11:44 ` Henrique de Moraes Holschuh
2016-06-14 16:22 ` Ping: " David Daney

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).