All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Viktor Jägersküpper" <viktor_jaegerskuepper@freenet.de>
To: Peter Zijlstra <peterz@infradead.org>,
	Thomas Gleixner <tglx@linutronix.de>
Cc: Kevin Shanahan <kevin@shanahan.id.au>,
	Siegfried Metz <frame@mailbox.org>,
	linux-kernel@vger.kernel.org, rafael.j.wysocki@intel.com,
	len.brown@intel.com, rjw@rjwysocki.net, diego.viola@gmail.com,
	rui.zhang@intel.com
Subject: Re: REGRESSION: boot stalls on several old dual core Intel CPUs
Date: Mon, 03 Sep 2018 11:30:00 +0000	[thread overview]
Message-ID: <5177cb97-e5d9-018e-781a-fc98a24f4173@freenet.de> (raw)
In-Reply-To: <20180903093305.GC24142@hirez.programming.kicks-ass.net>

Peter Zijlstra:
> On Mon, Sep 03, 2018 at 10:54:23AM +0200, Peter Zijlstra wrote:
>> On Mon, Sep 03, 2018 at 09:38:15AM +0200, Thomas Gleixner wrote:
>>> On Mon, 3 Sep 2018, Peter Zijlstra wrote:
>>>> On Sat, Sep 01, 2018 at 11:51:26AM +0930, Kevin Shanahan wrote:
>>>>> commit 01548f4d3e8e94caf323a4f664eb347fd34a34ab
>>>>> Author: Martin Schwidefsky <schwidefsky@de.ibm.com>
>>>>> Date:   Tue Aug 18 17:09:42 2009 +0200
>>>>>
>>>>>     clocksource: Avoid clocksource watchdog circular locking dependency
>>>>>
>>>>>     stop_machine from a multithreaded workqueue is not allowed because
>>>>>     of a circular locking dependency between cpu_down and the workqueue
>>>>>     execution. Use a kernel thread to do the clocksource downgrade.
>>>>
>>>> I cannot find stop_machine usage there; either it went away or I need to
>>>> like wake up.
>>>
>>> timekeeping_notify() which is involved in switching clock source uses stomp
>>> machine.
>>
>> ARGH... OK, lemme see if I can come up with something other than
>> endlessly spawning that kthread.
>>
>> A special purpose kthread_worker would make more sense than that.
> 
> Can someone test this?
> 
> ---
>  kernel/time/clocksource.c | 28 ++++++++++++++++++++++------
>  1 file changed, 22 insertions(+), 6 deletions(-)
> 
> diff --git a/kernel/time/clocksource.c b/kernel/time/clocksource.c
> index f74fb00d8064..898976d0082a 100644
> --- a/kernel/time/clocksource.c
> +++ b/kernel/time/clocksource.c
> @@ -112,13 +112,28 @@ static int finished_booting;
>  static u64 suspend_start;
>  
>  #ifdef CONFIG_CLOCKSOURCE_WATCHDOG
> -static void clocksource_watchdog_work(struct work_struct *work);
> +static void clocksource_watchdog_work(struct kthread_work *work);
>  static void clocksource_select(void);
>  
>  static LIST_HEAD(watchdog_list);
>  static struct clocksource *watchdog;
>  static struct timer_list watchdog_timer;
> -static DECLARE_WORK(watchdog_work, clocksource_watchdog_work);
> +
> +/*
> + * We must use a kthread_worker here, because:
> + *
> + *   clocksource_watchdog_work()
> + *     clocksource_select()
> + *       __clocksource_select()
> + *         timekeeping_notify()
> + *           stop_machine()
> + *
> + * cannot be called from a reqular workqueue, because of deadlocks between
> + * workqueue and stopmachine.
> + */
> +static struct kthread_worker *watchdog_worker;
> +static DEFINE_KTHREAD_WORK(watchdog_work, clocksource_watchdog_work);
> +
>  static DEFINE_SPINLOCK(watchdog_lock);
>  static int watchdog_running;
>  static atomic_t watchdog_reset_pending;
> @@ -158,7 +173,7 @@ static void __clocksource_unstable(struct clocksource *cs)
>  
>  	/* kick clocksource_watchdog_work() */
>  	if (finished_booting)
> -		schedule_work(&watchdog_work);
> +		kthread_queue_work(watchdog_worker, &watchdog_work);
>  }
>  
>  /**
> @@ -199,7 +214,7 @@ static void clocksource_watchdog(struct timer_list *unused)
>  		/* Clocksource already marked unstable? */
>  		if (cs->flags & CLOCK_SOURCE_UNSTABLE) {
>  			if (finished_booting)
> -				schedule_work(&watchdog_work);
> +				kthread_queue_work(watchdog_worker, &watchdog_work);
>  			continue;
>  		}
>  
> @@ -269,7 +284,7 @@ static void clocksource_watchdog(struct timer_list *unused)
>  			 */
>  			if (cs != curr_clocksource) {
>  				cs->flags |= CLOCK_SOURCE_RESELECT;
> -				schedule_work(&watchdog_work);
> +				kthread_queue_work(watchdog_worker, &watchdog_work);
>  			} else {
>  				tick_clock_notify();
>  			}
> @@ -418,7 +433,7 @@ static int __clocksource_watchdog_work(void)
>  	return select;
>  }
>  
> -static void clocksource_watchdog_work(struct work_struct *work)
> +static void clocksource_watchdog_work(struct kthread_work *work)
>  {
>  	mutex_lock(&clocksource_mutex);
>  	if (__clocksource_watchdog_work())
> @@ -806,6 +821,7 @@ static int __init clocksource_done_booting(void)
>  {
>  	mutex_lock(&clocksource_mutex);
>  	curr_clocksource = clocksource_default_clock();
> +	watchdog_worker = kthread_create_worker(0, "cs-watchdog");
>  	finished_booting = 1;
>  	/*
>  	 * Run the watchdog first to eliminate unstable clock sources
> 

Applied on mainline tag v4.19-rc2. Tested without additional parameters,
with "quiet" and with "debug", my PC booted successfully in all three
cases, whereas it stalled almost always in these three cases before.

Thanks!

  reply	other threads:[~2018-09-03 11:38 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-08-30 10:55 REGRESSION: boot stalls on several old dual core Intel CPUs Siegfried Metz
2018-08-30 13:04 ` Peter Zijlstra
2018-08-30 13:48   ` Peter Zijlstra
2018-09-01  2:21   ` Kevin Shanahan
2018-09-03  7:25     ` Peter Zijlstra
2018-09-03  7:38       ` Thomas Gleixner
2018-09-03  8:54         ` Peter Zijlstra
2018-09-03  9:33           ` Peter Zijlstra
2018-09-03 11:30             ` Viktor Jägersküpper [this message]
2018-09-03 12:34             ` Kevin Shanahan
2018-09-03 21:34             ` Siegfried Metz
2018-09-04 13:44             ` Niklas Cassel
2018-09-05  8:41               ` [PATCH] clocksource: Revert "Remove kthread" Peter Zijlstra
2018-09-06 10:46                 ` [tip:timers/urgent] " tip-bot for Peter Zijlstra
2018-09-06 21:42                 ` tip-bot for Peter Zijlstra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5177cb97-e5d9-018e-781a-fc98a24f4173@freenet.de \
    --to=viktor_jaegerskuepper@freenet.de \
    --cc=diego.viola@gmail.com \
    --cc=frame@mailbox.org \
    --cc=kevin@shanahan.id.au \
    --cc=len.brown@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=peterz@infradead.org \
    --cc=rafael.j.wysocki@intel.com \
    --cc=rjw@rjwysocki.net \
    --cc=rui.zhang@intel.com \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.