From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <linux-kernel-owner@vger.kernel.org>
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
	id S1751657AbcGNWM5 (ORCPT <rfc822;w@1wt.eu>);
	Thu, 14 Jul 2016 18:12:57 -0400
Received: from mail-pf0-f172.google.com ([209.85.192.172]:33647 "EHLO
	mail-pf0-f172.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
	with ESMTP id S1750893AbcGNWMy (ORCPT
	<rfc822;linux-kernel@vger.kernel.org>);
	Thu, 14 Jul 2016 18:12:54 -0400
Date: Thu, 14 Jul 2016 15:12:51 -0700
From: Viresh Kumar <viresh.kumar@linaro.org>
To: Jan Kara <jack@suse.cz>
Cc: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com>,
        Sergey Senozhatsky <sergey.senozhatsky@gmail.com>, rjw@rjwysocki.net,
        Tejun Heo <tj@kernel.org>,
        Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
        Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
        vlevenetz@mm-sol.com, vaibhav.hiremath@linaro.org,
        alex.elder@linaro.org, johan@kernel.org, akpm@linux-foundation.org,
        rostedt@goodmis.org, linux-pm@vger.kernel.org,
        Petr Mladek <pmladek@suse.com>, Thomas Gleixner <tglx@linutronix.de>
Subject: Re: [Query] Preemption (hogging) of the work handler
Message-ID: <20160714221251.GE3057@ubuntu>
References: <20160701165959.GR12473@ubuntu>
 <20160701172232.GD28719@htj.duckdns.org>
 <20160706182842.GS2671@ubuntu>
 <20160711102603.GI12410@quack2.suse.cz>
 <20160711154438.GA528@swordfish>
 <20160711223501.GI4695@ubuntu>
 <20160712231903.GR4695@ubuntu>
 <20160713054507.GA563@swordfish>
 <20160714141216.GC13151@quack2.suse.cz>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <20160714141216.GC13151@quack2.suse.cz>
User-Agent: Mutt/1.5.24 (2015-08-30)
Sender: linux-kernel-owner@vger.kernel.org
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org

On 14-07-16, 16:12, Jan Kara wrote:
> Exactly. Calling printk() from certain parts of the kernel (like scheduler
> code or timer code) has been always unsafe because printk itself uses these
> parts and so it can lead to deadlocks. That's why printk_deffered() has
> been introduced as you mention below.
> 
> And with sync printk the above deadlock doesn't trigger only by chance - if
> there happened to be a waiter on console_sem while we suspend, the same
> deadlock would trigger because up(&console_sem) will try to wake him up and
> the warning in timekeeping code will cause recursive printk.
> 
> So I think your patch doesn't really address the real issue - it only
> works around the particular WARN_ON(timekeeping_enabled) warning but if
> there was a different warning in timekeeping code which would trigger, it
> has a potential for causing recursive printk deadlock (and indeed we had
> such issues previously - see e.g. 504d58745c9c "timer: Fix lock inversion
> between hrtimer_bases.lock and scheduler locks").
> 
> So there are IMHO two issues here worth looking at:
> 
> 1) I didn't find how a wakeup would would lead to calling to ktime_get() in
> the current upstream kernel or even current RT kernel. Maybe this is a
> problem specific to the 3.10 kernel you are using? If yes, we don't have to
> do anything for current upstream AFAIU.

I haven't checked that earlier, but I see the path in both 3.10 and mainline.

vprintk_emit
 -> wake_up_process
  -> try_to_wake_up
   -> ttwu_queue
    -> ttwu_do_activate
     -> ttwu_activate
      -> activate_task
       -> enqueue_task (sched/core.c)
        -> enqueue_task_rt (rt.c)
         -> enqueue_rt_entity
          -> __enqueue_rt_entity
           -> inc_rt_tasks
            -> inc_rt_group
             -> start_rt_bandwidth
              -> start_bandwidth_timer
               -> __hrtimer_start_range_ns
                -> ktime_get()

> If I just missed how wakeup can call into ktime_get() in current upstream,
> there is another question:
> 
> 2) Is it OK that printk calls wakeup so late during suspend?

To clarify again to everybody, we are talking about the place where all non-boot
CPUs are already hot-unplugged and the last running one has disabled interrupts.

I believe that we can't do migration at all now, right? What will we get by
calling wake_up_process() now anyway ?

> I believe it
> is but I'm neither scheduler nor suspend expert. If it is OK, and wakeup
> can lead to ktime_get() in current upstream, then this contradicts the
> check WARN_ON(timekeeping_suspended) in ktime_get() and something is wrong.
> 
> Adding Thomas to CC as timer / RT expert...

Thanks.

-- 
viresh