From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <linux-kernel-owner@vger.kernel.org>
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
	id S1752582AbcHLJqK (ORCPT <rfc822;w@1wt.eu>);
	Fri, 12 Aug 2016 05:46:10 -0400
Received: from mx2.suse.de ([195.135.220.15]:42705 "EHLO mx2.suse.de"
	rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP
	id S1752203AbcHLJoz (ORCPT <rfc822;linux-kernel@vger.kernel.org>);
	Fri, 12 Aug 2016 05:44:55 -0400
Date: Fri, 12 Aug 2016 11:44:47 +0200
From: Petr Mladek <pmladek@suse.com>
To: Viresh Kumar <viresh.kumar@linaro.org>
Cc: Jan Kara <jack@suse.cz>, Andrew Morton <akpm@linux-foundation.org>,
        Sergey Senozhatsky <sergey.senozhatsky@gmail.com>,
        Jan Kara <jack@suse.com>, Tejun Heo <tj@kernel.org>,
        Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>,
        "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
        Byungchul Park <byungchul.park@lge.com>,
        Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com>,
        vlevenetz@mm-sol.com, Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Subject: Re: [PATCH v10 1/2] printk: Make printk() completely async
Message-ID: <20160812094447.GD7339@pathway.suse.cz>
References: <1459789048-1337-1-git-send-email-sergey.senozhatsky@gmail.com>
 <1459789048-1337-2-git-send-email-sergey.senozhatsky@gmail.com>
 <20160404155149.a3e3307def2d1315e2099c63@linux-foundation.org>
 <20160406082758.GA3554@quack.suse.cz>
 <CAOh2x==kfRXeN_OGL822izfDEZ66uvTT=uxBaiwUdC3ca6O-dQ@mail.gmail.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <CAOh2x==kfRXeN_OGL822izfDEZ66uvTT=uxBaiwUdC3ca6O-dQ@mail.gmail.com>
User-Agent: Mutt/1.5.21 (2010-09-15)
Sender: linux-kernel-owner@vger.kernel.org
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org

On Wed 2016-08-10 14:17:55, Viresh Kumar wrote:
> +Vladi/Greg,
> 
> On Wed, Apr 6, 2016 at 1:27 AM, Jan Kara <jack@suse.cz> wrote:
> > On Mon 04-04-16 15:51:49, Andrew Morton wrote:
> 
> >> > +static int __init init_printk_kthread(void)
> >> > +{
> >> > +   struct task_struct *thread;
> >> > +
> >> > +   if (printk_sync)
> >> > +           return 0;
> >> > +
> >> > +   thread = kthread_run(printk_kthread_func, NULL, "printk");
> >>
> >> This gets normal scheduling policy, so a spinning userspace SCHED_FIFO
> >> task will block printk for ever.  This seems bad.
> >
> > I have to research this a bit but won't the SCHED_FIFO task that has
> > potentially unbounded amount of work lockup the CPU even though it does
> > occasional cond_resched()?
> 
> We are facing complete hogs because of the printk thread being a SCHED_FIFO
> task and have this patch to fix it up for now.
> 
> Author: Vladislav Levenetz <vblagoev@mm-sol.com>
> Date:   Wed Aug 10 13:58:00 2016 -0700
> 
>     SW-7786: printk: Lower the priority of printk thread
> 
>     Flooding the console (with a test module) in a tight loop indefinitely
>     makes android user interface very sluggish. Opening YouTube app and the
>     device hangs and becomes even more unresponsive to the point it
>     completely hangs.
> 
>     The asynchronous printk thread is a SCHED FIFO thread with priority
>     MAX_RT_PRIO - 1. If we create it as a simple thread (i.e. no SCHED FIFO)
>     instead, we observe much better performance using the same printk flood
>     test. We don't even notice any kind of sluggishness during device usage.
>     We can play a YouTube clip smoothly and use the device normally in
>     general.  The kernel log looks fine as well, as the flood of messages
>     continue normally.
> 
>     Signed-off-by: Vladislav Levenetz <vblagoev@mm-sol.com>
>     Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
> ---
>  kernel/printk/printk.c | 4 ----
>  1 file changed, 4 deletions(-)
> 
> diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
> index c32872872cb6..ad5b30e5e6d9 100644
> --- a/kernel/printk/printk.c
> +++ b/kernel/printk/printk.c
> @@ -2856,9 +2856,6 @@ static int printk_kthread_func(void *data)
>  static int __init_printk_kthread(void)
>  {
>         struct task_struct *thread;
> -       struct sched_param param = {
> -               .sched_priority = MAX_RT_PRIO - 1,
> -       };
> 
>         if (!printk_kthread_can_run || printk_sync || printk_kthread)
>                 return 0;
> @@ -2870,7 +2867,6 @@ static int __init_printk_kthread(void)
>                 return PTR_ERR(thread);
>         }
> 
> -       sched_setscheduler(thread, SCHED_FIFO, &param);
>         printk_kthread = thread;
>         return 0;
>  }

IMHO, this is fine. We force the synchronous mode in critical
situations anyway.

But I was curious if we could hit a printk from the wake_up_process().
The change above causes using the fair scheduler and there is
the following call chain [*]

  vprintk_emit()
  -> wake_up_process()
   -> try_to_wake_up()
    -> ttwu_queue()
     -> ttwu_do_activate()
      -> ttwu_activate()
       -> activate_task()
	-> enqueue_task()
	 -> enqueue_task_fair()	    via p->sched_class->enqueue_task
	  -> cfs_rq_of()
	   -> task_of()
	    -> WARN_ON_ONCE(!entity_is_task(se))

We should never trigger this because printk_kthread is a task.
But what if the date gets inconsistent?

Then there is the following chain:

  vprintk_emit()
  -> wake_up_process()
   -> try_to_wake_up()
    -> ttwu_queue()
     -> ttwu_do_activate()
      -> ttwu_activate()
       -> activate_task()
	-> enqueue_task()
	 -> enqueue_task_fair()	    via p->sched_class->enqueue_task
	  ->hrtick_update()
	   -> hrtick_start_fair()
	    -> WARN_ON(task_rq(p) != rq)

This looks like another paranoid consistency check that might be
triggered when the scheduler gets messed.

I see few possible solutions:

1. Replace the WARN_ONs by printk_deferred().

   This is the usual solution but it would make debugging less convenient.


2. Force synchronous printk inside WARN()/BUG() macros.

   This would make sense even from other reasons. These are printed
   when the system is in a strange state. There is no guarantee that
   the printk_kthread will get scheduled.


3. Force printk_deferred() inside WARN()/BUG() macros via the per-CPU
   printk_func.

   It might be elegant. But we do not want this outside the scheduler
   code. Therefore we would need special variants of  WARN_*_SCHED()
   BUG_*_SCHED() macros.


I personally prefer the 2nd solution. What do you think about it,
please?


Best Regards,
Petr