From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <linux-kernel-owner@vger.kernel.org>
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S933376AbcKXQ61 (ORCPT <rfc822;w@1wt.eu>);
        Thu, 24 Nov 2016 11:58:27 -0500
Received: from mx2.suse.de ([195.135.220.15]:43036 "EHLO mx2.suse.de"
        rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP
        id S1756053AbcKXQ60 (ORCPT <rfc822;linux-kernel@vger.kernel.org>);
        Thu, 24 Nov 2016 11:58:26 -0500
Date: Thu, 24 Nov 2016 17:58:21 +0100
From: Petr Mladek <pmladek@suse.com>
To: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>, Jan Kara <jack@suse.cz>,
        Tejun Heo <tj@kernel.org>, Calvin Owens <calvinowens@fb.com>,
        Thomas Gleixner <tglx@linutronix.de>,
        Mel Gorman <mgorman@techsingularity.net>,
        Steven Rostedt <rostedt@goodmis.org>, Ingo Molnar <mingo@redhat.com>,
        Peter Zijlstra <peterz@infradead.org>,
        Laura Abbott <labbott@redhat.com>, Andy Lutomirski <luto@kernel.org>,
        Linus Torvalds <torvalds@linux-foundation.org>,
        Kees Cook <keescook@chromium.org>, linux-kernel@vger.kernel.org,
        Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com>
Subject: Re: [RFC][PATCHv4 3/6] printk: introduce per-cpu safe_print seq
 buffer
Message-ID: <20161124165821.GG24103@pathway.suse.cz>
References: <20161027154933.1211-1-sergey.senozhatsky@gmail.com>
 <20161027154933.1211-4-sergey.senozhatsky@gmail.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <20161027154933.1211-4-sergey.senozhatsky@gmail.com>
User-Agent: Mutt/1.5.21 (2010-09-15)
Sender: linux-kernel-owner@vger.kernel.org
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org

On Fri 2016-10-28 00:49:30, Sergey Senozhatsky wrote:
> This patch extends the idea of NMI per-cpu buffers to regions
> that may cause recursive printk() calls and possible deadlocks.
> Namely, printk() can't handle printk calls from schedule code
> or printk() calls from lock debugging code (spin_dump() for instance);
> because those may be called with `sem->lock' already taken or any
> other `critical' locks (p->pi_lock, etc.). An example of deadlock
> can be
> 
>  vprintk_emit()
>   console_unlock()
>    up()                        << raw_spin_lock_irqsave(&sem->lock, flags);
>     wake_up_process()
>      try_to_wake_up()
>       ttwu_queue()
>        ttwu_activate()
>         activate_task()
>          enqueue_task()
>           enqueue_task_fair()
>            cfs_rq_of()
>             task_of()
>              WARN_ON_ONCE(!entity_is_task(se))
>               vprintk_emit()
>                console_trylock()
>                 down_trylock()
>                  raw_spin_lock_irqsave(&sem->lock, flags)
>                  ^^^^ deadlock
> 
> and some other cases.
> 
> Just like in NMI implementation, the solution uses a per-cpu
> `printk_func' pointer to 'redirect' printk() calls to a 'safe'
> callback, that store messages in a per-cpu buffer and flushes
> them back to logbuf buffer later.
> 
> Usage example:
> 
>  printk()
>   printk_safe_enter(flags)
>   //
>   //  any printk() call from here will endup in vprintk_safe(),
>   //  that stores messages in a special per-CPU buffer.
>   //
>   printk_safe_exit(flags)
> 
> The 'redirection' mechanism, though, has been reworked, as suggested
> by Petr Mladek. Instead of using a per-cpu @print_func callback we now
> keep a per-cpu printk-context variable and call either default or nmi
> vprintk function depending on its value. printk_nmi_entrer/exit and
> printk_safe_enter/exit, thus, just set/celar corresponding bits in
> printk-context functions.
> 
> The patch only adds printk_safe support, we don't use it yet.
> 
> Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>

> diff --git a/kernel/printk/internal.h b/kernel/printk/internal.h
> index 7fd2838..87c784b 100644
> --- a/kernel/printk/internal.h
> +++ b/kernel/printk/internal.h
>  #endif /* CONFIG_PRINTK_NMI */
> +
> +#ifdef CONFIG_PRINTK
> +
> +#define PRINTK_SAFE_CONTEXT_MASK	0x7fffffff
> +#define PRINTK_SAFE_NMI_CONTEXT_MASK	0x80000000

What about shorter name PRINTK_NMI_CONTEXT_MASK?

> diff --git a/kernel/printk/printk_safe.c b/kernel/printk/printk_safe.c
> index 1f66163..af74d9c 100644
> --- a/kernel/printk/printk_safe.c
> +++ b/kernel/printk/printk_safe.c
> @@ -50,27 +49,26 @@ struct printk_safe_seq_buf {
>  	struct irq_work		work;	/* IRQ work that flushes the buffer */
>  	unsigned char		buffer[SAFE_LOG_BUF_LEN];
>  };
> +
> +static DEFINE_PER_CPU(struct printk_safe_seq_buf, safe_print_seq);
> +static DEFINE_PER_CPU(int, printk_safe_context);

I would personally use the short name "printk_context". It is a generic
value. Zero value means that it is a normal context. Also there is
an idea to add KDB context that would use its own vprintk_kdb()
implementation and will not use the printk_safe buffer.


> +#ifdef CONFIG_PRINTK_NMI
>  static DEFINE_PER_CPU(struct printk_safe_seq_buf, nmi_print_seq);
> +atomic_t nmi_message_lost;
> +#endif
>  
> -/*
> - * Safe printk() for NMI context. It uses a per-CPU buffer to
> - * store the message. NMIs are not nested, so there is always only
> - * one writer running. But the buffer might get flushed from another
> - * CPU, so we need to be careful.
> - */
> -static int vprintk_safe_nmi(const char *fmt, va_list args)
> +static int printk_safe_log_store(struct printk_safe_seq_buf *s,
> +		const char *fmt, va_list args)
>  {
> -	struct printk_safe_seq_buf *s = this_cpu_ptr(&nmi_print_seq);
> -	int add = 0;
> +	int add;
>  	size_t len;
>  
>  again:
>  	len = atomic_read(&s->len);
>  
> -	if (len >= sizeof(s->buffer)) {
> -		atomic_inc(&nmi_message_lost);
> -		return 0;
> -	}
> +	if (len >= sizeof(s->buffer))
> +		return -E2BIG;

E2BIG means "argument list too long" and does not fit much here.
I would suggest to use -ENOSPC. It is not ideal either but it fits
slightly better.

> +/*
> + * Lockless printk(), to avoid deadlocks should the printk() recurse
> + * into itself. It uses a per-CPU buffer to store the message, just like
> + * NMI.
> + */
> +static int vprintk_safe(const char *fmt, va_list args)
> +{
> +	struct printk_safe_seq_buf *s = this_cpu_ptr(&safe_print_seq);
> +
> +	return printk_safe_log_store(s, fmt, args);

We should return zero if printk_safe_log_store() returns an error.
I know that it will get fixed in the next patch. But we should do
some minimum sanity check here because of bisection.


Best Regards,
Petr