From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <linux-kernel-owner@vger.kernel.org>
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
	id S1754697AbbE1UJy (ORCPT <rfc822;w@1wt.eu>);
	Thu, 28 May 2015 16:09:54 -0400
Received: from mail.linuxfoundation.org ([140.211.169.12]:57927 "EHLO
	mail.linuxfoundation.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
	with ESMTP id S1754006AbbE1UJp (ORCPT
	<rfc822;linux-kernel@vger.kernel.org>);
	Thu, 28 May 2015 16:09:45 -0400
Date: Thu, 28 May 2015 13:09:44 -0700
From: Andrew Morton <akpm@linux-foundation.org>
To: Petr Mladek <pmladek@suse.cz>
Cc: Frederic Weisbecker <fweisbec@gmail.com>,
        Steven Rostedt <rostedt@goodmis.org>,
        Dave Anderson <anderson@redhat.com>,
        "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>,
        Kay Sievers <kay@vrfy.org>, Jiri Kosina <jkosina@suse.cz>,
        Michal Hocko <mhocko@suse.cz>, Jan Kara <jack@suse.cz>,
        linux-kernel@vger.kernel.org, Wang Long <long.wanglong@huawei.com>,
        peifeiyue@huawei.com, dzickus@redhat.com, morgan.wang@huawei.com,
        sasha.levin@oracle.com
Subject: Re: [PATCH 02/10] printk: Try harder to get logbuf_lock on NMI
Message-Id: <20150528130944.9dde0f591a18d656f2a7c519@linux-foundation.org>
In-Reply-To: <20150528135054.GF3135@pathway.suse.cz>
References: <1432557993-20458-1-git-send-email-pmladek@suse.cz>
	<1432557993-20458-3-git-send-email-pmladek@suse.cz>
	<20150527161409.c1c37d25af1c09c340bd04a3@linux-foundation.org>
	<20150528135054.GF3135@pathway.suse.cz>
X-Mailer: Sylpheed 3.4.1 (GTK+ 2.24.23; x86_64-pc-linux-gnu)
Mime-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
Sender: linux-kernel-owner@vger.kernel.org
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org

On Thu, 28 May 2015 15:50:54 +0200 Petr Mladek <pmladek@suse.cz> wrote:

> > > +{
> > > +	u64 start_time, current_time;
> > > +	int this_cpu = smp_processor_id();
> > > +
> > > +	/* no way if we are already locked on this CPU */
> > > +	if (logbuf_cpu == this_cpu)
> > > +		return 0;
> > > +
> > > +	/* try hard to get the lock but do not wait forever */
> > > +	start_time = cpu_clock(this_cpu);
> > > +	current_time = start_time;
> > > +	while (current_time - start_time < TRY_LOCKBUF_LOCK_MAX_DELAY_NS) {
> > > +		if (raw_spin_trylock(&logbuf_lock))
> > > +			return 1;
> > > +		cpu_relax();
> > > +		current_time = cpu_clock(this_cpu);
> > > +	}
> > 
> > (Looks at the read_seqcount_retry() in
> > kernel/time/sched_clock.c:sched_clock())
> >
> > Running cpu_clock() in NMI context seems a generally bad idea.
> 
> I am sorry but this is too cryptic for me :-)
> read_seqcount_retry() looks safe to me under NMI.

hmpf.  If you guys say so...

Note that it's not just a matter of "safe to call from NMI context". 
The above loop also assume that cpu_clock() is *being updated* within
the context of single NMI.  Is that true/safe now and in the future?
Probably.  I didn't check all architectures but ARM looks OK at present.

We should at least update Documentation/timers/timekeeping.txt: "a sane
value" becomes "the correct value", no alternatives.

> > There are many sites in kernel/printk/printk.c which take logbuf_lock,
> > but this patch only sets logbuf_cpu in one of those cases:
> > vprintk_emit().  I suggest adding helper functions to take/release
> > logbuf_lock.  And rename logbuf_lock to something else to ensure that
> > nobody accidentally takes the lock directly.
> 
> IMHO, vprintk_emit() is special. It is the only location where the
> lock is taken in NMI context. The other functions are used to dump
> @logbuf and are called in normal context.
> 
> try_logbuf_lock_in_nmi() could fail and we need to handle the error
> path. We do not need to do this in the other locations.
> 
> Note that we do not want to get the console in NMI because
> there are even more locks that might cause a deadlock.

Consider the case where a CPU has taken logbuf_lock within
devkmsg_read() and then receives an NMI, from which it calls
try_logbuf_lock_in_nmi():

> +/* We must be careful in NMI when we managed to preempt a running printk */
> +static int try_logbuf_lock_in_nmi(void)
> +{
> +	u64 start_time, current_time;
> +	int this_cpu = smp_processor_id();
> +
> +	/* no way if we are already locked on this CPU */
> +	if (logbuf_cpu == this_cpu)
> +		return 0;
> +
> +	/* try hard to get the lock but do not wait forever */
> +	start_time = cpu_clock(this_cpu);
> +	current_time = start_time;
> +	while (current_time - start_time < TRY_LOCKBUF_LOCK_MAX_DELAY_NS) {
> +		if (raw_spin_trylock(&logbuf_lock))
> +			return 1;
> +		cpu_relax();
> +		current_time = cpu_clock(this_cpu);
> +	}
> +
> +	return 0;
> +}

That CPU is now going to spin around for 100us and then time out.