From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753869AbdDKVbo (ORCPT ); Tue, 11 Apr 2017 17:31:44 -0400 Received: from mail.kernel.org ([198.145.29.136]:43562 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753673AbdDKVbj (ORCPT ); Tue, 11 Apr 2017 17:31:39 -0400 Date: Tue, 11 Apr 2017 17:31:33 -0400 From: Steven Rostedt To: "Paul E. McKenney" Cc: linux-kernel@vger.kernel.org Subject: Re: There is a Tasks RCU stall warning Message-ID: <20170411173133.52b28cfe@gandalf.local.home> In-Reply-To: <20170411211802.GA19165@linux.vnet.ibm.com> References: <20170411211802.GA19165@linux.vnet.ibm.com> X-Mailer: Claws Mail 3.14.0 (GTK+ 2.24.31; x86_64-pc-linux-gnu) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, 11 Apr 2017 14:18:02 -0700 "Paul E. McKenney" wrote: > Hello, Steve, > > Network connectivity issues... :-/ > > There is already a Tasks RCU stall warning, but it waits for > ten -minutes- before complaining. You can change this with the > rcupdate.rcu_task_stall_timeout kernel boot parameter, if you wish. > > Or did you wait longer than ten minutes? > Remember when I said I had a bunch of debugging turned on in this kernel. Well, I also included my ftrace trace_event benchmark code. Which appears to be what is triggering this. [ 205.159860] INFO: rcu_tasks detected stalls on tasks: [ 205.165042] ffff8800c40450c0: .. nvcsw: 2/2 holdout: 1 idle_cpu: -1/1 [ 205.171627] event_benchmark R running task 30224 1113 2 0x10000000 [ 205.178829] Call Trace: [ 205.181379] __schedule+0x574/0x1210 [ 205.185061] ? __schedule+0x574/0x1210 [ 205.188917] ? mark_held_locks+0x23/0xc0 [ 205.192944] ? mark_held_locks+0x23/0xc0 [ 205.196964] ? retint_kernel+0x2d/0x2d [ 205.200831] ? retint_kernel+0x2d/0x2d [ 205.204679] ? trace_hardirqs_on_caller+0x182/0x280 [ 205.209660] ? trace_hardirqs_on_thunk+0x1a/0x1c [ 205.214380] ? trace_hardirqs_on_thunk+0x1a/0x1c [ 205.219097] irq_exit+0x91/0x100 [ 205.222418] ? retint_kernel+0x2d/0x2d [ 205.226263] ? retint_kernel+0x2d/0x2d [ 205.230113] ? kthread_should_stop+0x3d/0x60 [ 205.234486] ? __asan_load8+0x11/0x70 [ 205.238245] ? ring_buffer_record_is_on+0x11/0x20 [ 205.243052] ? tracing_is_on+0x15/0x30 [ 205.246895] ? benchmark_event_kthread+0x4f/0x3c0 [ 205.251708] ? kthread+0x178/0x1d0 [ 205.255200] ? trace_benchmark_reg+0x80/0x80 [ 205.259563] ? kthread_create_on_node+0xa0/0xa0 [ 205.264205] ? ret_from_fork+0x2e/0x40 The thread gets created when I enable the benchmark tracepoint. It just so happens that my test enables *all* tracepoints, which would of course include this one as well. I'll have to look at this code to see why it is getting missed. -- Steve