From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <linux-kernel-owner@vger.kernel.org>
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S933745AbcIOGhx (ORCPT <rfc822;w@1wt.eu>);
        Thu, 15 Sep 2016 02:37:53 -0400
Received: from mail-wm0-f66.google.com ([74.125.82.66]:35158 "EHLO
        mail-wm0-f66.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
        with ESMTP id S1764099AbcIOGhw (ORCPT
        <rfc822;linux-kernel@vger.kernel.org>);
        Thu, 15 Sep 2016 02:37:52 -0400
Date: Thu, 15 Sep 2016 08:37:47 +0200
From: Ingo Molnar <mingo@kernel.org>
To: Andy Lutomirski <luto@kernel.org>,
        Linus Torvalds <torvalds@linux-foundation.org>
Cc: x86@kernel.org, Borislav Petkov <bp@alien8.de>,
        linux-kernel@vger.kernel.org, Brian Gerst <brgerst@gmail.com>,
        Jann Horn <jann@thejh.net>, Thomas Gleixner <tglx@linutronix.de>,
        "H. Peter Anvin" <hpa@zytor.com>
Subject: Re: [PATCH 08/12] x86/dumpstack: Pin the target stack in
 save_stack_trace_tsk()
Message-ID: <20160915063747.GC13992@gmail.com>
References: <cover.1473801993.git.luto@kernel.org>
 <7cd5e328dc75d8ccd912e5783665de34503f7c63.1473801993.git.luto@kernel.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <7cd5e328dc75d8ccd912e5783665de34503f7c63.1473801993.git.luto@kernel.org>
User-Agent: Mutt/1.5.24 (2015-08-30)
Sender: linux-kernel-owner@vger.kernel.org
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org


* Andy Lutomirski <luto@kernel.org> wrote:

> This will prevent a crash if the target task dies before or while
> dumping its stack once we start freeing task stacks early.
> 
> Signed-off-by: Andy Lutomirski <luto@kernel.org>
> ---
>  arch/x86/kernel/stacktrace.c | 5 +++++
>  1 file changed, 5 insertions(+)
> 
> diff --git a/arch/x86/kernel/stacktrace.c b/arch/x86/kernel/stacktrace.c
> index 4738f5e0f2ab..b3f32fbe3ba4 100644
> --- a/arch/x86/kernel/stacktrace.c
> +++ b/arch/x86/kernel/stacktrace.c
> @@ -79,9 +79,14 @@ void save_stack_trace_regs(struct pt_regs *regs, struct stack_trace *trace)
>  
>  void save_stack_trace_tsk(struct task_struct *tsk, struct stack_trace *trace)
>  {
> +	if (!try_get_task_stack(tsk))
> +		return;
> +
>  	dump_trace(tsk, NULL, NULL, 0, &save_stack_ops_nosched, trace);
>  	if (trace->nr_entries < trace->max_entries)
>  		trace->entries[trace->nr_entries++] = ULONG_MAX;
> +
> +	put_task_stack(tsk);
>  }
>  EXPORT_SYMBOL_GPL(save_stack_trace_tsk);

So I very much like the first half of the series, the thread_info merge into 
task_struct (yay!) and I am in the process of applying those bits to -tip.

So the above not (yet) fully correct patch is in preparation to a later change you 
are doing in this series:

    [PATCH 11/12] sched: Free the stack early if CONFIG_THREAD_INFO_IN_TASK

But I am having serious second thoughts about the fact that we now have to sprinke 
non-obvious try_get_task_stack()/put_task_stack() pairs into debugging code that 
never really needed anything like that, which pairs are easy to get wrong, easy to 
miss - and generally if we miss them will it will result in a slightly buggy, very 
unpleasant, racy, crashy behavior of the kernel.

Can we just ... not do the delayed freeing?

I mean, the cache hotness arguments don't look overly convincing to me: if we just 
start not using a piece of formerly cache hot memory then those cache lines will 
eventually decay naturally in whatever LRU scheme the CPU is using and will be 
reused without much harm done. It's just "another day in CPU land" that happens 
all the time. Yes, we could reduce the latency of the freeing and thus slightly 
improve the locality of other bits ... but is it really measurable or noticeable 
in any fashion. It's like task/thread fork/clone/exit is _that_ much of a fast 
path.

And then there's also the whole CONFIG_THREAD_INFO_IN_TASK #ifdeffery spreading.

I.e I just don't see the justification for this kind of object life time assymetry 
and hard to debug fragility.

Am I missing something?

Thanks,

	Ingo