From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <linux-kernel-owner@vger.kernel.org>
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
	id S1757779AbcDGVhX (ORCPT <rfc822;w@1wt.eu>);
	Thu, 7 Apr 2016 17:37:23 -0400
Received: from mx2.suse.de ([195.135.220.15]:34482 "EHLO mx2.suse.de"
	rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP
	id S1751673AbcDGVhV (ORCPT <rfc822;linux-kernel@vger.kernel.org>);
	Thu, 7 Apr 2016 17:37:21 -0400
Date: Thu, 7 Apr 2016 23:37:19 +0200 (CEST)
From: Jiri Kosina <jikos@kernel.org>
X-X-Sender: jkosina@pobox.suse.cz
To: Jessica Yu <jeyu@redhat.com>
cc: Josh Poimboeuf <jpoimboe@redhat.com>, Miroslav Benes <mbenes@suse.cz>,
        linux-kernel@vger.kernel.org, live-patching@vger.kernel.org,
        Vojtech Pavlik <vojtech@suse.com>
Subject: Re: sched: horrible way to detect whether a task has been
 preempted
In-Reply-To: <20160407211525.GB25804@packer-debian-8-amd64.digitalocean.com>
Message-ID: <alpine.LNX.2.00.1604072330270.27368@cbobk.fhfr.pm>
References: <cover.1458933243.git.jpoimboe@redhat.com> <24db5a6ae5b63dfcd2096a12d18e1399a351348e.1458933243.git.jpoimboe@redhat.com> <20160407211525.GB25804@packer-debian-8-amd64.digitalocean.com>
User-Agent: Alpine 2.00 (LNX 1167 2008-08-23)
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Sender: linux-kernel-owner@vger.kernel.org
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org

On Thu, 7 Apr 2016, Jessica Yu wrote:

> Been sort of rattling my head over the scheduler code :-) Just following 
> the calls in and out of __schedule() it doesn't look like there is a 
> current flag/mechanism to tell whether or not a task has been 
> preempted..

Performing the complete stack unwind just to determine whether task has 
been preempted non-volutarily is a slight overkill indeed :/

> Is there any reason why you didn't just create a new task flag, 
> something like TIF_PREEMPTED_IRQ, which would be set once 
> preempt_schedule_irq() is entered and unset after __schedule() returns 
> (for that task)? This would roughly correspond to setting the task flag 
> when the frame for preempt_schedule_irq() is pushed and unsetting it 
> just before the frame preempt_schedule_irq() is popped for that task. 
> This seems simpler than walking through all the frames just to see if 
> in_preempt_schedule_irq() had been called. Would that work?

Alternatively, without eating up a TIF_ space, it'd be possible to push a 
magic contents on top of the stack in preempt_schedule_irq() (and pop it 
once we are returning from there), and if such magic value is detected, we 
just don't bother and claim unreliability.

That has advantages of both aproaches combined, i.e. it's relatively 
low-cost in terms of performance penalty, and it's reliable (in a sense 
that you don't have false positives).

The small disadvantage is that you can (very rarely, depending on the 
chosen magic) have false negatives. That probably doesn't hurt too much, 
given the high inprobability and non-lethal consequences.

How does that sound?

-- 
Jiri Kosina
SUSE Labs