From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756311AbcDGWfO (ORCPT ); Thu, 7 Apr 2016 18:35:14 -0400 Received: from mx1.redhat.com ([209.132.183.28]:53611 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751706AbcDGWfN (ORCPT ); Thu, 7 Apr 2016 18:35:13 -0400 Date: Thu, 7 Apr 2016 17:35:11 -0500 From: Josh Poimboeuf To: Jiri Kosina Cc: Jessica Yu , Miroslav Benes , linux-kernel@vger.kernel.org, live-patching@vger.kernel.org, Vojtech Pavlik Subject: Re: sched: horrible way to detect whether a task has been preempted Message-ID: <20160407223511.k2shbxreuxhfvot6@treble.redhat.com> References: <24db5a6ae5b63dfcd2096a12d18e1399a351348e.1458933243.git.jpoimboe@redhat.com> <20160407211525.GB25804@packer-debian-8-amd64.digitalocean.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.23.1 (2014-03-12) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Apr 07, 2016 at 11:37:19PM +0200, Jiri Kosina wrote: > On Thu, 7 Apr 2016, Jessica Yu wrote: > > > Been sort of rattling my head over the scheduler code :-) Just following > > the calls in and out of __schedule() it doesn't look like there is a > > current flag/mechanism to tell whether or not a task has been > > preempted.. > > Performing the complete stack unwind just to determine whether task has > been preempted non-volutarily is a slight overkill indeed :/ > > > Is there any reason why you didn't just create a new task flag, > > something like TIF_PREEMPTED_IRQ, which would be set once > > preempt_schedule_irq() is entered and unset after __schedule() returns > > (for that task)? This would roughly correspond to setting the task flag > > when the frame for preempt_schedule_irq() is pushed and unsetting it > > just before the frame preempt_schedule_irq() is popped for that task. > > This seems simpler than walking through all the frames just to see if > > in_preempt_schedule_irq() had been called. Would that work? > > Alternatively, without eating up a TIF_ space, it'd be possible to push a > magic contents on top of the stack in preempt_schedule_irq() (and pop it > once we are returning from there), and if such magic value is detected, we > just don't bother and claim unreliability. > > That has advantages of both aproaches combined, i.e. it's relatively > low-cost in terms of performance penalty, and it's reliable (in a sense > that you don't have false positives). > > The small disadvantage is that you can (very rarely, depending on the > chosen magic) have false negatives. That probably doesn't hurt too much, > given the high inprobability and non-lethal consequences. > > How does that sound? To do that from C code, I guess we'd still need some arch-specific code in an asm() statement to do the actual push? I think I'd prefer just updating some field in the task_struct. That way it would be simple and arch-independent. And the stack walker wouldn't have to scan for some special value on the stack. -- Josh