From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932287AbbELKiZ (ORCPT ); Tue, 12 May 2015 06:38:25 -0400 Received: from casper.infradead.org ([85.118.1.10]:35251 "EHLO casper.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752910AbbELKiV (ORCPT ); Tue, 12 May 2015 06:38:21 -0400 Date: Tue, 12 May 2015 12:38:05 +0200 From: Peter Zijlstra To: Ingo Molnar Cc: Chris Metcalf , Gilad Ben Yossef , Steven Rostedt , Ingo Molnar , Andrew Morton , Rik van Riel , Tejun Heo , Thomas Gleixner , Frederic Weisbecker , "Paul E. McKenney" , Christoph Lameter , "Srivatsa S. Bhat" , linux-doc@vger.kernel.org, linux-api@vger.kernel.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH 4/6] nohz: support PR_DATAPLANE_QUIESCE Message-ID: <20150512103805.GJ21418@twins.programming.kicks-ass.net> References: <1431107927-13998-1-git-send-email-cmetcalf@ezchip.com> <1431107927-13998-5-git-send-email-cmetcalf@ezchip.com> <20150512093349.GH21418@twins.programming.kicks-ass.net> <20150512095030.GD11477@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20150512095030.GD11477@gmail.com> User-Agent: Mutt/1.5.21 (2012-12-30) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, May 12, 2015 at 11:50:30AM +0200, Ingo Molnar wrote: > > * Peter Zijlstra wrote: > > > On Fri, May 08, 2015 at 01:58:45PM -0400, Chris Metcalf wrote: > > > This prctl() flag for PR_SET_DATAPLANE sets a mode that requires the > > > kernel to quiesce any pending timer interrupts prior to returning > > > to userspace. When running with this mode set, sys calls (and page > > > faults, etc.) can be inordinately slow. However, user applications > > > that want to guarantee that no unexpected interrupts will occur > > > (even if they call into the kernel) can set this flag to guarantee > > > that semantics. > > > > Currently people hot-unplug and hot-plug the CPU to do this. > > Obviously that's a wee bit horrible :-) > > > > Not sure if a prctl like this is any better though. This is a CPU > > properly not a process one. > > So if then a prctl() (or other system call) could be a shortcut to: > > - move the task to an isolated CPU > - make sure there _is_ such an isolated domain available > > I.e. have some programmatic, kernel provided way for an application to > be sure it's running in the right environment. Relying on random > administration flags here and there won't cut it. No, we already have sched_setaffinity() and we should not duplicate its ability to move tasks about. What this is about is 'clearing' CPU state, its nothing to do with tasks. Ideally we'd never have to clear the state because it should be impossible to get into this predicament in the first place. The typical example here is a periodic timer that found its way onto the cpu and stays there. We're actually working on allowing such self arming timers to migrate, so once we have that sorted this could be fixed proper I think. Not sure if there's more pollution that people worry about. The hotplug hack worked because unplug force migrates the timers away. From mboxrd@z Thu Jan 1 00:00:00 1970 From: Peter Zijlstra Subject: Re: [PATCH 4/6] nohz: support PR_DATAPLANE_QUIESCE Date: Tue, 12 May 2015 12:38:05 +0200 Message-ID: <20150512103805.GJ21418@twins.programming.kicks-ass.net> References: <1431107927-13998-1-git-send-email-cmetcalf@ezchip.com> <1431107927-13998-5-git-send-email-cmetcalf@ezchip.com> <20150512093349.GH21418@twins.programming.kicks-ass.net> <20150512095030.GD11477@gmail.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Return-path: Content-Disposition: inline In-Reply-To: <20150512095030.GD11477-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> Sender: linux-api-owner-u79uwXL29TY76Z2rM5mHXA@public.gmane.org To: Ingo Molnar Cc: Chris Metcalf , Gilad Ben Yossef , Steven Rostedt , Ingo Molnar , Andrew Morton , Rik van Riel , Tejun Heo , Thomas Gleixner , Frederic Weisbecker , "Paul E. McKenney" , Christoph Lameter , "Srivatsa S. Bhat" , linux-doc-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, linux-api-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org List-Id: linux-api@vger.kernel.org On Tue, May 12, 2015 at 11:50:30AM +0200, Ingo Molnar wrote: > > * Peter Zijlstra wrote: > > > On Fri, May 08, 2015 at 01:58:45PM -0400, Chris Metcalf wrote: > > > This prctl() flag for PR_SET_DATAPLANE sets a mode that requires the > > > kernel to quiesce any pending timer interrupts prior to returning > > > to userspace. When running with this mode set, sys calls (and page > > > faults, etc.) can be inordinately slow. However, user applications > > > that want to guarantee that no unexpected interrupts will occur > > > (even if they call into the kernel) can set this flag to guarantee > > > that semantics. > > > > Currently people hot-unplug and hot-plug the CPU to do this. > > Obviously that's a wee bit horrible :-) > > > > Not sure if a prctl like this is any better though. This is a CPU > > properly not a process one. > > So if then a prctl() (or other system call) could be a shortcut to: > > - move the task to an isolated CPU > - make sure there _is_ such an isolated domain available > > I.e. have some programmatic, kernel provided way for an application to > be sure it's running in the right environment. Relying on random > administration flags here and there won't cut it. No, we already have sched_setaffinity() and we should not duplicate its ability to move tasks about. What this is about is 'clearing' CPU state, its nothing to do with tasks. Ideally we'd never have to clear the state because it should be impossible to get into this predicament in the first place. The typical example here is a periodic timer that found its way onto the cpu and stays there. We're actually working on allowing such self arming timers to migrate, so once we have that sorted this could be fixed proper I think. Not sure if there's more pollution that people worry about. The hotplug hack worked because unplug force migrates the timers away.