From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752077AbcGRWMG (ORCPT ); Mon, 18 Jul 2016 18:12:06 -0400 Received: from mail-vk0-f48.google.com ([209.85.213.48]:32833 "EHLO mail-vk0-f48.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751427AbcGRWMC (ORCPT ); Mon, 18 Jul 2016 18:12:02 -0400 MIME-Version: 1.0 In-Reply-To: <2a88da0c-b1a9-aec3-f5a6-d524e69e4731@mellanox.com> References: <1468529299-27929-1-git-send-email-cmetcalf@mellanox.com> <2a88da0c-b1a9-aec3-f5a6-d524e69e4731@mellanox.com> From: Andy Lutomirski Date: Mon, 18 Jul 2016 15:11:42 -0700 Message-ID: Subject: Re: [PATCH v13 00/12] support "task_isolation" mode To: Chris Metcalf Cc: Gilad Ben Yossef , Steven Rostedt , Ingo Molnar , Peter Zijlstra , Andrew Morton , Rik van Riel , Tejun Heo , Frederic Weisbecker , Thomas Gleixner , "Paul E. McKenney" , Christoph Lameter , Viresh Kumar , Catalin Marinas , Will Deacon , Daniel Lezcano , "linux-doc@vger.kernel.org" , Linux API , "linux-kernel@vger.kernel.org" Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Jul 14, 2016 at 2:22 PM, Chris Metcalf wrote: > On 7/14/2016 5:03 PM, Andy Lutomirski wrote: >> >> On Thu, Jul 14, 2016 at 1:48 PM, Chris Metcalf >> wrote: >>> >>> Here is a respin of the task-isolation patch set. This primarily >>> reflects feedback from Frederic and Peter Z. >> >> I still think this is the wrong approach, at least at this point. The >> first step should be to instrument things if necessary and fix the >> obvious cases where the kernel gets entered asynchronously. > > > Note, however, that the task_isolation_debug mode is a very convenient > way of discovering what is going on when things do go wrong for task > isolation. > >> Only once >> there's a credible reason to believe it can work well should any form >> of strictness be applied. > > > I'm not sure what criteria you need for this, though. Certainly we've been > shipping our version of task isolation to customers since 2008, and there > are quite a few customer applications in production that are working well. > I'd argue that's a credible reason. > >> As an example, enough vmalloc/vfree activity will eventually cause >> flush_tlb_kernel_range to be called and *boom*, there goes your shiny >> production dataplane application. > > > Well, that's actually a refinement that I did not inflict on this patch > series. Submit it separately, perhaps? The "kill the process if it goofs" think while there are known goofs in the kernel, apparently with patches written but unsent, seems questionable.