From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754797AbbGMVBl (ORCPT ); Mon, 13 Jul 2015 17:01:41 -0400 Received: from mail-am1on0060.outbound.protection.outlook.com ([157.56.112.60]:48928 "EHLO emea01-am1-obe.outbound.protection.outlook.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1754071AbbGMVBi (ORCPT ); Mon, 13 Jul 2015 17:01:38 -0400 Authentication-Results: vger.kernel.org; dkim=none (message not signed) header.d=none; Message-ID: <55A4271B.9040506@ezchip.com> Date: Mon, 13 Jul 2015 17:01:15 -0400 From: Chris Metcalf User-Agent: Mozilla/5.0 (X11; Linux i686 on x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.7.0 MIME-Version: 1.0 To: Andy Lutomirski CC: Gilad Ben Yossef , Steven Rostedt , Ingo Molnar , Peter Zijlstra , Andrew Morton , "Rik van Riel" , Tejun Heo , Frederic Weisbecker , Thomas Gleixner , "Paul E. McKenney" , Christoph Lameter , Viresh Kumar , "linux-doc@vger.kernel.org" , Linux API , "linux-kernel@vger.kernel.org" Subject: Re: [PATCH v4 1/5] nohz_full: add support for "cpu_isolated" mode References: <1436817481-8732-1-git-send-email-cmetcalf@ezchip.com> <1436817481-8732-2-git-send-email-cmetcalf@ezchip.com> In-Reply-To: Content-Type: text/plain; charset="utf-8"; format=flowed Content-Transfer-Encoding: 7bit X-Originating-IP: [12.216.194.146] X-ClientProxiedBy: CY1PR0801CA0011.namprd08.prod.outlook.com (25.163.136.149) To HE1PR02MB0777.eurprd02.prod.outlook.com (25.161.118.141) X-Microsoft-Exchange-Diagnostics: 1;HE1PR02MB0777;2:aREIKbltXDPNoll9MyDbveGyLuOWGUL5q51YwT0EhG7YCoKbBGfyel0/Gs3Y7vtv;3:od6KnAcyjREWw2xwUYxQ/JkWnRsPBGHYVfSNyoxcyyVSXfK5JDuLUn/9TdY7x/sx9KA38ZMaZ4AiSiE8iFJYcMs0tilcZK19FA+MgUeVUodda2Dh3u5484haa6GokQZiyGguSdT31QDHLwL1MQ35lQ==;25:O8yszwqtTm5LPIB4MQB1JCiRneJBRxYuHXUALIAXjXNW6fkQlhEcZ0+vroTze5TjWk4VP4KaT4lhwhYzP9rE9OOQkHp9X2yzi7YGLZrBQaABu3famezsU3VyKPstkdOqYGk74vq/vEyqrc2q6ui2TruF8Ci2OU9xbhrY+DlsX/6OTNXpHDuDFn1qRzN3ISkiEUSz5mpNNM+bvMDv8KiLJcAM/p+NkKGwoi0ZVawlTsm1Wdsiki84PkLhxfnuhM+0nVT8l8QVbYHbkg4g7nv9lg==;20:v2LnLieh2JXvikzuiT2q64jhZom9vHp+fCAHVL+jQMxxJP9QN0aPf2XlcvtrCZwt9IpBX7m4DuchpcrS4V55nz3wQz6O6gekApAJdiRlJ3X6vD+Hm0Eep/YGlh84mzcPJCKyQzTeEeXGPQ0lpKc/RNpOupeou6IzPcF3dVkUKJw= X-Microsoft-Antispam: UriScan:;BCL:0;PCL:0;RULEID:;SRVR:HE1PR02MB0777;UriScan:;BCL:0;PCL:0;RULEID:;SRVR:HE1PR02MB0793; HE1PR02MB0777: X-MS-Exchange-Organization-RulesExecuted X-Microsoft-Antispam-PRVS: X-Exchange-Antispam-Report-Test: UriScan:; X-Exchange-Antispam-Report-CFA-Test: BCL:0;PCL:0;RULEID:(601004)(5005006)(3002001);SRVR:HE1PR02MB0777;BCL:0;PCL:0;RULEID:;SRVR:HE1PR02MB0777; X-Microsoft-Exchange-Diagnostics: 1;HE1PR02MB0777;4:b4pZbKAL2sF0Y5gehKStmUEjk8axe36oT+uSd0d/yjf6CIj8/AHDt+Wv3bqJXS8O2BknD2vC55W7MeeSGtLA8jNviUEwdFFkfMrd26zIe9P2jnG0T2l+cr1KDXEdOQ7HjX6TQ5hcxq/sDsGAVRTzUpb3czZTjOmZ1g4ntdEAql5/uLMpwFiicdlzQLApH2c6XCYcGpX0KxswJZ/pCEiPcXe1AjJNJRMnYnYM9EvsQp3iXEwLuFJEaeSyYqnkIlFiHy6BP70rvOGNWLlFVKOiY8qs1jp9PRD6xUOUC83B6R8= X-Forefront-PRVS: 0636271852 X-Forefront-Antispam-Report: SFV:NSPM;SFS:(10009020)(6049001)(6009001)(377454003)(51704005)(24454002)(479174004)(19580395003)(5001960100002)(83506001)(110136002)(65956001)(23676002)(62966003)(65806001)(64126003)(77096005)(66066001)(189998001)(47776003)(77156002)(36756003)(40100003)(15975445007)(59896002)(80316001)(2950100001)(122386002)(5001920100001)(4001350100001)(86362001)(33656002)(50466002)(19580405001)(76176999)(46102003)(65816999)(42186005)(50986999)(54356999)(92566002)(87976001)(18886065003);DIR:OUT;SFP:1101;SCL:1;SRVR:HE1PR02MB0777;H:[10.7.0.41];FPR:;SPF:None;MLV:sfv;LANG:en; X-Microsoft-Exchange-Diagnostics: =?utf-8?B?MTtIRTFQUjAyTUIwNzc3OzIzOnVTNDloKzJqTlJrUDRWSGtGNmdOMUNBdC9y?= =?utf-8?B?MUk2MTZPWXozVVRvcUtVRDd0ZlNBQ1BJdHRiSDBRQStjRDlIdVBXRk1WZnJJ?= =?utf-8?B?azF2MUZ5ZmMvbG83NUhTejZ0WWRoWXVuY3pMQTV3NjIyWmpYMHVVN2Z2K0RJ?= =?utf-8?B?RlBuTEtralpzOXZvUlYxNWhNV0dDU3BpSGFKaW1BSWdDSGsySlVYWmVrQ2N5?= =?utf-8?B?cnNBOG03aXIzbCtWVGZwQSsrWG9xWG50amxiRG40UW5JWjVqOEtSWG5sb1ph?= =?utf-8?B?dXhoblo2RVA0NjA3dU4zZUNFQk9WTExJUDd4U01CVVRpc0NQelFOdDR5SGVz?= =?utf-8?B?MVREVXdoUmQ0OHlPT2w0cGZjdGxERnMyenUxT0dOaUJJcHVxUzYybnNWUGhj?= =?utf-8?B?bG1DeVZBRkdRUWMyMk84ZkEvcjZQOG5wMEVOcGtVQVdLbGNKbTZzVVNSMHFH?= =?utf-8?B?azlGNWpTS2lXM21VOGpxM3R1QkQ0YUNyZmlRVHFsUXMxYnpVTUp2UzlNZnBn?= =?utf-8?B?WitGNWxTbng2UnZnZWFKeWtUVFM4VlJoZDF2ZlJHRnNYU3B6NHJzVnJNQzE3?= =?utf-8?B?cFNlL2Q5YmxGYUgyWklvTUlIeHA0ckVnU241T200VXdsekQwUDJ5ZTNUOEtR?= =?utf-8?B?NWZGR1lES3ZxbExSMytSWTZ6Wko4UFE4d0IxN1ZCckY2R3lQNGVnai9rUnpY?= =?utf-8?B?OW5xYVF2amlDU1YxWXB1Zmd1QW9oUGwrc0R3N0dmcGMycFJiVXB6WFZLYW1Q?= =?utf-8?B?RmpadDVYbjFLczFJOVJBR1VCdTVKRFRMS2xjaHFkQVcrVzFXamo1bjdLSGZl?= =?utf-8?B?Zk0reE4wREQ2dHlBb1JjeFNtMW9DYStsRjllaGJDSWMwdndjNkdJTWFteklE?= =?utf-8?B?RmdTRWJ2WGswNmRuaHFGMUoyZCtEOG0vQ28wdW9WbWFQZk9MV0xZcWFDcUkz?= =?utf-8?B?NzYzekh1azhmam1abGZyMUY2bG1UdHZ6bnNIdDR1M3I2MjRSMWdPV1l6SXFG?= =?utf-8?B?aXQyYzJJWERnakRnem9VT2d0YTV3aUdwYzhCOHV4dmtWdUQzNTE4VzR1T0hZ?= =?utf-8?B?RlJJSkFENHhMRVlnSTJXcytkRHI3VjRmd1UrdmpKVm5DNGE0M3BqMFFEWW5R?= =?utf-8?B?RzRBbDRpNUsvbVc0M2ZHL0I1Y2JnUzdpREFIdC81Nm84WVhaOERrZWREY3JJ?= =?utf-8?B?a0Zza3BwUHZjeDJjcXNhWU90UVdsalExSkMvMHlRNGVtbnhRbjA3cUhoQkJ5?= =?utf-8?B?VS9YemlCWmZBQ3hjWFhNdkxOdDl6b2h4UWF5UDhXM0V0aEV1RWtVWG9LdzVw?= =?utf-8?B?c1QrTXE1bnEwRkxQSnZYQ1l1QUdzOXVFR2h2S1lvdWc5NWdxVkFzcmI3M3RN?= =?utf-8?B?VFZPUWk0MFR5Mzg4QzJJNzN0cnVmcElHMmVDZjlnaU15ZjZjM3dTU094RDhl?= =?utf-8?B?QWQrY0NEdEpLOHNzb3RpdnpxaHZMbmxvTU5DWE9qSndrWXFoRkZteTFRbGFI?= =?utf-8?Q?/61zqkB5n8UTs7SWwJANoCPdY=3D?= X-Microsoft-Exchange-Diagnostics: 1;HE1PR02MB0777;5:EeGXZmKERGg8Mr+jIZu7OSw6cqqKU7JVMhfjlB55wuUjqLCpQe7jg6smAp2mKPVZOXCeyqnWZBvd8xAs3wmfAgntIIk3NbZ7Tw0PVK3NgzTxgnqRn2AcuIo8nWuVzlf7q0F2zFxnvQlWPZ+8HBK+kQ==;24:K3AtXZzv4twDsKboe4kHk8MJ5I+YeCAAsB8yA2IL6hrrXWm+RRekLtVWhHlO9HFat5VoaqmHoSCl5tt8qYwhc1Qj4jazyzLG11kJA8QCYUs=;20:COna8Zv+K0RqY3uOrk+Zg1T3Y+/qt8Lqvww1Vq2K8UdWkwrRE4JbkpJ1Lm78Ay4idWF8qPMgbPKWpjenFM7zxw== SpamDiagnosticOutput: 1:23 SpamDiagnosticMetadata: NSPM X-MS-Exchange-CrossTenant-OriginalArrivalTime: 13 Jul 2015 21:01:29.0993 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-Transport-CrossTenantHeadersStamped: HE1PR02MB0777 X-Microsoft-Exchange-Diagnostics: 1;HE1PR02MB0793;2:lxSaoIkjrn/2DQ07rsTYKV21H1Mi++t9oJvPqEa42Wcd1SMBRwQGhq7bJ4Eorli6;3:GSFoLKJ5p6oWC+Csr719zeFOr0k+NX5H7C5ATAs5TXBFIfF2J1y97Vz9m4VUx88NLc+xmrCvC4JH8P/Djxoab9Ifo8zQSKwHnTgmJrC7+fRBNZeQdKsScOlYc2V61qb2SRnAtuhuJIRSVQ525mHUUQ==;25:E6q/YH9u5VU0wFeh+CMZVtcfL0Jd7A4Rzxipa53khwIlWWtvg7jhVt1bzGiy6TzpFhSloNyWSPKvmg1+Ex2iRORzZZj/kFs87sRlOg8lQBuBOMi02pOljbacFQ+s/bymqMPRYgfMKrwga0m3uUhy89jnH0vOGe3s/ZMpsMFtW3O//wmcHZTOEoPcLshpos8BkME2peZOz+4KlQyF5APmy+Hb2GW413wkGDMPx8ciZKzz4aJGC2MZh7YG2/6o7MP6Dr4GTZCYk8r4/qY3A4g3Tg==;20:xcs3IJjdVW7a46faU93KLUPGRwQPbPNpQDTImU8tEEMdPwq8WVZuge0LCX2/0Em/OH1sWEmKbJyP2edix/P6ng==;23:7tp/5dd+oRf9yVNoqCXe0YoQNDJwcwRScx+bTvO4aRnZAQ0CaqoxhOaSxjS5qC9QceOj/ui00qCmAr/ghE7N85wM251Su+t4NPzguuqPtbfZdsZoCSpUoDJSUvNPHAbn7tRQHqgxhEfSTY0oFV76SDnln1DW1ftm0waVitX/Spoob/kmOKu9/njNjQrD3zjIXUklw9Kt9fV/+0Uq2nZRRNNCURsM3+e+s7AUPKiAA/3R+RTkiXYHPtVq2xs8As1h HE1PR02MB0793: X-MS-Exchange-Organization-RulesExecuted X-OriginatorOrg: ezchip.com Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 07/13/2015 04:40 PM, Andy Lutomirski wrote: > On Mon, Jul 13, 2015 at 12:57 PM, Chris Metcalf wrote: >> The existing nohz_full mode makes tradeoffs to minimize userspace >> interruptions while still attempting to avoid overheads in the >> kernel entry/exit path, to provide 100% kernel semantics, etc. >> >> However, some applications require a stronger commitment from the >> kernel to avoid interruptions, in particular userspace device >> driver style applications, such as high-speed networking code. >> >> This change introduces a framework to allow applications to elect >> to have the stronger semantics as needed, specifying >> prctl(PR_SET_CPU_ISOLATED, PR_CPU_ISOLATED_ENABLE) to do so. >> Subsequent commits will add additional flags and additional >> semantics. > I thought the general consensus was that this should be the default > behavior and that any associated bugs should be fixed. I think it comes down to dividing the set of use cases in two: - "Regular" nohz_full, as used to improve performance and limit interruptions, possibly for power benefits, etc. But, stray interrupts are not particularly bad, and you don't want to take extreme measures to avoid them. - What I'm calling "cpu_isolated" mode where when you return to userspace, you expect that by God, the kernel doesn't interrupt you again, and if it does, it's a flat-out bug. There are a few things that cpu_isolated mode currently does to accomplish its goals that are pretty heavy-weight: Processes are held in kernel space until ticks are quiesced; this is not necessarily what every nohz_full task wants. If a task makes a kernel call, there may well be arbitrary timer fallout, and having a way to select whether or not you are willing to take a timer tick after return to userspace is pretty important. Likewise, there are things that you may want to do on return to userspace that are designed to prevent further interruptions in cpu_isolated mode, even at a possible future performance cost if and when you return to the kernel, such as flushing the per-cpu free page list so that you won't be interrupted by an IPI to flush it later. If you're arguing that the cpu_isolated semantic is really the only one that makes sense for nohz_full, my sense is that it might be surprising to many of the folks who do nohz_full work. But, I'm happy to be wrong on this point, and maybe all the nohz_full community is interested in making the same tradeoffs for nohz_full generally that I've proposed in this patch series just for cpu_isolated? -- Chris Metcalf, EZChip Semiconductor http://www.ezchip.com