From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S935260AbbI2Rma (ORCPT ); Tue, 29 Sep 2015 13:42:30 -0400 Received: from mail-am1on0065.outbound.protection.outlook.com ([157.56.112.65]:26976 "EHLO emea01-am1-obe.outbound.protection.outlook.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1750950AbbI2RmU (ORCPT ); Tue, 29 Sep 2015 13:42:20 -0400 Authentication-Results: spf=none (sender IP is ) smtp.mailfrom=cmetcalf@ezchip.com; Subject: Re: [PATCH v7 07/11] arch/x86: enable task isolation functionality To: Andy Lutomirski References: <1443453446-7827-1-git-send-email-cmetcalf@ezchip.com> <1443453446-7827-8-git-send-email-cmetcalf@ezchip.com> <5609B7C0.3010807@ezchip.com> CC: Gilad Ben Yossef , Steven Rostedt , Ingo Molnar , Peter Zijlstra , Andrew Morton , Rik van Riel , Tejun Heo , Frederic Weisbecker , Thomas Gleixner , "Paul E. McKenney" , Christoph Lameter , Viresh Kumar , Catalin Marinas , Will Deacon , "linux-kernel@vger.kernel.org" , "H. Peter Anvin" , X86 ML From: Chris Metcalf Message-ID: <560ACD6F.7060102@ezchip.com> Date: Tue, 29 Sep 2015 13:42:07 -0400 User-Agent: Mozilla/5.0 (X11; Linux i686 on x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.2.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset="utf-8"; format=flowed Content-Transfer-Encoding: 7bit X-Originating-IP: [12.216.194.146] X-ClientProxiedBy: BN3PR10CA0032.namprd10.prod.outlook.com (25.161.211.42) To AM2PR02MB0772.eurprd02.prod.outlook.com (25.163.146.16) X-Microsoft-Exchange-Diagnostics: 1;AM2PR02MB0772;2:ulRqdPlqEKVH43dCPYk5Mc3EQPZD4lfxBEBJmjk0qZsEybaklm+1yVVKieGVdvUEL8PBj3D1KnIVtbxOKJV/Pj+kslDwM9twJV0eDyzHiVNtHQDTdUP7yWzWIckWmDRFNWtLZtrcmjFOG1G9+G+NiMl2x1QuywYWlUsBVazmdU4=;3:wIvYcWWUdSzzNJLD73F62Er0W42KSwqMTbMeCgELdbjM84zkwQuOtWNWIOGHBal/wXUOd+8vNXJA5aWccw30ZPBRU/iEGbH1Cq8dQZWkgzuyMtyJDu/9SxoWSskihP93G1L43upBhGRMQjpB6mreBQ==;25:hor6k+YamT86ssVjl0gfL2NPiI80AZ9jnZ8sAowfCHSwYzlcq7Y/Iq7NxxvW1aAoQswkTSsxWLolL8EhZ/KfI/C/lTRZIu4xVGfnMT0sA4TPVOFDERNtM3QPcWIrh7ar/Pj4Od7iK0zpfVW96ku/MvJgVj0/6jtiDCZC7pPC6Qw9IVYXCJ4KGIthKx4bIpNEf56leYeq4+mIypd5oHHI1M40NEQcpq2NfrzfH80SyvRYw04ABqueLeoQapaUF+KstI4L0I+vdY+IQc99SSyJ1A==;20:9njuFMFl/2/1ijO6LY8z5qFnU7bpOtmZTHm4Ww9AjhRQ8bQhU/8c7ZmJeIJrXOmFiIO69TX7AN3be0H19yh68Izf0tEkC709JCZn1+T6LUuyAFnWyFnPEpRMW3pLimrqIpGd3aUZECi7j8c8D+8XPxd+n3MOW7YF3lRgR9P+3x0= X-Microsoft-Antispam: UriScan:;BCL:0;PCL:0;RULEID:;SRVR:AM2PR02MB0772; X-Microsoft-Antispam-PRVS: X-Exchange-Antispam-Report-Test: UriScan:; X-Exchange-Antispam-Report-CFA-Test: BCL:0;PCL:0;RULEID:(601004)(2401047)(8121501046)(520078)(5005006)(3002001);SRVR:AM2PR02MB0772;BCL:0;PCL:0;RULEID:;SRVR:AM2PR02MB0772; X-Microsoft-Exchange-Diagnostics: 1;AM2PR02MB0772;4:+xVLSkDQ4nReiKIDmGRYyzPb7kXEiEbpCuDLUc76qVReic7VaDBOPuvAY+59TzaBQsIca1a5ge4zysqPMxKOkCg1/0WsLyBI9umnVtzs46JXIwD/3rXV3ZJ2s5slKo/1YQCEtFD58AZeUBoG7FOpkqx7nNSTQ3r+qY5eycK4eCbRtjZIRaetM2gDsAuk3xtd/JlAOPe6a+xwkR/6ZOcbqYhx0OPPtJ8DesikyBFOYXxDvsBzeFJENZD3R8xmlg1Dm10JWvgX37i7UFLTD2g8FcCvPhwmwjUWLUlCgY29+PVE5NPlynMtheD2YgtvzTHjAzudn+xLy83of3T29hCsnfjzWkA6VfUVUNXrKM6Ymb0= X-Forefront-PRVS: 0714841678 X-Forefront-Antispam-Report: SFV:NSPM;SFS:(10009020)(6009001)(6049001)(377454003)(479174004)(189002)(24454002)(199003)(65956001)(66066001)(64126003)(65806001)(64706001)(92566002)(42186005)(86362001)(5008740100001)(87976001)(47776003)(33656002)(59896002)(101416001)(83506001)(65816999)(87266999)(68736005)(5001920100001)(110136002)(40100003)(50466002)(189998001)(76176999)(5004730100002)(81156007)(36756003)(77096005)(62966003)(15975445007)(93886004)(105586002)(2950100001)(77156002)(106356001)(5001830100001)(5007970100001)(97736004)(4001540100001)(23676002)(46102003)(122386002)(80316001)(5001860100001)(19580395003)(54356999)(4001350100001)(5001960100002)(50986999)(18886065003);DIR:OUT;SFP:1101;SCL:1;SRVR:AM2PR02MB0772;H:[10.7.0.41];FPR:;SPF:None;PTR:InfoNoRecords;MX:1;A:1;LANG:en; X-Microsoft-Exchange-Diagnostics: =?utf-8?B?MTtBTTJQUjAyTUIwNzcyOzIzOlVyWGZaL1I0SXhQMHNWQ3dFMVdzVHdvZUJp?= =?utf-8?B?cGJ4NTVPN3ljaExYM2xRcmxaaVFWUkJWUzR2Y3RSTE5IK0U3ME04OEZTSFgw?= =?utf-8?B?V2N2VkNPRjlobGhhWjcvNGxzTnUwdnRlM0xCNTBkV25rdVlwcXVNMU5mOHhk?= =?utf-8?B?SEtvdkJWRUk4NUtpZFVCU1JMNVB4RDdRR2RxdnY5TTN0Z3U2Z3hMMUwrNkJx?= =?utf-8?B?eWdQbjRzZEdRU2FBeHNCbjUvQlhpUEhOOWx4WlhQZjFXak9vSkx3ZnVrRVNT?= =?utf-8?B?SDFDVVlQdVpTU25Wc2NISzEvU1hVNGdtN2JRUWR6cHlnM054SGJzNHBCYmIr?= =?utf-8?B?UzhNRHNveGU0RTRScStNaWVWdFFRK0xxY0t4TVRJdEI3YUlCbUEyOUVnWXlG?= =?utf-8?B?SVZycHRjR29ERDliSVU0LzBpUzNEWWV6d1VQbkJpejkydE5HTGphUzlKQlFP?= =?utf-8?B?UU1wWjEzZEhkMFVkaVdrKzg2VStMMDQ2bWRmcmFndlpxdTUzWHJteVVaUUNX?= =?utf-8?B?OWVHVXVSUFk5Njc2QUNNS3RZM0U3M2xINkJ3bnN2aW4xa2xNcm0xWEtUaXFE?= =?utf-8?B?ZThtTEVjeXFGODdGU2w2VWpJVXhsZUYvdTBybGYzQitOQ2pCVVZDbkxqYnZH?= =?utf-8?B?ZUZ4cnozSzBNTmZMZzJoNU9FeUZISzNuT3B6U0cyNXZIODBVbU5JdWh0WEhT?= =?utf-8?B?UWppL1FvbGlXY2hHdklGNmNoMmNkbjMzanZEN1JCd3BMOHNDaXlRRlBGS0t5?= =?utf-8?B?TFRJRmJVZno4YnBjcVQxSFBqS0h3QTlyQTAybllJZjByTGNISnIvZHRkUVhX?= =?utf-8?B?N3hrVHg4N25FZ3R5QUVlN2JRbWdicUFmKzRaR1BhWkxoRzZLNm02WlpGR3FL?= =?utf-8?B?ZzBHTmZ6SFBEb1k3YktnL29nbHNQZTM4MER2UlBwdDJ3elp6UGVUekE0RmEv?= =?utf-8?B?c0VKQ1phMS9QS2ZiRFR6V1BaY2tzcUtuQWdJdm5wTDEwemcwS0FINURERUpB?= =?utf-8?B?N0gra0d2dG40aUF0OGtQL0tBSVQzRmhUNlZLMEphOTV2Ryt2OXdRMGh2cUlm?= =?utf-8?B?YmtQblFmOFhQNDM0TmVmbW9xeTExV2kvNXdDQUQ4MkhqcmFCWVFmTXVPS1VS?= =?utf-8?B?TkN3UUdRWWNlMjJHRUtINDQ4K1oxM1hnaDg2aWhlejd0cEhXVVROYm4zN3U4?= =?utf-8?B?VWFTREROdm55K2ZKdVQvSjVJRFhJWFRxMXNFc0VZMXRlY0NncldCaUNuWHlh?= =?utf-8?B?ZlV3UjNSR2NlSExBODZRQ1dlclFnRTNGNjJUdzF1UjdDYnQzOHdicGZDVE5h?= =?utf-8?B?ejlKaHRBa25QNFFnUDF2WjRuNGcvMkcrSmtLTXdlYUF5RkRwUk9PYXJXcnJS?= =?utf-8?B?L25aQzBsMnFYZmFKbVZ3OWwvb0cxTG5IL3EwYWdDOEZXa095NG9uSlVXWHRo?= =?utf-8?B?OU9SQnZiVXl2UHVkUjFmUENiYWtWMFBtamVNWEcrNzZPOEpMZ3k1UDFmdzk2?= =?utf-8?B?dEs0bzJuZ0EzV2ZwTnNRRmc2cmRlVm1JRXFvOFNTeW0zaEUrMWtOS2tPT0ty?= =?utf-8?B?S0FQSjhBYzRHK05lMEkxR0hxQmhIU1ZxejVPUmZVd2RUN3YvZXFWU091SmtV?= =?utf-8?B?RWQwMzdFZ1lVSkFmbE9zVGcyS1dQSlg2cEZlb1YrRGlCcCs5V1lxaDRPSkZp?= =?utf-8?B?aHU3cFBKcHJrTFpla1hIWDF6UEJubXpVZFUzb1JHeHBHd0g3L2FRTHdaNkhp?= =?utf-8?B?QVBlRy9sSENkZXU1aXpMaU9aeGNiY3BzOXNXSG5TS09IcUFvMEN1bitEc0R3?= =?utf-8?B?ZlI3NmNRRjF4UCswVTBMRE1XVVpNUDBZVURxcU40aFF6STdMSXpIRmt4MDF4?= =?utf-8?B?SGs3SEVnb0NCZUl3dnVtMzAyVTZlblp1eUNCd3lQVzdoNElyTmh3NjRWSDY3?= =?utf-8?B?c3N5elczL3JDaHVRUEhkY093UXkraXhLTUI2MmpxVjZnZ3BEb0VyWnB4eU1W?= =?utf-8?B?YTh1WUd2QU8vTWwxckpEYmFraUhmWnk0QjZuRU5xMitlenluOEdjZnlzc093?= =?utf-8?Q?XpUE=3D?= X-Microsoft-Exchange-Diagnostics: 1;AM2PR02MB0772;5:vlOw20Oven/7VTCRYywAQByxuxgETrvneHQcOy7/5fspaCUV8P/bIg4C03cFA3awbkfl5N0xD/aBMajkBoGJHoxx5mBRZkLlYb51qMt3avyBZqhiJRwOFxp1AK/RQRSqI8YbuNwC17w7ez0G+i2VHQ==;24:WlceYUyosbvSkhVdQft0At/atEjHolAngSYm6UD+DVv3GkjHjEnGPi0CFvkpIYwLKPh767ExrIK2K3/zOY7dRYW9fpTXPkh+yIz0v7B1Agk=;20:1no++1FvKSNS2rBzJM4/z8UIKm+py4VXlh7QWeGYd4EHpn67XkgWoM9Hm0BrSUHqCCu22TmGQM5W8E0NCKVs6g== SpamDiagnosticOutput: 1:23 SpamDiagnosticMetadata: NSPM X-OriginatorOrg: ezchip.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 29 Sep 2015 17:42:15.7263 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-Transport-CrossTenantHeadersStamped: AM2PR02MB0772 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 09/28/2015 06:43 PM, Andy Lutomirski wrote: > Why are we treating alarms as something that should defer entry to > userspace? I think it would be entirely reasonable to set an alarm > for ten minutes, ask for isolation, and then think hard for ten > minutes. > > A bigger issue would be if there's an RT task that asks for isolation > and a bunch of other stuff (most notably KVM hosts) running with > uncontrained affinity at full load. If task_isolation_enter always > sleeps, then your KVM host will get scheduled, and it'll ask for a > user return notifier on the way out, and you might just loop forever. > Can this happen? task_isolation_enter() doesn't sleep - it spins. This is intentional, because the point is that there should be nothing else that could be scheduled on that cpu. We're just waiting for any pending kernel management timer interrupts to fire. In any case, you normally wouldn't have a KVM host running on an isolcpus, nohz_full cpu, unless it was the only thing running there, I imagine (just as would be true for any other host process). > ISTM something's suboptimal with the inner workings of all this if > task_isolation_enter needs to sleep to wait for an event that isn't > scheduled for the immediate future (e.g. already queued up as an > interrupt). Scheduling a timer for 10 minutes away is typically done by scheduling timers for the max timer granularity (which could be just a few seconds) and then waking up a couple of hundred times between now and now+10 minutes. Doing this breaks the task isolation guarantee, so we can't return to userspace while something like that is pending. You'd have to do it by polling in userspace to avoid the unexpected interrupts. I suppose if your hardware supported it, you could imagine a mode where userspace can request an alarm a specific amount of time in the future, and the task isolation code would then ignore an alarm that was going off at that specific time. But I'm not sure what hardware does support that (I know tile uses the "few seconds and re-arm" model), and it seems like a pretty corner use-case. We could certainly investigate adding such support later, but I don't see it as part of the core value proposition for task isolation. -- Chris Metcalf, EZChip Semiconductor http://www.ezchip.com