From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753165AbcDRLUA (ORCPT ); Mon, 18 Apr 2016 07:20:00 -0400 Received: from mail-am1on0129.outbound.protection.outlook.com ([157.56.112.129]:55104 "EHLO emea01-am1-obe.outbound.protection.outlook.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1752608AbcDRLT4 (ORCPT ); Mon, 18 Apr 2016 07:19:56 -0400 Authentication-Results: virtuozzo.com; dkim=none (message not signed) header.d=none;virtuozzo.com; dmarc=none action=none header.from=virtuozzo.com; Subject: Re: [PATCHv4 1/2] x86/vdso: add mremap hook to vm_special_mapping To: Andy Lutomirski References: <1460388169-13340-1-git-send-email-dsafonov@virtuozzo.com> <1460729545-5666-1-git-send-email-dsafonov@virtuozzo.com> CC: "linux-kernel@vger.kernel.org" , Thomas Gleixner , Ingo Molnar , "H. Peter Anvin" , X86 ML , Andrew Morton , "linux-mm@kvack.org" , Dmitry Safonov <0x7f454c46@gmail.com> From: Dmitry Safonov Message-ID: <5714C299.5060307@virtuozzo.com> Date: Mon, 18 Apr 2016 14:18:49 +0300 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.7.1 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset="utf-8"; format=flowed Content-Transfer-Encoding: 7bit X-Originating-IP: [195.214.232.10] X-ClientProxiedBy: VI1PR02CA0020.eurprd02.prod.outlook.com (10.162.7.158) To AM5PR0801MB1300.eurprd08.prod.outlook.com (10.167.216.151) X-MS-Office365-Filtering-Correlation-Id: 2c28fdf5-7e5d-4fe2-0524-08d3677b5ce9 X-Microsoft-Exchange-Diagnostics: 1;AM5PR0801MB1300;2:hwjq11uLwsfjsjS4COIVJ5adj8uK6m1GQ4BsHxC3PD173umRMlAwiEvZKhijbrrdampvweT9DovMJdDXAepYFFGK1C77butLrKYymN/R9yvx+pPA2jWs98bcU19ZZ2scjEqB1DY88EryrjDS624cWznd65ey9/NqIijlbyDF9C0KONf6n2iiT2aWqUTy9MTP;3:xwrWELctmhiuWxTBk+y4bMOH4KhOdaty63wRHxdo1RlKlwT5zIjVEuuUBNKG8c2wE8OEkWjWZKXd5L8Xfdg/3EdusmSzZuoeuV2BuViuYs+Z8+V1uSEncb69aWKScbHq;25:62toDiWDAGU1W5dLdrmcgnktRpjOYnHHf3o0+XTeBJVeXkhIUbe6qfwahMxX5l/pK9ZZ15rO/jQpIMzMrm2NZNtKNeaZWG+WpKXPqJi8Wh6Q7hxF1eUta843cJeSxOHRWQs841EMzjsB4sFX/E8uo/u5oKoUX3QqahPSizN9Z6Wku4AqUpnW5RzoA1iVaktNTU/o5XZpZT7J3PhC0HoO4PpaMh5qrjqvpB1uB52iiOFnnS1dgt2Vh4hm2UW0Hh3vD+b2c8WZsq8rVZCsE3aL4GRfpjE27w2sFpP537RPwVanx/2ELDiuboU5S++JQ8lVEYRBWjZq7Ltewj/VQQBOGov3w2hvQKgDOOj7YB3xz4NsPwI6xlJKn7GN3ScAWEeXi7DBZ09yHJyY8Ct7IDufOIfnfqcvK2df7INMuTwC8ebDrkehjUkwQGAG3XjpuE5lHKNu2+GBMnOdDOafGWEQefW2L2IsCi+nElepAVx0gi2q6k/pvdM414FqN0SdD3DqarbVD+M47k5iW8RL79wx4+a4AGK4oWX6AzG5FK7b2W2aWbm0eqq4qHTtotB33vC6kik/5OvQqyiv82jR5Pzt0hBp25jBFLMGG8q0WdPW4I1EybVqEKhZ1hk1hKiOzyEp X-Microsoft-Antispam: UriScan:;BCL:0;PCL:0;RULEID:;SRVR:AM5PR0801MB1300; X-Microsoft-Antispam-PRVS: X-Exchange-Antispam-Report-Test: UriScan:; X-Exchange-Antispam-Report-CFA-Test: BCL:0;PCL:0;RULEID:(9101521026)(6040130)(601004)(2401047)(5005006)(8121501046)(3002001)(10201501046)(6041046)(6043046);SRVR:AM5PR0801MB1300;BCL:0;PCL:0;RULEID:;SRVR:AM5PR0801MB1300; X-Microsoft-Exchange-Diagnostics: 1;AM5PR0801MB1300;4:CKWgozSt5tGc8/aQbHnVvWrgnbmepPLLKcbe009gTmTfzSYAYkmticvAIZnzb7kgHoD3LwFa9/i1bYodaWZVL4FQzVnbcGsz8GpnrlP2+6O323WYwnZB44vzpQXYa0ocHj/8JNiBZf6XXWUOoQfanOc+wk2En3jlD7HUjuA5ii9jUTLpsMOqTNpW9VdP2dVimb9LJPivfra7s3hTPnjy8lneQsfP/tXKgCUkOCnT+W4USSOrMb8MoeN+o4l9jzZBggmetQBnBVJ5GX7uGz2Vm2jj//TZFGhd/mJVKuE97VJqgpomlYkkS6n+vnmAmEjkbjzoghkYy9fqymHktA6BPfAyxDOLaCnoz7a1R5GKxyqPuEzDzOd5EbvQzSlS9REFEMq639t5OuIg1h3EHR9J+MYZbjgV9ss3uEdeaWNcPbYzhDulQNL0/LTUcmZ0UR2FTna9/iRuk6m+MuS6hjEYrg== X-Forefront-PRVS: 0916FC3A18 X-Forefront-Antispam-Report: SFV:NSPM;SFS:(10019020)(4630300001)(6009001)(6049001)(24454002)(377454003)(64126003)(2906002)(4326007)(1096002)(6116002)(3846002)(36756003)(110136002)(189998001)(5008740100001)(4001350100001)(230700001)(586003)(92566002)(2950100001)(164054004)(81166005)(77096005)(50986999)(23676002)(83506001)(47776003)(65806001)(66066001)(65956001)(19580395003)(80316001)(65816999)(50466002)(33656002)(42186005)(86362001)(76176999)(59896002)(87266999)(54356999);DIR:OUT;SFP:1102;SCL:1;SRVR:AM5PR0801MB1300;H:[10.30.26.154];FPR:;SPF:None;MLV:sfv;LANG:en; X-Microsoft-Exchange-Diagnostics: =?utf-8?B?MTtBTTVQUjA4MDFNQjEzMDA7MjM6UU1MamoyYmtYL2k1cHAzUGlzS1k4eWcx?= =?utf-8?B?aFp2U1BycStkYzdSa2NYOFdncGRwRWk2RWpxRzUvZU43VWNXL3JVMFJvYVlr?= =?utf-8?B?cml0MHRUQ1hiQ05taCtpTkszVkJJNUJ4UWhqeDNvUnp5MXBOWnJtbEVQdVov?= =?utf-8?B?TGkzM3ZvVG5RL2hIWDAzQUN2TU9mRU5SdVFrL25TSytOSkN6aExEZmx1eCt2?= =?utf-8?B?dFAyb1JsZkFHZXlQNjhsNDB5YjZVZWJRaGJKdmJwUWMzbzRJZW16cENTN3VY?= =?utf-8?B?MklZcmZ6SGViZFY5TmFXRFBvQS9QR2NMMnRRelVXcVNVS09KRWJ0OVR0OVJC?= =?utf-8?B?bGRLamFlaUxINTJGNmZ2bTg5RVhOU0ZneWlWSk8wanpUZG9meFFDdmdhN1p3?= =?utf-8?B?Z2RoT1JWbmZIQ1NUMWUrdG9JbjU5dUwyQWovNmhBWXU4eTdSSXFvZWsrTFY2?= =?utf-8?B?Z3l4SG5keEVzdHpOc2xhVEgzSm5MVUtjN3J5QWhwLzRlM204d0JBNm1EdEx4?= =?utf-8?B?eWhWMVZUWjhhZ01iZExxK2M1M3paOXdNVTN5QmozekJnYU80WkxMNzZkajda?= =?utf-8?B?V0tSODRTQVY0K3NMNDE1SzVjS2xWbm1hLzRaVHVURzAxRWxUODdKWFd3MHBY?= =?utf-8?B?Y1VoSUp6V1EvTnFLSWFRM3p5ZkRNSVRuV1pOdzNYejU2eHJYMGVhdk1ORUxj?= =?utf-8?B?a2IxbG5MemtYVzBaM0JIcGI5bXlESHJxN1JROTNZUVIzdjRIMlhDWWxQQzFy?= =?utf-8?B?bm8xTEdqTi9oTnhOanpNeERJdG1wd2VsVVE2MkdTd2FaMXRCUHF2bTVRMlFD?= =?utf-8?B?M3V5K1lkYlh1WElyMDVhN3JuQzhvbExHeDRHSFNTTlpkYmFSM1k5dWVnWGRw?= =?utf-8?B?dFEyY0o3Z055VnMvbFlZcFBsaGYyditKRXVPVGNock53QWg2Nll6Zis1Rito?= =?utf-8?B?SVhKSkd2ZDZJUUtsbmVCN01hZkFKYk15MkhXcW9kOWZJU0NlWEg3VnNKR1R1?= =?utf-8?B?UzBwWXh1Y3A2dDFYbUxuQnA3K1pQWG5WRjFjWmdEVlY2bHBNeGgzbkUwOUZn?= =?utf-8?B?WjFLSTNBUEk1OHVHNkFxSTliMm02S3RsdEVjV3pweGt5VWhIc1l5TEdWd1JX?= =?utf-8?B?QkErZXEvRG5PZG9kcjhkVitUN3JkRE44NUpqYnBoNlVZaExDU2xBUlNFOW01?= =?utf-8?B?Sm4rcHZMTkpudXY4VWxNamNuekRHUlBnQWViMlZlR3A0bzdEZnhjblRBZXpk?= =?utf-8?B?QUJDYzVXRVFjRVdWL29NSTFMNytZVVpKMllaNjhOTGRIdnJ0bHlJcFdVcU90?= =?utf-8?B?V1JKejQyamdPeUkreCtqb2puODFKTVJkRkNSQTFsREo0Qi9JYUMxYVRUMXdJ?= =?utf-8?B?R2dhTFNNQ0Z4QjVWMkVudVhFaENNcnBYZ2ZPak4wckdEemVnUDV5OTNIT2Ew?= =?utf-8?Q?e7HGAwtdi8n+am/xEJ7zdUxVPnQWV?= X-Microsoft-Exchange-Diagnostics: 1;AM5PR0801MB1300;5:36UXdz3YDkYGLVuY5vyPkQxLi6iiTuswftkmaZHkWxaQAVaiwZbw80S/jAyL3xwBoaal6qVdBZymMw2TscHn/tNvjJkd63v3cjMUg0i83L/PQQ1e8anzL1sfOMgocijCEtuguh04w1ZWm7jK804msw1BJs1VPAmcID8Xj8RU5eWtf8PRKtiwK8YjxCHtZiB8;24:5ksMY3jA70gH5iWQ/9cuPsf0jTcpMLHlbgh67eGmA/EIwN4WzOaBEjOtDoToWdk02vHpU3txsXu7OrejkuGnryMp7cbQhVHBs2oW4hv4DI0=;7:vkYfM0pwNhIQ5/M3b1qVDC4mOddoJQKwFk32oUajwBVI7bp2Kvr8V/lVK+y/TYhO4OG7JGk0OscQGennZG5qxcfzBu9nLnuHiOXOgH7xZfRXDf3+WUUXGOYpDJx6JaEFt8Q2YYg9UyIWcaRRDQW588DGahGp3G2u7n609s8iuKU2yiMAdtuJIImS7l3iyT5nAHCxMDy/V+HzvYXRgDHbRCeuyqH7RuFmzsGTnaVoGoI=;20:TySxO7gSwHw6lagMmdaS9nhAL7vGgad64TXpZzf/dQt+jqFJ+tvlvIqu0uH3+124cGHquMm64/bQViVh7eFmiRzfY52rOj+8Fq4o59JtB8W/beBKd7MmboarrZLzcsN1GLgfRWv6m8C+MJiLkHXdaxStMSqMhaW6UJIKwLZlZ3w= SpamDiagnosticOutput: 1:23 SpamDiagnosticMetadata: NSPM X-OriginatorOrg: virtuozzo.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 18 Apr 2016 11:19:51.6003 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-Transport-CrossTenantHeadersStamped: AM5PR0801MB1300 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 04/15/2016 07:58 PM, Andy Lutomirski wrote: > A couple minor things: > > - You're looking at both new_vma->vm_mm and current->mm. Is there a > reason for that? If they're different, I'd be quite surprised, but > maybe it would make sense to check. Ok, will add a check. > - On second thought, the is_ia32_task() check is a little weird given > that you're planning on making the vdso image type. It might make > sense to change that to in_ia32_syscall() && image == &vdso_image_32. Yes, we might be there remapping vdso_image_64 through int80, where we shouldn't change landing. Thanks, will add a check. > Other than that, looks good to me. > > You could add a really simple test case to selftests/x86: > > mremap(the vdso, somewhere else); > asm volatile ("int $0x80" : : "a" (__NR_exit), "b" (0)); > > That'll segfault if this fails and it'll work and return 0 if it works. Will add - for now I have tested this with kind the same program. > FWIW, there's one respect in which this code could be problematic down > the road: if syscalls ever start needing the vvar page, then this gets > awkward because you can't remap both at once. Also, this is > fundamentally racy if multiple threads try to use it (but there's > nothing whatsoever the kernel could do about that). In general, once > the call to change and relocate the vdso gets written, CRIU should > probably prefer to use it over mremap. Yes, but from my point of view, for the other reasons: - on restore stage of CRIU, restorer maps VMAs that were dumped on dump stage. - this is done in one thread, as other threads may need those VMAs to funciton. - one of vmas, being restored is vDSO (which also was dumped), so there is image for this blob. So, ideally, I even would not need such API to remap blobs from 64 to 32 (or backwards). This is ideally for other applications that switches their mode. For CRIU *ideally* I do not need it, as I have this vma's image dumped before - I only need remap to fix contex.vdso pointer for proper landing and I'm doing it in one thread. But, in the practice, one may migrate application from one kernel to another. And for different kernel versions, there may be different vDSO entries. For now (before compatible C/R) we have checked if vDSO differ and if so, examine this different vDSO symbols and add jump trampolines on places where were entries in previous vDSO to a new one. So, this is also true for 32-bit vDSO blob. That's why I need this API for CRIU. -- Regards, Dmitry Safonov