From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S935318AbcHaOfp (ORCPT ); Wed, 31 Aug 2016 10:35:45 -0400 Received: from mail-db5eur01on0096.outbound.protection.outlook.com ([104.47.2.96]:2273 "EHLO EUR01-DB5-obe.outbound.protection.outlook.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S935293AbcHaOfj (ORCPT ); Wed, 31 Aug 2016 10:35:39 -0400 Authentication-Results: spf=none (sender IP is ) smtp.mailfrom=dsafonov@virtuozzo.com; From: Dmitry Safonov To: CC: <0x7f454c46@gmail.com>, , , , , , , , , , Dmitry Safonov Subject: [PATCHv4 0/6] x86: 32-bit compatible C/R on x86_64 Date: Wed, 31 Aug 2016 16:59:30 +0300 Message-ID: <20160831135936.2281-1-dsafonov@virtuozzo.com> X-Mailer: git-send-email 2.9.0 MIME-Version: 1.0 Content-Type: text/plain X-Originating-IP: [195.214.232.10] X-ClientProxiedBy: AMSPR04CA0049.eurprd04.prod.outlook.com (10.242.87.167) To HE1PR0801MB1737.eurprd08.prod.outlook.com (10.168.149.149) X-MS-Office365-Filtering-Correlation-Id: 31c50e89-966d-455a-0b96-08d3d1a75a3b X-Microsoft-Exchange-Diagnostics: 1;HE1PR0801MB1737;2:/qy96jVLRsrZOln9pdwClvtG9RkRwwzXVP5VYebb2uZ8CPL/e0Oah/67rM96DrU6cMjmzqY0FDo8eCuT5UogRudC3nsh4CQ1X/8OD7xgvVJ82k2iuPcuFjrG681X740yjFBrN0mN4dkDdWOVyZA2lDHBAIU6e+WOSi0smF05pY5jCXKwW9hXebrAQZWrXB7X;3:gc97kbsE9XeLAUJ7ZCUtYf9PPkuyyGQSL9STux/PYDw1WafPDKcWM3Ecmb+CTVmGV8qhvYLsVzMyZaltQFEbJj37b4qg4KsyG0dpBIB3dE+YvIoj770zy3kOKajivJdB;25:sqbyczMHngVo4SZbj9MY8yWun+HtFwgoWLwV/vykc2euCx09vNYsVoZQtKj2t45iSWlezXNAcOwoWIEJnjwdgYiMCbwy3cv+nftWIfiPJ/2eiupFQjeTUQe35t52BuZgnPDpYbjG5gkijHiscFkVts507mD0LBg5vvjjLAbBWZo2+a5dwzcRJONDtUBX1fXE7rjwellW1OdfIy4IrOxw86GGyCG5po1/O65KOBG9EfSXEEZk2Sce/pHHpFdFnKBEOdv+NgMitu28ZYNO7dZY+Bh6kYRP4nfLT22JRwFOoWDX4EIn+AN1ymy+fuv8+HaOnfZiHl5FiBGMgbvkdE7xhqgGOjy83vqfGQ0i6qNso0+nXO14DjrV6Vm9IQNUdA7rVAZER1hCHIzceMeACOhraX8zGub0Hdb/3kBPnlTjWyM= X-Microsoft-Antispam: UriScan:;BCL:0;PCL:0;RULEID:;SRVR:HE1PR0801MB1737; X-Microsoft-Exchange-Diagnostics: 1;HE1PR0801MB1737;31:aPwZM9HKlie6Ujxd7m68OtheDLK0GfQzz0MpZvheT2Xi3N78UiY0GOl5GCWo4Jkaz3Qe6W6mqqdwjML1LALtEwcqMFHWT0l6hGKhULhbA8L+1s3kStEzsyrQb3EG54QL28xAX9AkU3c+2kx+/OAhgU+R+Mla+q+rMqng/SaaUqaTihWyRMDv0AbTKoJPwsRyqwuUdM15SgTXtuzvzajxspXe65yxgWRrC8/1WrLHq+E=;4:TbaNKKHZ+c8R7/fyF10yB260zzUpuFgNX3euiZEoKrzwmgYL8fgR2NSQF0kTSC80TLMdwHbQGueZxxp2xtp02y5AXqlOMH5YsEnHuLzCsna6VL3QWGDYi/eylamr6FjstaeD0ZbyIe/MBFSwGvr9rwFCNzXfRBEwxwyvpt5yFuzmS9tu61IRQ7SEsQweAuHQocGtPNuKZxRR/6ZBmV+64tusKEyYzL7QU8ZmT+2m7Q+0R6x7siOUAUy1Fmj2eUuRBF/3VG+X1mlrvSZH34njODN4Pii4GdeoH1fTmz6vA3bpOoDl1ju/TVk+EsMExz9KU5DCy8qmmkgim5VRgDHGw/av4oWzBI6h9cRD4Afz963CZuxNcipmTSP6UEqU8U0IhzqeLIno3FZ74jfTWw8Jf1iGIGc2fS0jCV5buUpohZAgz2GJBi7SCPn4DCY5etrCVoa8RbHmgJWAF6IeTnMvCbgVZt2fCa07Aw+g4QiSjSgHr1Bc3lTXnWge05QEai4m X-Microsoft-Antispam-PRVS: X-Exchange-Antispam-Report-Test: UriScan:(166708455590820)(788757137089); X-Exchange-Antispam-Report-CFA-Test: BCL:0;PCL:0;RULEID:(6040176)(601004)(2401047)(5005006)(8121501046)(3002001)(10201501046)(6043046)(6042046);SRVR:HE1PR0801MB1737;BCL:0;PCL:0;RULEID:;SRVR:HE1PR0801MB1737; X-Forefront-PRVS: 00514A2FE6 X-Forefront-Antispam-Report: SFV:NSPM;SFS:(10019020)(4630300001)(6009001)(7916002)(189002)(199003)(5003940100001)(92566002)(3846002)(1076002)(6116002)(36756003)(101416001)(105586002)(106356001)(586003)(97736004)(50226002)(50986999)(68736007)(2906002)(4326007)(42186005)(19580395003)(4001430100002)(66066001)(53416004)(47776003)(7736002)(69596002)(7846002)(81166006)(81156014)(15975445007)(305945005)(86362001)(8676002)(107886002)(2351001)(110136002)(189998001)(50466002)(229853001)(33646002)(5660300001)(48376002)(77096005);DIR:OUT;SFP:1102;SCL:1;SRVR:HE1PR0801MB1737;H:dsafonov.sw.ru;FPR:;SPF:None;PTR:InfoNoRecords;MX:1;A:1;LANG:en; X-Microsoft-Exchange-Diagnostics: =?us-ascii?Q?1;HE1PR0801MB1737;23:nmnlXsfwroP7zDkkfwi7atEHKsb7vhcpiyGOmvg?= =?us-ascii?Q?VYJ1QI6R6BhAFC2PUvGy1RB2Qe/q8KWxzsmV2WXrU+22FIeNbvOc1fntcJrH?= =?us-ascii?Q?To/WSIqTZ3/T7rIUhZ4Y/LMXH/MiJlL7Gc/tiyZ6jm5HNOEmgphphoO6hamc?= =?us-ascii?Q?Eydcs2ZpnrYZkUHgLdUKIQ8TasrxpGMqqA8BFVTDWJf5XwsoDcsTBZ0LWMMa?= =?us-ascii?Q?Dpobkt7ZFLJOM/1xlT/tEWyPFdXskV4jI1vZo1KlbOf6PUTB5rAe3kN7ZQku?= =?us-ascii?Q?tyNPY2L1BSMwW0JsKkWxMXsc9ZQwwTbwOopJCRR8R0KPxUZz43FOF5oNJpb2?= =?us-ascii?Q?YKRZqDgK/Cr611CsbkkGS+rcFxNbwj6ELFSlH/fxCIfa+Iv15pnJh+JYmNXH?= =?us-ascii?Q?chdlg/kcJtpWr8nIskbMotzyB1uuy5912sMoQHHX7WGlVyIHyLhNZWmgEwqz?= =?us-ascii?Q?CwasyqIxfpsIzBhvFk+KgxIPdL9NPv82uNtut1+27GSJqiCkZJkOlzc6SRJa?= =?us-ascii?Q?boqzzNSV7axbbM1aI3WFpg46ohkUEs01yEedyBWTQvx4s2L4ibewoQ8FSmjh?= =?us-ascii?Q?JoPYKK0H3glDz+k2KeRKPVWwbGCjw1EB0QDdvO+6fA3OPZm7FI/guqS9v1OU?= =?us-ascii?Q?CWEtNrc4pC6tLuUlUqBoFAn65jRr3WDnbDz2O8UbhzE9lAU4tR6sKyI3jiuN?= =?us-ascii?Q?bjx5ZxYNxN8YEGO/fse7Q1nDUSnaosOQe7CKipK3GE7sAuvRXCUlGw/GKUbF?= =?us-ascii?Q?3OC4agd/gZ5ELZTt6gYkFZ/G46lmexhCEbkl/v3ucDtThXA44sypbLYK6D1G?= =?us-ascii?Q?NEw8KcL7Dk1molvZrbw+8W2s2UA55HcgK18yvwkxpDOsLs0m2XeV5cnfda7d?= =?us-ascii?Q?++p+r/qvboVZQ6T6i0pSqzSyLyOJ9FTNPJcBV+MHdOegPpoEg1zB3y4qBXfx?= =?us-ascii?Q?srmYG+TmmofP+rAZXaw0CVC58ENnzjOcfeDARzxkz+jQXuGkUhT8/i6rgrVU?= =?us-ascii?Q?rDLRkBItyD6V/ufflSM78AJJO1xR2+Av13KOSyJO1gYlSZTn0wY/EK/+vfMO?= =?us-ascii?Q?VBCzcE65jjD71Kkmu/RrJOAWUm4B99qKanJ5QZ6MUAlGUsgZZAo3P27nytRO?= =?us-ascii?Q?I5ypuvtw/XUdTRAYBJDWzlz8tIfrDSdaGWZ8NqiQMNu9yLoTi4qoTFw=3D?= =?us-ascii?Q?=3D?= X-Microsoft-Exchange-Diagnostics: 1;HE1PR0801MB1737;6:qRDdWLdfKhHd0Ex6CTqjloPBhaffoPWjcn/Q1Zm6Pjyh0w1Kr9+RO9TCw7hKAmod+dGltoN08FDTW78uAuq15x46wJsKsD6c6RtJak3gkPjMoT02CdnJA8FNO84Z9L7KfSwPeFpe2ifnGHDRRjq/fyLlT65Vga+nF+VyJNCCGzAE2XUW4LX5gELsjs6ZEYhLt3MYDjdQkq14EMzMuw+8g9iZ0Mmb6H4M4erZcuiqm+qb4i9iKCbMDApywCHzMDhjU/tAz72rFnBfMzHZZ0vvhi5psJbXG3q5tJdBljn3Ts6VzwMP9duh6U4T6oYHqgHb;5:bhmprDWjiYYDu9chm4NIo+VCZlZBqVNCkZfSpg3ZB5FOgJRsOVC1f6yM8TH9/947lqBWhsGE+q5vg4B0Yey4XM89VCHOWgMzIX5GKxgjEyvHcHKkPY/Dy5IIU4kSvjSqw1NdcjD0D+P4rhhQ/UTd1g==;24:rGk82v0H07IHONBGklJv53MnmFeWVrRrKkt9UOZ9LAPorswsMDuLjB6vjGw8MdsukRs20jnDLZee8XNGOTkWVOgvQuEBk5kuG2HTrEUbWWc=;7:SjPHkN5hTvD8WlzzQzoSlzMN4mSfz20HBGILx/W68TJ365Gt8BQqmWoUhMk7AMNyh8+kBDypHq9wQsNkluAKMjsZjh2ThqZwJkvySvnOMzr4jeybJ/th9uSM+4ZYWbToCibiDIs5aij4y/pGSDG8aTl/BCnH7igqceCgve80SHCznFbUG8f3Ru+G8hnNSFaBJBISJCRpAj6I5P3YEEw03+tZWDX8RibTNk6O0Sm6C484Upkp9/risI5eda69M+Rf SpamDiagnosticOutput: 1:99 SpamDiagnosticMetadata: NSPM X-Microsoft-Exchange-Diagnostics: 1;HE1PR0801MB1737;20:3oTvUMGd/UkWT/NgbvEITjXNYJwX8Y+IExUJh04ncBW2nujuniWYpjVuFudNFYuMMQkXsezKWbDAik2dOkK27I4iOZkBnbfT0ALPH8EeaQcqplCb7ZqEwOGh1LD+XWOALAac97+AMzalVyDQHSB5Cd3Fg9U6lsck7x9XEFuka5A= X-OriginatorOrg: virtuozzo.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 31 Aug 2016 14:01:48.2505 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-Transport-CrossTenantHeadersStamped: HE1PR0801MB1737 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Changes from v3: - proper ifdefs around vdso_image_32 - missed Reviewed-by tag Changes from v2: - reworked map_vdso() part with Andy suggestions - int arch_prctl(ARCH_MAP_VDSO_*, addr) now returns size of mapped vdso blob on success, which is handy for the following blob parsing in userspace - disallowed two vDSO blobs mappings: as Andy noted, __insert_special_mapping may not get all accounting right, which may lead to abuse this API from userspace. Return -EEXIST if process has mapped vdso blob - this will ensure that caller knows what it does. The following changes are available since v1: - killed PR_REG_SIZE macro as Oleg suggested - cleared SA_IA32_ABI|SA_X32_ABI from oact->sa.sa_flags in do_sigaction() as noticed by Oleg - moved SA_IA32_ABI|SA_X32_ABI from uapi header as those flags shouldn't be exposed to user-space I also reworked CRIU's patches to work with this patches set, rather than on first RFC that swapped TIF_IA32 with arch_prctl. By now it yet fails ~10% of 32-bit tests of CRIU's test suite called ZDTM. The CRIU branch for this can be viewed on [6] and v3 patches to add this functionality have been sent to maillist [7]. The patches set is based on [3] and while it's not yet applied -- it may make kbuild test robot unhappy. Description from v1 [5]: This patches set is an attempt to add checkpoint/restore for 32-bit tasks in compatibility mode on x86_64 hosts. Restore in CRIU starts from one root restoring process, which reads info for all threads being restored from images files. This information is used further to find out which processes share some resources. Later shared resources are restored only by one process and all other inherit them. After that it calls clone() and new threads restore their properties in parallel. Those threads inherit all parent's mappings and fetch properties from those mappings (and do clone themself, if they have children/subthreads). [1] Then starts restorer blob's play, it's PIE binary, which unmaps all unneeded for restoring VMAs, maps new VMAs and finalize restoring with sigreturn syscall. [2] To restore of 32-bit task we need three things to do in running x86_64 restorer blob: a) set code selector to __USER32_CS (to run 32-bit code); b) remap vdso blob from 64-bit to 32-bit This is primary needed because restore may happen on a different kernel, which has different vDSO image than we had on dump. c) if 32-bit vDSO differ to dumped image, move it on free place and add jump trampolines to that place. d) switch TIF_IA32 flag, so kernel would know that it deals with compatible 32-bit application. >>From all this: a) setting CS may be done from userspace, no patches needed; b) patches 1-3 add ability to map different vDSO blobs on x86 kernel; c) for remapping/moving 32-bit vDSO blob patches have been send earlier and seems to be accepted [3] d) and for swapping TIF_IA32 flag discussion with Andy ended in conclusion that it's better to remove this flag completely. Patches 4-6 deletes usage of TIF_IA32 from ptrace, signal and coredump code. This is rework/resend of RFC [4] [1] https://criu.org/Checkpoint/Restore#Restore [2] https://criu.org/Restorer_context [3] https://lkml.org/lkml/2016/6/28/489 [4] https://lkml.org/lkml/2016/4/25/650 [5] https://lkml.org/lkml/2016/6/1/425 [6] https://github.com/0x7f454c46/criu/tree/compat-4 [7] https://lists.openvz.org/pipermail/criu/2016-June/029788.html Dmitry Safonov (6): x86/vdso: unmap vdso blob on vvar mapping failure x86/vdso: replace calculate_addr in map_vdso() with addr x86/arch_prctl/vdso: add ARCH_MAP_VDSO_* x86/coredump: use pr_reg size, rather that TIF_IA32 flag x86/ptrace: down with test_thread_flag(TIF_IA32) x86/signal: add SA_{X32,IA32}_ABI sa_flags arch/x86/entry/vdso/vma.c | 81 +++++++++++++++++++++++++++------------ arch/x86/ia32/ia32_signal.c | 2 +- arch/x86/include/asm/compat.h | 8 ++-- arch/x86/include/asm/fpu/signal.h | 6 +++ arch/x86/include/asm/signal.h | 4 ++ arch/x86/include/asm/vdso.h | 2 + arch/x86/include/uapi/asm/prctl.h | 6 +++ arch/x86/kernel/process_64.c | 25 ++++++++++++ arch/x86/kernel/ptrace.c | 2 +- arch/x86/kernel/signal.c | 20 +++++----- arch/x86/kernel/signal_compat.c | 34 ++++++++++++++-- fs/binfmt_elf.c | 23 ++++------- kernel/signal.c | 7 ++++ 13 files changed, 162 insertions(+), 58 deletions(-) -- 2.9.0 From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pa0-f72.google.com (mail-pa0-f72.google.com [209.85.220.72]) by kanga.kvack.org (Postfix) with ESMTP id 69E926B025E for ; Wed, 31 Aug 2016 10:01:54 -0400 (EDT) Received: by mail-pa0-f72.google.com with SMTP id le9so92041565pab.0 for ; Wed, 31 Aug 2016 07:01:54 -0700 (PDT) Received: from EUR01-HE1-obe.outbound.protection.outlook.com (mail-he1eur01on0120.outbound.protection.outlook.com. [104.47.0.120]) by mx.google.com with ESMTPS id pk3si54880pab.101.2016.08.31.07.01.52 for (version=TLS1 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Wed, 31 Aug 2016 07:01:52 -0700 (PDT) From: Dmitry Safonov Subject: [PATCHv4 0/6] x86: 32-bit compatible C/R on x86_64 Date: Wed, 31 Aug 2016 16:59:30 +0300 Message-ID: <20160831135936.2281-1-dsafonov@virtuozzo.com> MIME-Version: 1.0 Content-Type: text/plain Sender: owner-linux-mm@kvack.org List-ID: To: linux-kernel@vger.kernel.org Cc: 0x7f454c46@gmail.com, luto@kernel.org, oleg@redhat.com, tglx@linutronix.de, hpa@zytor.com, mingo@redhat.com, linux-mm@kvack.org, x86@kernel.org, gorcunov@openvz.org, xemul@virtuozzo.com, Dmitry Safonov Changes from v3: - proper ifdefs around vdso_image_32 - missed Reviewed-by tag Changes from v2: - reworked map_vdso() part with Andy suggestions - int arch_prctl(ARCH_MAP_VDSO_*, addr) now returns size of mapped vdso blob on success, which is handy for the following blob parsing in userspace - disallowed two vDSO blobs mappings: as Andy noted, __insert_special_mapping may not get all accounting right, which may lead to abuse this API from userspace. Return -EEXIST if process has mapped vdso blob - this will ensure that caller knows what it does. The following changes are available since v1: - killed PR_REG_SIZE macro as Oleg suggested - cleared SA_IA32_ABI|SA_X32_ABI from oact->sa.sa_flags in do_sigaction() as noticed by Oleg - moved SA_IA32_ABI|SA_X32_ABI from uapi header as those flags shouldn't be exposed to user-space I also reworked CRIU's patches to work with this patches set, rather than on first RFC that swapped TIF_IA32 with arch_prctl. By now it yet fails ~10% of 32-bit tests of CRIU's test suite called ZDTM. The CRIU branch for this can be viewed on [6] and v3 patches to add this functionality have been sent to maillist [7]. The patches set is based on [3] and while it's not yet applied -- it may make kbuild test robot unhappy. Description from v1 [5]: This patches set is an attempt to add checkpoint/restore for 32-bit tasks in compatibility mode on x86_64 hosts. Restore in CRIU starts from one root restoring process, which reads info for all threads being restored from images files. This information is used further to find out which processes share some resources. Later shared resources are restored only by one process and all other inherit them. After that it calls clone() and new threads restore their properties in parallel. Those threads inherit all parent's mappings and fetch properties from those mappings (and do clone themself, if they have children/subthreads). [1] Then starts restorer blob's play, it's PIE binary, which unmaps all unneeded for restoring VMAs, maps new VMAs and finalize restoring with sigreturn syscall. [2] To restore of 32-bit task we need three things to do in running x86_64 restorer blob: a) set code selector to __USER32_CS (to run 32-bit code); b) remap vdso blob from 64-bit to 32-bit This is primary needed because restore may happen on a different kernel, which has different vDSO image than we had on dump. c) if 32-bit vDSO differ to dumped image, move it on free place and add jump trampolines to that place. d) switch TIF_IA32 flag, so kernel would know that it deals with compatible 32-bit application. >>From all this: a) setting CS may be done from userspace, no patches needed; b) patches 1-3 add ability to map different vDSO blobs on x86 kernel; c) for remapping/moving 32-bit vDSO blob patches have been send earlier and seems to be accepted [3] d) and for swapping TIF_IA32 flag discussion with Andy ended in conclusion that it's better to remove this flag completely. Patches 4-6 deletes usage of TIF_IA32 from ptrace, signal and coredump code. This is rework/resend of RFC [4] [1] https://criu.org/Checkpoint/Restore#Restore [2] https://criu.org/Restorer_context [3] https://lkml.org/lkml/2016/6/28/489 [4] https://lkml.org/lkml/2016/4/25/650 [5] https://lkml.org/lkml/2016/6/1/425 [6] https://github.com/0x7f454c46/criu/tree/compat-4 [7] https://lists.openvz.org/pipermail/criu/2016-June/029788.html Dmitry Safonov (6): x86/vdso: unmap vdso blob on vvar mapping failure x86/vdso: replace calculate_addr in map_vdso() with addr x86/arch_prctl/vdso: add ARCH_MAP_VDSO_* x86/coredump: use pr_reg size, rather that TIF_IA32 flag x86/ptrace: down with test_thread_flag(TIF_IA32) x86/signal: add SA_{X32,IA32}_ABI sa_flags arch/x86/entry/vdso/vma.c | 81 +++++++++++++++++++++++++++------------ arch/x86/ia32/ia32_signal.c | 2 +- arch/x86/include/asm/compat.h | 8 ++-- arch/x86/include/asm/fpu/signal.h | 6 +++ arch/x86/include/asm/signal.h | 4 ++ arch/x86/include/asm/vdso.h | 2 + arch/x86/include/uapi/asm/prctl.h | 6 +++ arch/x86/kernel/process_64.c | 25 ++++++++++++ arch/x86/kernel/ptrace.c | 2 +- arch/x86/kernel/signal.c | 20 +++++----- arch/x86/kernel/signal_compat.c | 34 ++++++++++++++-- fs/binfmt_elf.c | 23 ++++------- kernel/signal.c | 7 ++++ 13 files changed, 162 insertions(+), 58 deletions(-) -- 2.9.0 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org