From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753246AbdECKVQ (ORCPT ); Wed, 3 May 2017 06:21:16 -0400 Received: from mail-ve1eur01on0102.outbound.protection.outlook.com ([104.47.1.102]:53888 "EHLO EUR01-VE1-obe.outbound.protection.outlook.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1752027AbdECKVG (ORCPT ); Wed, 3 May 2017 06:21:06 -0400 Authentication-Results: chromium.org; dkim=none (message not signed) header.d=none;chromium.org; dmarc=none action=none header.from=virtuozzo.com; Subject: Re: [PATCH 2/2] pid_ns: Introduce ioctl to set vector of ns_last_pid's on ns hierarhy To: "Eric W. Biederman" References: <149245014695.17600.12640895883798122726.stgit@localhost.localdomain> <149245057248.17600.1341652606136269734.stgit@localhost.localdomain> <20170426155352.GA12131@redhat.com> <785e1986-da03-72aa-06c0-234ed2dbc0fd@virtuozzo.com> <20170427161255.GA19350@redhat.com> <20170427162254.GB19579@redhat.com> <43249645-f621-511e-dfa8-7bd78c547d2c@virtuozzo.com> <20170502163324.GA25036@redhat.com> <8737cngdxi.fsf@xmission.com> CC: Oleg Nesterov , , , , , , , , , , , , , , From: Kirill Tkhai Message-ID: Date: Wed, 3 May 2017 13:20:58 +0300 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.8.0 MIME-Version: 1.0 In-Reply-To: <8737cngdxi.fsf@xmission.com> Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit X-Originating-IP: [195.214.232.6] X-ClientProxiedBy: VI1PR06CA0033.eurprd06.prod.outlook.com (10.162.116.171) To HE1PR0802MB2284.eurprd08.prod.outlook.com (10.172.127.14) X-MS-Office365-Filtering-Correlation-Id: c5519f2e-6d6a-45b0-d6bf-08d4920e193a X-Microsoft-Antispam: UriScan:;BCL:0;PCL:0;RULEID:(22001)(201703131423075)(201703031133081);SRVR:HE1PR0802MB2284; X-Microsoft-Exchange-Diagnostics: 1;HE1PR0802MB2284;3:+XlfJFZh3AK7nbOIBLS+hCkE64kyEN0vEH53+i3K4yzd0Hgcu7c+jshmjCA830kLP3Jsuyx7QuOSgQD9BDE2x/jRguAi+/OWa+t877Li1mtrnFs5Szetm2wZ9qoQSycJYzLk8S1L374kfMQm81NkHj/N02GqaUd0pjq3m04k38ellbh1Fjd+4H44HxVj6hXsKjqXa643C4ob8hrZUSLwCFKF1vTH1Wgg4x3rlNMWY48B3t9MGQ/ScpHGRXv66ZnmnEZbzpzmK5LMSnGyUHlr9MioQIVFvZKnRarte0/OShhXO+rjf5wS+c6nicYYfJppjcbK0APd8GER8RmRVN/eUA==;25:QelMk0gswrZ8JNHIHzV5HFwn2+TW/CvtspwJXCtHeCF+yYO3l0yRoUvBXU1qwsxCkccLPmQ6BdRRYzZtRhyrXLIHr2TaYaPldo3zbCUB/MY6PrScKvyzSrx4dTY7ATE4dSnZAzTgXFRbi5KEwx0rZDT2v/XLsJv/r3nISoDvwI3WXMBl3zWjYov/6/sXT8DJ4xdoeAImIQk9WfnzUiT+YEEqVDGzuHN6y92IANuy0skSNoNl7NDPeO6g9SBtyWxce8xFX0/8uGxOvxKVM6g7rHSS/xTa9/jrrE7wY2kSk9eJFojUvjNP+zg9HzggW9Dim4OKckHilyS499pfjbyNNkeR7hiZ1Gn6U6klii8Nc1iXo1kc1FZMlZVoA62ohgGL3quup5hm0/ganKI+fnAUwltm9vRNrcrqpjqhOenRs0OBIpnTLmAoDkvdEMxuC2Sem9XElN51xmmatcICzZ7hOA== X-Microsoft-Exchange-Diagnostics: 1;HE1PR0802MB2284;31:DzQ4RYA4QrP6cQkj0mRWP6cPowgqW8tgIyApqBh3sWWB84OwFO30MTYagS3kaaiqwXAEjdl6NRKa6axwv2aIbuuSROLMBZPLMejrROq/Qabe0mdi5mDDRJU4nDboJwmj7Ro8Qe5jmKBnA1oMLLqFLKXs2xNaa04irvpGiirxBAMXSW6xqQyG6JvdF3FxAdpsVScar5U3lekaGLa4S/gkVuTsHHDm4JClRpiVsvdLQI3/DubusCz47HRlsZpvQOTo;20:FHZtEjAtwmxZ6s5dcEXYxxmQ7jl8s8GdDWhBQJC8lRBx4l3QxjxDPGhQDsqVMzvnKCGCllQAiXXj+DIfM87/tk8XMOMY+Wm+BN/Vvn3qSAWPdjKZCHgkkO9Ec3lpvIbgOqBxluo+7mnWFobKxeQi2I+OUxiC6okeIY4SJlxJmfxX/iEu7pPWRLgwFBLHouxfnHC4CG8sLFDVK9FqpC5ZOJIu3R/A9snqTBdTCXdDdyux7fbPLhmw3j4dROOQO6bvfbu+b21IG0eHtuEG4EHoZxwQQYWSwCPrSoGRPbIHZzLC/QVo/L1fV4iuuVYOKi3J1x0HM9y1tMRFbZt2PQrnh87QiwYAuqpfpO8k7uNGR5T2ArxszXmFLAsLyoasUtGcjPD7ZX2WIIQzDcZWxDQrV+ZDwasa4eV4BwvrtAr6v5Y= X-Microsoft-Antispam-PRVS: X-Exchange-Antispam-Report-Test: UriScan:; X-Exchange-Antispam-Report-CFA-Test: BCL:0;PCL:0;RULEID:(6040450)(601004)(2401047)(8121501046)(5005006)(10201501046)(3002001)(93006095)(93001095)(6041248)(20161123562025)(20161123560025)(20161123564025)(201703131423075)(201702281528075)(201703061421075)(20161123555025)(6072148);SRVR:HE1PR0802MB2284;BCL:0;PCL:0;RULEID:;SRVR:HE1PR0802MB2284; X-Microsoft-Exchange-Diagnostics: 1;HE1PR0802MB2284;4:ViHpNX8l0VmpNrBBq7R2pXEHDmzAEtbDe3j2+ptbOD5frGE+MBHAHpksNPY27w20HURFZw3HxmeWx3JV0NU2pLSG0p9/Xlxbus0DFA7xH2bnei6P4cOIgr/T578BFDuz4DJdu9gXOI5QYXNIyu71vTrITlRqgFi8TZneIRQSExbNBtNBsTuIoEb40SDkfZdnHMLO3K7tRhXtPZCkpwOIeL4UcAqjPFnQJ7MgRjolMtfRwYTY9s5YBbQA1lif1zEwIotnWqcwHF1wcQFt5R9fkPCLT7av1ME6FBAgGjwZ7AAnwicRFY2A9vr36rOQwZR4/wt6yf2dzYE6sZhpqXMvFiN9wMYFWLjeXfRQJZZKld6tfCzS3n9kDQ33YS9mCoB6sOpZNA7PTn0/pu66Fw3bnvJfyImtUzF2qjazkzbxTY2MT1iIbO4AKHwmo6nxzCcbHC34w5SqFNFtWKFvq4Ynj3Umis31sBhjz2mlIvIYqlydOvv91uRp7uAg9ze7pQvejcsXSIJKxxYzyUsyBdZM93JeFYfHIkh2RsJFtzrQmx9e+PmXR9URWNIQN7LNtc+8A3sVyTAs5fz8U5pdBYqQdN9fQjJLYxmIExS4KGNSX7rYuVlIqIPSA8ZQpxOG21wD6wJNjzNr4oFQMMCEG8tSdzvx8LYn8aj6xSmegvvehmicuhF5Sald7izSEmDNY9JxaSXoM/lp4Z5wtuhcbNRLEQ== X-Forefront-PRVS: 029651C7A1 X-Forefront-Antispam-Report: SFV:NSPM;SFS:(10019020)(4630300001)(6049001)(6009001)(39410400002)(39400400002)(39450400003)(39840400002)(24454002)(3846002)(86362001)(4001350100001)(25786009)(2906002)(6486002)(7416002)(189998001)(6116002)(54906002)(42186005)(66066001)(110136004)(53936002)(81166006)(8676002)(53546009)(7736002)(77096006)(90366009)(508600001)(4326008)(6246003)(83506001)(38730400002)(36756003)(305945005)(31696002)(65956001)(64126003)(65806001)(31686004)(230700001)(65826007)(50466002)(47776003)(5660300001)(76176999)(50986999)(2950100002)(93886004)(6916009)(54356999)(23676002)(33646002)(229853002);DIR:OUT;SFP:1102;SCL:1;SRVR:HE1PR0802MB2284;H:[172.16.25.137];FPR:;SPF:None;MLV:sfv;LANG:en; X-Microsoft-Exchange-Diagnostics: =?utf-8?B?MTtIRTFQUjA4MDJNQjIyODQ7MjM6VWRJL0RoWFc5ZnZHdS9RWnUxZzZhVkZT?= =?utf-8?B?bDZsaWg1V1NKSWZObnVzZ2lqTkd1M2VRZG1Wa2pwUk9CS3RWcEFaWjJMejBy?= =?utf-8?B?a3hSeWhUQUd6MFFieFFBYmVudXdGSnRrKzhUQlpjdUxldzRPNlpxeGxqczQv?= =?utf-8?B?OU9DdFlEbG04ZkZsSXNlaHBtM1IzZkVXSStPcHV0eHc4TE5CM2cvK1hwRTIz?= =?utf-8?B?RlUxbEJ5WENTamRheERwNHhnUW9nUFdvZjMwRWxPWFVOcHI5UU5vQVByczdv?= =?utf-8?B?RXpVWUpVdDdZY29iYVh5RGd2SmtiSEhwZ1dRYnJMSjdTRHlwRFVGTjBhMEtE?= =?utf-8?B?NWlYcTZCdFhucm9IWHhtakdqanYzZ1JvQ050SE92Q253N0VBcEMwbmQ5czVF?= =?utf-8?B?bXV1WDlObm1hbm0wWGRjS3lYRkl2UThYTXIrVGlxL0JSdjdxQ2UwS1JFc3ZH?= =?utf-8?B?Z3Y1Y1pWZWNoQzVPbXpIQ1dhUGoySHpJL0NlWFlTU1E0ZlpHNmxCS1l1YVRQ?= =?utf-8?B?RmF4WVBILzNLYnZ3ekFLSE9wNHlDNkJzaUZFckRCUmY4U3M2bmZocFNFdFdL?= =?utf-8?B?eFJ0R3V1TUhIeVhoREpaUjVIWXF1NUFveW9xVWVmWDd6R3dEUk4zY0RXbFBU?= =?utf-8?B?d1NhMHBFSSsyTUdjWG1na0RsYzNhMzNKaGhKemU0dkFCL2dzdTE4Skk0dzQ4?= =?utf-8?B?REwvWUg3YmdFVDc3YitOYmZkWDJ5bm9ZcndjaU9iWFNGK1pRZ1FiZWFSK05Y?= =?utf-8?B?QjZwaFR4ck1oQUd0NGU0SWVSVVhkUDlra3VTRmtEUHlFMlQ2blIwMGNIN3Rn?= =?utf-8?B?WDFTRlBOZUd5Y1d6V3FIdmQwdjU1ME9rMlFEZkVmRG1VbWU5UncrTWlBN0VV?= =?utf-8?B?ZWtMZWJqNGtBTUEyaHZPRmF0SjIxeDhoV0hYUjNyZEE0K1owNm9QcGloVnIz?= =?utf-8?B?UENDZXVUcmpoUjJ6ZFZ6QnlodHJHMGxpcnFnZG50eWZFZmFJU2oreE1Ib1dL?= =?utf-8?B?WmFONzA4YXZtQWk2SjgxeEs5T053RjlmTXNKWEY3SW9jQzZWeTc2VFVnL2lR?= =?utf-8?B?bWJyemdqVURWR0NvUFhxNzlMUkFRbDhteEFYTGJGZEJBNEJGSVlzWW43QmRK?= =?utf-8?B?NVhlZEhKakdsMUJBUytoS1JLUkxqTzVmeVRtZ1AvRnIzVzJhQmpSVFpzUWI4?= =?utf-8?B?YS9IaVYzalNLaXQ4bWRwR0xYSHd2cDY0Mm5nQWF3VW5sMnZ5SXhQWkdaMHB4?= =?utf-8?B?QktlTVdwZXBtUVJkdmJ3MGpJZ1dNQi9LUnQ2ZmRZazM1S2tHRnlxSnV1eWlK?= =?utf-8?B?TTdUZjVjdTJwdHpkeGZFTW16YUdIalhmc3BJU2xJV081bHdIZ01FbmpIbyts?= =?utf-8?B?eGcvNDBmdUc1anBQNlc0cTdGU3ptL1BzMG9rNFZTekQ1ZEZYNXB4Y3RSL1J5?= =?utf-8?B?bmQ5eE9MODR3U1hpS3dkNUVJKzIxMk9VNTYxdis3T1JRb3VpNTlCenBUMS9o?= =?utf-8?B?WGtSMHd1bXE4clVINHQrWnRyZ3ZGZVR3UU9lWGcrd2UxZEc5bDZHb1NDeEQz?= =?utf-8?B?VGpzQUVJMGNGUjVSSHNQYkJIODFidHFRWUtDSkdFVktNakJxVkpEaXFSNDIz?= =?utf-8?B?ZENkTURKcGt5ZmFjUHU3TnB1cC82VXQ2ZHZuYjYxN0cvRDJNMHRCZ1FQcTdP?= =?utf-8?B?V2tPRk11aE5mK25HQkVpeGRocGdyS3hjZHZNSEtZbUVzb3k0djNsY0RBYk5L?= =?utf-8?Q?lp0pDOkt+Jw9bPkZKtWU86v1y01fu7C3626GNR0=3D?= X-Microsoft-Exchange-Diagnostics: 1;HE1PR0802MB2284;6:E8VjwFJM545GSzXjmHAOqaM2hWrCPHe8+FNE0K/FXzACXH4yCWsbHz2BKamYrdr8qaH6FxrbU4qhg4hxoOgedin51OxHIUuMnUeHnHxkmZJP4LsT2Vac0pvOEwm70K5wyZDbs0JR4BW7qWh11LOzXaaPIxjnF4gSOtdQ0doOK8/24SHsWHLxHnDYcFoGMtZnSIlQ72IkT+j34Bn3Qh/6q9dEJNzyUo4Vl87iafhiiPcfclZ1DWWA9PFKHdXGJdC+RKCW3gogj7Dv/1s6pLJhDDtxTyJQ+mt1oVBjCQpFzriwVUtRpxgQiM1vxcqtwbUPBWBdllsVZGzIJoP71isOqwwiaO61Udh1RIkSrTTY9TcdbqRjC9xp3JfBTKOX5+tfRfSGdDOj/3qH7q/AGqD6t8+hzlo2jW4PUz4XZtTGt+r1iWOeuzFp9r9yYoGh0tzqKcaNVCwi/RhNJ0Q45CEpYEKNckbLeEmGKA+uecn7lnsbXyyGFZ5pZUp69lZYmwOueDEQAGsZvG0QXnPRzciUpg==;5:e/ez6WZ3bqPi3m8PULMJ91sL2u0rcZVN20dDtJ3+jtzSxkMlOBfIVY0NpW3ShNujRfpgdBES6oXrQ5KFeARBgXNDGlHbxLLJPPFzDmI/V8kvhZRTsreyO594Jd9cIp4LdOcK7eE+YNpGUB7urSiYwg==;24:cceUyuZGwXxeaUrUTl9kOSYuZEhfv3uckfIqtkN3Gdef4P9q0XwjPYudwRnnmsRGL5tyr+6A5eydKexJFiBtl5h1EffZ4fQXlaKA4+CL39g= SpamDiagnosticOutput: 1:99 SpamDiagnosticMetadata: NSPM X-Microsoft-Exchange-Diagnostics: 1;HE1PR0802MB2284;7:i8cn1lt1ULBMCvIxL+5jHfnYh26R1SOOPc7A2bdSdWzZwF6a0lAmiaQh2lafUBkQSQoHU+0y58Ku2Xe27WGiLjHVbahrvZ95fCOcc7z0h9HaAg3eQcLYOcZwWjOxgkMb881iHu19xUhax3dMz9gqB9AM6m/0cT/OcAr8i9fvuaXMY26Nlk0qo2IX64GSMcLLQWyAPcZ3KHw5PwFOcQAIQfPjUAxm6+9JIH7QIB9cOwGTpC7C+b6xMRDCBsGjKdB9nwa6fwoVrTugiH4JbrZ0/NvHuzeVwVq90JZpJPFRnaX/HE0zj837Tn3hY1pMAWPgnfYDnSXRRtRIsgZuSZ2zDA==;20:w1PL8sIWSZ/YDN82tPqk+KSoOxxXzF8u+NalfdlAVyvwVyCZ2/PeJHgM6C5T8o6VaQP2R7L4ZiwrwzLDGGIT6AYqGH9I+PFhGdpGWoDOgPD3jj/t95bULpbli4gSlecIqo9D0MISKzELDSZhCCHKALAJobdgCUlhyXQYKBazZ1s= X-OriginatorOrg: virtuozzo.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 03 May 2017 10:20:59.7230 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-Transport-CrossTenantHeadersStamped: HE1PR0802MB2284 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 03.05.2017 00:13, Eric W. Biederman wrote: > Kirill Tkhai writes: > >> On 02.05.2017 19:33, Oleg Nesterov wrote: >>> sorry for delay, vacation... >>> >>> On 04/28, Kirill Tkhai wrote: >>>> >>>> On 27.04.2017 19:22, Oleg Nesterov wrote: >>>>> >>>>> Ah, OK, I didn't notice the ns->child_reaper check in pidns_for_children_get(). >>>>> >>>>> But note that it doesn't need tasklist_lock too. >>>> >>>> Hm, are there possible strange situations with memory ordering, when we see >>>> ns->child_reaper of already died ns, which was placed in the same memory? >>>> Do we have to use some memory barriers here? >>> >>> Could you spell please? I don't understand your concerns... >>> >>> I don't see how, say, >>> >>> static struct ns_common *pidns_for_children_get(struct task_struct *task) >>> { >>> struct ns_common *ns = NULL; >>> struct pid_namespace *pid_ns; >>> >>> task_lock(task); >>> if (task->nsproxy) { >>> pid_ns = task->nsproxy->pid_ns_for_children; >>> if (pid_ns->child_reaper) { > ^^^^^^^^^^^^^^^^^^^^ > Oleg my apologies I missed this line earlier. > This does look like a valid way to skip read_lock(&tasklist_lock); >>> ns = &pid_ns->ns; >>> get_pid_ns(ns); > ^^^^^^^^^^^^^ This needs to be: > get_pid_ns(pid_ns); > >>> } >>> } >>> task_unlock(task); >>> >>> return ns; >>> } >>> >>> can be wrong. It also looks more clean to me. >>> >>> ->child_reaper is not stable without tasklist, it can be dead/etc, but >>> we do not care? >> >> I mean the following. We had a pid_ns1 with a child_reaper set. Then >> it became dead, and a new pid_ns2 were allocated in the same memory. > > task->nsproxy->pid_ns_for_children is always changed with > task_lock(task) held. See switch_task_namespaces (used by unshare and > setns). This also gives us the guarantee that the pid_ns reference > won't be freed/reused in any for until task_lock(task) is dropped. Now I've checked kmem_cache_zalloc() and it looks like it zeroes cache memory content synchronous on allocation (it seems there is no pre-zeroed memory for GFP_ZERO cases). So, the zeroing happens before switch_task_namespaces() (and task_unlock()) and we're really safe after task_lock() in pidns_for_children_get(). Ok, I'll send new version of the patchset.