From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752466AbeENJfh (ORCPT ); Mon, 14 May 2018 05:35:37 -0400 Received: from mail-lf0-f50.google.com ([209.85.215.50]:38552 "EHLO mail-lf0-f50.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752145AbeENJfd (ORCPT ); Mon, 14 May 2018 05:35:33 -0400 X-Google-Smtp-Source: AB8JxZqep20aLxaYey12VE5QQrZMJLRf+rDAKEeyppNcqYemtTg2/IYbrQq4bG+VM/gDO0t7nY0lzw== Reply-To: alex.popov@linux.com Subject: Re: [PATCH 2/2] arm64: Clear the stack To: Mark Rutland Cc: Andy Lutomirski , Laura Abbott , Kees Cook , Ard Biesheuvel , kernel-hardening@lists.openwall.com, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org References: <20180502203326.9491-1-labbott@redhat.com> <20180502203326.9491-3-labbott@redhat.com> <20180503071917.xm2xvgagvzkworay@salmiak> <20180504110907.c2dw33kjmyybso6t@lakrids.cambridge.arm.com> <4badae50-be9b-2c6d-854b-57ab48664800@linux.com> <71199506-b46b-5f91-e489-e6450b6d1067@linux.com> <20180511161317.57k6prl54xjmsit3@lakrids.cambridge.arm.com> <20180514051555.bpbydgr56hyffjch@salmiak> From: Alexander Popov Openpgp: preference=signencrypt Autocrypt: addr=alex.popov@linux.com; prefer-encrypt=mutual; keydata= xsFNBFX15q4BEADZartsIW3sQ9R+9TOuCFRIW+RDCoBWNHhqDLu+Tzf2mZevVSF0D5AMJW4f UB1QigxOuGIeSngfmgLspdYe2Kl8+P8qyfrnBcS4hLFyLGjaP7UVGtpUl7CUxz2Hct3yhsPz ID/rnCSd0Q+3thrJTq44b2kIKqM1swt/F2Er5Bl0B4o5WKx4J9k6Dz7bAMjKD8pHZJnScoP4 dzKPhrytN/iWM01eRZRc1TcIdVsRZC3hcVE6OtFoamaYmePDwWTRhmDtWYngbRDVGe3Tl8bT 7BYN7gv7Ikt7Nq2T2TOfXEQqr9CtidxBNsqFEaajbFvpLDpUPw692+4lUbQ7FL0B1WYLvWkG cVysClEyX3VBSMzIG5eTF0Dng9RqItUxpbD317ihKqYL95jk6eK6XyI8wVOCEa1V3MhtvzUo WGZVkwm9eMVZ05GbhzmT7KHBEBbCkihS+TpVxOgzvuV+heCEaaxIDWY/k8u4tgbrVVk+tIVG 99v1//kNLqd5KuwY1Y2/h2MhRrfxqGz+l/f/qghKh+1iptm6McN//1nNaIbzXQ2Ej34jeWDa xAN1C1OANOyV7mYuYPNDl5c9QrbcNGg3D6gOeGeGiMn11NjbjHae3ipH8MkX7/k8pH5q4Lhh Ra0vtJspeg77CS4b7+WC5jlK3UAKoUja3kGgkCrnfNkvKjrkEwARAQABzSZBbGV4YW5kZXIg UG9wb3YgPGFsZXgucG9wb3ZAbGludXguY29tPsLBgAQTAQoAKgIbIwIeAQIXgAULCQgHAwUV CgkICwUWAgMBAAUJB8+UXAUCWgsUegIZAQAKCRCODp3rvH6PqqpOEACX+tXHOgMJ6fGxaNJZ HkKRFR/9AGP1bxp5QS528Sd6w17bMMQ87V5NSFUsTMPMcbIoO73DganKQ3nN6tW0ZvDTKpRt pBUCUP8KPqNvoSs3kkskaQgNQ3FXv46YqPZ7DoYj9HevY9NUyGLwCTEWD2ER5zKuNbI2ek82 j4rwdqXn9kqqBf1ExAoEsszeNHzTKRl2d+bXuGDcOdpnOi7avoQfwi/O0oapR+goxz49Oeov YFf1EVaogHjDBREaqiqJ0MSKexfVBt8RD9ev9SGSIMcwfhgUHhMTX2JY/+6BXnUbzVcHD6HR EgqVGn/0RXfJIYmFsjH0Z6cHy34Vn+aqcGa8faztPnmkA/vNfhw8k5fEE7VlBqdEY8YeOiza hHdpaUi4GofNy/GoHIqpz16UulMjGB5SBzgsYKgCO+faNBrCcBrscWTl1aJfSNJvImuS1JhB EQnl/MIegxyBBRsH68x5BCffERo4FjaG0NDCmZLjXPOgMvl3vRywHLdDZThjAea3pwdGUq+W C77i7tnnUqgK7P9i+nEKwNWZfLpfjYgH5JE/jOgMf4tpHvO6fu4AnOffdz3kOxDyi+zFLVcz rTP5b46aVjI7D0dIDTIaCKUT+PfsLnJmP18x7dU/gR/XDcUaSEbWU3D9u61AvxP47g7tN5+a 5pFIJhJ44JLk6I5H/c7BTQRV9eauARAArcUVf6RdT14hkm0zT5TPc/3BJc6PyAghV/iCoPm8 kbzjKBIK80NvGodDeUV0MnQbX40jjFdSI0m96HNt86FtifQ3nwuW/BtS8dk8+lakRVwuTgMb hJWmXqKMFdVRCbjdyLbZWpdPip0WGND6p5i801xgPRmI8P6e5e4jBO4Cx1ToIFyJOzD/jvtb UhH9t5/naKUGa5BD9gSkguooXVOFvPdvKQKca19S7bb9hzjySh63H4qlbhUrG/7JGhX+Lr3g DwuAGrrFIV0FaVyIPGZ8U2fjLKpcBC7/lZJv0jRFpZ9CjHefILxt7NGxPB9hk2iDt2tE6jSl GNeloDYJUVItFmG+/giza2KrXmDEFKl+/mwfjRI/+PHR8PscWiB7S1zhsVus3DxhbM2mAK4x mmH4k0wNfgClh0Srw9zCU2CKJ6YcuRLi/RAAiyoxBb9wnSuQS5KkxoT32LRNwfyMdwlEtQGp WtC/vBI13XJVabx0Oalx7NtvRCcX1FX9rnKVjSFHX5YJ48heAd0dwRVmzOGL/EGywb1b9Q3O IWe9EFF8tmWV/JHs2thMz492qTHA5pm5JUsHQuZGBhBU+GqdOkdkFvujcNu4w7WyuEITBFAh 5qDiGkvY9FU1OH0fWQqVU/5LHNizzIYN2KjU6529b0VTVGb4e/M0HglwtlWpkpfQzHMAEQEA AcLBZQQYAQIADwUCVfXmrgIbDAUJCWYBgAAKCRCODp3rvH6PqrZtEACKsd/UUtpKmy4mrZwl 053nWp7+WCE+S9ke7CFytmXoMWf1CIrcQTk5cmdBmB4E0l3sr/DgKlJ8UrHTdRLcZZnbVqur +fnmVeQy9lqGkaIZvx/iXVYUqhT3+DNj9Zkjrynbe5pLsrGyxYWfsPRVL6J4mQatChadjuLw 7/WC6PBmWkRA2SxUVpxFEZlirpbboYWLSXk9I3JmS5/iJ+P5kHYiB0YqYkd1twFXXxixv1GB Zi/idvWTK7x6/bUh0AAGTKc5zFhyR4DJRGROGlFTAYM3WDoa9XbrHXsggJDLNoPZJTj9DMww u28SzHLvR3t2pY1dT61jzKNDLoE3pjvzgLKF/Olif0t7+m0IPKY+8umZvUEhJ9CAUcoFPCfG tEbL6t1xrcsT7dsUhZpkIX0Qc77op8GHlfNd/N6wZUt19Vn9G8B6xrH+dinc0ylUc4+4yxt6 6BsiEzma6Ah5jexChYIwaB5Oi21yjc6bBb4l6z01WWJQ052OGaOBzi+tS5iGmc5DWH4/pFqX OIkgJVVgjPv2y41qV66QJJEi2wT4WUKLY1zA9s6KXbt8dVSzJsNFvsrAoFdtzc8v6uqCo0/W f0Id8MBKoqN5FniTHWNxYX6b2dFwq8i5Rh6Oxc6q75Kg8279+co3/tLCkU6pGga28K7tUP2z h9AUWENlnWJX/YhP8MLBZQQYAQoADwIbDAUCWgsSOgUJB9eShwAKCRCODp3rvH6PqtoND/41 ozCKAS4WWBBCU6AYLm2SoJ0EGhg1kIf9VMiqy5PKlSrAnW5yl4WJQcv5wER/7EzvZ49Gj8aG uRWfz3lyQU8dH2KG6KLilDFCZF0mViEo2C7O4QUx5xmbpMUq41fWjY947Xvd3QDisc1T1/7G uNBAALEZdqzwnKsT9G27e9Cd3AW3KsLAD4MhsALFARg6OuuwDCbLl6k5fu++26PEqORGtpJQ rRBWan9ZWb/Y57P126IVIylWiH6vt6iEPlaEHBU8H9+Z0WF6wJ5rNz9gR6GhZhmo1qsyNedD 1HzOsXQhvCinsErpZs99VdZSF3d54dac8ypH4hvbjSmXZjY3Sblhyc6RLYlru5UXJFh7Hy+E TMuCg3hIVbdyFSDkvxVlvhHgUSf8+Uk3Ya4MO4a5l9ElUqxpSqYH7CvuwkG+mH5mN8tK3CCd +aKPCxUFfil62DfTa7YgLovr7sHQB+VMQkNDPXleC+amNqJb423L8M2sfCi9gw/lA1ha6q80 ydgbcFEkNjqz4OtbrSwEHMy/ADsUWksYuzVbw7/pQTc6OAskESBr5igP7B/rIACUgiIjdOVB ktD1IQcezrDcuzVCIpuq8zC6LwLm7V1Tr6zfU9FWwnqzoQeQZH4QlP7MBuOeswCpxIl07mz9 jXz/74kjFsyRgZA+d6a1pGtOwITEBxtxxg== Message-ID: <6c311899-7131-d21b-10f6-d2ba7380a392@linux.com> Date: Mon, 14 May 2018 12:35:25 +0300 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.7.0 MIME-Version: 1.0 In-Reply-To: <20180514051555.bpbydgr56hyffjch@salmiak> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 14.05.2018 08:15, Mark Rutland wrote: > On Sun, May 13, 2018 at 11:40:07AM +0300, Alexander Popov wrote: >> It seems that previously I was very "lucky" to accidentally have those MIN_STACK_LEFT, >> call trace depth and oops=panic together to experience a hang on stack overflow >> during BUG(). >> >> >> When I run my test in a loop _without_ VMAP_STACK, I manage to corrupt the neighbour >> processes with BUG() handling overstepping the stack boundary. It's a pity, but >> I have an idea. > > I think that in the absence of VMAP_STACK, there will always be cases where we > *could* corrupt a neighbouring stack, but I agree that trying to minimize that > possibility would be good. Ok! >> In kernel/sched/core.c we already have: >> >> #ifdef CONFIG_SCHED_STACK_END_CHECK >> if (task_stack_end_corrupted(prev)) >> panic("corrupted stack end detected inside scheduler\n"); >> #endif >> >> So what would you think if I do the following in check_alloca(): >> >> if (size >= stack_left) { >> #if !defined(CONFIG_VMAP_STACK) && defined(CONFIG_SCHED_STACK_END_CHECK) >> panic("alloca over the kernel stack boundary\n"); >> #else >> BUG(); >> #endif > > Given this is already out-of-line, how about we always use panic(), regardless > of VMAP_STACK and SCHED_STACK_END_CHECK? i.e. just > > if (unlikely(size >= stack_left)) > panic("alloca over the kernel stack boundary"); > > If we have VMAP_STACK selected, and overflow during the panic, it's the same as > if we overflowed during the BUG(). It's likely that panic() will use less stack > space than BUG(), and the compiler can put the call in a slow path that > shouldn't affect most calls, so in all cases it's likely preferable. I'm sure that maintainers and Linus will strongly dislike my patch if I always use panic() here. panic() kills the whole kernel and we shouldn't use it when we can safely continue to work. Let me describe my logic. So let's have size >= stack_left on a thread stack. 1. If CONFIG_VMAP_STACK is enabled, we can safely use BUG(). Even if BUG() handling overflows the thread stack into the guard page, handle_stack_overflow() is called and the neighbour memory is not corrupted. The kernel can proceed to live. 2. If CONFIG_VMAP_STACK is disabled, BUG() handling can corrupt the neighbour kernel memory and cause the undefined behaviour of the whole kernel. I see it on my lkdtm test. That is a cogent reason for panic(). 2.a. If CONFIG_SCHED_STACK_END_CHECK is enabled, the kernel already does panic() when STACK_END_MAGIC is corrupted. So we will _not_ break the safety policy if we do panic() in a similar situation in check_alloca(). 2.b. If CONFIG_SCHED_STACK_END_CHECK is disabled, the user has some real reasons not to do panic() when the kernel stack is corrupted. So we should not do it in check_alloca() as well, just use BUG() and hope for the best. That logic can be expressed this way: if (size >= stack_left) { #if !defined(CONFIG_VMAP_STACK) && defined(CONFIG_SCHED_STACK_END_CHECK) panic("alloca over the kernel stack boundary\n"); #else BUG(); #endif I think I should add a proper comment to describe it. Thank you. Best regards, Alexander From mboxrd@z Thu Jan 1 00:00:00 1970 From: alex.popov@linux.com (Alexander Popov) Date: Mon, 14 May 2018 12:35:25 +0300 Subject: [PATCH 2/2] arm64: Clear the stack In-Reply-To: <20180514051555.bpbydgr56hyffjch@salmiak> References: <20180502203326.9491-1-labbott@redhat.com> <20180502203326.9491-3-labbott@redhat.com> <20180503071917.xm2xvgagvzkworay@salmiak> <20180504110907.c2dw33kjmyybso6t@lakrids.cambridge.arm.com> <4badae50-be9b-2c6d-854b-57ab48664800@linux.com> <71199506-b46b-5f91-e489-e6450b6d1067@linux.com> <20180511161317.57k6prl54xjmsit3@lakrids.cambridge.arm.com> <20180514051555.bpbydgr56hyffjch@salmiak> Message-ID: <6c311899-7131-d21b-10f6-d2ba7380a392@linux.com> To: linux-arm-kernel@lists.infradead.org List-Id: linux-arm-kernel.lists.infradead.org On 14.05.2018 08:15, Mark Rutland wrote: > On Sun, May 13, 2018 at 11:40:07AM +0300, Alexander Popov wrote: >> It seems that previously I was very "lucky" to accidentally have those MIN_STACK_LEFT, >> call trace depth and oops=panic together to experience a hang on stack overflow >> during BUG(). >> >> >> When I run my test in a loop _without_ VMAP_STACK, I manage to corrupt the neighbour >> processes with BUG() handling overstepping the stack boundary. It's a pity, but >> I have an idea. > > I think that in the absence of VMAP_STACK, there will always be cases where we > *could* corrupt a neighbouring stack, but I agree that trying to minimize that > possibility would be good. Ok! >> In kernel/sched/core.c we already have: >> >> #ifdef CONFIG_SCHED_STACK_END_CHECK >> if (task_stack_end_corrupted(prev)) >> panic("corrupted stack end detected inside scheduler\n"); >> #endif >> >> So what would you think if I do the following in check_alloca(): >> >> if (size >= stack_left) { >> #if !defined(CONFIG_VMAP_STACK) && defined(CONFIG_SCHED_STACK_END_CHECK) >> panic("alloca over the kernel stack boundary\n"); >> #else >> BUG(); >> #endif > > Given this is already out-of-line, how about we always use panic(), regardless > of VMAP_STACK and SCHED_STACK_END_CHECK? i.e. just > > if (unlikely(size >= stack_left)) > panic("alloca over the kernel stack boundary"); > > If we have VMAP_STACK selected, and overflow during the panic, it's the same as > if we overflowed during the BUG(). It's likely that panic() will use less stack > space than BUG(), and the compiler can put the call in a slow path that > shouldn't affect most calls, so in all cases it's likely preferable. I'm sure that maintainers and Linus will strongly dislike my patch if I always use panic() here. panic() kills the whole kernel and we shouldn't use it when we can safely continue to work. Let me describe my logic. So let's have size >= stack_left on a thread stack. 1. If CONFIG_VMAP_STACK is enabled, we can safely use BUG(). Even if BUG() handling overflows the thread stack into the guard page, handle_stack_overflow() is called and the neighbour memory is not corrupted. The kernel can proceed to live. 2. If CONFIG_VMAP_STACK is disabled, BUG() handling can corrupt the neighbour kernel memory and cause the undefined behaviour of the whole kernel. I see it on my lkdtm test. That is a cogent reason for panic(). 2.a. If CONFIG_SCHED_STACK_END_CHECK is enabled, the kernel already does panic() when STACK_END_MAGIC is corrupted. So we will _not_ break the safety policy if we do panic() in a similar situation in check_alloca(). 2.b. If CONFIG_SCHED_STACK_END_CHECK is disabled, the user has some real reasons not to do panic() when the kernel stack is corrupted. So we should not do it in check_alloca() as well, just use BUG() and hope for the best. That logic can be expressed this way: if (size >= stack_left) { #if !defined(CONFIG_VMAP_STACK) && defined(CONFIG_SCHED_STACK_END_CHECK) panic("alloca over the kernel stack boundary\n"); #else BUG(); #endif I think I should add a proper comment to describe it. Thank you. Best regards, Alexander