From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932163AbeENNxT (ORCPT ); Mon, 14 May 2018 09:53:19 -0400 Received: from mail-lf0-f66.google.com ([209.85.215.66]:35498 "EHLO mail-lf0-f66.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753485AbeENNxQ (ORCPT ); Mon, 14 May 2018 09:53:16 -0400 X-Google-Smtp-Source: AB8JxZqg/++04yq9c/rJS2GvRiA7HELGHw0Vq8mq+es9P9/f9YDClK4caa/xYFKnXeqDb+O9vrQFAQ== Reply-To: alex.popov@linux.com Subject: Re: [PATCH 2/2] arm64: Clear the stack To: Mark Rutland Cc: Andy Lutomirski , Laura Abbott , Kees Cook , Ard Biesheuvel , kernel-hardening@lists.openwall.com, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org References: <20180502203326.9491-3-labbott@redhat.com> <20180503071917.xm2xvgagvzkworay@salmiak> <20180504110907.c2dw33kjmyybso6t@lakrids.cambridge.arm.com> <4badae50-be9b-2c6d-854b-57ab48664800@linux.com> <71199506-b46b-5f91-e489-e6450b6d1067@linux.com> <20180511161317.57k6prl54xjmsit3@lakrids.cambridge.arm.com> <20180514051555.bpbydgr56hyffjch@salmiak> <6c311899-7131-d21b-10f6-d2ba7380a392@linux.com> <20180514100639.v3erlzbuv2e4awfh@lakrids.cambridge.arm.com> From: Alexander Popov Openpgp: preference=signencrypt Autocrypt: addr=alex.popov@linux.com; prefer-encrypt=mutual; keydata= xsFNBFX15q4BEADZartsIW3sQ9R+9TOuCFRIW+RDCoBWNHhqDLu+Tzf2mZevVSF0D5AMJW4f UB1QigxOuGIeSngfmgLspdYe2Kl8+P8qyfrnBcS4hLFyLGjaP7UVGtpUl7CUxz2Hct3yhsPz ID/rnCSd0Q+3thrJTq44b2kIKqM1swt/F2Er5Bl0B4o5WKx4J9k6Dz7bAMjKD8pHZJnScoP4 dzKPhrytN/iWM01eRZRc1TcIdVsRZC3hcVE6OtFoamaYmePDwWTRhmDtWYngbRDVGe3Tl8bT 7BYN7gv7Ikt7Nq2T2TOfXEQqr9CtidxBNsqFEaajbFvpLDpUPw692+4lUbQ7FL0B1WYLvWkG cVysClEyX3VBSMzIG5eTF0Dng9RqItUxpbD317ihKqYL95jk6eK6XyI8wVOCEa1V3MhtvzUo WGZVkwm9eMVZ05GbhzmT7KHBEBbCkihS+TpVxOgzvuV+heCEaaxIDWY/k8u4tgbrVVk+tIVG 99v1//kNLqd5KuwY1Y2/h2MhRrfxqGz+l/f/qghKh+1iptm6McN//1nNaIbzXQ2Ej34jeWDa xAN1C1OANOyV7mYuYPNDl5c9QrbcNGg3D6gOeGeGiMn11NjbjHae3ipH8MkX7/k8pH5q4Lhh Ra0vtJspeg77CS4b7+WC5jlK3UAKoUja3kGgkCrnfNkvKjrkEwARAQABzSZBbGV4YW5kZXIg UG9wb3YgPGFsZXgucG9wb3ZAbGludXguY29tPsLBgAQTAQoAKgIbIwIeAQIXgAULCQgHAwUV CgkICwUWAgMBAAUJB8+UXAUCWgsUegIZAQAKCRCODp3rvH6PqqpOEACX+tXHOgMJ6fGxaNJZ HkKRFR/9AGP1bxp5QS528Sd6w17bMMQ87V5NSFUsTMPMcbIoO73DganKQ3nN6tW0ZvDTKpRt pBUCUP8KPqNvoSs3kkskaQgNQ3FXv46YqPZ7DoYj9HevY9NUyGLwCTEWD2ER5zKuNbI2ek82 j4rwdqXn9kqqBf1ExAoEsszeNHzTKRl2d+bXuGDcOdpnOi7avoQfwi/O0oapR+goxz49Oeov YFf1EVaogHjDBREaqiqJ0MSKexfVBt8RD9ev9SGSIMcwfhgUHhMTX2JY/+6BXnUbzVcHD6HR EgqVGn/0RXfJIYmFsjH0Z6cHy34Vn+aqcGa8faztPnmkA/vNfhw8k5fEE7VlBqdEY8YeOiza hHdpaUi4GofNy/GoHIqpz16UulMjGB5SBzgsYKgCO+faNBrCcBrscWTl1aJfSNJvImuS1JhB EQnl/MIegxyBBRsH68x5BCffERo4FjaG0NDCmZLjXPOgMvl3vRywHLdDZThjAea3pwdGUq+W C77i7tnnUqgK7P9i+nEKwNWZfLpfjYgH5JE/jOgMf4tpHvO6fu4AnOffdz3kOxDyi+zFLVcz rTP5b46aVjI7D0dIDTIaCKUT+PfsLnJmP18x7dU/gR/XDcUaSEbWU3D9u61AvxP47g7tN5+a 5pFIJhJ44JLk6I5H/c7BTQRV9eauARAArcUVf6RdT14hkm0zT5TPc/3BJc6PyAghV/iCoPm8 kbzjKBIK80NvGodDeUV0MnQbX40jjFdSI0m96HNt86FtifQ3nwuW/BtS8dk8+lakRVwuTgMb hJWmXqKMFdVRCbjdyLbZWpdPip0WGND6p5i801xgPRmI8P6e5e4jBO4Cx1ToIFyJOzD/jvtb UhH9t5/naKUGa5BD9gSkguooXVOFvPdvKQKca19S7bb9hzjySh63H4qlbhUrG/7JGhX+Lr3g DwuAGrrFIV0FaVyIPGZ8U2fjLKpcBC7/lZJv0jRFpZ9CjHefILxt7NGxPB9hk2iDt2tE6jSl GNeloDYJUVItFmG+/giza2KrXmDEFKl+/mwfjRI/+PHR8PscWiB7S1zhsVus3DxhbM2mAK4x mmH4k0wNfgClh0Srw9zCU2CKJ6YcuRLi/RAAiyoxBb9wnSuQS5KkxoT32LRNwfyMdwlEtQGp WtC/vBI13XJVabx0Oalx7NtvRCcX1FX9rnKVjSFHX5YJ48heAd0dwRVmzOGL/EGywb1b9Q3O IWe9EFF8tmWV/JHs2thMz492qTHA5pm5JUsHQuZGBhBU+GqdOkdkFvujcNu4w7WyuEITBFAh 5qDiGkvY9FU1OH0fWQqVU/5LHNizzIYN2KjU6529b0VTVGb4e/M0HglwtlWpkpfQzHMAEQEA AcLBZQQYAQIADwUCVfXmrgIbDAUJCWYBgAAKCRCODp3rvH6PqrZtEACKsd/UUtpKmy4mrZwl 053nWp7+WCE+S9ke7CFytmXoMWf1CIrcQTk5cmdBmB4E0l3sr/DgKlJ8UrHTdRLcZZnbVqur +fnmVeQy9lqGkaIZvx/iXVYUqhT3+DNj9Zkjrynbe5pLsrGyxYWfsPRVL6J4mQatChadjuLw 7/WC6PBmWkRA2SxUVpxFEZlirpbboYWLSXk9I3JmS5/iJ+P5kHYiB0YqYkd1twFXXxixv1GB Zi/idvWTK7x6/bUh0AAGTKc5zFhyR4DJRGROGlFTAYM3WDoa9XbrHXsggJDLNoPZJTj9DMww u28SzHLvR3t2pY1dT61jzKNDLoE3pjvzgLKF/Olif0t7+m0IPKY+8umZvUEhJ9CAUcoFPCfG tEbL6t1xrcsT7dsUhZpkIX0Qc77op8GHlfNd/N6wZUt19Vn9G8B6xrH+dinc0ylUc4+4yxt6 6BsiEzma6Ah5jexChYIwaB5Oi21yjc6bBb4l6z01WWJQ052OGaOBzi+tS5iGmc5DWH4/pFqX OIkgJVVgjPv2y41qV66QJJEi2wT4WUKLY1zA9s6KXbt8dVSzJsNFvsrAoFdtzc8v6uqCo0/W f0Id8MBKoqN5FniTHWNxYX6b2dFwq8i5Rh6Oxc6q75Kg8279+co3/tLCkU6pGga28K7tUP2z h9AUWENlnWJX/YhP8MLBZQQYAQoADwIbDAUCWgsSOgUJB9eShwAKCRCODp3rvH6PqtoND/41 ozCKAS4WWBBCU6AYLm2SoJ0EGhg1kIf9VMiqy5PKlSrAnW5yl4WJQcv5wER/7EzvZ49Gj8aG uRWfz3lyQU8dH2KG6KLilDFCZF0mViEo2C7O4QUx5xmbpMUq41fWjY947Xvd3QDisc1T1/7G uNBAALEZdqzwnKsT9G27e9Cd3AW3KsLAD4MhsALFARg6OuuwDCbLl6k5fu++26PEqORGtpJQ rRBWan9ZWb/Y57P126IVIylWiH6vt6iEPlaEHBU8H9+Z0WF6wJ5rNz9gR6GhZhmo1qsyNedD 1HzOsXQhvCinsErpZs99VdZSF3d54dac8ypH4hvbjSmXZjY3Sblhyc6RLYlru5UXJFh7Hy+E TMuCg3hIVbdyFSDkvxVlvhHgUSf8+Uk3Ya4MO4a5l9ElUqxpSqYH7CvuwkG+mH5mN8tK3CCd +aKPCxUFfil62DfTa7YgLovr7sHQB+VMQkNDPXleC+amNqJb423L8M2sfCi9gw/lA1ha6q80 ydgbcFEkNjqz4OtbrSwEHMy/ADsUWksYuzVbw7/pQTc6OAskESBr5igP7B/rIACUgiIjdOVB ktD1IQcezrDcuzVCIpuq8zC6LwLm7V1Tr6zfU9FWwnqzoQeQZH4QlP7MBuOeswCpxIl07mz9 jXz/74kjFsyRgZA+d6a1pGtOwITEBxtxxg== Message-ID: Date: Mon, 14 May 2018 16:53:12 +0300 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.7.0 MIME-Version: 1.0 In-Reply-To: <20180514100639.v3erlzbuv2e4awfh@lakrids.cambridge.arm.com> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 14.05.2018 13:06, Mark Rutland wrote: > On Mon, May 14, 2018 at 12:35:25PM +0300, Alexander Popov wrote: >> On 14.05.2018 08:15, Mark Rutland wrote: >>> On Sun, May 13, 2018 at 11:40:07AM +0300, Alexander Popov wrote: >>>> So what would you think if I do the following in check_alloca(): >>>> >>>> if (size >= stack_left) { >>>> #if !defined(CONFIG_VMAP_STACK) && defined(CONFIG_SCHED_STACK_END_CHECK) >>>> panic("alloca over the kernel stack boundary\n"); >>>> #else >>>> BUG(); >>>> #endif >>> >>> Given this is already out-of-line, how about we always use panic(), regardless >>> of VMAP_STACK and SCHED_STACK_END_CHECK? i.e. just >>> >>> if (unlikely(size >= stack_left)) >>> panic("alloca over the kernel stack boundary"); >>> >>> If we have VMAP_STACK selected, and overflow during the panic, it's the same as >>> if we overflowed during the BUG(). It's likely that panic() will use less stack >>> space than BUG(), and the compiler can put the call in a slow path that >>> shouldn't affect most calls, so in all cases it's likely preferable. >> >> I'm sure that maintainers and Linus will strongly dislike my patch if I always >> use panic() here. panic() kills the whole kernel and we shouldn't use it when we >> can safely continue to work. >> >> Let me describe my logic. So let's have size >= stack_left on a thread stack. >> >> 1. If CONFIG_VMAP_STACK is enabled, we can safely use BUG(). Even if BUG() >> handling overflows the thread stack into the guard page, handle_stack_overflow() >> is called and the neighbour memory is not corrupted. The kernel can proceed to live. > > On arm64 with CONFIG_VMAP_STACK, a stack overflow will result in a > panic(). My understanding was that the same is true on x86. No, x86 CONFIG_VMAP_STACK only kills the offending process. I see it on my deep recursion test, the kernel continues to live. handle_stack_overflow() in arch/x86/kernel/traps.c calls die(). >> 2. If CONFIG_VMAP_STACK is disabled, BUG() handling can corrupt the neighbour >> kernel memory and cause the undefined behaviour of the whole kernel. I see it on >> my lkdtm test. That is a cogent reason for panic(). > > In this case, panic() can also corrupt the neighbour stack, and could > also fail. > > When CONFIG_VMAP_STACK is not selected, a stack overflow simply cannot > be handled reliably -- while panic() may be more likely to succeed, it > is not gauranteed to. > >> 2.a. If CONFIG_SCHED_STACK_END_CHECK is enabled, the kernel already does panic() >> when STACK_END_MAGIC is corrupted. So we will _not_ break the safety policy if >> we do panic() in a similar situation in check_alloca(). > > Sure, I'm certainly happy with panic() here. Ok! >> 2.b. If CONFIG_SCHED_STACK_END_CHECK is disabled, the user has some real reasons >> not to do panic() when the kernel stack is corrupted. > > I believe that CONFIG_SCHED_STACK_END_CHECK is seen as a debug feature, > and hence people don't select it. I see CONFIG_SCHED_STACK_END_CHECK enabled by default in Ubuntu config... > I strongly doubt that people have > reasons to disable it other than not wanting the overhead associated > with debug features. I think it's not a question of performance here. There are cases when a system must live as long as possible (even partially corrupted) and must not die entirely. Oops is ok for those systems, but panic (full DoS) is not. > I think it is reasonable to panic() here even with CONFIG_VMAP_STACK > selected. It's too tough for CONFIG_VMAP_STACK on x86 - the system can proceed to live. Anyway, the check_alloca() code will not be shared between x86 and arm64, I've described the reasons in this thread. So I can have BUG() for CONFIG_VMAP_STACK on x86 and Laura can consistently use panic() on arm64. >> So we should not do it in check_alloca() as well, just use BUG() and >> hope for the best. > > Regardless of whether we BUG() or panic(), we're hoping for the best. > > Consistently using panic() here will keep things simpler, so any failure > reported will be easier to reason about, and easier to debug. Let me keep BUG() for !CONFIG_SCHED_STACK_END_CHECK. I beware of using panic() by default, let distro/user decide this. I remember very well how I was shouted at, when this one was merged: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=ce6fa91b93630396ca220c33dd38ffc62686d499 Mark, I'm really grateful to you for such a nice code review! Alexander