From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755346AbcFAQbe (ORCPT ); Wed, 1 Jun 2016 12:31:34 -0400 Received: from mail-lf0-f49.google.com ([209.85.215.49]:36368 "EHLO mail-lf0-f49.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750822AbcFAQbd convert rfc822-to-8bit (ORCPT ); Wed, 1 Jun 2016 12:31:33 -0400 MIME-Version: 1.0 In-Reply-To: <574EFE0F.2000404@virtuozzo.com> References: <1464691466-59010-1-git-send-email-glider@google.com> <574D7B11.8090709@virtuozzo.com> <574EFE0F.2000404@virtuozzo.com> From: Alexander Potapenko Date: Wed, 1 Jun 2016 18:31:30 +0200 Message-ID: Subject: Re: [PATCH] mm, kasan: introduce a special shadow value for allocator metadata To: Andrey Ryabinin Cc: Andrey Konovalov , Christoph Lameter , Dmitriy Vyukov , Andrew Morton , Steven Rostedt , Joonsoo Kim , Joonsoo Kim , Kostya Serebryany , kasan-dev , Linux Memory Management List , LKML Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8BIT Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Jun 1, 2016 at 5:23 PM, Andrey Ryabinin wrote: > On 05/31/2016 08:49 PM, Alexander Potapenko wrote: >> On Tue, May 31, 2016 at 1:52 PM, Andrey Ryabinin >> wrote: >>> >>> >>> On 05/31/2016 01:44 PM, Alexander Potapenko wrote: >>>> Add a special shadow value to distinguish accesses to KASAN-specific >>>> allocator metadata. >>>> >>>> Unlike AddressSanitizer in the userspace, KASAN lets the kernel proceed >>>> after a memory error. However a write to the kmalloc metadata may cause >>>> memory corruptions that will make the tool itself unreliable and induce >>>> crashes later on. Warning about such corruptions will ease the >>>> debugging. >>> >>> It will not. Whether out-of-bounds hits metadata or not is absolutely irrelevant >>> to the bug itself. This information doesn't help to understand, analyze or fix the bug. >>> >> Here's the example that made me think the opposite. >> >> I've been reworking KASAN hooks for mempool and added a test that did >> a write-after-free to an object allocated from a mempool. >> This resulted in flaky kernel crashes somewhere in quarantine >> shrinking after several attempts to `insmod test_kasan.ko`. >> Because there already were numerous KASAN errors in the test, it >> wasn't evident that the crashes were related to the new test, so I >> thought the problem was in the buggy quarantine implementation. >> However the problem was indeed in the new test, which corrupted the >> quarantine pointer in the object and caused a crash while traversing >> the quarantine list. >> >> My previous experience with userspace ASan shows that crashes in the >> tool code itself puzzle the developers. >> As a result, the users think that the tool is broken and don't believe >> its reports. >> >> I first thought about hardening the quarantine list by checksumming >> the pointers and validating them on each traversal. >> This prevents the crashes, but doesn't give the users any idea about >> what went wrong. >> On the other hand, reporting the pointer corruption right when it happens does. >> Distinguishing between a regular UAF and a quarantine corruption >> (which is what the patch in question is about) helps to prioritize the >> KASAN reports and give the developers better understanding of the >> consequences. >> > > After the first report we have memory in a corrupted state, so we are done here. This is theoretically true, that's why we crash after the first report in the userspace ASan. But since the kernel proceeds after the first KASAN report, it's possible that we see several different reports, and they are sometimes worth looking at. > Anything that happens after the first report can't be trusted since it can be an after-effect, > just like in your case. Such crashes are not worthy to look at. > Out-of-bounds that doesn't hit metadata as any other memory corruption also can lead to after-effects crashes, > thus distinguishing such bugs doesn't make a lot of sense. Unlike the crashes in the kernel itself, crashes with KASAN functions in the stack trace may make the developer think the tool is broken. > > test_kasan module is just a quick hack, made only to make sure that KASAN works. > It does some crappy thing, and may lead to crash as well. So I would recommend an immediate > reboot even after single attempt to load it. Agreed. However a plain write into the first byte of the freed object will cause similar problems. -- Alexander Potapenko Software Engineer Google Germany GmbH Erika-Mann-Straße, 33 80636 München Geschäftsführer: Matthew Scott Sucherman, Paul Terence Manicle Registergericht und -nummer: Hamburg, HRB 86891 Sitz der Gesellschaft: Hamburg