From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753163AbcFBMDB (ORCPT ); Thu, 2 Jun 2016 08:03:01 -0400 Received: from mail-lf0-f42.google.com ([209.85.215.42]:36569 "EHLO mail-lf0-f42.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750811AbcFBMC7 convert rfc822-to-8bit (ORCPT ); Thu, 2 Jun 2016 08:02:59 -0400 MIME-Version: 1.0 In-Reply-To: References: <1464691466-59010-1-git-send-email-glider@google.com> <574D7B11.8090709@virtuozzo.com> <574EFE0F.2000404@virtuozzo.com> From: Alexander Potapenko Date: Thu, 2 Jun 2016 14:02:55 +0200 Message-ID: Subject: Re: [PATCH] mm, kasan: introduce a special shadow value for allocator metadata To: Andrey Ryabinin Cc: Andrey Konovalov , Christoph Lameter , Dmitriy Vyukov , Andrew Morton , Steven Rostedt , Joonsoo Kim , Joonsoo Kim , Kostya Serebryany , kasan-dev , Linux Memory Management List , LKML Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8BIT Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Jun 1, 2016 at 6:31 PM, Alexander Potapenko wrote: > On Wed, Jun 1, 2016 at 5:23 PM, Andrey Ryabinin wrote: >> On 05/31/2016 08:49 PM, Alexander Potapenko wrote: >>> On Tue, May 31, 2016 at 1:52 PM, Andrey Ryabinin >>> wrote: >>>> >>>> >>>> On 05/31/2016 01:44 PM, Alexander Potapenko wrote: >>>>> Add a special shadow value to distinguish accesses to KASAN-specific >>>>> allocator metadata. >>>>> >>>>> Unlike AddressSanitizer in the userspace, KASAN lets the kernel proceed >>>>> after a memory error. However a write to the kmalloc metadata may cause >>>>> memory corruptions that will make the tool itself unreliable and induce >>>>> crashes later on. Warning about such corruptions will ease the >>>>> debugging. >>>> >>>> It will not. Whether out-of-bounds hits metadata or not is absolutely irrelevant >>>> to the bug itself. This information doesn't help to understand, analyze or fix the bug. >>>> >>> Here's the example that made me think the opposite. >>> >>> I've been reworking KASAN hooks for mempool and added a test that did >>> a write-after-free to an object allocated from a mempool. >>> This resulted in flaky kernel crashes somewhere in quarantine >>> shrinking after several attempts to `insmod test_kasan.ko`. >>> Because there already were numerous KASAN errors in the test, it >>> wasn't evident that the crashes were related to the new test, so I >>> thought the problem was in the buggy quarantine implementation. >>> However the problem was indeed in the new test, which corrupted the >>> quarantine pointer in the object and caused a crash while traversing >>> the quarantine list. >>> >>> My previous experience with userspace ASan shows that crashes in the >>> tool code itself puzzle the developers. >>> As a result, the users think that the tool is broken and don't believe >>> its reports. >>> >>> I first thought about hardening the quarantine list by checksumming >>> the pointers and validating them on each traversal. >>> This prevents the crashes, but doesn't give the users any idea about >>> what went wrong. >>> On the other hand, reporting the pointer corruption right when it happens does. >>> Distinguishing between a regular UAF and a quarantine corruption >>> (which is what the patch in question is about) helps to prioritize the >>> KASAN reports and give the developers better understanding of the >>> consequences. >>> >> >> After the first report we have memory in a corrupted state, so we are done here. > This is theoretically true, that's why we crash after the first report > in the userspace ASan. > But since the kernel proceeds after the first KASAN report, it's > possible that we see several different reports, and they are sometimes > worth looking at. > >> Anything that happens after the first report can't be trusted since it can be an after-effect, >> just like in your case. Such crashes are not worthy to look at. >> Out-of-bounds that doesn't hit metadata as any other memory corruption also can lead to after-effects crashes, >> thus distinguishing such bugs doesn't make a lot of sense. > Unlike the crashes in the kernel itself, crashes with KASAN functions > in the stack trace may make the developer think the tool is broken. >> >> test_kasan module is just a quick hack, made only to make sure that KASAN works. >> It does some crappy thing, and may lead to crash as well. So I would recommend an immediate >> reboot even after single attempt to load it. > Agreed. However a plain write into the first byte of the freed object > will cause similar problems. On a second thought, we could do without the additional shadow byte value, by just comparing the address to the metadata offset. > > -- > Alexander Potapenko > Software Engineer > > Google Germany GmbH > Erika-Mann-Straße, 33 > 80636 München > > Geschäftsführer: Matthew Scott Sucherman, Paul Terence Manicle > Registergericht und -nummer: Hamburg, HRB 86891 > Sitz der Gesellschaft: Hamburg -- Alexander Potapenko Software Engineer Google Germany GmbH Erika-Mann-Straße, 33 80636 München Geschäftsführer: Matthew Scott Sucherman, Paul Terence Manicle Registergericht und -nummer: Hamburg, HRB 86891 Sitz der Gesellschaft: Hamburg