From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 45BDCC433F5 for ; Wed, 5 Oct 2022 12:39:09 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229927AbiJEMjH (ORCPT ); Wed, 5 Oct 2022 08:39:07 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39186 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229507AbiJEMjE (ORCPT ); Wed, 5 Oct 2022 08:39:04 -0400 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 6E6615788D for ; Wed, 5 Oct 2022 05:39:03 -0700 (PDT) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id CCCB8113E; Wed, 5 Oct 2022 05:39:09 -0700 (PDT) Received: from [10.1.197.78] (unknown [10.1.197.78]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 271383F792; Wed, 5 Oct 2022 05:39:01 -0700 (PDT) Message-ID: <830e8c64-0118-9a2d-5dcf-5cad55425dc2@arm.com> Date: Wed, 5 Oct 2022 13:38:55 +0100 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux aarch64; rv:91.0) Gecko/20100101 Thunderbird/91.10.0 Subject: Re: [syzbot] KASAN: invalid-access Read in copy_page Content-Language: en-GB To: Andrey Konovalov , Catalin Marinas Cc: Linux ARM , LKML , syzkaller-bugs , tongtiangen@huawei.com, Vincenzo Frascino , Kefeng Wang , Will Deacon , syzbot , Evgenii Stepanov , Peter Collingbourne , Dmitry Vyukov References: <0000000000004387dc05e5888ae5@google.com> From: James Morse In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi guys, On 27/09/2022 17:55, Andrey Konovalov wrote: > On Tue, Sep 6, 2022 at 6:23 PM Catalin Marinas wrote: >> >> On Tue, Sep 06, 2022 at 04:39:57PM +0200, Andrey Konovalov wrote: >>> On Tue, Sep 6, 2022 at 4:29 PM Catalin Marinas wrote: >>>>>> Does it take long to reproduce this kasan warning? >>>>> >>>>> syzbot finds several such cases every day (200 crashes for the past 35 days): >>>>> https://syzkaller.appspot.com/bug?extid=c2c79c6d6eddc5262b77 >>>>> So once it reaches the tested tree, we should have an answer within a day. >>> >>> To be specific, this syzkaller instance fuzzes the mainline, so the >>> patch with the WARN_ON needs to end up there. >>> >>> If this is unacceptable, perhaps, we could switch the MTE syzkaller >>> instance to the arm64 testing tree. >> >> It needs some more digging first. My first guess was that a PROT_MTE >> page was mapped into the user address space and the task repainted it >> but I don't think that's the case. > syzkaller still keeps hitting this issue and I was wondering if you > have any ideas of what could be wrong here? > >> Since I can't find the kernel boot log for these runs, is there any kind >> of swap enabled? I'm trying to narrow down where the problem may be. > > I don't think there is. I've reproduced this with the latest qemu and v6.0 kernel using ubuntu 15.04 user-space. The reproducer is just to log in once its booted. The vm has swap, and I've turned the memory down low enough to force it to swap. The round trip time is about 15 minutes. I've not managed to reproduce it without swap, or with more memory. (but it may be a timing thing) Below is one example of tag corruption that affected page-cache memory that wouldn't be swapped: -------------------%<------------------- [49488.484420] BUG: KASAN: invalid-access in __arch_copy_to_user+0x180/0x240 [49488.487122] Read at addr f1ff00000ad48000 by task apt-config/5041 [49488.488614] Pointer tag: [f1], memory tag: [fe] [49488.490921] CPU: 1 PID: 5041 Comm: apt-config Not tainted 6.0.0 #14546 [49488.492364] Hardware name: linux,dummy-virt (DT) [49488.493790] Call trace: [49488.494640] dump_backtrace.part.0+0xd0/0xe0 [49488.495811] show_stack+0x18/0x50 [49488.496785] dump_stack_lvl+0x68/0x84 [49488.497781] print_report+0x104/0x604 [49488.498790] kasan_report+0x8c/0xb0 [49488.499758] __do_kernel_fault+0x11c/0x1bc [49488.500801] do_tag_check_fault+0x78/0x90 [49488.501830] do_mem_abort+0x44/0x9c [49488.502813] el1_abort+0x40/0x60 [49488.503839] el1h_64_sync_handler+0xb0/0xd0 [49488.504880] el1h_64_sync+0x64/0x68 [49488.505847] __arch_copy_to_user+0x180/0x240 [49488.506917] _copy_to_iter+0x68/0x5c0 [49488.507918] copy_page_to_iter+0xac/0x33c [49488.508943] filemap_read+0x1b4/0x3b0 [49488.509936] generic_file_read_iter+0x108/0x1a0 [49488.511033] ext4_file_read_iter+0x58/0x1f0 [49488.512078] vfs_read+0x1f8/0x2a0 [49488.513031] ksys_read+0x68/0xf4 [49488.513978] __arm64_sys_read+0x1c/0x2c [49488.514998] invoke_syscall+0x48/0x114 [49488.516046] el0_svc_common.constprop.0+0x44/0xec [49488.517153] do_el0_svc+0x2c/0xc0 [49488.518120] el0_svc+0x2c/0xb4 [49488.519041] el0t_64_sync_handler+0xb8/0xc0 [49488.520080] el0t_64_sync+0x198/0x19c [49488.522268] The buggy address belongs to the physical page: [49488.523778] page:00000000db6e19d9 refcount:20 mapcount:18 mapping:0000000052573be9 index:0x0 pfn:0x4ad48 [49488.524938] memcg:faff000002c70000 [49488.525430] aops:ext4_da_aops ino:8061 dentry name:"libc-2.21.so" [49488.526289] flags: 0x1ffc38002020876(referenced|uptodate|lru|active|workingset|arch_1|mappedtodisk|arch_2|node=0|zone=0|lastcpupid=0x7ff|kasantag=0xe) CMA [49488.527947] raw: 01ffc38002020876 fffffc00002b5248 fffffc00002b51c8 f8ff00000335c760 [49488.528325] raw: 0000000000000000 0000000000000000 0000001400000011 faff000002c70000 [49488.528669] page dumped because: kasan: bad access detected [49488.529615] Memory state around the buggy address: [49488.531027] ffff00000ad47e00: f1 f1 f1 f1 f1 f1 f1 f1 f1 f1 f1 f1 f1 f1 f1 f1 [49488.532442] ffff00000ad47f00: f1 f1 f1 f1 f1 f1 f1 f1 f1 f1 f1 f1 f1 f1 f1 f1 [49488.533922] >ffff00000ad48000: fe fe fe fe fe fe fe fe fe fe fe fe fe fe fe fe [49488.535259] ^ [49488.536292] ffff00000ad48100: fe fe fe fe fe fe fe fe fe fe fe fe fe fe fe fe [49488.537628] ffff00000ad48200: fe fe fe fe fe fe fe fe fe fe fe fe fe fe fe fe [49488.539015] ================================================================== [49488.603970] Disabling lock debugging due to kernel taint -------------------%<------------------- Thanks, James From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 4BD57C433FE for ; Wed, 5 Oct 2022 12:40:16 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:In-Reply-To:From:References:Cc:To: Subject:MIME-Version:Date:Message-ID:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=W6h1SE+qcgz1osqKu44nwMo284vRlwtxkT7X5zTL0Pk=; b=s/bQcPEp/0bndC IduM1BrwNA+iCapD8u+PTWsEus4IbHwdmJYAJ3kjqUWywaxg1Lyww0YfzqO1LxNUkRrjhnV9ZC66k G8BMu99RoqD5U/uhY6OKzD52fB1c7fzF+cq6rkrAmvtfT38xbTHjHtBnxyyBnRmOW9T+IxAdpJTGn qWKzK9gVJO85/ju7kymBwTbbnR8Kz4AwWp3vadTTdYWOflJNe8kz2PVoLcT7OsyxUokaVZ60Uvn2M ghVmhKyjE4bLRSKP1OjIburOp/TWSCZK4bpBDAKrnx5HNO0DRV46GXj/RcFbSgVI2X1cFccctEXtD mfj4KpSw9s7rSTKPgYZQ==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1og3g7-00E36X-2m; Wed, 05 Oct 2022 12:39:11 +0000 Received: from foss.arm.com ([217.140.110.172]) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1og3g3-00E35D-3W for linux-arm-kernel@lists.infradead.org; Wed, 05 Oct 2022 12:39:09 +0000 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id CCCB8113E; Wed, 5 Oct 2022 05:39:09 -0700 (PDT) Received: from [10.1.197.78] (unknown [10.1.197.78]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 271383F792; Wed, 5 Oct 2022 05:39:01 -0700 (PDT) Message-ID: <830e8c64-0118-9a2d-5dcf-5cad55425dc2@arm.com> Date: Wed, 5 Oct 2022 13:38:55 +0100 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux aarch64; rv:91.0) Gecko/20100101 Thunderbird/91.10.0 Subject: Re: [syzbot] KASAN: invalid-access Read in copy_page Content-Language: en-GB To: Andrey Konovalov , Catalin Marinas Cc: Linux ARM , LKML , syzkaller-bugs , tongtiangen@huawei.com, Vincenzo Frascino , Kefeng Wang , Will Deacon , syzbot , Evgenii Stepanov , Peter Collingbourne , Dmitry Vyukov References: <0000000000004387dc05e5888ae5@google.com> From: James Morse In-Reply-To: X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20221005_053907_294586_CB6BED01 X-CRM114-Status: GOOD ( 20.92 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org Hi guys, On 27/09/2022 17:55, Andrey Konovalov wrote: > On Tue, Sep 6, 2022 at 6:23 PM Catalin Marinas wrote: >> >> On Tue, Sep 06, 2022 at 04:39:57PM +0200, Andrey Konovalov wrote: >>> On Tue, Sep 6, 2022 at 4:29 PM Catalin Marinas wrote: >>>>>> Does it take long to reproduce this kasan warning? >>>>> >>>>> syzbot finds several such cases every day (200 crashes for the past 35 days): >>>>> https://syzkaller.appspot.com/bug?extid=c2c79c6d6eddc5262b77 >>>>> So once it reaches the tested tree, we should have an answer within a day. >>> >>> To be specific, this syzkaller instance fuzzes the mainline, so the >>> patch with the WARN_ON needs to end up there. >>> >>> If this is unacceptable, perhaps, we could switch the MTE syzkaller >>> instance to the arm64 testing tree. >> >> It needs some more digging first. My first guess was that a PROT_MTE >> page was mapped into the user address space and the task repainted it >> but I don't think that's the case. > syzkaller still keeps hitting this issue and I was wondering if you > have any ideas of what could be wrong here? > >> Since I can't find the kernel boot log for these runs, is there any kind >> of swap enabled? I'm trying to narrow down where the problem may be. > > I don't think there is. I've reproduced this with the latest qemu and v6.0 kernel using ubuntu 15.04 user-space. The reproducer is just to log in once its booted. The vm has swap, and I've turned the memory down low enough to force it to swap. The round trip time is about 15 minutes. I've not managed to reproduce it without swap, or with more memory. (but it may be a timing thing) Below is one example of tag corruption that affected page-cache memory that wouldn't be swapped: -------------------%<------------------- [49488.484420] BUG: KASAN: invalid-access in __arch_copy_to_user+0x180/0x240 [49488.487122] Read at addr f1ff00000ad48000 by task apt-config/5041 [49488.488614] Pointer tag: [f1], memory tag: [fe] [49488.490921] CPU: 1 PID: 5041 Comm: apt-config Not tainted 6.0.0 #14546 [49488.492364] Hardware name: linux,dummy-virt (DT) [49488.493790] Call trace: [49488.494640] dump_backtrace.part.0+0xd0/0xe0 [49488.495811] show_stack+0x18/0x50 [49488.496785] dump_stack_lvl+0x68/0x84 [49488.497781] print_report+0x104/0x604 [49488.498790] kasan_report+0x8c/0xb0 [49488.499758] __do_kernel_fault+0x11c/0x1bc [49488.500801] do_tag_check_fault+0x78/0x90 [49488.501830] do_mem_abort+0x44/0x9c [49488.502813] el1_abort+0x40/0x60 [49488.503839] el1h_64_sync_handler+0xb0/0xd0 [49488.504880] el1h_64_sync+0x64/0x68 [49488.505847] __arch_copy_to_user+0x180/0x240 [49488.506917] _copy_to_iter+0x68/0x5c0 [49488.507918] copy_page_to_iter+0xac/0x33c [49488.508943] filemap_read+0x1b4/0x3b0 [49488.509936] generic_file_read_iter+0x108/0x1a0 [49488.511033] ext4_file_read_iter+0x58/0x1f0 [49488.512078] vfs_read+0x1f8/0x2a0 [49488.513031] ksys_read+0x68/0xf4 [49488.513978] __arm64_sys_read+0x1c/0x2c [49488.514998] invoke_syscall+0x48/0x114 [49488.516046] el0_svc_common.constprop.0+0x44/0xec [49488.517153] do_el0_svc+0x2c/0xc0 [49488.518120] el0_svc+0x2c/0xb4 [49488.519041] el0t_64_sync_handler+0xb8/0xc0 [49488.520080] el0t_64_sync+0x198/0x19c [49488.522268] The buggy address belongs to the physical page: [49488.523778] page:00000000db6e19d9 refcount:20 mapcount:18 mapping:0000000052573be9 index:0x0 pfn:0x4ad48 [49488.524938] memcg:faff000002c70000 [49488.525430] aops:ext4_da_aops ino:8061 dentry name:"libc-2.21.so" [49488.526289] flags: 0x1ffc38002020876(referenced|uptodate|lru|active|workingset|arch_1|mappedtodisk|arch_2|node=0|zone=0|lastcpupid=0x7ff|kasantag=0xe) CMA [49488.527947] raw: 01ffc38002020876 fffffc00002b5248 fffffc00002b51c8 f8ff00000335c760 [49488.528325] raw: 0000000000000000 0000000000000000 0000001400000011 faff000002c70000 [49488.528669] page dumped because: kasan: bad access detected [49488.529615] Memory state around the buggy address: [49488.531027] ffff00000ad47e00: f1 f1 f1 f1 f1 f1 f1 f1 f1 f1 f1 f1 f1 f1 f1 f1 [49488.532442] ffff00000ad47f00: f1 f1 f1 f1 f1 f1 f1 f1 f1 f1 f1 f1 f1 f1 f1 f1 [49488.533922] >ffff00000ad48000: fe fe fe fe fe fe fe fe fe fe fe fe fe fe fe fe [49488.535259] ^ [49488.536292] ffff00000ad48100: fe fe fe fe fe fe fe fe fe fe fe fe fe fe fe fe [49488.537628] ffff00000ad48200: fe fe fe fe fe fe fe fe fe fe fe fe fe fe fe fe [49488.539015] ================================================================== [49488.603970] Disabling lock debugging due to kernel taint -------------------%<------------------- Thanks, James _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel