From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5BDBDEE57E5 for ; Fri, 8 Sep 2023 08:42:35 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E35D96B00A6; Fri, 8 Sep 2023 04:42:34 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id DE6396B00A7; Fri, 8 Sep 2023 04:42:34 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id CD6036B00A8; Fri, 8 Sep 2023 04:42:34 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id BCA876B00A6 for ; Fri, 8 Sep 2023 04:42:34 -0400 (EDT) Received: from smtpin07.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 88D6C160FFA for ; Fri, 8 Sep 2023 08:42:34 +0000 (UTC) X-FDA: 81212789028.07.26EE85A Received: from szxga08-in.huawei.com (szxga08-in.huawei.com [45.249.212.255]) by imf10.hostedemail.com (Postfix) with ESMTP id 60C8CC0008 for ; Fri, 8 Sep 2023 08:42:30 +0000 (UTC) Authentication-Results: imf10.hostedemail.com; dkim=none; spf=pass (imf10.hostedemail.com: domain of liushixin2@huawei.com designates 45.249.212.255 as permitted sender) smtp.mailfrom=liushixin2@huawei.com; dmarc=pass (policy=quarantine) header.from=huawei.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1694162552; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=jcfX5K+q61fWZsm/ch5ZVyJfkk5TElTVmyfSSySSZAg=; b=XM+WIpbmOVULQ0P0LYayA93pE3ITJ/jDCj93DIaBu6gLg6TpsvHoCFv5zx4LJ0Kt4UnT62 44WMbVXApkC8F2vV71Mf2ve/Qu4kZ8/+ciBSQKjfltFaaNKZXdN5Vc9hzoOCsNBTZTpRoa bvirF6mKsUToJ1E8Gd/7VWym+yvplXw= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1694162552; a=rsa-sha256; cv=none; b=Hjl8XzoGSoD2YVk7nMuWDIm4aKl7U2fh6a/fNG8jEfMjKSR6T2B6mGWbQp+JfcMufBQEKv XXyZnl9pS4lqjgWOV871MF7NN3JkEy4ZRBuaYBBbJjt7zQT2Acm2vh7qIK5dCFvAEVGiH6 9l/dUH+6b/tybUawLOhHpXnoBfOhsGI= ARC-Authentication-Results: i=1; imf10.hostedemail.com; dkim=none; spf=pass (imf10.hostedemail.com: domain of liushixin2@huawei.com designates 45.249.212.255 as permitted sender) smtp.mailfrom=liushixin2@huawei.com; dmarc=pass (policy=quarantine) header.from=huawei.com Received: from dggpemm500009.china.huawei.com (unknown [172.30.72.53]) by szxga08-in.huawei.com (SkyGuard) with ESMTP id 4RhqLj464jz1M9Ff; Fri, 8 Sep 2023 16:40:29 +0800 (CST) Received: from [10.174.179.24] (10.174.179.24) by dggpemm500009.china.huawei.com (7.185.36.225) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.31; Fri, 8 Sep 2023 16:42:19 +0800 Subject: Re: Kernel crash during ltp(min_free_kbytes) test run (zone_reclaimable_pages) To: Sachin Sant , linuxppc-dev , References: CC: open list , From: Liu Shixin Message-ID: <4533da10-8c69-3dd2-875b-6dff15a4c289@huawei.com> Date: Fri, 8 Sep 2023 16:42:19 +0800 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:45.0) Gecko/20100101 Thunderbird/45.7.1 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset="windows-1252" Content-Transfer-Encoding: 7bit X-Originating-IP: [10.174.179.24] X-ClientProxiedBy: dggems705-chm.china.huawei.com (10.3.19.182) To dggpemm500009.china.huawei.com (7.185.36.225) X-CFilter-Loop: Reflected X-Stat-Signature: btypsuy3frxariijzfha34sgtb4jtck6 X-Rspamd-Server: rspam10 X-Rspamd-Queue-Id: 60C8CC0008 X-Rspam-User: X-HE-Tag: 1694162550-544724 X-HE-Meta: U2FsdGVkX1+30IrqacvuBd+J7lff28TQC1MJeVfNWik8xYpeb/Y8e0INOlxAVVhjAs4O99JYk1hndjgbphDR+FyHlR6gWalGpWmBiBCkI6kdOm57M9uGPaIppLt2NiYH3f3KcRT3SAFoApb7aev/7YcpVNjY/QfI/xYcXFlL/DRt7coGwCVzobNV5vfQCMU/2EgbaA8Ru6NanPs8zJe11PC2/fSu05Vn1Nqio3GFp+FGBNqFBOB/EqXiUyGQGG7DlND9oyewblGrx8Mnv+HZNsWdWdrPZP3P1ZHIkNMnwDQcrkH7cmQo6Xl2v4IICCeZDitseiiNNbMrY/6v7oQZcBoL1hzamyQbydQlMsQVqcCKqbAeWBkqEGGzN4Hv/xxBOPrNuS3V8Nu9GMLr31VHmMMzziZm+6sngElQkMzU7jPPdvTdaChYY91IKvB4ZE/dv1D9vk1PeRS0J/7T7jy2YYCIG798oLARxB5Hn1A9XaQDEIPExYD00hUE+vXgjVbLXhdaNBLkrWYqgkC2YTBKbkaxqpAK8rJnT6iSopxXKN9Mo4PPLyaUq1X5es7zSJAHd3RbwKukMZrgvEOMiwPzyxo3eaifEFB0aSbdUaPWAN+xTprUR5L1hZl9I6Ixkq1SfLCeGt7JMUN385Q/0ebEKDafD20UxVjZAlZIvmNS/+1HgAiUzPySik6tkHwTvGGjojvFwLlnDq38bao6kdukA9OBCo1LwWOGpt6qRF90o5AtQavrD8MQEACMAIkeLSzxR7HCqO5WLfroeLob0XRDefQWE46WSsep4OZxJ0WoFa/BBcL/NBdpIqKn8I1kpDFng7aLDr0+karoxsI9q7+P2v1Wv/1I1tnHV4Cku22ZgYHcGeDQk6HxBoZ/AoawexIKLA1nQPAg9qS6CphOTolf01lbrcz2nvPY9bacX+ahTlGg3tag3j8HTtVI5gdRfdOEym+auS6oHijTmGRyXIj S8bLhhUC bVzgG1VbW1Bt+rpaXLK2Lodiv5YL5YkSUtNOD07GOK9zI07H6Av1Oydo+7wSX5ygFoEAde+f85q2uDsGdwY+9yHSHsFqf77GK/OT/dM1KjllEnwk7yJkYlaN/8vEMSHck3SN8n2C8dvZ3/H9HhDrw5yKD9GeLUU1yhnFfpEjt18oP5cgAVMSOxHtATNYpExnMVJrag6Xy5UfqC3hSXLRK0RRG5fpoDILPMpXa1VOTMvtDUMw7p95A/dxjf2d54yKEnpSv X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 2023/9/7 22:22, Sachin Sant wrote: > While running LTP tests (specifically min_free_kbytes) on a Power server > booted with 6.5.0-next-20230906 following crash was encountered. > > [ 3952.404936] __vm_enough_memory: pid: 440285, comm: min_free_kbytes, not enough memory for the allocation > [ 3956.895519] __vm_enough_memory: pid: 440286, comm: min_free_kbytes, not enough memory for the allocation > [ 3961.296168] __vm_enough_memory: pid: 440287, comm: min_free_kbytes, not enough memory for the allocation > [ 3982.202651] Kernel attempted to read user page (28) - exploit attempt? (uid: 0) > [ 3982.202669] BUG: Kernel NULL pointer dereference on read at 0x00000028 > [ 3982.202674] Faulting instruction address: 0xc000000000469660 > [ 3982.202679] Oops: Kernel access of bad area, sig: 11 [#1] > [ 3982.202682] LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=8192 NUMA pSeries > [ 3982.202688] Modules linked in: nfsv3 nfs_acl nfs lockd grace fscache netfs brd overlay exfat vfat fat btrfs blake2b_generic xor raid6_pq zstd_compress xfs loop sctp ip6_udp_tunnel udp_tunnel dm_mod nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 bonding rfkill tls ip_set nf_tables libcrc32c nfnetlink sunrpc pseries_rng vmx_crypto ext4 mbcache jbd2 sd_mod t10_pi crc64_rocksoft crc64 sg ibmvscsi ibmveth scsi_transport_srp fuse [last unloaded: init_module(O)] > [ 3982.202756] CPU: 18 PID: 440288 Comm: min_free_kbytes Tainted: G O 6.5.0-next-20230906 #1 > [ 3982.202762] Hardware name: IBM,9080-HEX POWER10 (raw) 0x800200 0xf000006 of:IBM,FW1030.20 (NH1030_058) hv:phyp pSeries > [ 3982.202767] NIP: c000000000469660 LR: c0000000004694a8 CTR: 0000000000000000 > [ 3982.202771] REGS: c00000001d6af410 TRAP: 0300 Tainted: G O (6.5.0-next-20230906) > [ 3982.202776] MSR: 8000000000009033 CR: 24402444 XER: 00000000 > [ 3982.202787] CFAR: c0000000004694fc DAR: 0000000000000028 DSISR: 40000000 IRQMASK: 0 > [ 3982.202787] GPR00: c0000000004696b8 c00000001d6af6b0 c000000001451100 0000000000000080 > [ 3982.202787] GPR04: 0000000000000080 0000000000000081 0000000000000020 0000000000000000 > [ 3982.202787] GPR08: 0000000000000080 00000000000048d9 0000000000000000 00000000000014de > [ 3982.202787] GPR12: 0000000000008000 c0000013ffab5300 c000000002f27238 c000000002c9d4d8 > [ 3982.202787] GPR16: 0000000000000000 0000000000000000 c000000006924d40 c000000002d174f8 > [ 3982.202787] GPR20: c000000002d17500 0000000000000002 60000000000000e0 00000000000008c0 > [ 3982.202787] GPR24: 0000000000000000 0000000000000000 0000000000000000 c000000002c9a7e8 > [ 3982.202787] GPR28: c000000002c9be10 c0000013ff1d1500 0000000000000488 0000000000000950 > [ 3982.202839] NIP [c000000000469660] zone_reclaimable_pages+0x2a0/0x2c0 > [ 3982.202847] LR [c0000000004694a8] zone_reclaimable_pages+0xe8/0x2c0 > [ 3982.202852] Call Trace: > [ 3982.202854] [c00000001d6af6b0] [5deadbeef0000122] 0x5deadbeef0000122 (unreliable) > [ 3982.202861] [c00000001d6af710] [c0000000004696b8] allow_direct_reclaim.part.72+0x38/0x190 > [ 3982.202867] [c00000001d6af760] [c000000000469990] throttle_direct_reclaim+0x180/0x400 > [ 3982.202873] [c00000001d6af7e0] [c00000000046de88] try_to_free_pages+0xd8/0x2a0 > [ 3982.202879] [c00000001d6af8a0] [c0000000004e7370] __alloc_pages_slowpath.constprop.92+0x490/0x1000 > [ 3982.202886] [c00000001d6afa50] [c0000000004e822c] __alloc_pages+0x34c/0x3d0 > [ 3982.202893] [c00000001d6afad0] [c0000000004e8ce4] __folio_alloc+0x34/0x90 > [ 3982.202898] [c00000001d6afb00] [c00000000051ba50] vma_alloc_folio+0xe0/0x460 > [ 3982.202905] [c00000001d6afbc0] [c0000000004af108] do_pte_missing+0x2a8/0xca0 > [ 3982.202912] [c00000001d6afc10] [c0000000004b3590] __handle_mm_fault+0x3f0/0x1060 > [ 3982.202917] [c00000001d6afd20] [c0000000004b43c4] handle_mm_fault+0x1c4/0x330 > [ 3982.202923] [c00000001d6afd70] [c000000000092a14] ___do_page_fault+0x2d4/0xaa0 > [ 3982.202930] [c00000001d6afe20] [c0000000000934d0] do_page_fault+0xa0/0x2a0 > [ 3982.202936] [c00000001d6afe50] [c000000000008be0] data_access_common_virt+0x210/0x220 > [ 3982.202943] --- interrupt: 300 at 0x7fffb3cc6360 > [ 3982.202946] NIP: 00007fffb3cc6360 LR: 0000000010005644 CTR: 0000000000001200 > [ 3982.202950] REGS: c00000001d6afe80 TRAP: 0300 Tainted: G O (6.5.0-next-20230906) > [ 3982.202955] MSR: 800000000200d033 CR: 44002444 XER: 00000000 > [ 3982.202966] CFAR: 00007fffb3cc6384 DAR: 00007fea3bc70000 DSISR: 42000000 IRQMASK: 0 > [ 3982.202966] GPR00: 0000000000002000 00007fffd0497ae0 0000000010057f00 00007fea3bc00000 > [ 3982.202966] GPR04: 0000000000000001 0000000000100000 00007fea3bc70000 0000000000000000 > [ 3982.202966] GPR08: 1000000000000000 00007fea3bc00000 0000000000000000 0000000000000000 > [ 3982.202966] GPR12: 00007fffb3cc62a0 00007fffb410b080 0000000000000000 0000000000000000 > [ 3982.202966] GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 > [ 3982.202966] GPR20: 000000001002c260 000000001002c208 cccccccccccccccd a3d70a3d70a3d70b > [ 3982.202966] GPR24: 000000001002c2d0 000000001002c238 00007fffb3e01888 000000001002c260 > [ 3982.202966] GPR28: 0000000000000000 000000001002c1f0 000000001002c218 0000000000000000 > [ 3982.203016] NIP [00007fffb3cc6360] 0x7fffb3cc6360 > [ 3982.203020] LR [0000000010005644] 0x10005644 > [ 3982.203023] --- interrupt: 300 > [ 3982.203026] Code: eb21ffc8 eb81ffe0 eba1ffe8 ebc1fff0 7fffd214 eb41ffd0 7c0803a6 7fe3fb78 ebe1fff8 4e800020 60000000 60000000 3900ffff 7909782c b12a0028 > [ 3982.203044] ---[ end trace 0000000000000000 ]--- > [ 3982.299095] pstore: backend (nvram) writing error (-1) > [ 3982.299105] > [ 3983.299108] Kernel panic - not syncing: Fatal exception > [ 3983.564309] Rebooting in 10 seconds.. > > Git bisect point to the following patch > > commit 92039ae85e8d018e82b9ba2597ca22e9851447fe > mm: vmscan: try to reclaim swapcache pages if no swap space > > The system is configured with 60GB of memory and 4GB of swap. Thanks for your found. I sent a patch to fix it. https://lore.kernel.org/linux-mm/20230908093103.2620512-1-liushixin2@huawei.com/ > > - Sachin > > . >