From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.5 required=3.0 tests=BAYES_00, DKIM_ADSP_CUSTOM_MED,DKIM_INVALID,DKIM_SIGNED,FREEMAIL_FORGED_FROMDOMAIN, FREEMAIL_FROM,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1144CC433EF for ; Fri, 17 Sep 2021 20:57:38 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id A25BE610C8 for ; Fri, 17 Sep 2021 20:57:37 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org A25BE610C8 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id B7A536B0071; Fri, 17 Sep 2021 16:57:36 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id B29A06B0072; Fri, 17 Sep 2021 16:57:36 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 9CACB900002; Fri, 17 Sep 2021 16:57:36 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0224.hostedemail.com [216.40.44.224]) by kanga.kvack.org (Postfix) with ESMTP id 8A1166B0071 for ; Fri, 17 Sep 2021 16:57:36 -0400 (EDT) Received: from smtpin22.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 3DF3C184D4141 for ; Fri, 17 Sep 2021 20:57:36 +0000 (UTC) X-FDA: 78598276512.22.E5086B1 Received: from mail-pf1-f169.google.com (mail-pf1-f169.google.com [209.85.210.169]) by imf17.hostedemail.com (Postfix) with ESMTP id EDE94F000136 for ; Fri, 17 Sep 2021 20:57:35 +0000 (UTC) Received: by mail-pf1-f169.google.com with SMTP id b7so10285770pfo.11 for ; Fri, 17 Sep 2021 13:57:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=from:to:cc:subject:date:message-id:mime-version :content-transfer-encoding; bh=TYwIkcglVtBUNs/8XYOkfDTuYuTaGRxf1IE1dNFTBP0=; b=q7bF15N1jlNUh1O71Xb0VCkGx2rFXrm3zhJwFrD6jSVue0z+ZyjawlrLwFbh+9KIP6 XSXFhCFVQWn0L/m4eegIy/+wME57I0X1ghaOgLw9DfnffY86nEG3n88f0zG5ZmUNketl 3nleyctPZ0pgttFP8RPpSCVNHRLI/+TS4FzqbSK4Lmi+P+gZ4SgpcsUT42uQzF/JrAum NwKKdz1a5DuKjFjeIahIWRvOAeAr78nbhsYSMjEE0QaBQ8P+6X/DYkaclmjGlK1piW/k etdIUGkoVNfdsvLCiyag7CohgicUIYK7zkwzBZ3kJRaDHp5oTlp1Pk6hdA/cQmPu/r54 tKWg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:mime-version :content-transfer-encoding; bh=TYwIkcglVtBUNs/8XYOkfDTuYuTaGRxf1IE1dNFTBP0=; b=S1n6t3hbrqIZHCAlpNyFAIPHDmqbwmnq9VTjY+WnFiO2Q8eBh3aolfoSihHlTFUIvg PgY6kQP6vTjoMnGDajL/GQPaugrigNmWnENXw/j6pF87Ie82F5hCF+vMajylBNPT5UUQ vUqDCLGic29jYR9VnODzeYvN8RadGgNM2uZQ7b+coCUvikAZkjbTvMova9A9+mzbfsVv O/D/GgvWAAYNgkJ/mjKDCdrvvJllsZcQmGCLCnqDGGbddoT8UYJdzJQLnvfBFM/3Wlxm H30d6GBSSzGQftmtZOl86RJy+m6FPSGABJHTtS2VJ1ZKsuwsF7Ab1UlE6DgugWpx1Z7Q VN8A== X-Gm-Message-State: AOAM530mkddGOTts/gZXY/WULwbMDtcJnrDYz8mZOrSaMGVEG0GBEeHJ 3MMztnWKe+k+hhqK+Urcnd4= X-Google-Smtp-Source: ABdhPJwyMIF98TmO+53NAuE0Z6IeARMfNJBTMYgffoPm6vpRzgjO2GhfTaPMP5NVNK21K86yKtxoYA== X-Received: by 2002:a63:741b:: with SMTP id p27mr11638047pgc.140.1631912254733; Fri, 17 Sep 2021 13:57:34 -0700 (PDT) Received: from localhost.localdomain (c-73-93-239-127.hsd1.ca.comcast.net. [73.93.239.127]) by smtp.gmail.com with ESMTPSA id p15sm6781505pff.194.2021.09.17.13.57.32 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 17 Sep 2021 13:57:33 -0700 (PDT) From: Yang Shi To: willy@infradead.org, hughd@google.com, cfijalkovich@google.com, song@kernel.org, akpm@linux-foundation.org Cc: shy828301@gmail.com, linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH] fs: buffer: check huge page size instead of single page for invalidatepage Date: Fri, 17 Sep 2021 13:57:31 -0700 Message-Id: <20210917205731.262693-1-shy828301@gmail.com> X-Mailer: git-send-email 2.26.2 MIME-Version: 1.0 X-Stat-Signature: whs7gr3qk4d175amiwoxhfagp35kbeyt Authentication-Results: imf17.hostedemail.com; dkim=pass header.d=gmail.com header.s=20210112 header.b=q7bF15N1; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf17.hostedemail.com: domain of shy828301@gmail.com designates 209.85.210.169 as permitted sender) smtp.mailfrom=shy828301@gmail.com X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: EDE94F000136 X-HE-Tag: 1631912255-425004 Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Hao Sun reported the below BUG on v5.15-rc1 kernel: kernel BUG at fs/buffer.c:1510! invalid opcode: 0000 [#1] PREEMPT SMP CPU: 0 PID: 5 Comm: kworker/0:0 Not tainted 5.14.0+ #15 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.org 04/01/2014 Workqueue: events delayed_fput RIP: 0010:block_invalidatepage+0x27f/0x2a0 -origin/fs/buffer.c:1510 Code: ff ff e8 b4 07 d7 ff b9 02 00 00 00 be 02 00 00 00 4c 89 ff 48 c7 c2 40 4e 25 84 e8 2b c2 c4 02 e9 c9 fe ff ff e8 91 07 d7 ff <0f> 0b e8 8a 07 d7 ff 0f 0b e8 83 07 d7 ff 48 8d 5d ff e9 57 ff ff RSP: 0018:ffffc9000065bb60 EFLAGS: 00010293 RAX: 0000000000000000 RBX: ffffea0000670000 RCX: 0000000000000000 RDX: ffff8880097fa240 RSI: ffffffff81608a9f RDI: ffffea0000670000 RBP: ffffea0000670000 R08: 0000000000000001 R09: 0000000000000000 R10: ffffc9000065b9f8 R11: 0000000000000003 R12: ffffffff81608820 R13: ffffc9000065bc68 R14: 0000000000000000 R15: ffffc9000065bbf0 FS: 0000000000000000(0000) GS:ffff88807dc00000(0000) knlGS:0000000000000= 000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f4aef93fb08 CR3: 0000000108cf2000 CR4: 0000000000750ef0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 PKRU: 55555554 Call Trace: do_invalidatepage -origin/mm/truncate.c:157 [inline] truncate_cleanup_page+0x15c/0x280 -origin/mm/truncate.c:176 truncate_inode_pages_range+0x169/0xc30 -origin/mm/truncate.c:325 kill_bdev.isra.29+0x28/0x30 blkdev_flush_mapping+0x4c/0x130 -origin/block/bdev.c:658 blkdev_put_whole+0x54/0x60 -origin/block/bdev.c:689 blkdev_put+0x6f/0x210 -origin/block/bdev.c:953 blkdev_close+0x25/0x30 -origin/block/fops.c:459 __fput+0xdf/0x380 -origin/fs/file_table.c:280 delayed_fput+0x25/0x40 -origin/fs/file_table.c:308 process_one_work+0x359/0x850 -origin/kernel/workqueue.c:2297 worker_thread+0x41/0x4d0 -origin/kernel/workqueue.c:2444 kthread+0x178/0x1b0 -origin/kernel/kthread.c:319 ret_from_fork+0x1f/0x30 -origin/arch/x86/entry/entry_64.S:295 Modules linked in: Dumping ftrace buffer: (ftrace buffer empty) ---[ end trace 9dbb8f58f2109f10 ]--- RIP: 0010:block_invalidatepage+0x27f/0x2a0 -origin/fs/buffer.c:1510 Code: ff ff e8 b4 07 d7 ff b9 02 00 00 00 be 02 00 00 00 4c 89 ff 48 c7 c2 40 4e 25 84 e8 2b c2 c4 02 e9 c9 fe ff ff e8 91 07 d7 ff <0f> 0b e8 8a 07 d7 ff 0f 0b e8 83 07 d7 ff 48 8d 5d ff e9 57 ff ff RSP: 0018:ffffc9000065bb60 EFLAGS: 00010293 RAX: 0000000000000000 RBX: ffffea0000670000 RCX: 0000000000000000 RDX: ffff8880097fa240 RSI: ffffffff81608a9f RDI: ffffea0000670000 RBP: ffffea0000670000 R08: 0000000000000001 R09: 0000000000000000 R10: ffffc9000065b9f8 R11: 0000000000000003 R12: ffffffff81608820 R13: ffffc9000065bc68 R14: 0000000000000000 R15: ffffc9000065bbf0 FS: 0000000000000000(0000) GS:ffff88807dc00000(0000) knlGS:0000000000000= 000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007ff98674f000 CR3: 0000000106b2e000 CR4: 0000000000750ef0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 PKRU: 55555554 The debugging showed the page passed to invalidatepage is a huge page and the length is the size of huge page instead of single page due to read only FS THP support. But block_invalidatepage() would throw BUG if the size is greater than single page. However there is actually a bigger problem in invalidatepage(). All the implementations are *NOT* THP aware and hardcoded PAGE_SIZE. Some trigge= rs BUG(), like block_invalidatepage(), some just returns error if length is greater than PAGE_SIZE. Converting PAGE_SIZE to thp_size() actually is not enough since the actua= l invalidation part just assumes single page is passed in. Since other sub= pages may have private too because PG_private is per subpage so there may be multiple subpages have private. This may prevent the THP from splitting and reclaiming since the extra refcount pins from private of subpages. The complete fix seems not trivial and involve how to deal with huge page in page cache. So the scope of this patch is limited to close the BUG at the moment. Fixes: eb6ecbed0aa2 ("mm, thp: relax the VM_DENYWRITE constraint on file-= backed THPs") Reported-by: Hao Sun Tested-by: Hao Sun Cc: Signed-off-by: Yang Shi --- fs/buffer.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/fs/buffer.c b/fs/buffer.c index ab7573d72dd7..4bcb54c4d1be 100644 --- a/fs/buffer.c +++ b/fs/buffer.c @@ -1507,7 +1507,7 @@ void block_invalidatepage(struct page *page, unsign= ed int offset, /* * Check for overflow */ - BUG_ON(stop > PAGE_SIZE || stop < length); + BUG_ON(stop > thp_size(page) || stop < length); =20 head =3D page_buffers(page); bh =3D head; @@ -1535,7 +1535,7 @@ void block_invalidatepage(struct page *page, unsign= ed int offset, * The get_block cached value has been unconditionally invalidated, * so real IO is not possible anymore. */ - if (length =3D=3D PAGE_SIZE) + if (length >=3D PAGE_SIZE) try_to_release_page(page, 0); out: return; --=20 2.26.2