From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 90791EB64D7 for ; Fri, 23 Jun 2023 16:40:23 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 231398D0002; Fri, 23 Jun 2023 12:40:23 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 1E0D88D0001; Fri, 23 Jun 2023 12:40:23 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 0CFED8D0002; Fri, 23 Jun 2023 12:40:23 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id EF3338D0001 for ; Fri, 23 Jun 2023 12:40:22 -0400 (EDT) Received: from smtpin15.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id B927F160C0E for ; Fri, 23 Jun 2023 16:40:22 +0000 (UTC) X-FDA: 80934575484.15.51545FF Received: from mail-pg1-f201.google.com (mail-pg1-f201.google.com [209.85.215.201]) by imf01.hostedemail.com (Postfix) with ESMTP id 8332340022 for ; Fri, 23 Jun 2023 16:40:20 +0000 (UTC) Authentication-Results: imf01.hostedemail.com; dkim=pass header.d=google.com header.s=20221208 header.b=jZtiAGYT; spf=pass (imf01.hostedemail.com: domain of 388qVZAgKCAwvum2uAmzs00sxq.o0yxuz69-yyw7mow.03s@flex--jiaqiyan.bounces.google.com designates 209.85.215.201 as permitted sender) smtp.mailfrom=388qVZAgKCAwvum2uAmzs00sxq.o0yxuz69-yyw7mow.03s@flex--jiaqiyan.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1687538420; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding:in-reply-to: references:dkim-signature; bh=804BUznaRJxahSpZ+yggx/c4ITZ/f8vCCZVijpr48F0=; b=sWDHE9ynhWvevQ6kiGkgkWZVsZynEozgWA7Ntbavt2eGsoDiZLB//0WNr9oLNaLJeUQSFq ZfN36hOTwDGuNuGiDZLC9QNhPFoLKqZF0HcUISsD47L9BP8cbDkfmGxytVDOYTaArQrQx5 fr+eCOHVPOAOCWQRCLYKqsotL+xYkNw= ARC-Authentication-Results: i=1; imf01.hostedemail.com; dkim=pass header.d=google.com header.s=20221208 header.b=jZtiAGYT; spf=pass (imf01.hostedemail.com: domain of 388qVZAgKCAwvum2uAmzs00sxq.o0yxuz69-yyw7mow.03s@flex--jiaqiyan.bounces.google.com designates 209.85.215.201 as permitted sender) smtp.mailfrom=388qVZAgKCAwvum2uAmzs00sxq.o0yxuz69-yyw7mow.03s@flex--jiaqiyan.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1687538420; a=rsa-sha256; cv=none; b=bUJlZtmQWrHg4yjC+HfQlS3vlsJklpHZDSvdhvPFTLSEcHqAfe6SKjJLaz4SCuEs6pp6Hg yHDEeS2EDrXgoveP5qxv/YAEWn3es1RMIfRqflEM/KBZB/JyivRjvw3XqpSEgi65lB5HKa DmFJOsFVR/PNAuK2DTt+o7DzZGL4hJY= Received: by mail-pg1-f201.google.com with SMTP id 41be03b00d2f7-553d42a7069so692643a12.1 for ; Fri, 23 Jun 2023 09:40:20 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1687538419; x=1690130419; h=cc:to:from:subject:message-id:mime-version:date:from:to:cc:subject :date:message-id:reply-to; bh=804BUznaRJxahSpZ+yggx/c4ITZ/f8vCCZVijpr48F0=; b=jZtiAGYTk81uuMQHYSplK6RArvXfVokurmnT+Vp8LofB/qX9ED8oW1HAtA5pXXksBb aIw+W3qBavUHtjsRfeRuaEtO1FRKQ78PAfLrP9rYwa4Op1nP7cp6WAn9O2HiKuFpgCky 1o+u+u51Ew8k+opL+LDIxrXpaAKFUnJI2VL7KDJdHB0LTq5kg8q/OGU/WIIEFRyx2DN2 UTCsrX1Bm7CGE0QSzXT6fyer9Jb+qNrn4lI17/UO9JQV/LQKUxWF4d64VlCQFjpRUlxJ 7eVuBzL20yR2RMclW5Tll2LhJ6WjDVhlCmzC8e4XXRRxBnSoLvZmWLvxaGOs/jNqeRwU hD7g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1687538419; x=1690130419; h=cc:to:from:subject:message-id:mime-version:date:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=804BUznaRJxahSpZ+yggx/c4ITZ/f8vCCZVijpr48F0=; b=Wt5Oz2MT7KxqTjjgigW5Nvdcon4smr4pYmQRMp7F9r4dlXGnr28SbCeoE0Qxl9/Deo K8GaYxNtMzHUr+tRwAof3hJoNpj40J5tddISHXapfb9puZBj3GdM0ZcwLflaRe+Wt2tE ADqTG/tlhOsuiKUAIvqxNaBSDjN18jEr3veEXd45LfumQwJYPeWeWNaJOZgdP/MLc8PF cUreGbCGbfClbzuktYH4mYj/GTPa7i7e7ghqKaj4uUWs6O/TM79W351FeCI+8i1BiDEF X8RIe9cqm6zXBLNm5Elaq5ikgsob3E7y/rEf0J7AiPTbAxzT9zw15cTIXE1zFS3/89QG byDw== X-Gm-Message-State: AC+VfDzlEHc1NrzRBllqBgCVovLWaq7CvY/joJvVMzjNh9aFKce3LKOv RmK9CwvwkeQDJN0HGWXVWVS6FIxdQFzGgw== X-Google-Smtp-Source: ACHHUZ6hZpPhnIyx5Qe+Ysjsk1wI49515RocMiYK0cPkTeizIG7/lNuGAnyLD/Atj+FQ4LQuCVD7fiYmsBUw5Q== X-Received: from yjq3.c.googlers.com ([fda3:e722:ac3:cc00:24:72f4:c0a8:272f]) (user=jiaqiyan job=sendgmr) by 2002:a63:555d:0:b0:543:9759:1952 with SMTP id f29-20020a63555d000000b0054397591952mr2457037pgm.11.1687538419140; Fri, 23 Jun 2023 09:40:19 -0700 (PDT) Date: Fri, 23 Jun 2023 16:40:11 +0000 Mime-Version: 1.0 X-Mailer: git-send-email 2.41.0.162.gfafddb0af9-goog Message-ID: <20230623164015.3431990-1-jiaqiyan@google.com> Subject: [PATCH v2 0/4] Improve hugetlbfs read on HWPOISON hugepages From: Jiaqi Yan To: mike.kravetz@oracle.com, naoya.horiguchi@nec.com Cc: songmuchun@bytedance.com, shy828301@gmail.com, linmiaohe@huawei.com, akpm@linux-foundation.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, duenwen@google.com, axelrasmussen@google.com, jthoughton@google.com, Jiaqi Yan Content-Type: text/plain; charset="UTF-8" X-Rspamd-Queue-Id: 8332340022 X-Rspam-User: X-Stat-Signature: 86gpi9sokxdygaat3eduta5jboz9dzdc X-Rspamd-Server: rspam01 X-HE-Tag: 1687538420-717534 X-HE-Meta: U2FsdGVkX18Go8a8V1b0eAx9capWq+yjsmG445v2MpCxEqS7dP7Dc1Ooq9fYBfEBf0/X566RY6voJIjLmD71Guo61BGcI+hA3rpoze9mU05y7DFvofsewNYw1nG6RAXqsul/XyA2PDW4RtQUuj+ABxXBGk4C93em2wmZ1r18Z5IWy0UWDonHdsY5gI86E8fCRP74Hw5oYLTiP4E57xBlt2ElybFiLVvbD83YnfAoQCLCDd8sw/tIcZL8LUaMdSCPX3myxw1wCe1R68wc3x1fwLwu6wCFl/+9P5jCKJRRFHhrKdZj4JLbfca7Dw50hr+sxwvn04IzF4iFSiqSux91fsabUIc9kzaO+jbIcRoIRIyJLm9r4E2KxP8AUfGLVJ8RPhtXf+o1I1sZNDSom4vT3aCuEhoQM0j4nKhYk4g+HQh9QwgnGN9i3rKlrjuY7ZMu7ulHYe1ylxxkLV9Pkm8W7VDG58ILvYkMasO7TjDJ8s1yiyC8QPnhWNK9CEb03onINHvwszRaawtZzpGI4qz8nHNSQdXj4HQUonVOdmPb+wCWrbxKgdZLWmfjUPBdyV9xiBnl+qHQvReBrmJNtNWzP5riPRU5NtwCDOjCnvH9nFAf9JEU+Y6F4t/HtfryoQqM22WfLQ7p1Hu2Md4RpWyYwKoXRa8fKzB9x3ipgdW8MXQu+UdEuJ2YAZTDdW/7GQvcp8tZXq5qDLjcqrmQyjKe7ajRDOuB6Jim5uUiCqtxadkHDgPrWzOScU0Jvy8bIBhKijy/Ui85zjRngzkGWdB3fFgkF5mwe6AWqkHQiIShZSTOjPEYTJRPU7et4tgsWmJ/ph9mjrjQDZrCg2Kkrwo+9ttQhGdaeX9bS3qNBS53Gn4DNvg9MctXKwiToGTXwoWBNG1p4mmMiq5680s65XUqapvR9HY/zQNPiYm2ySecZH2U+uy7t1yFFN6tL9EP0h4IEg6XSfFCYpWINeJRdG0 7mI9iO7n cbhTvrmrZWbPKtvBx69peio1MYR5aJervmoeWSKXTddcb04fssOCPBWI4mTrsk2GYNTBO3v7okGNiI31Ohvgshlx8E8q+cwcWvpixLmfqxt1PO5cbjvXKAI2V/IMySLkJI13DopOFqXBSKFj7qZamnIETbjdIdc4YRiK7iVpzgrH4W0KAaRDYedjbh+V+a85usFoHkuwi4LEbyP2cKqY9xlJt6aKBUjpSlKgSQqkUXOJY4kRjtu/RSeq1sxW3Aq1jiAlD7Ow/NxeYn8qBM5BJYOZ+nPI7P3WeQqxOmQ597+LSwKutJ29jXQTOU/0ZriPN3dTFjivEqyAvGoczxeRztCpmhiQHUWIdvQ4AJZJWYPtabLT9Az/r+ma0VfV8+EejrnOH9lCg20e8dwWQlTDCU/UOfRYktATGVEwXX6CAO/v+WT5xGB+XeyHOVT1He1TunrK27K5x2UUGw5Fhfrn9G6bGXTe5kakLa8lpRqfIgKeU+plAbzfrr41U9Lc7atiNZeYo/1coGIpMJvESkbPKSt1oy4BN10beCv79NKdkvQmKgALbOQPyXsHha8m2l5FQgf5NS+19Iy/gGAx5Q1OZeUCVUVEVcSj5P21L X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Today when hardware memory is corrupted in a hugetlb hugepage, kernel leaves the hugepage in pagecache [1]; otherwise future mmap or read will suject to silent data corruption. This is implemented by returning -EIO from hugetlb_read_iter immediately if the hugepage has HWPOISON flag set. Since memory_failure already tracks the raw HWPOISON subpages in a hugepage, a natural improvement is possible: if userspace only asks for healthy subpages in the pagecache, kernel can return these data. This patchset implements this improvement. The 1st commit fixes an issue in __folio_free_raw_hwp. The 2nd commit exports the functionality to tell if a subpage inside a hugetlb hugepage is a raw HWPOISON page. The 3rd commit teaches hugetlbfs_read_iter to return as many healthy bytes as possible. The last commit properly tests this new feature. [1] commit 8625147cafaa ("hugetlbfs: don't delete error page from pagecache") Changelog v1 => v2 * __folio_free_raw_hwp deletes all entries in raw_hwp_list before it traverses and frees raw_hwp_page. * find_raw_hwp_page => __is_raw_hwp_subpage and __is_raw_hwp_subpage only returns bool instead of a raw_hwp_page entry. * is_raw_hwp_subpage holds hugetlb_lock while checking __is_raw_hwp_subpage. * No need to do folio_lock in adjust_range_hwpoison. * v2 is based on commit a6e79df92e4a ("mm/gup: disallow FOLL_LONGTERM GUP-fast writing to file-backed mappings") Jiaqi Yan (4): mm/hwpoison: delete all entries before traversal in __folio_free_raw_hwp mm/hwpoison: check if a subpage of a hugetlb folio is raw HWPOISON hugetlbfs: improve read HWPOISON hugepage selftests/mm: add tests for HWPOISON hugetlbfs read fs/hugetlbfs/inode.c | 58 +++- include/linux/hugetlb.h | 19 ++ include/linux/mm.h | 7 + mm/hugetlb.c | 10 + mm/memory-failure.c | 42 ++- tools/testing/selftests/mm/.gitignore | 1 + tools/testing/selftests/mm/Makefile | 1 + .../selftests/mm/hugetlb-read-hwpoison.c | 322 ++++++++++++++++++ 8 files changed, 439 insertions(+), 21 deletions(-) create mode 100644 tools/testing/selftests/mm/hugetlb-read-hwpoison.c -- 2.41.0.162.gfafddb0af9-goog