From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.2 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH,MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS, URIBL_BLOCKED,USER_AGENT_SANE_1 autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1D780C4360C for ; Thu, 10 Oct 2019 18:55:48 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id B354220640 for ; Thu, 10 Oct 2019 18:55:47 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org B354220640 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 24FF38E0005; Thu, 10 Oct 2019 14:55:47 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 200808E0003; Thu, 10 Oct 2019 14:55:47 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 117148E0005; Thu, 10 Oct 2019 14:55:47 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0027.hostedemail.com [216.40.44.27]) by kanga.kvack.org (Postfix) with ESMTP id D7E208E0003 for ; Thu, 10 Oct 2019 14:55:46 -0400 (EDT) Received: from smtpin25.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with SMTP id 75FD81F1A for ; Thu, 10 Oct 2019 18:55:46 +0000 (UTC) X-FDA: 76028779092.25.crate70_6c78d9575a32d X-HE-Tag: crate70_6c78d9575a32d X-Filterd-Recvd-Size: 4420 Received: from mx1.redhat.com (mx1.redhat.com [209.132.183.28]) by imf40.hostedemail.com (Postfix) with ESMTP for ; Thu, 10 Oct 2019 18:55:45 +0000 (UTC) Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id EC12A30224AA; Thu, 10 Oct 2019 18:55:43 +0000 (UTC) Received: from [10.36.116.80] (ovpn-116-80.ams2.redhat.com [10.36.116.80]) by smtp.corp.redhat.com (Postfix) with ESMTP id 6BDA85C660; Thu, 10 Oct 2019 18:55:42 +0000 (UTC) Subject: Re: [PATCH] mm/page_owner: fix a crash after memory offline To: Qian Cai , akpm@linux-foundation.org Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, "mhocko@suse.com" References: <1570732366-16426-1-git-send-email-cai@lca.pw> From: David Hildenbrand Organization: Red Hat GmbH Message-ID: <2e36a929-0fc7-d32a-d838-de746ff071fc@redhat.com> Date: Thu, 10 Oct 2019 20:55:41 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.1.0 MIME-Version: 1.0 In-Reply-To: <1570732366-16426-1-git-send-email-cai@lca.pw> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.49]); Thu, 10 Oct 2019 18:55:44 +0000 (UTC) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 10.10.19 20:32, Qian Cai wrote: > The linux-next series "mm/memory_hotplug: Shrink zones before removing > memory" [1] seems make a crash easier to reproduce while reading > /proc/pagetypeinfo after offlining a memory section. Fix it by using > pfn_to_online_page() in the PFN walker. Can you please rephrase the subject+description to describe the actual problem and drop the reference to the series? E.g., similar to my recent patches: "mm/page_owner: Don't access uninitialized memmaps when reading /proc/pagetypeinfo Uninitialized memmaps contain garbage and in the worst case trigger kernel BUGs, especially with CONFIG_PAGE_POISONING. They should not get touched. For example, when not onlining a memory block that is spanned by a zone and reading /proc/pagetypeinfo, we can trigger a kernel BUG: ... " However, you also have to justify why it is okay to no longer consider ZONE_DEVICE (I think walk_zones_in_node() will skip ZONE_DEVICE due to assert_populated == true and ZONE_DEVICE will never be populated, Therefore, we will never end in this code path with ZONE_DEVICE). > > [1] https://lore.kernel.org/linux-mm/20191006085646.5768-1-david@redhat.com/ > > page:ffffea0021200000 is uninitialized and poisoned > raw: ffffffffffffffff ffffffffffffffff ffffffffffffffff ffffffffffffffff > raw: ffffffffffffffff ffffffffffffffff ffffffffffffffff ffffffffffffffff > page dumped because: VM_BUG_ON_PAGE(PagePoisoned(p)) > There is not page extension available. > ------------[ cut here ]------------ > kernel BUG at include/linux/mm.h:1107! > RIP: 0010:pagetypeinfo_showmixedcount_print+0x4fb/0x680 > Call Trace: > walk_zones_in_node+0x3a/0xc0 > pagetypeinfo_show+0x260/0x2c0 > seq_read+0x27e/0x710 > proc_reg_read+0x12e/0x190 > __vfs_read+0x50/0xa0 > vfs_read+0xcb/0x1e0 > ksys_read+0xc6/0x160 > __x64_sys_read+0x43/0x50 > do_syscall_64+0xcc/0xaec > entry_SYSCALL_64_after_hwframe+0x49/0xbe > > Signed-off-by: Qian Cai > --- > mm/page_owner.c | 5 ++--- > 1 file changed, 2 insertions(+), 3 deletions(-) > > diff --git a/mm/page_owner.c b/mm/page_owner.c > index dee931184788..03a6b19b3cdd 100644 > --- a/mm/page_owner.c > +++ b/mm/page_owner.c > @@ -296,11 +296,10 @@ void pagetypeinfo_showmixedcount_print(struct seq_file *m, > pageblock_mt = get_pageblock_migratetype(page); > What about the pfn_valid() in the outermost loop? You can skip over the whole pageblock if the first page is not online. > for (; pfn < block_end_pfn; pfn++) { > - if (!pfn_valid_within(pfn)) > + page = pfn_to_online_page(pfn); > + if (!page) > continue; > > - page = pfn_to_page(pfn); > - > if (page_zone(page) != zone) > continue; > > -- Thanks, David / dhildenb