From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.5 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE, SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6587AC433DB for ; Mon, 22 Feb 2021 21:55:36 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id BAE5564E05 for ; Mon, 22 Feb 2021 21:55:35 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org BAE5564E05 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linux.ibm.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id DE6806B0005; Mon, 22 Feb 2021 16:55:34 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id D97FD6B006C; Mon, 22 Feb 2021 16:55:34 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C5EAC6B006E; Mon, 22 Feb 2021 16:55:34 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0118.hostedemail.com [216.40.44.118]) by kanga.kvack.org (Postfix) with ESMTP id AE4666B0005 for ; Mon, 22 Feb 2021 16:55:34 -0500 (EST) Received: from smtpin04.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 671A58248047 for ; Mon, 22 Feb 2021 21:55:34 +0000 (UTC) X-FDA: 77847260988.04.61D39D3 Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) by imf09.hostedemail.com (Postfix) with ESMTP id 0666D60024A2 for ; Mon, 22 Feb 2021 21:55:29 +0000 (UTC) Received: from pps.filterd (m0098409.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.43/8.16.0.43) with SMTP id 11MLXqIa132709; Mon, 22 Feb 2021 16:55:13 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=date : from : to : cc : subject : message-id : references : mime-version : content-type : content-transfer-encoding : in-reply-to; s=pp1; bh=R4rIU2G5kKIHfDd+43BWvqmZDKp/IJsUHSHYByhbHpg=; b=TTGdvXlpHCGDKkgldOJcsUcTvidVzHHRAMtgYg21fTjap6ig4ANu+wQwmVO83s+fppVj UDoLNuXhl/0SVLkzBaWTb/v0TwOmISPbB166mZ3iaXEaA3K1OqAN9q5kIFF0uknsLv4F ySaGIBdCMQhavgJzmpMpRVvDIBckHZibEVHJ0p3RPmfyTPI2DviqgFchhPueUeftxvm9 AV4aqhHa2wS87GlYk2mMfKqy+ZZCOwRQJa6NdfiuW2Z0AckIVqo3hLDRhboMLIrmYmq1 OVlXy6SnJRs75f7xyN6kM4QG0kndtBiUEc2laZm7qYmCiBYez2OOYfe32zLwWutfTB9e fA== Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com with ESMTP id 36vkfs31pf-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 22 Feb 2021 16:55:13 -0500 Received: from m0098409.ppops.net (m0098409.ppops.net [127.0.0.1]) by pps.reinject (8.16.0.43/8.16.0.43) with SMTP id 11MLYruF142830; Mon, 22 Feb 2021 16:55:12 -0500 Received: from ppma03fra.de.ibm.com (6b.4a.5195.ip4.static.sl-reverse.com [149.81.74.107]) by mx0a-001b2d01.pphosted.com with ESMTP id 36vkfs31n9-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 22 Feb 2021 16:55:11 -0500 Received: from pps.filterd (ppma03fra.de.ibm.com [127.0.0.1]) by ppma03fra.de.ibm.com (8.16.0.42/8.16.0.42) with SMTP id 11MLt9Mk031004; Mon, 22 Feb 2021 21:55:09 GMT Received: from b06cxnps4074.portsmouth.uk.ibm.com (d06relay11.portsmouth.uk.ibm.com [9.149.109.196]) by ppma03fra.de.ibm.com with ESMTP id 36tt28s27c-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 22 Feb 2021 21:55:09 +0000 Received: from d06av22.portsmouth.uk.ibm.com (d06av22.portsmouth.uk.ibm.com [9.149.105.58]) by b06cxnps4074.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 11MLt7tQ45941092 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Mon, 22 Feb 2021 21:55:07 GMT Received: from d06av22.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 29A824C04E; Mon, 22 Feb 2021 21:55:07 +0000 (GMT) Received: from d06av22.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id E4A5A4C044; Mon, 22 Feb 2021 21:55:04 +0000 (GMT) Received: from linux.ibm.com (unknown [9.145.51.238]) by d06av22.portsmouth.uk.ibm.com (Postfix) with ESMTPS; Mon, 22 Feb 2021 21:55:04 +0000 (GMT) Date: Mon, 22 Feb 2021 23:55:02 +0200 From: Mike Rapoport To: George Kennedy Cc: David Hildenbrand , Andrey Konovalov , Andrew Morton , Catalin Marinas , Vincenzo Frascino , Dmitry Vyukov , Konrad Rzeszutek Wilk , Will Deacon , Andrey Ryabinin , Alexander Potapenko , Marco Elver , Peter Collingbourne , Evgenii Stepanov , Branislav Rankov , Kevin Brodsky , Christoph Hellwig , kasan-dev , Linux ARM , Linux Memory Management List , LKML , Dhaval Giani Subject: Re: [PATCH] mm, kasan: don't poison boot memory Message-ID: <20210222215502.GB1741768@linux.ibm.com> References: <487751e1ccec8fcd32e25a06ce000617e96d7ae1.1613595269.git.andreyknvl@google.com> <797fae72-e3ea-c0b0-036a-9283fa7f2317@oracle.com> <1ac78f02-d0af-c3ff-cc5e-72d6b074fc43@redhat.com> <56c97056-6d8b-db0e-e303-421ee625abe3@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline In-Reply-To: X-TM-AS-GCONF: 00 X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.369,18.0.761 definitions=2021-02-22_07:2021-02-22,2021-02-22 signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 clxscore=1015 mlxscore=0 mlxlogscore=999 priorityscore=1501 impostorscore=0 lowpriorityscore=0 phishscore=0 bulkscore=0 spamscore=0 adultscore=0 malwarescore=0 suspectscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2009150000 definitions=main-2102220187 X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 0666D60024A2 X-Stat-Signature: c8cenhdrbpmwi7jzhmyap95dba3q9bxf Received-SPF: none (linux.ibm.com>: No applicable sender policy available) receiver=imf09; identity=mailfrom; envelope-from=""; helo=mx0a-001b2d01.pphosted.com; client-ip=148.163.156.1 X-HE-DKIM-Result: pass/pass X-HE-Tag: 1614030929-693584 Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Mon, Feb 22, 2021 at 01:42:56PM -0500, George Kennedy wrote: >=20 > On 2/22/2021 11:13 AM, David Hildenbrand wrote: > > On 22.02.21 16:13, George Kennedy wrote: > > >=20 > > > On 2/22/2021 4:52 AM, David Hildenbrand wrote: > > > >=20 > > > > Let me look into the code ... I have little experience with ACPI > > > > details, so bear with me. > > > >=20 > > > > I assume that acpi_map()/acpi_unmap() map some firmware blob that= is > > > > provided via firmware/bios/... to us. > > > >=20 > > > > should_use_kmap() tells us whether > > > > a) we have a "struct page" and should kmap() that one > > > > b) we don't have a "struct page" and should ioremap. > > > >=20 > > > > As it is a blob, the firmware should always reserve that memory r= egion > > > > via memblock (e.g., memblock_reserve()), such that we either > > > > 1) don't create a memmap ("struct page") at all (-> case b) ) > > > > 2) if we have to create e memmap, we mark the page PG_reserved an= d > > > > =A0=A0=A0 *never* expose it to the buddy (-> case a) ) > > > >=20 > > > >=20 > > > > Are you telling me that in this case we might have a memmap for t= he HW > > > > blob that is *not* PG_reserved? In that case it most probably got > > > > exposed to the buddy where it can happily get allocated/freed. > > > >=20 > > > > The latent BUG would be that that blob gets exposed to the system= like > > > > ordinary RAM, and not reserved via memblock early during boot. > > > > Assuming that blob has a low physical address, with my patch it w= ill > > > > get allocated/used a lot earlier - which would mean we trigger th= is > > > > latent BUG now more easily. > > > >=20 > > > > There have been similar latent BUGs on ARM boards that my patch > > > > discovered where special RAM regions did not get marked as reserv= ed > > > > via the device tree properly. > > > >=20 > > > > Now, this is just a wild guess :) Can you dump the page when mapp= ing > > > > (before PageReserved()) and when unmapping, to see what the state= of > > > > that memmap is? > > >=20 > > > Thank you David for the explanation and your help on this, > > >=20 > > > dump_page() before PageReserved and before kmap() in the above patc= h: > > >=20 > > > [=A0=A0=A0 1.116480] ACPI: Core revision 20201113 > > > [=A0=A0=A0 1.117628] XXX acpi_map: about to call kmap()... > > > [=A0=A0=A0 1.118561] page:ffffea0002f914c0 refcount:0 mapcount:0 > > > mapping:0000000000000000 index:0x0 pfn:0xbe453 > > > [=A0=A0=A0 1.120381] flags: 0xfffffc0000000() > > > [=A0=A0=A0 1.121116] raw: 000fffffc0000000 ffffea0002f914c8 ffffea0= 002f914c8 > > > 0000000000000000 > > > [=A0=A0=A0 1.122638] raw: 0000000000000000 0000000000000000 0000000= 0ffffffff > > > 0000000000000000 > > > [=A0=A0=A0 1.124146] page dumped because: acpi_map pre SetPageReser= ved > > >=20 > > > I also added dump_page() before unmapping, but it is not hit. The > > > following for the same pfn now shows up I believe as a result of se= tting > > > PageReserved: > > >=20 > > > [=A0=A0 28.098208] BUG:Bad page state in process mo dprobe pfn:be45= 3 > > > [=A0=A0 28.098394] page:ffffea0002f914c0 refcount:0 mapcount:0 > > > mapping:0000000000000000 index:0x1 pfn:0xbe453 > > > [=A0=A0 28.098394] flags: 0xfffffc0001000(reserved) > > > [=A0=A0 28.098394] raw: 000fffffc0001000 dead000000000100 dead00000= 0000122 > > > 0000000000000000 > > > [=A0=A0 28.098394] raw: 0000000000000001 0000000000000000 00000000f= fffffff > > > 0000000000000000 > > > [=A0=A0 28.098394] page dumped because: PAGE_FLAGS_CHECK_AT_PREP fl= ag(s) set > > > [=A0=A0 28.098394] page_owner info is not present (never set?) > > > [=A0=A0 28.098394] Modules linked in: > > > [=A0=A0 28.098394] CPU: 2 PID: 204 Comm: modprobe Not tainted > > > 5.11.0-3dbd5e3 #66 > > > [=A0=A0 28.098394] Hardware name: QEMU Standard PC (i440FX + PIIX, = 1996), > > > BIOS 0.0.0 02/06/2015 > > > [=A0=A0 28.098394] Call Trace: > > > [=A0=A0 28.098394]=A0 dump_stack+0xdb/0x120 > > > [=A0=A0 28.098394]=A0 bad_page.cold.108+0xc6/0xcb > > > [=A0=A0 28.098394]=A0 check_new_page_bad+0x47/0xa0 > > > [=A0=A0 28.098394]=A0 get_page_from_freelist+0x30cd/0x5730 > > > [=A0=A0 28.098394]=A0 ? __isolate_free_page+0x4f0/0x4f0 > > > [=A0=A0 28.098394]=A0 ? init_object+0x7e/0x90 > > > [=A0=A0 28.098394]=A0 __alloc_pages_nodemask+0x2d8/0x650 > > > [=A0=A0 28.098394]=A0 ? write_comp_data+0x2f/0x90 > > > [=A0=A0 28.098394]=A0 ? __alloc_pages_slowpath.constprop.103+0x2110= /0x2110 > > > [=A0=A0 28.098394]=A0 ? __sanitizer_cov_trace_pc+0x21/0x50 > > > [=A0=A0 28.098394]=A0 alloc_pages_vma+0xe2/0x560 > > > [=A0=A0 28.098394]=A0 do_fault+0x194/0x12c0 > > > [=A0=A0 28.098394]=A0 ? write_comp_data+0x2f/0x90 > > > [=A0=A0 28.098394]=A0 __handle_mm_fault+0x1650/0x26c0 > > > [=A0=A0 28.098394]=A0 ? copy_page_range+0x1350/0x1350 > > > [=A0=A0 28.098394]=A0 ? write_comp_data+0x2f/0x90 > > > [=A0=A0 28.098394]=A0 ? write_comp_data+0x2f/0x90 > > > [=A0=A0 28.098394]=A0 handle_mm_fault+0x1f9/0x810 > > > [=A0=A0 28.098394]=A0 ? write_comp_data+0x2f/0x90 > > > [=A0=A0 28.098394]=A0 do_user_addr_fault+0x6f7/0xca0 > > > [=A0=A0 28.098394]=A0 exc_page_fault+0xaf/0x1a0 > > > [=A0=A0 28.098394]=A0 asm_exc_page_fault+0x1e/0x30 > > > [=A0=A0 28.098394] RIP: 0010:__clear_user+0x30/0x60 > >=20 > > I think the PAGE_FLAGS_CHECK_AT_PREP check in this instance means tha= t > > someone is trying to allocate that page with the PG_reserved bit set. > > This means that the page actually was exposed to the buddy. > >=20 > > However, when you SetPageReserved(), I don't think that PG_buddy is s= et > > and the refcount is 0. That could indicate that the page is on the bu= ddy > > PCP list. Could be that it is getting reused a couple of times. > >=20 > > The PFN 0xbe453 looks a little strange, though. Do we expect ACPI tab= les > > close to 3 GiB ? No idea. Could it be that you are trying to map a wr= ong > > table? Just a guess. > >=20 > > >=20 > > > What would be=A0 the correct way to reserve the page so that the ab= ove > > > would not be hit? > >=20 > > I would have assumed that if this is a binary blob, that someone (whi= ch > > I think would be acpi code) reserved via memblock_reserve() early dur= ing > > boot. > >=20 > > E.g., see drivers/acpi/tables.c:acpi_table_upgrade()->memblock_reserv= e(). >=20 > acpi_table_upgrade() gets called, but bails out before memblock_reserve= () is > called. Thus, it appears no pages are getting reserved. acpi_table_upgrade() does not actually reserve memory but rather open codes memblock allocation with memblock_find_in_range() + memblock_reserve(), so it does not seem related anyway. Do you have by chance a full boot log handy?=20 =20 > =A0=A0=A0 503 void __init acpi_table_upgrade(void) > =A0=A0=A0 504 { ... > =A0=A0=A0 568=A0=A0=A0=A0=A0=A0=A0=A0 if (table_nr =3D=3D 0) > =A0=A0=A0 569=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 return; =A0= =A0=A0 =A0=A0=A0 =A0=A0=A0 =A0=A0=A0 =A0=A0=A0 =A0=A0=A0 =A0=A0=A0 =A0=A0= =A0 <-- bails > out here > "drivers/acpi/tables.c" >=20 > George >=20 --=20 Sincerely yours, Mike.