From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753599AbdHQWad convert rfc822-to-8bit (ORCPT ); Thu, 17 Aug 2017 18:30:33 -0400 Received: from g9t1613g.houston.hpe.com ([15.241.32.99]:57944 "EHLO g9t1613g.houston.hpe.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753005AbdHQWab (ORCPT ); Thu, 17 Aug 2017 18:30:31 -0400 From: "Elliott, Robert (Persistent Memory)" To: Andrew Morton , "Luck, Tony" CC: Borislav Petkov , Dave Hansen , "Naoya Horiguchi" , "x86@kernel.org" , "linux-mm@kvack.org" , "linux-kernel@vger.kernel.org" Subject: RE: [PATCH-resend] mm/hwpoison: Clear PRESENT bit for kernel 1:1 mappings of poison pages Thread-Topic: [PATCH-resend] mm/hwpoison: Clear PRESENT bit for kernel 1:1 mappings of poison pages Thread-Index: AQHTFrOs5yBR/eBpjUGfoM88uKwdtKKJHaUAgAAAkmA= Date: Thu, 17 Aug 2017 22:29:48 +0000 Message-ID: References: <20170816171803.28342-1-tony.luck@intel.com> <20170817150942.017f87537b6cbb48e9cfc082@linux-foundation.org> In-Reply-To: <20170817150942.017f87537b6cbb48e9cfc082@linux-foundation.org> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: authentication-results: spf=none (sender IP is ) smtp.mailfrom=elliott@hpe.com; x-originating-ip: [15.211.195.2] x-ms-publictraffictype: Email x-microsoft-exchange-diagnostics: 1;AT5PR84MB0051;6:91lgayJh4c9muiGWB0ETM7Oaf5WED15HcUuXGMlrh1gSUKz8mToukU8jvdxVDynn7ClMOR4SGNqxRXkJARQd+BbdzaP7rS6Nf+SsvFi0xIXppfacsZzxDxyn4jWm17gTyCZNAVRGv/oC0ff0LI/s4Qf9PVF+Olx2hCohutktmINAQQHwbK0Yy4N9moAjDXDpQJPiL8mWXX08qWwY1Owb+1WQSdYclri8KeDo5Us/wNh9IsluatRkV3vPXaZ4phzD+lWsenwl6rb1UNUXK0IBa567ZPmxcYepxrrgG3+p0vTpRo80r5VDyIJco5jy0g1gT4fyFDtmom1LbdOc1lNs9g==;5:Q8muEpMwL1Re2EanBHu2UKlTENss40cC0KjNF8Abe2WxeO0KMtx4/jngyGhnkjZDqEBNvZoqaSSqdvpoxqKNS2DfkBHilRv0CEtTJ9JAWf3g1BZFjMOEijSqSTUq5v/342xFXnQU/p1z+V/Jxxyscw==;24:9kpD+aRYI+arxPtKuOZmg8ftCWMtT/YAcOccLXiCyzwdK1nr+NU1dLVXMxnitp5lHvQC5caTKUg+nEwRgt8LLrUg0vzHs5cObMhqNWPoPGk=;7:ydlUtz7181nAwB4rgqQtqpVgCCtWMoFtIrQr+eqqgXYPdEawkEHXOFsZis3026DVea2BKRoJuVW8yQKfrtS+X1otxDIyXvSxa/R80ZtwBUdldNqXTLwnSYyCJGT9+fq3/KV2AUxHh3eR4yud4n3BDrL04HSQVVV67e4I77vEo89Ab0cHiSa2mV2Mp9m8U4YTKopHeg/ZmmiHk0dxqeq7t6xLpypiq6PpnDpi8/wYWGA= x-ms-exchange-antispam-srfa-diagnostics: SSOS; x-ms-office365-filtering-correlation-id: 0f0710e2-77f8-4b7e-50cf-08d4e5bf782e x-ms-office365-filtering-ht: Tenant x-microsoft-antispam: UriScan:;BCL:0;PCL:0;RULEID:(300000500095)(300135000095)(300000501095)(300135300095)(22001)(300000502095)(300135100095)(2017030254152)(48565401081)(300000503095)(300135400095)(2017052603031)(201703131423075)(201703031133081)(201702281549075)(300000504095)(300135200095)(300000505095)(300135600095)(300000506095)(300135500095);SRVR:AT5PR84MB0051; x-ms-traffictypediagnostic: AT5PR84MB0051: x-exchange-antispam-report-test: UriScan:(227479698468861)(20558992708506)(9452136761055)(228905959029699); x-microsoft-antispam-prvs: x-exchange-antispam-report-cfa-test: BCL:0;PCL:0;RULEID:(100000700101)(100105000095)(100000701101)(100105300095)(100000702101)(100105100095)(6040450)(601004)(2401047)(8121501046)(5005006)(10201501046)(100000703101)(100105400095)(93006095)(93001095)(3002001)(6055026)(6041248)(20161123558100)(20161123564025)(201703131423075)(201702281528075)(201703061421075)(201703061406153)(20161123562025)(20161123560025)(20161123555025)(6072148)(201708071742011)(100000704101)(100105200095)(100000705101)(100105500095);SRVR:AT5PR84MB0051;BCL:0;PCL:0;RULEID:(100000800101)(100110000095)(100000801101)(100110300095)(100000802101)(100110100095)(100000803101)(100110400095)(100000804101)(100110200095)(100000805101)(100110500095);SRVR:AT5PR84MB0051; x-forefront-prvs: 0402872DA1 x-forefront-antispam-report: SFV:NSPM;SFS:(10019020)(6009001)(39860400002)(189002)(377454003)(199003)(13464003)(24454002)(66066001)(5660300001)(478600001)(25786009)(7696004)(3280700002)(4326008)(81156014)(8676002)(74316002)(81166006)(3660700001)(86362001)(33656002)(305945005)(189998001)(7736002)(2906002)(8936002)(101416001)(6116002)(102836003)(5250100002)(55016002)(54906002)(53936002)(6246003)(14454004)(9686003)(6436002)(3846002)(106356001)(68736007)(2900100001)(6506006)(105586002)(229853002)(50986999)(76176999)(2950100002)(53546010)(97736004)(54356999);DIR:OUT;SFP:1102;SCL:1;SRVR:AT5PR84MB0051;H:AT5PR84MB0082.NAMPRD84.PROD.OUTLOOK.COM;FPR:;SPF:None;PTR:InfoNoRecords;MX:1;A:1;LANG:en; spamdiagnosticoutput: 1:99 spamdiagnosticmetadata: NSPM Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 8BIT MIME-Version: 1.0 X-MS-Exchange-CrossTenant-originalarrivaltime: 17 Aug 2017 22:29:48.3116 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: 105b2061-b669-4b31-92ac-24d304d195dc X-MS-Exchange-Transport-CrossTenantHeadersStamped: AT5PR84MB0051 X-OriginatorOrg: hpe.com Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org > -----Original Message----- > From: Andrew Morton [mailto:akpm@linux-foundation.org] > Sent: Thursday, August 17, 2017 5:10 PM > To: Luck, Tony > Cc: Borislav Petkov ; Dave Hansen ; > Naoya Horiguchi ; Elliott, Robert (Persistent > Memory) ; x86@kernel.org; linux-mm@kvack.org; linux- > kernel@vger.kernel.org > Subject: Re: [PATCH-resend] mm/hwpoison: Clear PRESENT bit for kernel 1:1 > mappings of poison pages > > On Wed, 16 Aug 2017 10:18:03 -0700 "Luck, Tony" > wrote: > > > Speculative processor accesses may reference any memory that has a > > valid page table entry. While a speculative access won't generate > > a machine check, it will log the error in a machine check bank. That > > could cause escalation of a subsequent error since the overflow bit > > will be then set in the machine check bank status register. > > > > Code has to be double-plus-tricky to avoid mentioning the 1:1 virtual > > address of the page we want to map out otherwise we may trigger the > > very problem we are trying to avoid. We use a non-canonical address > > that passes through the usual Linux table walking code to get to the > > same "pte". > > > > Thanks to Dave Hansen for reviewing several iterations of this. > > It's unclear (to lil ole me) what the end-user-visible effects of this > are. > > Could we please have a description of that? So a) people can > understand your decision to cc:stable and b) people whose kernels are > misbehaving can use your description to decide whether your patch might > fix the issue their users are reporting. In general, the system is subject to halting due to uncorrectable memory errors at addresses that software is not even accessing. The first error doesn't cause the crash, but if a second error happens before the machine check handler services the first one, it'll find the Overflow bit set and won't know what errors or how many errors happened (e.g., it might have been problems in an instruction fetch, and the instructions the CPU is slated to run are bogus). Halting is the only safe thing to do. For persistent memory, the BIOS reports known-bad addresses in the ACPI ARS (address range scrub) table. They are likely to keep reappearing every boot since it is persistent memory, so you can't just reboot and hope they go away. Software is supposed to avoid reading those addresses until it fixes them (e.g., writes new data to those locations). Even if it follows this rule, the system can still crash due to speculative reads (e.g., prefetches) touching those addresses. Tony's patch marks those addresses in the page tables so the CPU won't speculatively try to read them. --- Robert Elliott, HPE Persistent Memory From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-yw0-f197.google.com (mail-yw0-f197.google.com [209.85.161.197]) by kanga.kvack.org (Postfix) with ESMTP id A5A416B02F4 for ; Thu, 17 Aug 2017 18:30:31 -0400 (EDT) Received: by mail-yw0-f197.google.com with SMTP id s143so127374041ywg.3 for ; Thu, 17 Aug 2017 15:30:31 -0700 (PDT) Received: from g9t5008.houston.hpe.com (g9t5008.houston.hpe.com. [15.241.48.72]) by mx.google.com with ESMTPS id y21si39648ywd.702.2017.08.17.15.30.30 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 17 Aug 2017 15:30:30 -0700 (PDT) From: "Elliott, Robert (Persistent Memory)" Subject: RE: [PATCH-resend] mm/hwpoison: Clear PRESENT bit for kernel 1:1 mappings of poison pages Date: Thu, 17 Aug 2017 22:29:48 +0000 Message-ID: References: <20170816171803.28342-1-tony.luck@intel.com> <20170817150942.017f87537b6cbb48e9cfc082@linux-foundation.org> In-Reply-To: <20170817150942.017f87537b6cbb48e9cfc082@linux-foundation.org> Content-Language: en-US Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Sender: owner-linux-mm@kvack.org List-ID: To: Andrew Morton , "Luck, Tony" Cc: Borislav Petkov , Dave Hansen , Naoya Horiguchi , "x86@kernel.org" , "linux-mm@kvack.org" , "linux-kernel@vger.kernel.org" > -----Original Message----- > From: Andrew Morton [mailto:akpm@linux-foundation.org] > Sent: Thursday, August 17, 2017 5:10 PM > To: Luck, Tony > Cc: Borislav Petkov ; Dave Hansen ; > Naoya Horiguchi ; Elliott, Robert (Persistent > Memory) ; x86@kernel.org; linux-mm@kvack.org; linux- > kernel@vger.kernel.org > Subject: Re: [PATCH-resend] mm/hwpoison: Clear PRESENT bit for kernel 1:1 > mappings of poison pages >=20 > On Wed, 16 Aug 2017 10:18:03 -0700 "Luck, Tony" > wrote: >=20 > > Speculative processor accesses may reference any memory that has a > > valid page table entry. While a speculative access won't generate > > a machine check, it will log the error in a machine check bank. That > > could cause escalation of a subsequent error since the overflow bit > > will be then set in the machine check bank status register. > > > > Code has to be double-plus-tricky to avoid mentioning the 1:1 virtual > > address of the page we want to map out otherwise we may trigger the > > very problem we are trying to avoid. We use a non-canonical address > > that passes through the usual Linux table walking code to get to the > > same "pte". > > > > Thanks to Dave Hansen for reviewing several iterations of this. >=20 > It's unclear (to lil ole me) what the end-user-visible effects of this > are. >=20 > Could we please have a description of that? So a) people can > understand your decision to cc:stable and b) people whose kernels are > misbehaving can use your description to decide whether your patch might > fix the issue their users are reporting. In general, the system is subject to halting due to uncorrectable memory errors at addresses that software is not even accessing. =20 The first error doesn't cause the crash, but if a second error happens before the machine check handler services the first one, it'll find the Overflow bit set and won't know what errors or how many errors happened (e.g., it might have been problems in an instruction fetch, and the instructions the CPU is slated to run are bogus). Halting is=20 the only safe thing to do. For persistent memory, the BIOS reports known-bad addresses in the ACPI ARS (address range scrub) table. They are likely to keep reappearing every boot since it is persistent memory, so you can't just reboot and hope they go away. Software is supposed to avoid reading those addresses until it fixes them (e.g., writes new data to those locations). Even if it follows this rule, the system can still crash due to speculative reads (e.g., prefetches) touching those addresses. Tony's patch marks those addresses in the page tables so the CPU won't speculatively try to read them. --- Robert Elliott, HPE Persistent Memory -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org