All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Edgecombe, Rick P" <rick.p.edgecombe@intel.com>
To: "rppt@kernel.org" <rppt@kernel.org>
Cc: "david@redhat.com" <david@redhat.com>,
	"cl@linux.com" <cl@linux.com>,
	"gor@linux.ibm.com" <gor@linux.ibm.com>,
	"hpa@zytor.com" <hpa@zytor.com>,
	"peterz@infradead.org" <peterz@infradead.org>,
	"catalin.marinas@arm.com" <catalin.marinas@arm.com>,
	"dave.hansen@linux.intel.com" <dave.hansen@linux.intel.com>,
	"borntraeger@de.ibm.com" <borntraeger@de.ibm.com>,
	"penberg@kernel.org" <penberg@kernel.org>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	"iamjoonsoo.kim@lge.com" <iamjoonsoo.kim@lge.com>,
	"will@kernel.org" <will@kernel.org>,
	"aou@eecs.berkeley.edu" <aou@eecs.berkeley.edu>,
	"kirill@shutemov.name" <kirill@shutemov.name>,
	"rientjes@google.com" <rientjes@google.com>,
	"rppt@linux.ibm.com" <rppt@linux.ibm.com>,
	"paulus@samba.org" <paulus@samba.org>,
	"hca@linux.ibm.com" <hca@linux.ibm.com>,
	"bp@alien8.de" <bp@alien8.de>, "pavel@ucw.cz" <pavel@ucw.cz>,
	"sparclinux@vger.kernel.org" <sparclinux@vger.kernel.org>,
	"akpm@linux-foundation.org" <akpm@linux-foundation.org>,
	"luto@kernel.org" <luto@kernel.org>,
	"davem@davemloft.net" <davem@davemloft.net>,
	"mpe@ellerman.id.au" <mpe@ellerman.id.au>,
	"benh@kernel.crashing.org" <benh@kernel.crashing.org>,
	"linuxppc-dev@lists.ozlabs.org" <linuxppc-dev@lists.ozlabs.org>,
	"rjw@rjwysocki.net" <rjw@rjwysocki.net>,
	"tglx@linutronix.de" <tglx@linutronix.de>,
	"linux-riscv@lists.infradead.org"
	<linux-riscv@lists.infradead.org>,
	"x86@kernel.org" <x86@kernel.org>,
	"linux-pm@vger.kernel.org" <linux-pm@vger.kernel.org>,
	"linux-arm-kernel@lists.infradead.org" 
	<linux-arm-kernel@lists.infradead.org>,
	"palmer@dabbelt.com" <palmer@dabbelt.com>,
	"Brown, Len" <len.brown@intel.com>,
	"mingo@redhat.com" <mingo@redhat.com>,
	"linux-s390@vger.kernel.org" <linux-s390@vger.kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"paul.walmsley@sifive.com" <paul.walmsley@sifive.com>
Subject: Re: [PATCH 0/4] arch, mm: improve robustness of direct map manipulation
Date: Mon, 26 Oct 2020 18:05:30 +0000	[thread overview]
Message-ID: <a0212b073b3b2f62c3dbf1bf398f03fa402997be.camel@intel.com> (raw)
In-Reply-To: <20201026090526.GA1154158@kernel.org>

On Mon, 2020-10-26 at 11:05 +0200, Mike Rapoport wrote:
> On Mon, Oct 26, 2020 at 01:13:52AM +0000, Edgecombe, Rick P wrote:
> > On Sun, 2020-10-25 at 12:15 +0200, Mike Rapoport wrote:
> > > Indeed, for architectures that define
> > > CONFIG_ARCH_HAS_SET_DIRECT_MAP
> > > it is
> > > possible that __kernel_map_pages() would fail, but since this
> > > function is
> > > void, the failure will go unnoticed.
> > 
> > Could you elaborate on how this could happen? Do you mean during
> > runtime today or if something new was introduced?
> 
> A failure in__kernel_map_pages() may happen today. For instance, on
> x86
> if the kernel is built with DEBUG_PAGEALLOC.
> 
>         __kernel_map_pages(page, 1, 0);
> 
> will need to split, say, 2M page and during the split an allocation
> of
> page table could fail.

On x86 at least, DEBUG_PAGEALLOC expects to never have to break a page
on the direct map and even disables locking in cpa because it assumes
this. If this is happening somehow anyway then we should probably fix
that. Even if it's a debug feature, it will not be as useful if it is
causing its own crashes.

I'm still wondering if there is something I'm missing here. It seems
like you are saying there is a bug in some arch's, so let's add a WARN
in cross-arch code to log it as it crashes. A warn and making things
clearer seem like good ideas, but if there is a bug we should fix it.
The code around the callers still functionally assume re-mapping can't
fail.

> Currently, the only user of __kernel_map_pages() outside
> DEBUG_PAGEALLOC
> is hibernation, but I think it would be safer to entirely prevent
> usage
> of __kernel_map_pages() when DEBUG_PAGEALLOC=n.

I totally agree it's error prone FWIW. On x86, my mental model of how
it is supposed to work is: If a page is 4k and NP it cannot fail to be
remapped. set_direct_map_invalid_noflush() should result in 4k NP
pages, and DEBUG_PAGEALLOC should result in all 4k pages on the direct
map. Are you seeing this violated or do I have wrong assumptions?

Beyond whatever you are seeing, for the latter case of new things
getting introduced to an interface with hidden dependencies... Another
edge case could be a new caller to set_memory_np() could result in
large NP pages. None of the callers today should cause this AFAICT, but
it's not great to rely on the callers to know these details.



WARNING: multiple messages have this Message-ID (diff)
From: "Edgecombe, Rick P" <rick.p.edgecombe@intel.com>
To: "rppt@kernel.org" <rppt@kernel.org>
Cc: "benh@kernel.crashing.org" <benh@kernel.crashing.org>,
	"david@redhat.com" <david@redhat.com>,
	"peterz@infradead.org" <peterz@infradead.org>,
	"catalin.marinas@arm.com" <catalin.marinas@arm.com>,
	"dave.hansen@linux.intel.com" <dave.hansen@linux.intel.com>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	"paulus@samba.org" <paulus@samba.org>,
	"pavel@ucw.cz" <pavel@ucw.cz>, "hpa@zytor.com" <hpa@zytor.com>,
	"sparclinux@vger.kernel.org" <sparclinux@vger.kernel.org>,
	"cl@linux.com" <cl@linux.com>,
	"will@kernel.org" <will@kernel.org>,
	"linux-riscv@lists.infradead.org"
	<linux-riscv@lists.infradead.org>,
	"linux-s390@vger.kernel.org" <linux-s390@vger.kernel.org>,
	"mpe@ellerman.id.au" <mpe@ellerman.id.au>,
	"x86@kernel.org" <x86@kernel.org>,
	"rppt@linux.ibm.com" <rppt@linux.ibm.com>,
	"borntraeger@de.ibm.com" <borntraeger@de.ibm.com>,
	"mingo@redhat.com" <mingo@redhat.com>,
	"rientjes@google.com" <rientjes@google.com>,
	"Brown, Len" <len.brown@intel.com>,
	"aou@eecs.berkeley.edu" <aou@eecs.berkeley.edu>,
	"gor@linux.ibm.com" <gor@linux.ibm.com>,
	"linux-pm@vger.kernel.org" <linux-pm@vger.kernel.org>,
	"hca@linux.ibm.com" <hca@linux.ibm.com>,
	"bp@alien8.de" <bp@alien8.de>,
	"luto@kernel.org" <luto@kernel.org>,
	"paul.walmsley@sifive.com" <paul.walmsley@sifive.com>,
	"kirill@shutemov.name" <kirill@shutemov.name>,
	"tglx@linutronix.de" <tglx@linutronix.de>,
	"iamjoonsoo.kim@lge.com" <iamjoonsoo.kim@lge.com>,
	"linux-arm-kernel@lists.infradead.org"
	<linux-arm-kernel@lists.infradead.org>,
	"rjw@rjwysocki.net" <rjw@rjwysocki.net>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"penberg@kernel.org" <penberg@kernel.org>,
	"palmer@dabbelt.com" <palmer@dabbelt.com>,
	"akpm@linux-foundation.org" <akpm@linux-foundation.org>,
	"linuxppc-dev@lists.ozlabs.org" <linuxppc-dev@lists.ozlabs.org>,
	"davem@davemloft.net" <davem@davemloft.net>
Subject: Re: [PATCH 0/4] arch, mm: improve robustness of direct map manipulation
Date: Mon, 26 Oct 2020 18:05:30 +0000	[thread overview]
Message-ID: <a0212b073b3b2f62c3dbf1bf398f03fa402997be.camel@intel.com> (raw)
In-Reply-To: <20201026090526.GA1154158@kernel.org>

T24gTW9uLCAyMDIwLTEwLTI2IGF0IDExOjA1ICswMjAwLCBNaWtlIFJhcG9wb3J0IHdyb3RlOg0K
PiBPbiBNb24sIE9jdCAyNiwgMjAyMCBhdCAwMToxMzo1MkFNICswMDAwLCBFZGdlY29tYmUsIFJp
Y2sgUCB3cm90ZToNCj4gPiBPbiBTdW4sIDIwMjAtMTAtMjUgYXQgMTI6MTUgKzAyMDAsIE1pa2Ug
UmFwb3BvcnQgd3JvdGU6DQo+ID4gPiBJbmRlZWQsIGZvciBhcmNoaXRlY3R1cmVzIHRoYXQgZGVm
aW5lDQo+ID4gPiBDT05GSUdfQVJDSF9IQVNfU0VUX0RJUkVDVF9NQVANCj4gPiA+IGl0IGlzDQo+
ID4gPiBwb3NzaWJsZSB0aGF0IF9fa2VybmVsX21hcF9wYWdlcygpIHdvdWxkIGZhaWwsIGJ1dCBz
aW5jZSB0aGlzDQo+ID4gPiBmdW5jdGlvbiBpcw0KPiA+ID4gdm9pZCwgdGhlIGZhaWx1cmUgd2ls
bCBnbyB1bm5vdGljZWQuDQo+ID4gDQo+ID4gQ291bGQgeW91IGVsYWJvcmF0ZSBvbiBob3cgdGhp
cyBjb3VsZCBoYXBwZW4/IERvIHlvdSBtZWFuIGR1cmluZw0KPiA+IHJ1bnRpbWUgdG9kYXkgb3Ig
aWYgc29tZXRoaW5nIG5ldyB3YXMgaW50cm9kdWNlZD8NCj4gDQo+IEEgZmFpbHVyZSBpbl9fa2Vy
bmVsX21hcF9wYWdlcygpIG1heSBoYXBwZW4gdG9kYXkuIEZvciBpbnN0YW5jZSwgb24NCj4geDg2
DQo+IGlmIHRoZSBrZXJuZWwgaXMgYnVpbHQgd2l0aCBERUJVR19QQUdFQUxMT0MuDQo+IA0KPiAg
ICAgICAgIF9fa2VybmVsX21hcF9wYWdlcyhwYWdlLCAxLCAwKTsNCj4gDQo+IHdpbGwgbmVlZCB0
byBzcGxpdCwgc2F5LCAyTSBwYWdlIGFuZCBkdXJpbmcgdGhlIHNwbGl0IGFuIGFsbG9jYXRpb24N
Cj4gb2YNCj4gcGFnZSB0YWJsZSBjb3VsZCBmYWlsLg0KDQpPbiB4ODYgYXQgbGVhc3QsIERFQlVH
X1BBR0VBTExPQyBleHBlY3RzIHRvIG5ldmVyIGhhdmUgdG8gYnJlYWsgYSBwYWdlDQpvbiB0aGUg
ZGlyZWN0IG1hcCBhbmQgZXZlbiBkaXNhYmxlcyBsb2NraW5nIGluIGNwYSBiZWNhdXNlIGl0IGFz
c3VtZXMNCnRoaXMuIElmIHRoaXMgaXMgaGFwcGVuaW5nIHNvbWVob3cgYW55d2F5IHRoZW4gd2Ug
c2hvdWxkIHByb2JhYmx5IGZpeA0KdGhhdC4gRXZlbiBpZiBpdCdzIGEgZGVidWcgZmVhdHVyZSwg
aXQgd2lsbCBub3QgYmUgYXMgdXNlZnVsIGlmIGl0IGlzDQpjYXVzaW5nIGl0cyBvd24gY3Jhc2hl
cy4NCg0KSSdtIHN0aWxsIHdvbmRlcmluZyBpZiB0aGVyZSBpcyBzb21ldGhpbmcgSSdtIG1pc3Np
bmcgaGVyZS4gSXQgc2VlbXMNCmxpa2UgeW91IGFyZSBzYXlpbmcgdGhlcmUgaXMgYSBidWcgaW4g
c29tZSBhcmNoJ3MsIHNvIGxldCdzIGFkZCBhIFdBUk4NCmluIGNyb3NzLWFyY2ggY29kZSB0byBs
b2cgaXQgYXMgaXQgY3Jhc2hlcy4gQSB3YXJuIGFuZCBtYWtpbmcgdGhpbmdzDQpjbGVhcmVyIHNl
ZW0gbGlrZSBnb29kIGlkZWFzLCBidXQgaWYgdGhlcmUgaXMgYSBidWcgd2Ugc2hvdWxkIGZpeCBp
dC4NClRoZSBjb2RlIGFyb3VuZCB0aGUgY2FsbGVycyBzdGlsbCBmdW5jdGlvbmFsbHkgYXNzdW1l
IHJlLW1hcHBpbmcgY2FuJ3QNCmZhaWwuDQoNCj4gQ3VycmVudGx5LCB0aGUgb25seSB1c2VyIG9m
IF9fa2VybmVsX21hcF9wYWdlcygpIG91dHNpZGUNCj4gREVCVUdfUEFHRUFMTE9DDQo+IGlzIGhp
YmVybmF0aW9uLCBidXQgSSB0aGluayBpdCB3b3VsZCBiZSBzYWZlciB0byBlbnRpcmVseSBwcmV2
ZW50DQo+IHVzYWdlDQo+IG9mIF9fa2VybmVsX21hcF9wYWdlcygpIHdoZW4gREVCVUdfUEFHRUFM
TE9DPW4uDQoNCkkgdG90YWxseSBhZ3JlZSBpdCdzIGVycm9yIHByb25lIEZXSVcuIE9uIHg4Niwg
bXkgbWVudGFsIG1vZGVsIG9mIGhvdw0KaXQgaXMgc3VwcG9zZWQgdG8gd29yayBpczogSWYgYSBw
YWdlIGlzIDRrIGFuZCBOUCBpdCBjYW5ub3QgZmFpbCB0byBiZQ0KcmVtYXBwZWQuIHNldF9kaXJl
Y3RfbWFwX2ludmFsaWRfbm9mbHVzaCgpIHNob3VsZCByZXN1bHQgaW4gNGsgTlANCnBhZ2VzLCBh
bmQgREVCVUdfUEFHRUFMTE9DIHNob3VsZCByZXN1bHQgaW4gYWxsIDRrIHBhZ2VzIG9uIHRoZSBk
aXJlY3QNCm1hcC4gQXJlIHlvdSBzZWVpbmcgdGhpcyB2aW9sYXRlZCBvciBkbyBJIGhhdmUgd3Jv
bmcgYXNzdW1wdGlvbnM/DQoNCkJleW9uZCB3aGF0ZXZlciB5b3UgYXJlIHNlZWluZywgZm9yIHRo
ZSBsYXR0ZXIgY2FzZSBvZiBuZXcgdGhpbmdzDQpnZXR0aW5nIGludHJvZHVjZWQgdG8gYW4gaW50
ZXJmYWNlIHdpdGggaGlkZGVuIGRlcGVuZGVuY2llcy4uLiBBbm90aGVyDQplZGdlIGNhc2UgY291
bGQgYmUgYSBuZXcgY2FsbGVyIHRvIHNldF9tZW1vcnlfbnAoKSBjb3VsZCByZXN1bHQgaW4NCmxh
cmdlIE5QIHBhZ2VzLiBOb25lIG9mIHRoZSBjYWxsZXJzIHRvZGF5IHNob3VsZCBjYXVzZSB0aGlz
IEFGQUlDVCwgYnV0DQppdCdzIG5vdCBncmVhdCB0byByZWx5IG9uIHRoZSBjYWxsZXJzIHRvIGtu
b3cgdGhlc2UgZGV0YWlscy4NCg0KDQo

WARNING: multiple messages have this Message-ID (diff)
From: "Edgecombe, Rick P" <rick.p.edgecombe@intel.com>
To: "rppt@kernel.org" <rppt@kernel.org>
Cc: "benh@kernel.crashing.org" <benh@kernel.crashing.org>,
	"david@redhat.com" <david@redhat.com>,
	"peterz@infradead.org" <peterz@infradead.org>,
	"catalin.marinas@arm.com" <catalin.marinas@arm.com>,
	"dave.hansen@linux.intel.com" <dave.hansen@linux.intel.com>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	"paulus@samba.org" <paulus@samba.org>,
	"pavel@ucw.cz" <pavel@ucw.cz>, "hpa@zytor.com" <hpa@zytor.com>,
	"sparclinux@vger.kernel.org" <sparclinux@vger.kernel.org>,
	"cl@linux.com" <cl@linux.com>,
	"will@kernel.org" <will@kernel.org>,
	"linux-riscv@lists.infradead.org"
	<linux-riscv@lists.infradead.org>,
	"linux-s390@vger.kernel.org" <linux-s390@vger.kernel.org>,
	"mpe@ellerman.id.au" <mpe@ellerman.id.au>,
	"x86@kernel.org" <x86@kernel.org>,
	"rppt@linux.ibm.com" <rppt@linux.ibm.com>,
	"borntraeger@de.ibm.com" <borntraeger@de.ibm.com>,
	"mingo@redhat.com" <mingo@redhat.com>,
	"rientjes@google.com" <rientjes@google.com>,
	"Brown, Len" <len.brown@intel.com>,
	"aou@eecs.berkeley.edu" <aou@eecs.berkeley.edu>,
	"gor@linux.ibm.com" <gor@linux.ibm.com>,
	"linux-pm@vger.kernel.org" <linux-pm@vger.kernel.org>,
	"hca@linux.ibm.com" <hca@linux.ibm.com>,
	"bp@alien8.de" <bp@alien8.de>,
	"luto@kernel.org" <luto@kernel.org>,
	"paul.walmsley@sifive.com" <paul.walmsley@sifive.com>,
	"kirill@shutemov.name" <kirill@shutemov.name>,
	"tglx@linutronix.de" <tglx@linutronix.de>,
	"iamjoonsoo.kim@lge.com" <iamjoonsoo.kim@lge.com>,
	"linux-arm-kernel@lists.infradead.org"
	<linux-arm-kernel@lists.infradead.org>,
	"rjw@rjwysocki.net" <rjw@rjwysocki.net>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"penberg@kernel.org" <penberg@kernel.org>,
	"palmer@dabbelt.com" <palmer@dabbelt.com>,
	"akpm@linux-foundation.org" <akpm@linux-foundation.org>,
	"linuxppc-dev@lists.ozlabs.org" <linuxppc-dev@lists.ozlabs.org>,
	"davem@davemloft.net" <davem@davemloft.net>
Subject: Re: [PATCH 0/4] arch, mm: improve robustness of direct map manipulation
Date: Mon, 26 Oct 2020 18:05:30 +0000	[thread overview]
Message-ID: <a0212b073b3b2f62c3dbf1bf398f03fa402997be.camel@intel.com> (raw)
In-Reply-To: <20201026090526.GA1154158@kernel.org>

On Mon, 2020-10-26 at 11:05 +0200, Mike Rapoport wrote:
> On Mon, Oct 26, 2020 at 01:13:52AM +0000, Edgecombe, Rick P wrote:
> > On Sun, 2020-10-25 at 12:15 +0200, Mike Rapoport wrote:
> > > Indeed, for architectures that define
> > > CONFIG_ARCH_HAS_SET_DIRECT_MAP
> > > it is
> > > possible that __kernel_map_pages() would fail, but since this
> > > function is
> > > void, the failure will go unnoticed.
> > 
> > Could you elaborate on how this could happen? Do you mean during
> > runtime today or if something new was introduced?
> 
> A failure in__kernel_map_pages() may happen today. For instance, on
> x86
> if the kernel is built with DEBUG_PAGEALLOC.
> 
>         __kernel_map_pages(page, 1, 0);
> 
> will need to split, say, 2M page and during the split an allocation
> of
> page table could fail.

On x86 at least, DEBUG_PAGEALLOC expects to never have to break a page
on the direct map and even disables locking in cpa because it assumes
this. If this is happening somehow anyway then we should probably fix
that. Even if it's a debug feature, it will not be as useful if it is
causing its own crashes.

I'm still wondering if there is something I'm missing here. It seems
like you are saying there is a bug in some arch's, so let's add a WARN
in cross-arch code to log it as it crashes. A warn and making things
clearer seem like good ideas, but if there is a bug we should fix it.
The code around the callers still functionally assume re-mapping can't
fail.

> Currently, the only user of __kernel_map_pages() outside
> DEBUG_PAGEALLOC
> is hibernation, but I think it would be safer to entirely prevent
> usage
> of __kernel_map_pages() when DEBUG_PAGEALLOC=n.

I totally agree it's error prone FWIW. On x86, my mental model of how
it is supposed to work is: If a page is 4k and NP it cannot fail to be
remapped. set_direct_map_invalid_noflush() should result in 4k NP
pages, and DEBUG_PAGEALLOC should result in all 4k pages on the direct
map. Are you seeing this violated or do I have wrong assumptions?

Beyond whatever you are seeing, for the latter case of new things
getting introduced to an interface with hidden dependencies... Another
edge case could be a new caller to set_memory_np() could result in
large NP pages. None of the callers today should cause this AFAICT, but
it's not great to rely on the callers to know these details.


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

WARNING: multiple messages have this Message-ID (diff)
From: "Edgecombe, Rick P" <rick.p.edgecombe@intel.com>
To: "rppt@kernel.org" <rppt@kernel.org>
Cc: "david@redhat.com" <david@redhat.com>,
	"peterz@infradead.org" <peterz@infradead.org>,
	"catalin.marinas@arm.com" <catalin.marinas@arm.com>,
	"dave.hansen@linux.intel.com" <dave.hansen@linux.intel.com>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	"paulus@samba.org" <paulus@samba.org>,
	"pavel@ucw.cz" <pavel@ucw.cz>, "hpa@zytor.com" <hpa@zytor.com>,
	"sparclinux@vger.kernel.org" <sparclinux@vger.kernel.org>,
	"cl@linux.com" <cl@linux.com>,
	"will@kernel.org" <will@kernel.org>,
	"linux-riscv@lists.infradead.org"
	<linux-riscv@lists.infradead.org>,
	"linux-s390@vger.kernel.org" <linux-s390@vger.kernel.org>,
	"x86@kernel.org" <x86@kernel.org>,
	"rppt@linux.ibm.com" <rppt@linux.ibm.com>,
	"borntraeger@de.ibm.com" <borntraeger@de.ibm.com>,
	"mingo@redhat.com" <mingo@redhat.com>,
	"rientjes@google.com" <rientjes@google.com>,
	"Brown, Len" <len.brown@intel.com>,
	"aou@eecs.berkeley.edu" <aou@eecs.berkeley.edu>,
	"gor@linux.ibm.com" <gor@linux.ibm.com>,
	"linux-pm@vger.kernel.org" <linux-pm@vger.kernel.org>,
	"hca@linux.ibm.com" <hca@linux.ibm.com>,
	"bp@alien8.de" <bp@alien8.de>,
	"luto@kernel.org" <luto@kernel.org>,
	"paul.walmsley@sifive.com" <paul.walmsley@sifive.com>,
	"kirill@shutemov.name" <kirill@shutemov.name>,
	"tglx@linutronix.de" <tglx@linutronix.de>,
	"iamjoonsoo.kim@lge.com" <iamjoonsoo.kim@lge.com>,
	"linux-arm-kernel@lists.infradead.org"
	<linux-arm-kernel@lists.infradead.org>,
	"rjw@rjwysocki.net" <rjw@rjwysocki.net>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"penberg@kernel.org" <penberg@kernel.org>,
	"palmer@dabbelt.com" <palmer@dabbelt.com>,
	"akpm@linux-foundation.org" <akpm@linux-foundation.org>,
	"linuxppc-dev@lists.ozlabs.org" <linuxppc-dev@lists.ozlabs.org>,
	"davem@davemloft.net" <davem@davemloft.net>
Subject: Re: [PATCH 0/4] arch, mm: improve robustness of direct map manipulation
Date: Mon, 26 Oct 2020 18:05:30 +0000	[thread overview]
Message-ID: <a0212b073b3b2f62c3dbf1bf398f03fa402997be.camel@intel.com> (raw)
In-Reply-To: <20201026090526.GA1154158@kernel.org>

On Mon, 2020-10-26 at 11:05 +0200, Mike Rapoport wrote:
> On Mon, Oct 26, 2020 at 01:13:52AM +0000, Edgecombe, Rick P wrote:
> > On Sun, 2020-10-25 at 12:15 +0200, Mike Rapoport wrote:
> > > Indeed, for architectures that define
> > > CONFIG_ARCH_HAS_SET_DIRECT_MAP
> > > it is
> > > possible that __kernel_map_pages() would fail, but since this
> > > function is
> > > void, the failure will go unnoticed.
> > 
> > Could you elaborate on how this could happen? Do you mean during
> > runtime today or if something new was introduced?
> 
> A failure in__kernel_map_pages() may happen today. For instance, on
> x86
> if the kernel is built with DEBUG_PAGEALLOC.
> 
>         __kernel_map_pages(page, 1, 0);
> 
> will need to split, say, 2M page and during the split an allocation
> of
> page table could fail.

On x86 at least, DEBUG_PAGEALLOC expects to never have to break a page
on the direct map and even disables locking in cpa because it assumes
this. If this is happening somehow anyway then we should probably fix
that. Even if it's a debug feature, it will not be as useful if it is
causing its own crashes.

I'm still wondering if there is something I'm missing here. It seems
like you are saying there is a bug in some arch's, so let's add a WARN
in cross-arch code to log it as it crashes. A warn and making things
clearer seem like good ideas, but if there is a bug we should fix it.
The code around the callers still functionally assume re-mapping can't
fail.

> Currently, the only user of __kernel_map_pages() outside
> DEBUG_PAGEALLOC
> is hibernation, but I think it would be safer to entirely prevent
> usage
> of __kernel_map_pages() when DEBUG_PAGEALLOC=n.

I totally agree it's error prone FWIW. On x86, my mental model of how
it is supposed to work is: If a page is 4k and NP it cannot fail to be
remapped. set_direct_map_invalid_noflush() should result in 4k NP
pages, and DEBUG_PAGEALLOC should result in all 4k pages on the direct
map. Are you seeing this violated or do I have wrong assumptions?

Beyond whatever you are seeing, for the latter case of new things
getting introduced to an interface with hidden dependencies... Another
edge case could be a new caller to set_memory_np() could result in
large NP pages. None of the callers today should cause this AFAICT, but
it's not great to rely on the callers to know these details.



WARNING: multiple messages have this Message-ID (diff)
From: "Edgecombe, Rick P" <rick.p.edgecombe@intel.com>
To: "rppt@kernel.org" <rppt@kernel.org>
Cc: "benh@kernel.crashing.org" <benh@kernel.crashing.org>,
	"david@redhat.com" <david@redhat.com>,
	"peterz@infradead.org" <peterz@infradead.org>,
	"catalin.marinas@arm.com" <catalin.marinas@arm.com>,
	"dave.hansen@linux.intel.com" <dave.hansen@linux.intel.com>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	"paulus@samba.org" <paulus@samba.org>,
	"pavel@ucw.cz" <pavel@ucw.cz>, "hpa@zytor.com" <hpa@zytor.com>,
	"sparclinux@vger.kernel.org" <sparclinux@vger.kernel.org>,
	"cl@linux.com" <cl@linux.com>,
	"will@kernel.org" <will@kernel.org>,
	"linux-riscv@lists.infradead.org"
	<linux-riscv@lists.infradead.org>,
	"linux-s390@vger.kernel.org" <linux-s390@vger.kernel.org>,
	"mpe@ellerman.id.au" <mpe@ellerman.id.au>,
	"x86@kernel.org" <x86@kernel.org>,
	"rppt@linux.ibm.com" <rppt@linux.ibm.com>,
	"borntraeger@de.ibm.com" <borntraeger@de.ibm.com>,
	"mingo@redhat.com" <mingo@redhat.com>,
	"rientjes@google.com" <rientjes@google.com>,
	"Brown, Len" <len.brown@intel.com>,
	"aou@eecs.berkeley.edu" <aou@eecs.berkeley.edu>,
	"gor@linux.ibm.com" <gor@linux.ibm.com>,
	"linux-pm@vger.kernel.org" <linux-pm@vger.kernel.org>,
	"hca@linux.ibm.com" <hca@linux.ibm.com>,
	"bp@alien8.de" <bp@alien8.de>,
	"luto@kernel.org" <luto@kernel.org>,
	"paul.walmsley@sifive.com" <paul.walmsley@sifive.com>,
	"kirill@shutemov.name" <kirill@shutemov.name>,
	"tglx@linutronix.de" <tglx@linutronix.de>,
	"iamjoonsoo.kim@lge.com" <iamjoonsoo.kim@lge.com>,
	"linux-arm-kernel@lists.infradead.org"
	<linux-arm-kernel@lists.infradead.org>,
	"rjw@rjwysocki.net" <rjw@rjwysocki.net>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"penberg@kernel.org" <penberg@kernel.org>,
	"palmer@dabbelt.com" <palmer@dabbelt.com>,
	"akpm@linux-foundation.org" <akpm@linux-foundation.org>,
	"linuxppc-dev@lists.ozlabs.org" <linuxppc-dev@lists.ozlabs.org>,
	"davem@davemloft.net" <davem@davemloft.net>
Subject: Re: [PATCH 0/4] arch, mm: improve robustness of direct map manipulation
Date: Mon, 26 Oct 2020 18:05:30 +0000	[thread overview]
Message-ID: <a0212b073b3b2f62c3dbf1bf398f03fa402997be.camel@intel.com> (raw)
In-Reply-To: <20201026090526.GA1154158@kernel.org>

On Mon, 2020-10-26 at 11:05 +0200, Mike Rapoport wrote:
> On Mon, Oct 26, 2020 at 01:13:52AM +0000, Edgecombe, Rick P wrote:
> > On Sun, 2020-10-25 at 12:15 +0200, Mike Rapoport wrote:
> > > Indeed, for architectures that define
> > > CONFIG_ARCH_HAS_SET_DIRECT_MAP
> > > it is
> > > possible that __kernel_map_pages() would fail, but since this
> > > function is
> > > void, the failure will go unnoticed.
> > 
> > Could you elaborate on how this could happen? Do you mean during
> > runtime today or if something new was introduced?
> 
> A failure in__kernel_map_pages() may happen today. For instance, on
> x86
> if the kernel is built with DEBUG_PAGEALLOC.
> 
>         __kernel_map_pages(page, 1, 0);
> 
> will need to split, say, 2M page and during the split an allocation
> of
> page table could fail.

On x86 at least, DEBUG_PAGEALLOC expects to never have to break a page
on the direct map and even disables locking in cpa because it assumes
this. If this is happening somehow anyway then we should probably fix
that. Even if it's a debug feature, it will not be as useful if it is
causing its own crashes.

I'm still wondering if there is something I'm missing here. It seems
like you are saying there is a bug in some arch's, so let's add a WARN
in cross-arch code to log it as it crashes. A warn and making things
clearer seem like good ideas, but if there is a bug we should fix it.
The code around the callers still functionally assume re-mapping can't
fail.

> Currently, the only user of __kernel_map_pages() outside
> DEBUG_PAGEALLOC
> is hibernation, but I think it would be safer to entirely prevent
> usage
> of __kernel_map_pages() when DEBUG_PAGEALLOC=n.

I totally agree it's error prone FWIW. On x86, my mental model of how
it is supposed to work is: If a page is 4k and NP it cannot fail to be
remapped. set_direct_map_invalid_noflush() should result in 4k NP
pages, and DEBUG_PAGEALLOC should result in all 4k pages on the direct
map. Are you seeing this violated or do I have wrong assumptions?

Beyond whatever you are seeing, for the latter case of new things
getting introduced to an interface with hidden dependencies... Another
edge case could be a new caller to set_memory_np() could result in
large NP pages. None of the callers today should cause this AFAICT, but
it's not great to rely on the callers to know these details.


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  reply	other threads:[~2020-10-26 18:05 UTC|newest]

Thread overview: 219+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-10-25 10:15 [PATCH 0/4] arch, mm: improve robustness of direct map manipulation Mike Rapoport
2020-10-25 10:15 ` Mike Rapoport
2020-10-25 10:15 ` Mike Rapoport
2020-10-25 10:15 ` Mike Rapoport
2020-10-25 10:15 ` Mike Rapoport
2020-10-25 10:15 ` [PATCH 1/4] mm: introduce debug_pagealloc_map_pages() helper Mike Rapoport
2020-10-25 10:15   ` Mike Rapoport
2020-10-25 10:15   ` Mike Rapoport
2020-10-25 10:15   ` Mike Rapoport
2020-10-25 10:15   ` Mike Rapoport
2020-10-26 11:05   ` David Hildenbrand
2020-10-26 11:05     ` David Hildenbrand
2020-10-26 11:05     ` David Hildenbrand
2020-10-26 11:05     ` David Hildenbrand
2020-10-26 11:05     ` David Hildenbrand
2020-10-26 11:54     ` Mike Rapoport
2020-10-26 11:54       ` Mike Rapoport
2020-10-26 11:54       ` Mike Rapoport
2020-10-26 11:54       ` Mike Rapoport
2020-10-26 11:54       ` Mike Rapoport
2020-10-26 11:55       ` David Hildenbrand
2020-10-26 11:55         ` David Hildenbrand
2020-10-26 11:55         ` David Hildenbrand
2020-10-26 11:55         ` David Hildenbrand
2020-10-26 11:55         ` David Hildenbrand
2020-10-25 10:15 ` [PATCH 2/4] PM: hibernate: improve robustness of mapping pages in the direct map Mike Rapoport
2020-10-25 10:15   ` Mike Rapoport
2020-10-25 10:15   ` Mike Rapoport
2020-10-25 10:15   ` Mike Rapoport
2020-10-25 10:15   ` Mike Rapoport
2020-10-26  0:38   ` Edgecombe, Rick P
2020-10-26  0:38     ` Edgecombe, Rick P
2020-10-26  0:38     ` Edgecombe, Rick P
2020-10-26  0:38     ` Edgecombe, Rick P
2020-10-26  0:38     ` Edgecombe, Rick P
2020-10-26  0:38     ` Edgecombe, Rick P
2020-10-26  9:15     ` Mike Rapoport
2020-10-26  9:15       ` Mike Rapoport
2020-10-26  9:15       ` Mike Rapoport
2020-10-26  9:15       ` Mike Rapoport
2020-10-26  9:15       ` Mike Rapoport
2020-10-26  9:15       ` Mike Rapoport
2020-10-26 18:57       ` Edgecombe, Rick P
2020-10-26 18:57         ` Edgecombe, Rick P
2020-10-26 18:57         ` Edgecombe, Rick P
2020-10-26 18:57         ` Edgecombe, Rick P
2020-10-26 18:57         ` Edgecombe, Rick P
2020-10-26 18:57         ` Edgecombe, Rick P
2020-10-27  8:49         ` Mike Rapoport
2020-10-27  8:49           ` Mike Rapoport
2020-10-27  8:49           ` Mike Rapoport
2020-10-27  8:49           ` Mike Rapoport
2020-10-27  8:49           ` Mike Rapoport
2020-10-27  8:49           ` Mike Rapoport
2020-10-27 22:44           ` Edgecombe, Rick P
2020-10-27 22:44             ` Edgecombe, Rick P
2020-10-27 22:44             ` Edgecombe, Rick P
2020-10-27 22:44             ` Edgecombe, Rick P
2020-10-27 22:44             ` Edgecombe, Rick P
2020-10-27 22:44             ` Edgecombe, Rick P
2020-10-28  9:41             ` Mike Rapoport
2020-10-28  9:41               ` Mike Rapoport
2020-10-28  9:41               ` Mike Rapoport
2020-10-28  9:41               ` Mike Rapoport
2020-10-28  9:41               ` Mike Rapoport
2020-10-28  9:41               ` Mike Rapoport
2020-10-27  1:10       ` Edgecombe, Rick P
2020-10-27  1:10         ` Edgecombe, Rick P
2020-10-27  1:10         ` Edgecombe, Rick P
2020-10-27  1:10         ` Edgecombe, Rick P
2020-10-27  1:10         ` Edgecombe, Rick P
2020-10-27  1:10         ` Edgecombe, Rick P
2020-10-28 21:15   ` Edgecombe, Rick P
2020-10-28 21:15     ` Edgecombe, Rick P
2020-10-28 21:15     ` Edgecombe, Rick P
2020-10-28 21:15     ` Edgecombe, Rick P
2020-10-28 21:15     ` Edgecombe, Rick P
2020-10-28 21:15     ` Edgecombe, Rick P
2020-10-29  7:54     ` Mike Rapoport
2020-10-29  7:54       ` Mike Rapoport
2020-10-29  7:54       ` Mike Rapoport
2020-10-29  7:54       ` Mike Rapoport
2020-10-29  7:54       ` Mike Rapoport
2020-10-29  7:54       ` Mike Rapoport
2020-10-29 23:19       ` Edgecombe, Rick P
2020-10-29 23:19         ` Edgecombe, Rick P
2020-10-29 23:19         ` Edgecombe, Rick P
2020-10-29 23:19         ` Edgecombe, Rick P
2020-10-29 23:19         ` Edgecombe, Rick P
2020-10-29 23:19         ` Edgecombe, Rick P
2020-11-01 17:02         ` Mike Rapoport
2020-11-01 17:02           ` Mike Rapoport
2020-11-01 17:02           ` Mike Rapoport
2020-11-01 17:02           ` Mike Rapoport
2020-11-01 17:02           ` Mike Rapoport
2020-11-01 17:02           ` Mike Rapoport
2020-10-25 10:15 ` [PATCH 3/4] arch, mm: restore dependency of __kernel_map_pages() of DEBUG_PAGEALLOC Mike Rapoport
2020-10-25 10:15   ` Mike Rapoport
2020-10-25 10:15   ` Mike Rapoport
2020-10-25 10:15   ` Mike Rapoport
2020-10-25 10:15   ` Mike Rapoport
2020-10-25 10:15 ` [PATCH 4/4] arch, mm: make kernel_page_present() always available Mike Rapoport
2020-10-25 10:15   ` Mike Rapoport
2020-10-25 10:15   ` Mike Rapoport
2020-10-25 10:15   ` Mike Rapoport
2020-10-25 10:15   ` Mike Rapoport
2020-10-26  0:54   ` Edgecombe, Rick P
2020-10-26  0:54     ` Edgecombe, Rick P
2020-10-26  0:54     ` Edgecombe, Rick P
2020-10-26  0:54     ` Edgecombe, Rick P
2020-10-26  0:54     ` Edgecombe, Rick P
2020-10-26  0:54     ` Edgecombe, Rick P
2020-10-26  9:31     ` Mike Rapoport
2020-10-26  9:31       ` Mike Rapoport
2020-10-26  9:31       ` Mike Rapoport
2020-10-26  9:31       ` Mike Rapoport
2020-10-26  9:31       ` Mike Rapoport
2020-10-26  9:31       ` Mike Rapoport
2020-10-26  1:13 ` [PATCH 0/4] arch, mm: improve robustness of direct map manipulation Edgecombe, Rick P
2020-10-26  1:13   ` Edgecombe, Rick P
2020-10-26  1:13   ` Edgecombe, Rick P
2020-10-26  1:13   ` Edgecombe, Rick P
2020-10-26  1:13   ` Edgecombe, Rick P
2020-10-26  1:13   ` Edgecombe, Rick P
2020-10-26  9:05   ` Mike Rapoport
2020-10-26  9:05     ` Mike Rapoport
2020-10-26  9:05     ` Mike Rapoport
2020-10-26  9:05     ` Mike Rapoport
2020-10-26  9:05     ` Mike Rapoport
2020-10-26  9:05     ` Mike Rapoport
2020-10-26 18:05     ` Edgecombe, Rick P [this message]
2020-10-26 18:05       ` Edgecombe, Rick P
2020-10-26 18:05       ` Edgecombe, Rick P
2020-10-26 18:05       ` Edgecombe, Rick P
2020-10-26 18:05       ` Edgecombe, Rick P
2020-10-26 18:05       ` Edgecombe, Rick P
2020-10-27  8:38       ` Mike Rapoport
2020-10-27  8:38         ` Mike Rapoport
2020-10-27  8:38         ` Mike Rapoport
2020-10-27  8:38         ` Mike Rapoport
2020-10-27  8:38         ` Mike Rapoport
2020-10-27  8:38         ` Mike Rapoport
2020-10-27  8:46         ` David Hildenbrand
2020-10-27  8:46           ` David Hildenbrand
2020-10-27  8:46           ` David Hildenbrand
2020-10-27  8:46           ` David Hildenbrand
2020-10-27  8:46           ` David Hildenbrand
2020-10-27  8:46           ` David Hildenbrand
2020-10-27  9:47           ` Mike Rapoport
2020-10-27  9:47             ` Mike Rapoport
2020-10-27  9:47             ` Mike Rapoport
2020-10-27  9:47             ` Mike Rapoport
2020-10-27  9:47             ` Mike Rapoport
2020-10-27  9:47             ` Mike Rapoport
2020-10-27 10:34             ` David Hildenbrand
2020-10-27 10:34               ` David Hildenbrand
2020-10-27 10:34               ` David Hildenbrand
2020-10-27 10:34               ` David Hildenbrand
2020-10-27 10:34               ` David Hildenbrand
2020-10-27 10:34               ` David Hildenbrand
2020-10-28 11:09           ` Mike Rapoport
2020-10-28 11:09             ` Mike Rapoport
2020-10-28 11:09             ` Mike Rapoport
2020-10-28 11:09             ` Mike Rapoport
2020-10-28 11:09             ` Mike Rapoport
2020-10-28 11:09             ` Mike Rapoport
2020-10-28 11:17             ` David Hildenbrand
2020-10-28 11:17               ` David Hildenbrand
2020-10-28 11:17               ` David Hildenbrand
2020-10-28 11:17               ` David Hildenbrand
2020-10-28 11:17               ` David Hildenbrand
2020-10-28 11:17               ` David Hildenbrand
2020-10-28 12:22               ` Mike Rapoport
2020-10-28 12:22                 ` Mike Rapoport
2020-10-28 12:22                 ` Mike Rapoport
2020-10-28 12:22                 ` Mike Rapoport
2020-10-28 12:22                 ` Mike Rapoport
2020-10-28 12:22                 ` Mike Rapoport
2020-10-28 18:31             ` Edgecombe, Rick P
2020-10-28 18:31               ` Edgecombe, Rick P
2020-10-28 18:31               ` Edgecombe, Rick P
2020-10-28 18:31               ` Edgecombe, Rick P
2020-10-28 18:31               ` Edgecombe, Rick P
2020-10-28 18:31               ` Edgecombe, Rick P
2020-10-28 11:20         ` Will Deacon
2020-10-28 11:20           ` Will Deacon
2020-10-28 11:20           ` Will Deacon
2020-10-28 11:20           ` Will Deacon
2020-10-28 11:20           ` Will Deacon
2020-10-28 11:20           ` Will Deacon
2020-10-28 11:30           ` Mike Rapoport
2020-10-28 11:30             ` Mike Rapoport
2020-10-28 11:30             ` Mike Rapoport
2020-10-28 11:30             ` Mike Rapoport
2020-10-28 11:30             ` Mike Rapoport
2020-10-28 11:30             ` Mike Rapoport
2020-10-28 21:03             ` Edgecombe, Rick P
2020-10-28 21:03               ` Edgecombe, Rick P
2020-10-28 21:03               ` Edgecombe, Rick P
2020-10-28 21:03               ` Edgecombe, Rick P
2020-10-28 21:03               ` Edgecombe, Rick P
2020-10-28 21:03               ` Edgecombe, Rick P
2020-10-29  8:12               ` Mike Rapoport
2020-10-29  8:12                 ` Mike Rapoport
2020-10-29  8:12                 ` Mike Rapoport
2020-10-29  8:12                 ` Mike Rapoport
2020-10-29  8:12                 ` Mike Rapoport
2020-10-29  8:12                 ` Mike Rapoport
2020-10-29 23:19                 ` Edgecombe, Rick P
2020-10-29 23:19                   ` Edgecombe, Rick P
2020-10-29 23:19                   ` Edgecombe, Rick P
2020-10-29 23:19                   ` Edgecombe, Rick P
2020-10-29 23:19                   ` Edgecombe, Rick P
2020-10-29 23:19                   ` Edgecombe, Rick P
2020-10-29  8:15 ` David Hildenbrand
2020-10-29  8:15   ` David Hildenbrand
2020-10-29  8:15   ` David Hildenbrand
2020-10-29  8:15   ` David Hildenbrand
2020-10-29  8:15   ` David Hildenbrand

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=a0212b073b3b2f62c3dbf1bf398f03fa402997be.camel@intel.com \
    --to=rick.p.edgecombe@intel.com \
    --cc=akpm@linux-foundation.org \
    --cc=aou@eecs.berkeley.edu \
    --cc=benh@kernel.crashing.org \
    --cc=borntraeger@de.ibm.com \
    --cc=bp@alien8.de \
    --cc=catalin.marinas@arm.com \
    --cc=cl@linux.com \
    --cc=dave.hansen@linux.intel.com \
    --cc=davem@davemloft.net \
    --cc=david@redhat.com \
    --cc=gor@linux.ibm.com \
    --cc=hca@linux.ibm.com \
    --cc=hpa@zytor.com \
    --cc=iamjoonsoo.kim@lge.com \
    --cc=kirill@shutemov.name \
    --cc=len.brown@intel.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=linux-riscv@lists.infradead.org \
    --cc=linux-s390@vger.kernel.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=luto@kernel.org \
    --cc=mingo@redhat.com \
    --cc=mpe@ellerman.id.au \
    --cc=palmer@dabbelt.com \
    --cc=paul.walmsley@sifive.com \
    --cc=paulus@samba.org \
    --cc=pavel@ucw.cz \
    --cc=penberg@kernel.org \
    --cc=peterz@infradead.org \
    --cc=rientjes@google.com \
    --cc=rjw@rjwysocki.net \
    --cc=rppt@kernel.org \
    --cc=rppt@linux.ibm.com \
    --cc=sparclinux@vger.kernel.org \
    --cc=tglx@linutronix.de \
    --cc=will@kernel.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.