All of lore.kernel.org
 help / color / mirror / Atom feed
From: David Hildenbrand <david@redhat.com>
To: "Kirill A. Shutemov" <kirill@shutemov.name>
Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	Andrew Morton <akpm@linux-foundation.org>,
	Arnd Bergmann <arnd@arndb.de>, Michal Hocko <mhocko@suse.com>,
	Oscar Salvador <osalvador@suse.de>,
	Matthew Wilcox <willy@infradead.org>,
	Andrea Arcangeli <aarcange@redhat.com>,
	Minchan Kim <minchan@kernel.org>, Jann Horn <jannh@google.com>,
	Jason Gunthorpe <jgg@ziepe.ca>,
	Dave Hansen <dave.hansen@intel.com>,
	Hugh Dickins <hughd@google.com>, Rik van Riel <riel@surriel.com>,
	"Michael S . Tsirkin" <mst@redhat.com>,
	"Kirill A . Shutemov" <kirill.shutemov@linux.intel.com>,
	Vlastimil Babka <vbabka@suse.cz>,
	Richard Henderson <rth@twiddle.net>,
	Ivan Kokshaysky <ink@jurassic.park.msu.ru>,
	Matt Turner <mattst88@gmail.com>,
	Thomas Bogendoerfer <tsbogend@alpha.franken.de>,
	"James E.J. Bottomley" <James.Bottomley@HansenPartnership.com>,
	Helge Deller <deller@gmx.de>, Chris Zankel <chris@zankel.net>,
	Max Filippov <jcmvbkbc@gmail.com>,
	Mike Kravetz <mike.kravetz@oracle.com>,
	Peter Xu <peterx@redhat.com>,
	Rolf Eike Beer <eike-kernel@sf-tec.de>,
	linux-alpha@vger.kernel.org, linux-mips@vger.kernel.org,
	linux-parisc@vger.kernel.org, linux-xtensa@linux-xtensa.org,
	linux-arch@vger.kernel.org, Linux API <linux-api@vger.kernel.org>
Subject: Re: [PATCH RFCv2] mm/madvise: introduce MADV_POPULATE_(READ|WRITE) to prefault/prealloc memory
Date: Mon, 15 Mar 2021 14:26:08 +0100	[thread overview]
Message-ID: <e59d6301-6ba8-1d7f-5c15-60364eec3fe1@redhat.com> (raw)
In-Reply-To: <20210315130353.iqnwsnp2c2wpt4y2@box>

On 15.03.21 14:03, Kirill A. Shutemov wrote:
> On Mon, Mar 15, 2021 at 01:25:40PM +0100, David Hildenbrand wrote:
>> On 15.03.21 13:22, Kirill A. Shutemov wrote:
>>> On Mon, Mar 08, 2021 at 05:45:20PM +0100, David Hildenbrand wrote:
>>>> +			case -EHWPOISON: /* Skip over any poisoned pages. */
>>>> +				start += PAGE_SIZE;
>>>> +				continue;
>>>
>>> Why is it good approach? It's not abvious to me.
>>
>> My main motivation was to simplify return code handling. I don't want to
>> return -EHWPOISON to user space
> 
> Why? Hiding the problem under the rug doesn't help anybody. SIGBUS later
> is not better than an error upfront.

Well, if you think about "prefaulting page tables", the first intuition 
is certainly not to check for poisoned pages, right? After all, you are 
not actually accessing memory, you are allocating memory if required and 
fill page tables. OTOH, mlock() will also choke on poisoned pages.

With the current semantics, you can start and run a VM just fine. 
Preallocation/prefaulting succeeded after all. On access you will get a 
SIGBUS, from which e.g., QEMU can recover by injecting an MCE into the 
guest - just like if you would hit a poisoned page later.

The problem we are talking about is most probably very rare, especially 
when using MADV_POPULATE_ for actual preallocation.

I don't have a strong opinion; not bailing out on poisoned pages felt 
like the right thing to do.

-- 
Thanks,

David / dhildenb


WARNING: multiple messages have this Message-ID (diff)
From: David Hildenbrand <david@redhat.com>
To: "Kirill A. Shutemov" <kirill@shutemov.name>
Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	Andrew Morton <akpm@linux-foundation.org>,
	Arnd Bergmann <arnd@arndb.de>, Michal Hocko <mhocko@suse.com>,
	Oscar Salvador <osalvador@suse.de>,
	Matthew Wilcox <willy@infradead.org>,
	Andrea Arcangeli <aarcange@redhat.com>,
	Minchan Kim <minchan@kernel.org>, Jann Horn <jannh@google.com>,
	Jason Gunthorpe <jgg@ziepe.ca>,
	Dave Hansen <dave.hansen@intel.com>,
	Hugh Dickins <hughd@google.com>, Rik van Riel <riel@surriel.com>,
	"Michael S . Tsirkin" <mst@redhat.com>,
	"Kirill A . Shutemov" <kirill.shutemov@linux.intel.com>,
	Vlastimil Babka <vbabka@suse.cz>,
	Richard Henderson <rth@twiddle.net>,
	Ivan Kokshaysky <ink@jurassic.park.msu.ru>,
	Matt Turner <mattst88@gmail.com>,
	Thomas Bogendoerfer <tsbogend@alpha.franken.d>
Subject: Re: [PATCH RFCv2] mm/madvise: introduce MADV_POPULATE_(READ|WRITE) to prefault/prealloc memory
Date: Mon, 15 Mar 2021 14:26:08 +0100	[thread overview]
Message-ID: <e59d6301-6ba8-1d7f-5c15-60364eec3fe1@redhat.com> (raw)
In-Reply-To: <20210315130353.iqnwsnp2c2wpt4y2@box>

On 15.03.21 14:03, Kirill A. Shutemov wrote:
> On Mon, Mar 15, 2021 at 01:25:40PM +0100, David Hildenbrand wrote:
>> On 15.03.21 13:22, Kirill A. Shutemov wrote:
>>> On Mon, Mar 08, 2021 at 05:45:20PM +0100, David Hildenbrand wrote:
>>>> +			case -EHWPOISON: /* Skip over any poisoned pages. */
>>>> +				start += PAGE_SIZE;
>>>> +				continue;
>>>
>>> Why is it good approach? It's not abvious to me.
>>
>> My main motivation was to simplify return code handling. I don't want to
>> return -EHWPOISON to user space
> 
> Why? Hiding the problem under the rug doesn't help anybody. SIGBUS later
> is not better than an error upfront.

Well, if you think about "prefaulting page tables", the first intuition 
is certainly not to check for poisoned pages, right? After all, you are 
not actually accessing memory, you are allocating memory if required and 
fill page tables. OTOH, mlock() will also choke on poisoned pages.

With the current semantics, you can start and run a VM just fine. 
Preallocation/prefaulting succeeded after all. On access you will get a 
SIGBUS, from which e.g., QEMU can recover by injecting an MCE into the 
guest - just like if you would hit a poisoned page later.

The problem we are talking about is most probably very rare, especially 
when using MADV_POPULATE_ for actual preallocation.

I don't have a strong opinion; not bailing out on poisoned pages felt 
like the right thing to do.

-- 
Thanks,

David / dhildenb


  reply	other threads:[~2021-03-15 13:27 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-03-08 16:45 [PATCH RFCv2] mm/madvise: introduce MADV_POPULATE_(READ|WRITE) to prefault/prealloc memory David Hildenbrand
2021-03-08 16:45 ` David Hildenbrand
2021-03-09  7:35 ` Rolf Eike Beer
2021-03-09  7:35   ` Rolf Eike Beer
2021-03-09  8:31   ` David Hildenbrand
2021-03-09  8:31     ` David Hildenbrand
2021-03-10 16:07 ` David Hildenbrand
2021-03-10 16:07   ` David Hildenbrand
2021-03-15 12:22 ` Kirill A. Shutemov
2021-03-15 12:22   ` Kirill A. Shutemov
2021-03-15 12:25   ` David Hildenbrand
2021-03-15 12:25     ` David Hildenbrand
2021-03-15 13:03     ` Kirill A. Shutemov
2021-03-15 13:03       ` Kirill A. Shutemov
2021-03-15 13:26       ` David Hildenbrand [this message]
2021-03-15 13:26         ` David Hildenbrand
2021-03-15 16:28         ` David Hildenbrand
2021-03-15 16:28           ` David Hildenbrand

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=e59d6301-6ba8-1d7f-5c15-60364eec3fe1@redhat.com \
    --to=david@redhat.com \
    --cc=James.Bottomley@HansenPartnership.com \
    --cc=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=arnd@arndb.de \
    --cc=chris@zankel.net \
    --cc=dave.hansen@intel.com \
    --cc=deller@gmx.de \
    --cc=eike-kernel@sf-tec.de \
    --cc=hughd@google.com \
    --cc=ink@jurassic.park.msu.ru \
    --cc=jannh@google.com \
    --cc=jcmvbkbc@gmail.com \
    --cc=jgg@ziepe.ca \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=kirill@shutemov.name \
    --cc=linux-alpha@vger.kernel.org \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mips@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-parisc@vger.kernel.org \
    --cc=linux-xtensa@linux-xtensa.org \
    --cc=mattst88@gmail.com \
    --cc=mhocko@suse.com \
    --cc=mike.kravetz@oracle.com \
    --cc=minchan@kernel.org \
    --cc=mst@redhat.com \
    --cc=osalvador@suse.de \
    --cc=peterx@redhat.com \
    --cc=riel@surriel.com \
    --cc=rth@twiddle.net \
    --cc=tsbogend@alpha.franken.de \
    --cc=vbabka@suse.cz \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.