From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.5 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,NICE_REPLY_A,SPF_HELO_NONE,SPF_PASS,USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2B292C433E9 for ; Mon, 15 Mar 2021 16:29:59 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 03DE564F2F for ; Mon, 15 Mar 2021 16:29:58 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231665AbhCOQ3W (ORCPT ); Mon, 15 Mar 2021 12:29:22 -0400 Received: from us-smtp-delivery-124.mimecast.com ([63.128.21.124]:40508 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231458AbhCOQ3G (ORCPT ); Mon, 15 Mar 2021 12:29:06 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1615825745; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Nl1+WY2yc35r8DKGgQPrYO8xQPdF6BvZlTPgaL/G56k=; b=YOHzx1gS7USYGQN06rton0TA3dQPuGKPUef1qmuebui2TUNh8XeFd9UY51seWM3pVcShL2 Unt7/Xm1LISuzmmYkSD2O6wpRGt9aBAmMNZ6EPIsDwiMQwD7DXRR8Q2A0c/VOiFzz7P0Vb bkNI5LdWVo11SZdEcCs+yfU9O+XQbOc= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-418-gaY_ODU1M2SL4P00G20Vrw-1; Mon, 15 Mar 2021 12:29:01 -0400 X-MC-Unique: gaY_ODU1M2SL4P00G20Vrw-1 Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.phx2.redhat.com [10.5.11.15]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 491E9107ACCD; Mon, 15 Mar 2021 16:28:57 +0000 (UTC) Received: from [10.36.112.200] (ovpn-112-200.ams2.redhat.com [10.36.112.200]) by smtp.corp.redhat.com (Postfix) with ESMTP id DDB8F5D755; Mon, 15 Mar 2021 16:28:41 +0000 (UTC) Subject: Re: [PATCH RFCv2] mm/madvise: introduce MADV_POPULATE_(READ|WRITE) to prefault/prealloc memory From: David Hildenbrand To: "Kirill A. Shutemov" Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, Andrew Morton , Arnd Bergmann , Michal Hocko , Oscar Salvador , Matthew Wilcox , Andrea Arcangeli , Minchan Kim , Jann Horn , Jason Gunthorpe , Dave Hansen , Hugh Dickins , Rik van Riel , "Michael S . Tsirkin" , "Kirill A . Shutemov" , Vlastimil Babka , Richard Henderson , Ivan Kokshaysky , Matt Turner , Thomas Bogendoerfer , "James E.J. Bottomley" , Helge Deller , Chris Zankel , Max Filippov , Mike Kravetz , Peter Xu , Rolf Eike Beer , linux-alpha@vger.kernel.org, linux-mips@vger.kernel.org, linux-parisc@vger.kernel.org, linux-xtensa@linux-xtensa.org, linux-arch@vger.kernel.org, Linux API References: <20210308164520.18323-1-david@redhat.com> <20210315122213.k52wtlbbhsw42pks@box> <7d607d1c-efd5-3888-39bb-9e5f8bc08185@redhat.com> <20210315130353.iqnwsnp2c2wpt4y2@box> Organization: Red Hat GmbH Message-ID: Date: Mon, 15 Mar 2021 17:28:40 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.8.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit X-Scanned-By: MIMEDefang 2.79 on 10.5.11.15 Precedence: bulk List-ID: X-Mailing-List: linux-parisc@vger.kernel.org On 15.03.21 14:26, David Hildenbrand wrote: > On 15.03.21 14:03, Kirill A. Shutemov wrote: >> On Mon, Mar 15, 2021 at 01:25:40PM +0100, David Hildenbrand wrote: >>> On 15.03.21 13:22, Kirill A. Shutemov wrote: >>>> On Mon, Mar 08, 2021 at 05:45:20PM +0100, David Hildenbrand wrote: >>>>> + case -EHWPOISON: /* Skip over any poisoned pages. */ >>>>> + start += PAGE_SIZE; >>>>> + continue; >>>> >>>> Why is it good approach? It's not abvious to me. >>> >>> My main motivation was to simplify return code handling. I don't want to >>> return -EHWPOISON to user space >> >> Why? Hiding the problem under the rug doesn't help anybody. SIGBUS later >> is not better than an error upfront. > > Well, if you think about "prefaulting page tables", the first intuition > is certainly not to check for poisoned pages, right? After all, you are > not actually accessing memory, you are allocating memory if required and > fill page tables. OTOH, mlock() will also choke on poisoned pages. > > With the current semantics, you can start and run a VM just fine. > Preallocation/prefaulting succeeded after all. On access you will get a > SIGBUS, from which e.g., QEMU can recover by injecting an MCE into the > guest - just like if you would hit a poisoned page later. > > The problem we are talking about is most probably very rare, especially > when using MADV_POPULATE_ for actual preallocation. > > I don't have a strong opinion; not bailing out on poisoned pages felt > like the right thing to do. I'll switch to propagating -EHWPOISON, it matches how e.g., mlock() behaves -- not ignoring poisoned pages. Thanks! -- Thanks, David / dhildenb From mboxrd@z Thu Jan 1 00:00:00 1970 From: David Hildenbrand Subject: Re: [PATCH RFCv2] mm/madvise: introduce MADV_POPULATE_(READ|WRITE) to prefault/prealloc memory Date: Mon, 15 Mar 2021 17:28:40 +0100 Message-ID: References: <20210308164520.18323-1-david@redhat.com> <20210315122213.k52wtlbbhsw42pks@box> <7d607d1c-efd5-3888-39bb-9e5f8bc08185@redhat.com> <20210315130353.iqnwsnp2c2wpt4y2@box> Mime-Version: 1.0 Content-Transfer-Encoding: 7bit Return-path: DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1615825743; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Nl1+WY2yc35r8DKGgQPrYO8xQPdF6BvZlTPgaL/G56k=; b=Vy9uNIFp5GkvQua/FEpKRTiBfckxT4X/drsuy2F0vD8j8Hp1L/wCDCcJXh1ValaBCMJKu2 7kDkyoBE++6VBIympbnQMkYIJx4TH5HWwXfE/hZwhPN91IvFn9rCyrI3HiOPDNBLY8DTfe PXXz6AAfr6J+5HvncwgoZXebO5sLgc4= In-Reply-To: Content-Language: en-US List-ID: Content-Type: text/plain; charset="us-ascii"; format="flowed" To: "Kirill A. Shutemov" Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, Andrew Morton , Arnd Bergmann , Michal Hocko , Oscar Salvador , Matthew Wilcox , Andrea Arcangeli , Minchan Kim , Jann Horn , Jason Gunthorpe , Dave Hansen , Hugh Dickins , Rik van Riel , "Michael S . Tsirkin" , "Kirill A . Shutemov" , Vlastimil Babka , Richard Henderson , Ivan Kokshaysky , Matt Turner , Thomas Bogendoerfer On 15.03.21 14:26, David Hildenbrand wrote: > On 15.03.21 14:03, Kirill A. Shutemov wrote: >> On Mon, Mar 15, 2021 at 01:25:40PM +0100, David Hildenbrand wrote: >>> On 15.03.21 13:22, Kirill A. Shutemov wrote: >>>> On Mon, Mar 08, 2021 at 05:45:20PM +0100, David Hildenbrand wrote: >>>>> + case -EHWPOISON: /* Skip over any poisoned pages. */ >>>>> + start += PAGE_SIZE; >>>>> + continue; >>>> >>>> Why is it good approach? It's not abvious to me. >>> >>> My main motivation was to simplify return code handling. I don't want to >>> return -EHWPOISON to user space >> >> Why? Hiding the problem under the rug doesn't help anybody. SIGBUS later >> is not better than an error upfront. > > Well, if you think about "prefaulting page tables", the first intuition > is certainly not to check for poisoned pages, right? After all, you are > not actually accessing memory, you are allocating memory if required and > fill page tables. OTOH, mlock() will also choke on poisoned pages. > > With the current semantics, you can start and run a VM just fine. > Preallocation/prefaulting succeeded after all. On access you will get a > SIGBUS, from which e.g., QEMU can recover by injecting an MCE into the > guest - just like if you would hit a poisoned page later. > > The problem we are talking about is most probably very rare, especially > when using MADV_POPULATE_ for actual preallocation. > > I don't have a strong opinion; not bailing out on poisoned pages felt > like the right thing to do. I'll switch to propagating -EHWPOISON, it matches how e.g., mlock() behaves -- not ignoring poisoned pages. Thanks! -- Thanks, David / dhildenb