From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.9 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER,MAILING_LIST_MULTI,NICE_REPLY_A,SPF_HELO_NONE,SPF_PASS, USER_AGENT_SANE_1 autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6B828C433B4 for ; Wed, 5 May 2021 15:10:41 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 3D52C6101A for ; Wed, 5 May 2021 15:10:41 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233327AbhEEPLg (ORCPT ); Wed, 5 May 2021 11:11:36 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]:20865 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233273AbhEEPLf (ORCPT ); Wed, 5 May 2021 11:11:35 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1620227438; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=/L/odP+hNbqQUDDWOlbej9giUfvPDpuerxl0425+TBc=; b=e/1BlBsYloRIlP5H7nUqQVDa4u1o20rT3rKm6mHYe70SWXyD9i85EJOADcQNCLXm6sQ3Dr SCiAxOTukBhnkU1JFggg1VSbL433Qy4XXZh8pukHLS9F+QFnMo0dmxQDowX2l5bGeFtHiZ NHw6L7HYASkwiM97gyflqgCETpyxtV4= Received: from mail-wr1-f70.google.com (mail-wr1-f70.google.com [209.85.221.70]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-32-Wc4SgycqOIGLXba4BSSXHQ-1; Wed, 05 May 2021 11:10:36 -0400 X-MC-Unique: Wc4SgycqOIGLXba4BSSXHQ-1 Received: by mail-wr1-f70.google.com with SMTP id a12-20020a5d6cac0000b0290109c3c8d66fso816831wra.15 for ; Wed, 05 May 2021 08:10:36 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:cc:references:from:organization :message-id:date:user-agent:mime-version:in-reply-to :content-language:content-transfer-encoding; bh=/L/odP+hNbqQUDDWOlbej9giUfvPDpuerxl0425+TBc=; b=ktfRG+To5vkLiKb1uyyNr95MdOk64v5b7i3HMK/f5bKjb3tYrsfAKyF+xrCe8WaV4u t/qZRfUpOEyqrGpb0OrvWLDU6jwn6bzAqfZuRZhi1g3zjJ/bZt6ACHEsRy7j+seB5xEC rxDu2nIS4qkBGJcPoIuyXGxJFPS0UsRT65Yk9NfJQOfB5pFYPWzqqUtyAmvQjKcij4Bc 49H9aPVAsIyJjJ/wLwe4fst+xm8t8RGjjlWh2Q/uzSAxWTSPWQ07PJDsipAuXSliQWED qQRNnB2L73WZRHLLr8B1bh0VsPA8+ePj6+u0oYSUqcMqHdem6nRqgHVjWVU986KVpSal 6yWA== X-Gm-Message-State: AOAM532+7OlaVnwBrxHpg4Z9V3hX8TElZcdEO9QOhMmEjhYncMHKWZ8v HUKI5V/+7j18pTYulB8WuL/NX/27ucduQySHqXr7Kfe2MKmDgYZTznq5hAGXi7I/qKjp3joEbnn 4p+6Bhfz+ius4UPaK/UPlrHtg X-Received: by 2002:adf:e686:: with SMTP id r6mr38035613wrm.187.1620227435727; Wed, 05 May 2021 08:10:35 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzzj10ka75uUq1iMVB73dSK1NoxnunzZR7BfPe0BRAG27lacrjzLCxVkjF8du5Ejfe2oe3wnw== X-Received: by 2002:adf:e686:: with SMTP id r6mr38035569wrm.187.1620227435461; Wed, 05 May 2021 08:10:35 -0700 (PDT) Received: from [192.168.3.132] (p5b0c63bc.dip0.t-ipconnect.de. [91.12.99.188]) by smtp.gmail.com with ESMTPSA id m184sm6099684wme.40.2021.05.05.08.10.34 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 05 May 2021 08:10:35 -0700 (PDT) Subject: Re: [PATCH v1 5/7] mm: introduce page_offline_(begin|end|freeze|unfreeze) to synchronize setting PageOffline() To: Michal Hocko Cc: linux-kernel@vger.kernel.org, Andrew Morton , "Michael S. Tsirkin" , Jason Wang , Alexey Dobriyan , Mike Rapoport , "Matthew Wilcox (Oracle)" , Oscar Salvador , Roman Gushchin , Alex Shi , Steven Price , Mike Kravetz , Aili Yao , Jiri Bohac , "K. Y. Srinivasan" , Haiyang Zhang , Stephen Hemminger , Wei Liu , Naoya Horiguchi , linux-hyperv@vger.kernel.org, virtualization@lists.linux-foundation.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org References: <20210429122519.15183-1-david@redhat.com> <20210429122519.15183-6-david@redhat.com> From: David Hildenbrand Organization: Red Hat Message-ID: <8650f764-8652-a82c-c54f-f67401c800e8@redhat.com> Date: Wed, 5 May 2021 17:10:33 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.8.1 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit Precedence: bulk List-ID: X-Mailing-List: linux-hyperv@vger.kernel.org On 05.05.21 15:24, Michal Hocko wrote: > On Thu 29-04-21 14:25:17, David Hildenbrand wrote: >> A driver might set a page logically offline -- PageOffline() -- and >> turn the page inaccessible in the hypervisor; after that, access to page >> content can be fatal. One example is virtio-mem; while unplugged memory >> -- marked as PageOffline() can currently be read in the hypervisor, this >> will no longer be the case in the future; for example, when having >> a virtio-mem device backed by huge pages in the hypervisor. >> >> Some special PFN walkers -- i.e., /proc/kcore -- read content of random >> pages after checking PageOffline(); however, these PFN walkers can race >> with drivers that set PageOffline(). >> >> Let's introduce page_offline_(begin|end|freeze|unfreeze) for >> synchronizing. >> >> page_offline_freeze()/page_offline_unfreeze() allows for a subsystem to >> synchronize with such drivers, achieving that a page cannot be set >> PageOffline() while frozen. >> >> page_offline_begin()/page_offline_end() is used by drivers that care about >> such races when setting a page PageOffline(). >> >> For simplicity, use a rwsem for now; neither drivers nor users are >> performance sensitive. > > Please add a note to the PageOffline documentation as well. While are > adding the api close enough an explicit note there wouldn't hurt. Will do. > >> Signed-off-by: David Hildenbrand > > As to the patch itself, I am slightly worried that other pfn walkers > might be less tolerant to the locking than the proc ones. On the other > hand most users shouldn't really care as they do not tend to touch the > memory content and PageOffline check without any synchronization should > be sufficient for those. Let's try this out and see where we get... My thinking. Users that actually read random page content (as discussed in the cover letter) are 1. Hibernation 2. Dumping (/proc/kcore, /proc/vmcore) 3. Physical memory access bypassing the kernel via /dev/mem 4. Live debug tools (kgdb) Other PFN walkers really shouldn't (and don't) access random page content. Thanks! -- Thanks, David / dhildenb