From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9D87BC433F5 for ; Tue, 19 Oct 2021 04:26:38 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 3606A6108E for ; Tue, 19 Oct 2021 04:26:38 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 3606A6108E Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 88E4D900002; Tue, 19 Oct 2021 00:26:37 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 816E96B0071; Tue, 19 Oct 2021 00:26:37 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 6B737900002; Tue, 19 Oct 2021 00:26:37 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0130.hostedemail.com [216.40.44.130]) by kanga.kvack.org (Postfix) with ESMTP id 594506B006C for ; Tue, 19 Oct 2021 00:26:37 -0400 (EDT) Received: from smtpin28.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 102E78249980 for ; Tue, 19 Oct 2021 04:26:37 +0000 (UTC) X-FDA: 78711900834.28.5B5B9B6 Received: from mail-pf1-f169.google.com (mail-pf1-f169.google.com [209.85.210.169]) by imf26.hostedemail.com (Postfix) with ESMTP id 665FC20019C4 for ; Tue, 19 Oct 2021 04:26:37 +0000 (UTC) Received: by mail-pf1-f169.google.com with SMTP id y7so16492639pfg.8 for ; Mon, 18 Oct 2021 21:26:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=intel-com.20210112.gappssmtp.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=7wQMmH7lLgMo4ZZ0vQVnCc8CBbLizGVkHHz0zd95llQ=; b=MvcoIlOvUTiFcAUmnCqYZwte1hVj5v3reXfdaO1Xfgi+un8viqI1QdL/tjcnUOM2XV ZDo6ZEF5uCA5i1x6BjYrmi8DEeVp0SLum8nLrPIINKKzqV8ai15f6RHoYSjmrt9on8tY b1zog1rf8V7VARIX1OUeQgbrBlrtoFJowPERLC9TxoRg00t/8QRSkkZp0IGcg0hjKqef o6FmdFk/ULkmqQ9gclPMnBUPdzA0mRGc7Qqfj+PBbLzidGVvRjhteqQdvQG6CtCpKe9B rDyKmu+kL+wI399BOIeHj9q8lWi7OK8tQ3qRLf5t0jgfdSjls2N/yDGQB+MeodiBF9gj N4Mg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=7wQMmH7lLgMo4ZZ0vQVnCc8CBbLizGVkHHz0zd95llQ=; b=owAv134nMjNi9cdwvGUpuNdXmedSdD3QLm/u1Hy7EIT4Clkm8DFqPDxcOw2f2lvLkt ap/vvxzzGpobZODdk06MnLJKlZBGESl2id6aBJRi8vd9SSlu+nRCbfmrB4LoIMEkIa42 WlUzUEXMH0KQMggFiw1P2gMA3g2jp4kiL1ZOPp0xMDniGlcvVKBUivTO9txtLAabcyo5 /7hdG0y+UeKxcHTwtDVnyOU5cVgwXmd9rEolblxQFYiNy8IOdhDqzMt5hJqP1wyklv2n dD3N7n+xihDlOU5Rzk3qAFvStE6joV6BsBzyNTY2D70Uy9cm1GcY9bNyd3QA/i3rzu/d vkGA== X-Gm-Message-State: AOAM530Gcqhi2Rk9RNszcP+iKWejKPtRinXFSk32F0gQUPnut7Pd8n40 p3rjJdvZnoHrwTo3it0etvhMhtElsUSKMxO4Rj4fLQ== X-Google-Smtp-Source: ABdhPJysFF00F8r4afBRUev7hG6iUFPxIcv8rxMlcC4A43npPr6ylw8S9GhiZVVlsaTUSvMNVzOXMjJMTgBVsNuI07c= X-Received: by 2002:a63:8743:: with SMTP id i64mr9249749pge.377.1634617594915; Mon, 18 Oct 2021 21:26:34 -0700 (PDT) MIME-Version: 1.0 References: <20210820054340.GA28560@lst.de> <20210823160546.0bf243bf@thinkpad> <20210823214708.77979b3f@thinkpad> <20210824202449.19d524b5@thinkpad> <20211014230439.GA3592864@nvidia.com> <5ca908e3-b4ad-dfef-d75f-75073d4165f7@oracle.com> <20211018233045.GQ2744544@nvidia.com> In-Reply-To: <20211018233045.GQ2744544@nvidia.com> From: Dan Williams Date: Mon, 18 Oct 2021 21:26:24 -0700 Message-ID: Subject: Re: can we finally kill off CONFIG_FS_DAX_LIMITED To: Jason Gunthorpe Cc: Joao Martins , Gerald Schaefer , Christoph Hellwig , Heiko Carstens , Vasily Gorbik , Christian Borntraeger , Linux NVDIMM , linux-s390 , Matthew Wilcox , Alex Sierra , "Kuehling, Felix" , Linux MM , Ralph Campbell , Alistair Popple , Vishal Verma , Dave Jiang Content-Type: text/plain; charset="UTF-8" Authentication-Results: imf26.hostedemail.com; dkim=pass header.d=intel-com.20210112.gappssmtp.com header.s=20210112 header.b=MvcoIlOv; dmarc=fail reason="No valid SPF, DKIM not aligned (relaxed)" header.from=intel.com (policy=none); spf=none (imf26.hostedemail.com: domain of dan.j.williams@intel.com has no SPF policy when checking 209.85.210.169) smtp.mailfrom=dan.j.williams@intel.com X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: 665FC20019C4 X-Stat-Signature: dgx9t4rfip583cddw56at3xex3kn7qff X-HE-Tag: 1634617597-416362 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Mon, Oct 18, 2021 at 4:31 PM Jason Gunthorpe wrote: > > On Fri, Oct 15, 2021 at 01:22:41AM +0100, Joao Martins wrote: > > > dev_pagemap_mapping_shift() does a lookup to figure out > > which order is the page table entry represents. is_zone_device_page() > > is already used to gate usage of dev_pagemap_mapping_shift(). I think > > this might be an artifact of the same issue as 3) in which PMDs/PUDs > > are represented with base pages and hence you can't do what the rest > > of the world does with: > > This code is looks broken as written. > > vma_address() relies on certain properties that I maybe DAX (maybe > even only FSDAX?) sets on its ZONE_DEVICE pages, and > dev_pagemap_mapping_shift() does not handle the -EFAULT return. It > will crash if a memory failure hits any other kind of ZONE_DEVICE > area. That case is gated with a TODO in memory_failure_dev_pagemap(). I never got any response to queries about what to do about memory failure vs HMM. > > I'm not sure the comment is correct anyhow: > > /* > * Unmap the largest mapping to avoid breaking up > * device-dax mappings which are constant size. The > * actual size of the mapping being torn down is > * communicated in siginfo, see kill_proc() > */ > unmap_mapping_range(page->mapping, start, size, 0); > > Beacuse for non PageAnon unmap_mapping_range() does either > zap_huge_pud(), __split_huge_pmd(), or zap_huge_pmd(). > > Despite it's name __split_huge_pmd() does not actually split, it will > call __split_huge_pmd_locked: > > } else if (!(pmd_devmap(*pmd) || is_pmd_migration_entry(*pmd))) > goto out; > __split_huge_pmd_locked(vma, pmd, range.start, freeze); > > Which does > if (!vma_is_anonymous(vma)) { > old_pmd = pmdp_huge_clear_flush_notify(vma, haddr, pmd); > > Which is a zap, not split. > > So I wonder if there is a reason to use anything other than 4k here > for DAX? > > > tk->size_shift = page_shift(compound_head(p)); > > > > ... as page_shift() would just return PAGE_SHIFT (as compound_order() is 0). > > And what would be so wrong with memory failure doing this as a 4k > page? device-dax does not support misaligned mappings. It makes hard guarantees for applications that can not afford the page table allocation overhead of sub-1GB mappings.