From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 90D75C433EF for ; Fri, 5 Nov 2021 03:09:33 +0000 (UTC) Received: from smtp3.osuosl.org (smtp3.osuosl.org [140.211.166.136]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 0FEE16121E for ; Fri, 5 Nov 2021 03:09:33 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 0FEE16121E Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=mit.edu Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=lists.linux-foundation.org Received: from localhost (localhost [127.0.0.1]) by smtp3.osuosl.org (Postfix) with ESMTP id BE8EC60BAA; Fri, 5 Nov 2021 03:09:32 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from smtp3.osuosl.org ([127.0.0.1]) by localhost (smtp3.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id odkYzcuBS_wn; Fri, 5 Nov 2021 03:09:31 +0000 (UTC) Received: from lists.linuxfoundation.org (lf-lists.osuosl.org [IPv6:2605:bc80:3010:104::8cd3:938]) by smtp3.osuosl.org (Postfix) with ESMTPS id 5943760609; Fri, 5 Nov 2021 03:09:31 +0000 (UTC) Received: from lf-lists.osuosl.org (localhost [127.0.0.1]) by lists.linuxfoundation.org (Postfix) with ESMTP id 32698C0012; Fri, 5 Nov 2021 03:09:31 +0000 (UTC) Received: from smtp1.osuosl.org (smtp1.osuosl.org [140.211.166.138]) by lists.linuxfoundation.org (Postfix) with ESMTP id BB693C000E for ; Fri, 5 Nov 2021 03:09:29 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by smtp1.osuosl.org (Postfix) with ESMTP id 9D09D8184F for ; Fri, 5 Nov 2021 03:09:29 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from smtp1.osuosl.org ([127.0.0.1]) by localhost (smtp1.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id CYkyR9kBEQZm for ; Fri, 5 Nov 2021 03:09:29 +0000 (UTC) X-Greylist: domain auto-whitelisted by SQLgrey-1.8.0 Received: from outgoing.mit.edu (outgoing-auth-1.mit.edu [18.9.28.11]) by smtp1.osuosl.org (Postfix) with ESMTPS id CE7BC8101F for ; Fri, 5 Nov 2021 03:09:28 +0000 (UTC) Received: from cwcc.thunk.org (pool-72-74-133-215.bstnma.fios.verizon.net [72.74.133.215]) (authenticated bits=0) (User authenticated as tytso@ATHENA.MIT.EDU) by outgoing.mit.edu (8.14.7/8.12.4) with ESMTP id 1A539JCp001043 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 4 Nov 2021 23:09:19 -0400 Received: by cwcc.thunk.org (Postfix, from userid 15806) id 215AA15C00B9; Thu, 4 Nov 2021 23:09:19 -0400 (EDT) Date: Thu, 4 Nov 2021 23:09:19 -0400 From: "Theodore Ts'o" To: "Darrick J. Wong" Subject: Re: futher decouple DAX from block devices Message-ID: References: <20211018044054.1779424-1-hch@lst.de> <21ff4333-e567-2819-3ae0-6a2e83ec7ce6@sandeen.net> <20211104081740.GA23111@lst.de> <20211104173417.GJ2237511@magnolia> <20211104173559.GB31740@lst.de> <20211104190443.GK24333@magnolia> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20211104190443.GK24333@magnolia> Cc: Linux NVDIMM , Mike Snitzer , linux-s390 , linux-erofs@lists.ozlabs.org, Eric Sandeen , virtualization@lists.linux-foundation.org, linux-xfs , device-mapper development , linux-fsdevel , Dan Williams , linux-ext4 , Ira Weiny , Christoph Hellwig X-BeenThere: virtualization@lists.linux-foundation.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: Linux virtualization List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Errors-To: virtualization-bounces@lists.linux-foundation.org Sender: "Virtualization" On Thu, Nov 04, 2021 at 12:04:43PM -0700, Darrick J. Wong wrote: > > Note that I've avoided implementing read/write fops for dax devices > > partly out of concern for not wanting to figure out shared-mmap vs > > write coherence issues, but also because of a bet with Dave Hansen > > that device-dax not grow features like what happened to hugetlbfs. So > > it would seem mkfs would need to switch to mmap I/O, or bite the > > bullet and implement read/write fops in the driver. > > That ... would require a fair amount of userspace changes, though at > least e2fsprogs has pluggable io drivers, which would make mmapping a > character device not too awful. > > xfsprogs would be another story -- porting the buffer cache mignt not be > too bad, but mkfs and repair seem to issue pread/pwrite calls directly. > Note that xfsprogs explicitly screens out chardevs. It's not just e2fsprogs and xfsprogs. There's also udev, blkid, potententially systemd unit generators to kick off fsck runs, etc. There are probably any number of user scripts which assume that file systems are mounted on block devices --- for example, by looking at the output of lsblk, etc. Also note that block devices have O_EXCL support to provide locking against attempts to run mkfs on a mounted file system. If you move dax file systems to be mounted on a character mode device, that would have to be replicated as well, etc. So I suspect that a large number of subtle things would break, and I'd strongly recommend against going down that path. - Ted _______________________________________________ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from outgoing.mit.edu (outgoing-auth-1.mit.edu [18.9.28.11]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C73652C99 for ; Fri, 5 Nov 2021 03:13:02 +0000 (UTC) Received: from cwcc.thunk.org (pool-72-74-133-215.bstnma.fios.verizon.net [72.74.133.215]) (authenticated bits=0) (User authenticated as tytso@ATHENA.MIT.EDU) by outgoing.mit.edu (8.14.7/8.12.4) with ESMTP id 1A539JCp001043 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 4 Nov 2021 23:09:19 -0400 Received: by cwcc.thunk.org (Postfix, from userid 15806) id 215AA15C00B9; Thu, 4 Nov 2021 23:09:19 -0400 (EDT) Date: Thu, 4 Nov 2021 23:09:19 -0400 From: "Theodore Ts'o" To: "Darrick J. Wong" Cc: Dan Williams , Christoph Hellwig , Eric Sandeen , Mike Snitzer , Ira Weiny , device-mapper development , linux-xfs , Linux NVDIMM , linux-s390 , linux-fsdevel , linux-erofs@lists.ozlabs.org, linux-ext4 , virtualization@lists.linux-foundation.org Subject: Re: futher decouple DAX from block devices Message-ID: References: <20211018044054.1779424-1-hch@lst.de> <21ff4333-e567-2819-3ae0-6a2e83ec7ce6@sandeen.net> <20211104081740.GA23111@lst.de> <20211104173417.GJ2237511@magnolia> <20211104173559.GB31740@lst.de> <20211104190443.GK24333@magnolia> Precedence: bulk X-Mailing-List: nvdimm@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20211104190443.GK24333@magnolia> On Thu, Nov 04, 2021 at 12:04:43PM -0700, Darrick J. Wong wrote: > > Note that I've avoided implementing read/write fops for dax devices > > partly out of concern for not wanting to figure out shared-mmap vs > > write coherence issues, but also because of a bet with Dave Hansen > > that device-dax not grow features like what happened to hugetlbfs. So > > it would seem mkfs would need to switch to mmap I/O, or bite the > > bullet and implement read/write fops in the driver. > > That ... would require a fair amount of userspace changes, though at > least e2fsprogs has pluggable io drivers, which would make mmapping a > character device not too awful. > > xfsprogs would be another story -- porting the buffer cache mignt not be > too bad, but mkfs and repair seem to issue pread/pwrite calls directly. > Note that xfsprogs explicitly screens out chardevs. It's not just e2fsprogs and xfsprogs. There's also udev, blkid, potententially systemd unit generators to kick off fsck runs, etc. There are probably any number of user scripts which assume that file systems are mounted on block devices --- for example, by looking at the output of lsblk, etc. Also note that block devices have O_EXCL support to provide locking against attempts to run mkfs on a mounted file system. If you move dax file systems to be mounted on a character mode device, that would have to be replicated as well, etc. So I suspect that a large number of subtle things would break, and I'd strongly recommend against going down that path. - Ted From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 85E1CC433F5 for ; Fri, 5 Nov 2021 03:10:22 +0000 (UTC) Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 0F431603E7 for ; Fri, 5 Nov 2021 03:10:21 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 0F431603E7 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=mit.edu Authentication-Results: mail.kernel.org; spf=tempfail smtp.mailfrom=redhat.com Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-392-JAyQVlz_Ozi_NUu_EIui7g-1; Thu, 04 Nov 2021 23:10:19 -0400 X-MC-Unique: JAyQVlz_Ozi_NUu_EIui7g-1 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 089475075C; Fri, 5 Nov 2021 03:10:15 +0000 (UTC) Received: from colo-mx.corp.redhat.com (colo-mx02.intmail.prod.int.phx2.redhat.com [10.5.11.21]) by smtp.corp.redhat.com (Postfix) with ESMTPS id D40395BAF0; Fri, 5 Nov 2021 03:09:58 +0000 (UTC) Received: from lists01.pubmisc.prod.ext.phx2.redhat.com (lists01.pubmisc.prod.ext.phx2.redhat.com [10.5.19.33]) by colo-mx.corp.redhat.com (Postfix) with ESMTP id 347594A703; Fri, 5 Nov 2021 03:09:57 +0000 (UTC) Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.rdu2.redhat.com [10.11.54.1]) by lists01.pubmisc.prod.ext.phx2.redhat.com (8.13.8/8.13.8) with ESMTP id 1A539tDT024834 for ; Thu, 4 Nov 2021 23:09:55 -0400 Received: by smtp.corp.redhat.com (Postfix) id 04A314010FF7; Fri, 5 Nov 2021 03:09:55 +0000 (UTC) Received: from mimecast-mx02.redhat.com (mimecast01.extmail.prod.ext.rdu2.redhat.com [10.11.55.17]) by smtp.corp.redhat.com (Postfix) with ESMTPS id F39904010E79 for ; Fri, 5 Nov 2021 03:09:54 +0000 (UTC) Received: from us-smtp-1.mimecast.com (us-smtp-delivery-1.mimecast.com [207.211.31.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id D923C857050 for ; Fri, 5 Nov 2021 03:09:54 +0000 (UTC) Received: from outgoing.mit.edu (outgoing-auth-1.mit.edu [18.9.28.11]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-113-LmN3wI4zMOa4K9T7Vk6ujg-1; Thu, 04 Nov 2021 23:09:49 -0400 X-MC-Unique: LmN3wI4zMOa4K9T7Vk6ujg-1 Received: from cwcc.thunk.org (pool-72-74-133-215.bstnma.fios.verizon.net [72.74.133.215]) (authenticated bits=0) (User authenticated as tytso@ATHENA.MIT.EDU) by outgoing.mit.edu (8.14.7/8.12.4) with ESMTP id 1A539JCp001043 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 4 Nov 2021 23:09:19 -0400 Received: by cwcc.thunk.org (Postfix, from userid 15806) id 215AA15C00B9; Thu, 4 Nov 2021 23:09:19 -0400 (EDT) Date: Thu, 4 Nov 2021 23:09:19 -0400 From: "Theodore Ts'o" To: "Darrick J. Wong" Message-ID: References: <20211018044054.1779424-1-hch@lst.de> <21ff4333-e567-2819-3ae0-6a2e83ec7ce6@sandeen.net> <20211104081740.GA23111@lst.de> <20211104173417.GJ2237511@magnolia> <20211104173559.GB31740@lst.de> <20211104190443.GK24333@magnolia> MIME-Version: 1.0 In-Reply-To: <20211104190443.GK24333@magnolia> X-Mimecast-Impersonation-Protect: Policy=CLT - Impersonation Protection Definition; Similar Internal Domain=false; Similar Monitored External Domain=false; Custom External Domain=false; Mimecast External Domain=false; Newly Observed Domain=false; Internal User Name=false; Custom Display Name List=false; Reply-to Address Mismatch=false; Targeted Threat Dictionary=false; Mimecast Threat Dictionary=false; Custom Threat Dictionary=false X-Scanned-By: MIMEDefang 2.84 on 10.11.54.1 X-loop: dm-devel@redhat.com Cc: Linux NVDIMM , Mike Snitzer , linux-s390 , linux-erofs@lists.ozlabs.org, Eric Sandeen , virtualization@lists.linux-foundation.org, linux-xfs , device-mapper development , linux-fsdevel , Dan Williams , linux-ext4 , Ira Weiny , Christoph Hellwig Subject: Re: [dm-devel] futher decouple DAX from block devices X-BeenThere: dm-devel@redhat.com X-Mailman-Version: 2.1.12 Precedence: junk List-Id: device-mapper development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: dm-devel-bounces@redhat.com Errors-To: dm-devel-bounces@redhat.com X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=dm-devel-bounces@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Disposition: inline Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit On Thu, Nov 04, 2021 at 12:04:43PM -0700, Darrick J. Wong wrote: > > Note that I've avoided implementing read/write fops for dax devices > > partly out of concern for not wanting to figure out shared-mmap vs > > write coherence issues, but also because of a bet with Dave Hansen > > that device-dax not grow features like what happened to hugetlbfs. So > > it would seem mkfs would need to switch to mmap I/O, or bite the > > bullet and implement read/write fops in the driver. > > That ... would require a fair amount of userspace changes, though at > least e2fsprogs has pluggable io drivers, which would make mmapping a > character device not too awful. > > xfsprogs would be another story -- porting the buffer cache mignt not be > too bad, but mkfs and repair seem to issue pread/pwrite calls directly. > Note that xfsprogs explicitly screens out chardevs. It's not just e2fsprogs and xfsprogs. There's also udev, blkid, potententially systemd unit generators to kick off fsck runs, etc. There are probably any number of user scripts which assume that file systems are mounted on block devices --- for example, by looking at the output of lsblk, etc. Also note that block devices have O_EXCL support to provide locking against attempts to run mkfs on a mounted file system. If you move dax file systems to be mounted on a character mode device, that would have to be replicated as well, etc. So I suspect that a large number of subtle things would break, and I'd strongly recommend against going down that path. - Ted -- dm-devel mailing list dm-devel@redhat.com https://listman.redhat.com/mailman/listinfo/dm-devel From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id E7E7DC433EF for ; Fri, 5 Nov 2021 03:13:09 +0000 (UTC) Received: from lists.ozlabs.org (lists.ozlabs.org [112.213.38.117]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 248DF61246 for ; Fri, 5 Nov 2021 03:13:08 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 248DF61246 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=mit.edu Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=lists.ozlabs.org Received: from boromir.ozlabs.org (localhost [IPv6:::1]) by lists.ozlabs.org (Postfix) with ESMTP id 4Hllv73ZxTz2yS3 for ; Fri, 5 Nov 2021 14:13:07 +1100 (AEDT) Authentication-Results: lists.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=mit.edu (client-ip=18.9.28.11; helo=outgoing.mit.edu; envelope-from=tytso@mit.edu; receiver=) X-Greylist: delayed 198 seconds by postgrey-1.36 at boromir; Fri, 05 Nov 2021 14:13:00 AEDT Received: from outgoing.mit.edu (outgoing-auth-1.mit.edu [18.9.28.11]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 4Hllv06SfNz2x9H for ; Fri, 5 Nov 2021 14:12:59 +1100 (AEDT) Received: from cwcc.thunk.org (pool-72-74-133-215.bstnma.fios.verizon.net [72.74.133.215]) (authenticated bits=0) (User authenticated as tytso@ATHENA.MIT.EDU) by outgoing.mit.edu (8.14.7/8.12.4) with ESMTP id 1A539JCp001043 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 4 Nov 2021 23:09:19 -0400 Received: by cwcc.thunk.org (Postfix, from userid 15806) id 215AA15C00B9; Thu, 4 Nov 2021 23:09:19 -0400 (EDT) Date: Thu, 4 Nov 2021 23:09:19 -0400 From: "Theodore Ts'o" To: "Darrick J. Wong" Subject: Re: futher decouple DAX from block devices Message-ID: References: <20211018044054.1779424-1-hch@lst.de> <21ff4333-e567-2819-3ae0-6a2e83ec7ce6@sandeen.net> <20211104081740.GA23111@lst.de> <20211104173417.GJ2237511@magnolia> <20211104173559.GB31740@lst.de> <20211104190443.GK24333@magnolia> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20211104190443.GK24333@magnolia> X-BeenThere: linux-erofs@lists.ozlabs.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Development of Linux EROFS file system List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Linux NVDIMM , Mike Snitzer , linux-s390 , linux-erofs@lists.ozlabs.org, Eric Sandeen , virtualization@lists.linux-foundation.org, linux-xfs , device-mapper development , linux-fsdevel , Dan Williams , linux-ext4 , Ira Weiny , Christoph Hellwig Errors-To: linux-erofs-bounces+linux-erofs=archiver.kernel.org@lists.ozlabs.org Sender: "Linux-erofs" On Thu, Nov 04, 2021 at 12:04:43PM -0700, Darrick J. Wong wrote: > > Note that I've avoided implementing read/write fops for dax devices > > partly out of concern for not wanting to figure out shared-mmap vs > > write coherence issues, but also because of a bet with Dave Hansen > > that device-dax not grow features like what happened to hugetlbfs. So > > it would seem mkfs would need to switch to mmap I/O, or bite the > > bullet and implement read/write fops in the driver. > > That ... would require a fair amount of userspace changes, though at > least e2fsprogs has pluggable io drivers, which would make mmapping a > character device not too awful. > > xfsprogs would be another story -- porting the buffer cache mignt not be > too bad, but mkfs and repair seem to issue pread/pwrite calls directly. > Note that xfsprogs explicitly screens out chardevs. It's not just e2fsprogs and xfsprogs. There's also udev, blkid, potententially systemd unit generators to kick off fsck runs, etc. There are probably any number of user scripts which assume that file systems are mounted on block devices --- for example, by looking at the output of lsblk, etc. Also note that block devices have O_EXCL support to provide locking against attempts to run mkfs on a mounted file system. If you move dax file systems to be mounted on a character mode device, that would have to be replicated as well, etc. So I suspect that a large number of subtle things would break, and I'd strongly recommend against going down that path. - Ted