All of lore.kernel.org
 help / color / mirror / Atom feed
From: Dan Williams <dan.j.williams@intel.com>
To: Doug Ledford <dledford@redhat.com>
Cc: Jan Kara <jack@suse.cz>, Linux MM <linux-mm@kvack.org>,
	linux-rdma <linux-rdma@vger.kernel.org>,
	John Hubbard <jhubbard@nvidia.com>,
	linux-nvdimm <linux-nvdimm@lists.01.org>,
	Dave Chinner <david@fromorbit.com>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Matthew Wilcox <willy@infradead.org>,
	Michal Hocko <mhocko@kernel.org>, Jason Gunthorpe <jgg@ziepe.ca>,
	Jerome Glisse <jglisse@redhat.com>,
	Christopher Lameter <cl@linux.com>,
	lsf-pc@lists.linux-foundation.org
Subject: Re: [LSF/MM TOPIC] Discuss least bad options for resolving longterm-GUP usage by RDMA
Date: Wed, 6 Feb 2019 18:48:18 -0800	[thread overview]
Message-ID: <CAPcyv4i-sW9gu4nrRvvb24=uAUQms9=+Yx5=EQSj+CxpmoNkSw@mail.gmail.com> (raw)
In-Reply-To: <645c5e11b28ff10d354ae17ed3016bc895c9028b.camel@redhat.com>

On Wed, Feb 6, 2019 at 5:57 PM Doug Ledford <dledford@redhat.com> wrote:
[..]
> > > > Dave, you said the FS is responsible to arbitrate access to the
> > > > physical pages..
> > > >
> > > > Is it possible to have a filesystem for DAX that is more suited to
> > > > this environment? Ie designed to not require block reallocation (no
> > > > COW, no reflinks, different approach to ftruncate, etc)
> > >
> > > Can someone give me a real world scenario that someone is *actually*
> > > asking for with this?
> >
> > I'll point to this example. At the 6:35 mark Kodi talks about the
> > Oracle use case for DAX + RDMA.
> >
> > https://youtu.be/ywKPPIE8JfQ?t=395
>
> Thanks for the link, I'll review the panel.
>
> > Currently the only way to get this to work is to use ODP capable
> > hardware, or Device-DAX. Device-DAX is a facility to map persistent
> > memory statically through device-file. It's great for statically
> > allocated use cases, but loses all the nice things (provisioning,
> > permissions, naming) that a filesystem gives you. This debate is what
> > to do about non-ODP capable hardware and Filesystem-DAX facility. The
> > current answer is "no RDMA for you".
> >
> > > Are DAX users demanding xfs, or is it just the
> > > filesystem of convenience?
> >
> > xfs is the only Linux filesystem that supports DAX and reflink.
>
> Is it going to be clear from the link above why reflink + DAX + RDMA is
> a good/desirable thing?
>

No, unfortunately it will only clarify the DAX + RDMA use case, but
you don't need to look very far to see that the trend for storage
management is more COW / reflink / thin-provisioning etc in more
places. Users want the flexibility to be able delay, change, and
consolidate physical storage allocation decisions, otherwise
device-dax would have solved all these problems and we would not be
having this conversation.

> > > Do they need to stick with xfs?
> >
> > Can you clarify the motivation for that question?
>
> I did a little googling and research before I asked that question.
> According to the documentation, other FSes can work with DAX too (namely
> ext2 and ext4).  The question was more or less pondering whether or not
> ext2 or ext4 + RDMA + DAX would solve people's problems without the
> issues that xfs brings.

No, ext4 also supports hole punch, and the ext2 support is a toy. We
went through quite a bit of work to solve this problem for the
O_DIRECT pinned page case.

6b2bb7265f0b sched/wait: Introduce wait_var_event()
d6dc57e251a4 xfs, dax: introduce xfs_break_dax_layouts()
69eb5fa10eb2 xfs: prepare xfs_break_layouts() for another layout type
c63a8eae63d3 xfs: prepare xfs_break_layouts() to be called with
XFS_MMAPLOCK_EXCL
5fac7408d828 mm, fs, dax: handle layout changes to pinned dax mappings
b1f382178d15 ext4: close race between direct IO and ext4_break_layouts()
430657b6be89 ext4: handle layout changes to pinned DAX mappings
cdbf8897cb09 dax: dax_layout_busy_page() warn on !exceptional

So the fs is prepared to notify RDMA applications of the need to
evacuate a mapping (layout change), and the timeout to respond to that
notification can be configured by the administrator. The debate is
about what to do when the platform owner needs to get a mapping out of
the way in bounded time.

> >  This problem exists
> > for any filesystem that implements an mmap that where the physical
> > page backing the mapping is identical to the physical storage location
> > for the file data. I don't see it as an xfs specific problem. Rather,
> > xfs is taking the lead in this space because it has already deployed
> > and demonstrated that leases work for the pnfs4 block-server case, so
> > it seems logical to attempt to extend that case for non-ODP-RDMA.
> >
> > > Are they
> > > really trying to do COW backed mappings for the RDMA targets?  Or do
> > > they want a COW backed FS but are perfectly happy if the specific RDMA
> > > targets are *not* COW and are statically allocated?
> >
> > I would expect the COW to be broken at registration time. Only ODP
> > could possibly support reflink + RDMA. So I think this devolves the
> > problem back to just the "what to do about truncate/punch-hole"
> > problem in the specific case of non-ODP hardware combined with the
> > Filesystem-DAX facility.
>
> If that's the case, then we are back to EBUSY *could* work (despite the
> objections made so far).

I linked it in my response to Jason [1], but the entire reason ext2,
ext4, and xfs scream "experimental" when DAX is enabled is because DAX
makes typical flows fail that used to work in the page-cache backed
mmap case. The failure of a data space management command like
fallocate(punch_hole) is more risky than just not allowing the memory
registration to happen in the first place. Leases result in a system
that has a chance at making forward progress.

The current state of disallowing RDMA for FS-DAX is one of the "if
(dax) goto fail;" conditions that needs to be solved before filesystem
developers graduate DAX from experimental status.

[1]: https://lists.01.org/pipermail/linux-nvdimm/2019-February/019884.html
_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

WARNING: multiple messages have this Message-ID (diff)
From: Dan Williams <dan.j.williams-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
To: Doug Ledford <dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
Cc: Jan Kara <jack-AlSwsSmVLrQ@public.gmane.org>,
	Linux MM <linux-mm-Bw31MaZKKs3YtjvyW6yDsg@public.gmane.org>,
	linux-rdma <linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
	John Hubbard <jhubbard-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>,
	linux-nvdimm
	<linux-nvdimm-hn68Rpc1hR1g9hUCZPvPmw@public.gmane.org>,
	Dave Chinner <david-FqsqvQoI3Ljby3iVrkZq2A@public.gmane.org>,
	Linux Kernel Mailing List
	<linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
	Matthew Wilcox <willy-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>,
	Michal Hocko <mhocko-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>,
	Jason Gunthorpe <jgg-uk2M96/98Pc@public.gmane.org>,
	Jerome Glisse <jglisse-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>,
	Christopher Lameter <cl-vYTEC60ixJUAvxtiuMwx3w@public.gmane.org>,
	lsf-pc-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org
Subject: Re: [LSF/MM TOPIC] Discuss least bad options for resolving longterm-GUP usage by RDMA
Date: Wed, 6 Feb 2019 18:48:18 -0800	[thread overview]
Message-ID: <CAPcyv4i-sW9gu4nrRvvb24=uAUQms9=+Yx5=EQSj+CxpmoNkSw@mail.gmail.com> (raw)
In-Reply-To: <645c5e11b28ff10d354ae17ed3016bc895c9028b.camel-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>

On Wed, Feb 6, 2019 at 5:57 PM Doug Ledford <dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote:
[..]
> > > > Dave, you said the FS is responsible to arbitrate access to the
> > > > physical pages..
> > > >
> > > > Is it possible to have a filesystem for DAX that is more suited to
> > > > this environment? Ie designed to not require block reallocation (no
> > > > COW, no reflinks, different approach to ftruncate, etc)
> > >
> > > Can someone give me a real world scenario that someone is *actually*
> > > asking for with this?
> >
> > I'll point to this example. At the 6:35 mark Kodi talks about the
> > Oracle use case for DAX + RDMA.
> >
> > https://youtu.be/ywKPPIE8JfQ?t=395
>
> Thanks for the link, I'll review the panel.
>
> > Currently the only way to get this to work is to use ODP capable
> > hardware, or Device-DAX. Device-DAX is a facility to map persistent
> > memory statically through device-file. It's great for statically
> > allocated use cases, but loses all the nice things (provisioning,
> > permissions, naming) that a filesystem gives you. This debate is what
> > to do about non-ODP capable hardware and Filesystem-DAX facility. The
> > current answer is "no RDMA for you".
> >
> > > Are DAX users demanding xfs, or is it just the
> > > filesystem of convenience?
> >
> > xfs is the only Linux filesystem that supports DAX and reflink.
>
> Is it going to be clear from the link above why reflink + DAX + RDMA is
> a good/desirable thing?
>

No, unfortunately it will only clarify the DAX + RDMA use case, but
you don't need to look very far to see that the trend for storage
management is more COW / reflink / thin-provisioning etc in more
places. Users want the flexibility to be able delay, change, and
consolidate physical storage allocation decisions, otherwise
device-dax would have solved all these problems and we would not be
having this conversation.

> > > Do they need to stick with xfs?
> >
> > Can you clarify the motivation for that question?
>
> I did a little googling and research before I asked that question.
> According to the documentation, other FSes can work with DAX too (namely
> ext2 and ext4).  The question was more or less pondering whether or not
> ext2 or ext4 + RDMA + DAX would solve people's problems without the
> issues that xfs brings.

No, ext4 also supports hole punch, and the ext2 support is a toy. We
went through quite a bit of work to solve this problem for the
O_DIRECT pinned page case.

6b2bb7265f0b sched/wait: Introduce wait_var_event()
d6dc57e251a4 xfs, dax: introduce xfs_break_dax_layouts()
69eb5fa10eb2 xfs: prepare xfs_break_layouts() for another layout type
c63a8eae63d3 xfs: prepare xfs_break_layouts() to be called with
XFS_MMAPLOCK_EXCL
5fac7408d828 mm, fs, dax: handle layout changes to pinned dax mappings
b1f382178d15 ext4: close race between direct IO and ext4_break_layouts()
430657b6be89 ext4: handle layout changes to pinned DAX mappings
cdbf8897cb09 dax: dax_layout_busy_page() warn on !exceptional

So the fs is prepared to notify RDMA applications of the need to
evacuate a mapping (layout change), and the timeout to respond to that
notification can be configured by the administrator. The debate is
about what to do when the platform owner needs to get a mapping out of
the way in bounded time.

> >  This problem exists
> > for any filesystem that implements an mmap that where the physical
> > page backing the mapping is identical to the physical storage location
> > for the file data. I don't see it as an xfs specific problem. Rather,
> > xfs is taking the lead in this space because it has already deployed
> > and demonstrated that leases work for the pnfs4 block-server case, so
> > it seems logical to attempt to extend that case for non-ODP-RDMA.
> >
> > > Are they
> > > really trying to do COW backed mappings for the RDMA targets?  Or do
> > > they want a COW backed FS but are perfectly happy if the specific RDMA
> > > targets are *not* COW and are statically allocated?
> >
> > I would expect the COW to be broken at registration time. Only ODP
> > could possibly support reflink + RDMA. So I think this devolves the
> > problem back to just the "what to do about truncate/punch-hole"
> > problem in the specific case of non-ODP hardware combined with the
> > Filesystem-DAX facility.
>
> If that's the case, then we are back to EBUSY *could* work (despite the
> objections made so far).

I linked it in my response to Jason [1], but the entire reason ext2,
ext4, and xfs scream "experimental" when DAX is enabled is because DAX
makes typical flows fail that used to work in the page-cache backed
mmap case. The failure of a data space management command like
fallocate(punch_hole) is more risky than just not allowing the memory
registration to happen in the first place. Leases result in a system
that has a chance at making forward progress.

The current state of disallowing RDMA for FS-DAX is one of the "if
(dax) goto fail;" conditions that needs to be solved before filesystem
developers graduate DAX from experimental status.

[1]: https://lists.01.org/pipermail/linux-nvdimm/2019-February/019884.html

WARNING: multiple messages have this Message-ID (diff)
From: Dan Williams <dan.j.williams@intel.com>
To: Doug Ledford <dledford@redhat.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>,
	Dave Chinner <david@fromorbit.com>,
	Christopher Lameter <cl@linux.com>,
	Matthew Wilcox <willy@infradead.org>, Jan Kara <jack@suse.cz>,
	Ira Weiny <ira.weiny@intel.com>,
	lsf-pc@lists.linux-foundation.org,
	linux-rdma <linux-rdma@vger.kernel.org>,
	Linux MM <linux-mm@kvack.org>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	John Hubbard <jhubbard@nvidia.com>,
	Jerome Glisse <jglisse@redhat.com>,
	Michal Hocko <mhocko@kernel.org>,
	linux-nvdimm <linux-nvdimm@lists.01.org>
Subject: Re: [LSF/MM TOPIC] Discuss least bad options for resolving longterm-GUP usage by RDMA
Date: Wed, 6 Feb 2019 18:48:18 -0800	[thread overview]
Message-ID: <CAPcyv4i-sW9gu4nrRvvb24=uAUQms9=+Yx5=EQSj+CxpmoNkSw@mail.gmail.com> (raw)
In-Reply-To: <645c5e11b28ff10d354ae17ed3016bc895c9028b.camel@redhat.com>

On Wed, Feb 6, 2019 at 5:57 PM Doug Ledford <dledford@redhat.com> wrote:
[..]
> > > > Dave, you said the FS is responsible to arbitrate access to the
> > > > physical pages..
> > > >
> > > > Is it possible to have a filesystem for DAX that is more suited to
> > > > this environment? Ie designed to not require block reallocation (no
> > > > COW, no reflinks, different approach to ftruncate, etc)
> > >
> > > Can someone give me a real world scenario that someone is *actually*
> > > asking for with this?
> >
> > I'll point to this example. At the 6:35 mark Kodi talks about the
> > Oracle use case for DAX + RDMA.
> >
> > https://youtu.be/ywKPPIE8JfQ?t=395
>
> Thanks for the link, I'll review the panel.
>
> > Currently the only way to get this to work is to use ODP capable
> > hardware, or Device-DAX. Device-DAX is a facility to map persistent
> > memory statically through device-file. It's great for statically
> > allocated use cases, but loses all the nice things (provisioning,
> > permissions, naming) that a filesystem gives you. This debate is what
> > to do about non-ODP capable hardware and Filesystem-DAX facility. The
> > current answer is "no RDMA for you".
> >
> > > Are DAX users demanding xfs, or is it just the
> > > filesystem of convenience?
> >
> > xfs is the only Linux filesystem that supports DAX and reflink.
>
> Is it going to be clear from the link above why reflink + DAX + RDMA is
> a good/desirable thing?
>

No, unfortunately it will only clarify the DAX + RDMA use case, but
you don't need to look very far to see that the trend for storage
management is more COW / reflink / thin-provisioning etc in more
places. Users want the flexibility to be able delay, change, and
consolidate physical storage allocation decisions, otherwise
device-dax would have solved all these problems and we would not be
having this conversation.

> > > Do they need to stick with xfs?
> >
> > Can you clarify the motivation for that question?
>
> I did a little googling and research before I asked that question.
> According to the documentation, other FSes can work with DAX too (namely
> ext2 and ext4).  The question was more or less pondering whether or not
> ext2 or ext4 + RDMA + DAX would solve people's problems without the
> issues that xfs brings.

No, ext4 also supports hole punch, and the ext2 support is a toy. We
went through quite a bit of work to solve this problem for the
O_DIRECT pinned page case.

6b2bb7265f0b sched/wait: Introduce wait_var_event()
d6dc57e251a4 xfs, dax: introduce xfs_break_dax_layouts()
69eb5fa10eb2 xfs: prepare xfs_break_layouts() for another layout type
c63a8eae63d3 xfs: prepare xfs_break_layouts() to be called with
XFS_MMAPLOCK_EXCL
5fac7408d828 mm, fs, dax: handle layout changes to pinned dax mappings
b1f382178d15 ext4: close race between direct IO and ext4_break_layouts()
430657b6be89 ext4: handle layout changes to pinned DAX mappings
cdbf8897cb09 dax: dax_layout_busy_page() warn on !exceptional

So the fs is prepared to notify RDMA applications of the need to
evacuate a mapping (layout change), and the timeout to respond to that
notification can be configured by the administrator. The debate is
about what to do when the platform owner needs to get a mapping out of
the way in bounded time.

> >  This problem exists
> > for any filesystem that implements an mmap that where the physical
> > page backing the mapping is identical to the physical storage location
> > for the file data. I don't see it as an xfs specific problem. Rather,
> > xfs is taking the lead in this space because it has already deployed
> > and demonstrated that leases work for the pnfs4 block-server case, so
> > it seems logical to attempt to extend that case for non-ODP-RDMA.
> >
> > > Are they
> > > really trying to do COW backed mappings for the RDMA targets?  Or do
> > > they want a COW backed FS but are perfectly happy if the specific RDMA
> > > targets are *not* COW and are statically allocated?
> >
> > I would expect the COW to be broken at registration time. Only ODP
> > could possibly support reflink + RDMA. So I think this devolves the
> > problem back to just the "what to do about truncate/punch-hole"
> > problem in the specific case of non-ODP hardware combined with the
> > Filesystem-DAX facility.
>
> If that's the case, then we are back to EBUSY *could* work (despite the
> objections made so far).

I linked it in my response to Jason [1], but the entire reason ext2,
ext4, and xfs scream "experimental" when DAX is enabled is because DAX
makes typical flows fail that used to work in the page-cache backed
mmap case. The failure of a data space management command like
fallocate(punch_hole) is more risky than just not allowing the memory
registration to happen in the first place. Leases result in a system
that has a chance at making forward progress.

The current state of disallowing RDMA for FS-DAX is one of the "if
(dax) goto fail;" conditions that needs to be solved before filesystem
developers graduate DAX from experimental status.

[1]: https://lists.01.org/pipermail/linux-nvdimm/2019-February/019884.html

WARNING: multiple messages have this Message-ID (diff)
From: Dan Williams <dan.j.williams@intel.com>
To: Doug Ledford <dledford@redhat.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>,
	Dave Chinner <david@fromorbit.com>,
	 Christopher Lameter <cl@linux.com>,
	Matthew Wilcox <willy@infradead.org>, Jan Kara <jack@suse.cz>,
	 Ira Weiny <ira.weiny@intel.com>,
	lsf-pc@lists.linux-foundation.org,
	 linux-rdma <linux-rdma@vger.kernel.org>,
	Linux MM <linux-mm@kvack.org>,
	 Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	John Hubbard <jhubbard@nvidia.com>,
	 Jerome Glisse <jglisse@redhat.com>,
	Michal Hocko <mhocko@kernel.org>,
	 linux-nvdimm <linux-nvdimm@lists.01.org>
Subject: Re: [LSF/MM TOPIC] Discuss least bad options for resolving longterm-GUP usage by RDMA
Date: Wed, 6 Feb 2019 18:48:18 -0800	[thread overview]
Message-ID: <CAPcyv4i-sW9gu4nrRvvb24=uAUQms9=+Yx5=EQSj+CxpmoNkSw@mail.gmail.com> (raw)
In-Reply-To: <645c5e11b28ff10d354ae17ed3016bc895c9028b.camel@redhat.com>

On Wed, Feb 6, 2019 at 5:57 PM Doug Ledford <dledford@redhat.com> wrote:
[..]
> > > > Dave, you said the FS is responsible to arbitrate access to the
> > > > physical pages..
> > > >
> > > > Is it possible to have a filesystem for DAX that is more suited to
> > > > this environment? Ie designed to not require block reallocation (no
> > > > COW, no reflinks, different approach to ftruncate, etc)
> > >
> > > Can someone give me a real world scenario that someone is *actually*
> > > asking for with this?
> >
> > I'll point to this example. At the 6:35 mark Kodi talks about the
> > Oracle use case for DAX + RDMA.
> >
> > https://youtu.be/ywKPPIE8JfQ?t=395
>
> Thanks for the link, I'll review the panel.
>
> > Currently the only way to get this to work is to use ODP capable
> > hardware, or Device-DAX. Device-DAX is a facility to map persistent
> > memory statically through device-file. It's great for statically
> > allocated use cases, but loses all the nice things (provisioning,
> > permissions, naming) that a filesystem gives you. This debate is what
> > to do about non-ODP capable hardware and Filesystem-DAX facility. The
> > current answer is "no RDMA for you".
> >
> > > Are DAX users demanding xfs, or is it just the
> > > filesystem of convenience?
> >
> > xfs is the only Linux filesystem that supports DAX and reflink.
>
> Is it going to be clear from the link above why reflink + DAX + RDMA is
> a good/desirable thing?
>

No, unfortunately it will only clarify the DAX + RDMA use case, but
you don't need to look very far to see that the trend for storage
management is more COW / reflink / thin-provisioning etc in more
places. Users want the flexibility to be able delay, change, and
consolidate physical storage allocation decisions, otherwise
device-dax would have solved all these problems and we would not be
having this conversation.

> > > Do they need to stick with xfs?
> >
> > Can you clarify the motivation for that question?
>
> I did a little googling and research before I asked that question.
> According to the documentation, other FSes can work with DAX too (namely
> ext2 and ext4).  The question was more or less pondering whether or not
> ext2 or ext4 + RDMA + DAX would solve people's problems without the
> issues that xfs brings.

No, ext4 also supports hole punch, and the ext2 support is a toy. We
went through quite a bit of work to solve this problem for the
O_DIRECT pinned page case.

6b2bb7265f0b sched/wait: Introduce wait_var_event()
d6dc57e251a4 xfs, dax: introduce xfs_break_dax_layouts()
69eb5fa10eb2 xfs: prepare xfs_break_layouts() for another layout type
c63a8eae63d3 xfs: prepare xfs_break_layouts() to be called with
XFS_MMAPLOCK_EXCL
5fac7408d828 mm, fs, dax: handle layout changes to pinned dax mappings
b1f382178d15 ext4: close race between direct IO and ext4_break_layouts()
430657b6be89 ext4: handle layout changes to pinned DAX mappings
cdbf8897cb09 dax: dax_layout_busy_page() warn on !exceptional

So the fs is prepared to notify RDMA applications of the need to
evacuate a mapping (layout change), and the timeout to respond to that
notification can be configured by the administrator. The debate is
about what to do when the platform owner needs to get a mapping out of
the way in bounded time.

> >  This problem exists
> > for any filesystem that implements an mmap that where the physical
> > page backing the mapping is identical to the physical storage location
> > for the file data. I don't see it as an xfs specific problem. Rather,
> > xfs is taking the lead in this space because it has already deployed
> > and demonstrated that leases work for the pnfs4 block-server case, so
> > it seems logical to attempt to extend that case for non-ODP-RDMA.
> >
> > > Are they
> > > really trying to do COW backed mappings for the RDMA targets?  Or do
> > > they want a COW backed FS but are perfectly happy if the specific RDMA
> > > targets are *not* COW and are statically allocated?
> >
> > I would expect the COW to be broken at registration time. Only ODP
> > could possibly support reflink + RDMA. So I think this devolves the
> > problem back to just the "what to do about truncate/punch-hole"
> > problem in the specific case of non-ODP hardware combined with the
> > Filesystem-DAX facility.
>
> If that's the case, then we are back to EBUSY *could* work (despite the
> objections made so far).

I linked it in my response to Jason [1], but the entire reason ext2,
ext4, and xfs scream "experimental" when DAX is enabled is because DAX
makes typical flows fail that used to work in the page-cache backed
mmap case. The failure of a data space management command like
fallocate(punch_hole) is more risky than just not allowing the memory
registration to happen in the first place. Leases result in a system
that has a chance at making forward progress.

The current state of disallowing RDMA for FS-DAX is one of the "if
(dax) goto fail;" conditions that needs to be solved before filesystem
developers graduate DAX from experimental status.

[1]: https://lists.01.org/pipermail/linux-nvdimm/2019-February/019884.html


  reply	other threads:[~2019-02-07  2:48 UTC|newest]

Thread overview: 155+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-02-05 17:50 [LSF/MM TOPIC] Discuss least bad options for resolving longterm-GUP usage by RDMA Ira Weiny
2019-02-05 18:01 ` Ira Weiny
2019-02-06 21:31   ` Dave Chinner
2019-02-06  9:50 ` Jan Kara
2019-02-06 17:31   ` Jason Gunthorpe
2019-02-06 17:52     ` Matthew Wilcox
2019-02-06 18:32       ` Doug Ledford
2019-02-06 18:32         ` Doug Ledford
2019-02-06 18:35         ` Matthew Wilcox
2019-02-06 18:44           ` Doug Ledford
2019-02-06 18:44             ` Doug Ledford
2019-02-06 18:52           ` Jason Gunthorpe
2019-02-06 19:45             ` Dan Williams
2019-02-06 19:45               ` Dan Williams
2019-02-06 20:14               ` Doug Ledford
2019-02-06 21:04                 ` Dan Williams
2019-02-06 21:04                   ` Dan Williams
2019-02-06 21:12                   ` Doug Ledford
2019-02-06 21:12                     ` Doug Ledford
2019-02-06 19:16         ` Christopher Lameter
2019-02-06 19:16           ` Christopher Lameter
2019-02-06 19:40           ` Matthew Wilcox
2019-02-06 20:16             ` Doug Ledford
2019-02-06 20:16               ` Doug Ledford
2019-02-06 20:20               ` Matthew Wilcox
2019-02-06 20:28                 ` Doug Ledford
2019-02-06 20:28                   ` Doug Ledford
2019-02-06 20:41                   ` Matthew Wilcox
2019-02-06 20:47                     ` Doug Ledford
2019-02-06 20:47                       ` Doug Ledford
2019-02-06 20:49                       ` Matthew Wilcox
2019-02-06 20:50                         ` Doug Ledford
2019-02-06 20:50                           ` Doug Ledford
2019-02-06 20:31                 ` Jason Gunthorpe
2019-02-06 20:39                 ` Christopher Lameter
2019-02-06 20:39                   ` Christopher Lameter
2019-02-06 20:54                 ` Doug Ledford
2019-02-06 20:54                   ` Doug Ledford
2019-02-07 16:48                   ` Jan Kara
2019-02-06 20:24             ` Christopher Lameter
2019-02-06 20:24               ` Christopher Lameter
2019-02-06 21:03           ` Dave Chinner
2019-02-06 22:08             ` Jason Gunthorpe
2019-02-06 22:24               ` Doug Ledford
2019-02-06 22:24                 ` Doug Ledford
2019-02-06 22:44                 ` Dan Williams
2019-02-06 22:44                   ` Dan Williams
2019-02-06 23:21                   ` Jason Gunthorpe
2019-02-06 23:30                     ` Dan Williams
2019-02-06 23:30                       ` Dan Williams
2019-02-06 23:41                       ` Jason Gunthorpe
2019-02-07  0:22                         ` Dan Williams
2019-02-07  0:22                           ` Dan Williams
2019-02-07  0:22                           ` Dan Williams
2019-02-07  0:22                           ` Dan Williams
2019-02-07  5:33                           ` Jason Gunthorpe
2019-02-07  1:57                   ` Doug Ledford
2019-02-07  1:57                     ` Doug Ledford
2019-02-07  2:48                     ` Dan Williams [this message]
2019-02-07  2:48                       ` Dan Williams
2019-02-07  2:48                       ` Dan Williams
2019-02-07  2:48                       ` Dan Williams
2019-02-07  2:42                   ` Doug Ledford
2019-02-07  2:42                     ` Doug Ledford
2019-02-07  3:13                     ` Dan Williams
2019-02-07  3:13                       ` Dan Williams
2019-02-07 17:23                       ` Ira Weiny
2019-02-07 16:25                   ` Doug Ledford
2019-02-07 16:25                     ` Doug Ledford
2019-02-07 16:55                     ` Christopher Lameter
2019-02-07 16:55                       ` Christopher Lameter
2019-02-07 17:35                       ` Ira Weiny
2019-02-07 18:17                         ` Christopher Lameter
2019-02-07 18:17                           ` Christopher Lameter
2019-02-08  4:43                       ` Dave Chinner
2019-02-08 11:10                         ` Jan Kara
2019-02-08 20:50                           ` Dan Williams
2019-02-08 20:50                             ` Dan Williams
2019-02-11 10:24                             ` Jan Kara
2019-02-11 17:22                               ` Dan Williams
2019-02-11 17:22                                 ` Dan Williams
2019-02-11 18:06                                 ` Jason Gunthorpe
2019-02-11 18:15                                   ` Dan Williams
2019-02-11 18:15                                     ` Dan Williams
2019-02-11 18:19                                   ` Ira Weiny
2019-02-11 18:26                                     ` Jason Gunthorpe
2019-02-11 18:40                                       ` Matthew Wilcox
2019-02-11 19:58                                         ` Dan Williams
2019-02-11 19:58                                           ` Dan Williams
2019-02-11 20:49                                           ` Jason Gunthorpe
2019-02-11 21:02                                             ` Dan Williams
2019-02-11 21:02                                               ` Dan Williams
2019-02-11 21:09                                               ` Jason Gunthorpe
2019-02-12 16:34                                                 ` Jan Kara
2019-02-12 16:55                                                   ` Christopher Lameter
2019-02-12 16:55                                                     ` Christopher Lameter
2019-02-13 15:06                                                     ` Jan Kara
2019-02-12 16:36                                               ` Christopher Lameter
2019-02-12 16:36                                                 ` Christopher Lameter
2019-02-12 16:44                                                 ` Jan Kara
2019-02-11 21:08                                     ` Jerome Glisse
2019-02-11 21:22                                     ` John Hubbard
2019-02-11 21:22                                       ` John Hubbard
2019-02-11 22:12                                       ` Jason Gunthorpe
2019-02-11 22:33                                         ` John Hubbard
2019-02-11 22:33                                           ` John Hubbard
2019-02-12 16:39                                           ` Christopher Lameter
2019-02-12 16:39                                             ` Christopher Lameter
2019-02-13  2:58                                             ` John Hubbard
2019-02-13  2:58                                               ` John Hubbard
2019-02-12 16:28                                   ` Jan Kara
2019-02-14 20:26                                   ` Jerome Glisse
2019-02-14 20:50                                     ` Matthew Wilcox
2019-02-14 21:39                                       ` Jerome Glisse
2019-02-15  1:19                                         ` Dave Chinner
2019-02-15 15:42                                           ` Christopher Lameter
2019-02-15 15:42                                             ` Christopher Lameter
2019-02-15 18:08                                             ` Matthew Wilcox
2019-02-15 18:31                                               ` Christopher Lameter
2019-02-15 18:31                                                 ` Christopher Lameter
2019-02-15 22:00                                                 ` Jason Gunthorpe
2019-02-15 23:38                                                   ` Ira Weiny
2019-02-16 22:42                                                     ` Dave Chinner
2019-02-17  2:54                                                     ` Christopher Lameter
2019-02-17  2:54                                                       ` Christopher Lameter
2019-02-12 16:07                                 ` Jan Kara
2019-02-12 21:53                                   ` Dan Williams
2019-02-12 21:53                                     ` Dan Williams
2019-02-08 21:20                           ` Dave Chinner
2019-02-08 15:33                         ` Christopher Lameter
2019-02-08 15:33                           ` Christopher Lameter
2019-02-07 17:24                     ` Matthew Wilcox
2019-02-07 17:26                       ` Jason Gunthorpe
2019-02-07  3:52                 ` Dave Chinner
2019-02-07  5:23                   ` Jason Gunthorpe
2019-02-07  6:00                     ` Dan Williams
2019-02-07  6:00                       ` Dan Williams
2019-02-07 17:17                       ` Jason Gunthorpe
2019-02-07 23:54                         ` Dan Williams
2019-02-07 23:54                           ` Dan Williams
2019-02-08  1:44                           ` Ira Weiny
2019-02-08  5:19                           ` Jason Gunthorpe
2019-02-08  7:20                             ` Dan Williams
2019-02-08  7:20                               ` Dan Williams
2019-02-08 15:42                               ` Jason Gunthorpe
2019-02-07 15:04                     ` Chuck Lever
2019-02-07 15:28                       ` Tom Talpey
2019-02-07 15:37                         ` Doug Ledford
2019-02-07 15:37                           ` Doug Ledford
2019-02-07 15:41                           ` Tom Talpey
2019-02-07 15:56                             ` Doug Ledford
2019-02-07 15:56                               ` Doug Ledford
2019-02-07 16:57                         ` Ira Weiny
2019-02-07 21:31                           ` Tom Talpey
2019-02-07 16:54                     ` Ira Weiny

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAPcyv4i-sW9gu4nrRvvb24=uAUQms9=+Yx5=EQSj+CxpmoNkSw@mail.gmail.com' \
    --to=dan.j.williams@intel.com \
    --cc=cl@linux.com \
    --cc=david@fromorbit.com \
    --cc=dledford@redhat.com \
    --cc=jack@suse.cz \
    --cc=jgg@ziepe.ca \
    --cc=jglisse@redhat.com \
    --cc=jhubbard@nvidia.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-nvdimm@lists.01.org \
    --cc=linux-rdma@vger.kernel.org \
    --cc=lsf-pc@lists.linux-foundation.org \
    --cc=mhocko@kernel.org \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.