All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
To: Dan Williams <dan.j.williams@intel.com>
Cc: Jan Kara <jack@suse.cz>, Ashok Raj <ashok.raj@intel.com>,
	"Darrick J. Wong" <darrick.wong@oracle.com>,
	linux-rdma@vger.kernel.org,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	Joerg Roedel <joro@8bytes.org>,
	"linux-nvdimm@lists.01.org" <linux-nvdimm@lists.01.org>,
	Dave Chinner <david@fromorbit.com>,
	Robin Murphy <robin.murphy@arm.com>,
	linux-xfs@vger.kernel.org, Linux MM <linux-mm@kvack.org>,
	Linux API <linux-api@vger.kernel.org>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	David Woodhouse <dwmw2@infradead.org>,
	Christoph Hellwig <hch@lst.de>,
	Marek Szyprowski <m.szyprowski@samsung.com>
Subject: Re: [PATCH v7 07/12] dma-mapping: introduce dma_has_iommu()
Date: Thu, 12 Oct 2017 12:27:12 -0600	[thread overview]
Message-ID: <20171012182712.GA5772@obsidianresearch.com> (raw)
In-Reply-To: <CAPcyv4gCBu5ptmWyof+Z-p7NbuCygEs2rMe2wdL0n3QQbXhrzA@mail.gmail.com>

On Tue, Oct 10, 2017 at 01:17:26PM -0700, Dan Williams wrote:

> Also keep in mind that what triggers the lease break is another
> application trying to write or punch holes in a file that is mapped
> for RDMA. So, if the hardware can't handle the iommu mapping getting
> invalidated asynchronously and the application can't react in the
> lease break timeout period then the administrator should arrange for
> the file to not be written or truncated while it is mapped.

That makes sense, but why not return ENOSYS or something to the app
trying to alter the file if the RDMA hardware can't support this
instead of having the RDMA app deal with this lease break weirdness?

> It's already the case that get_user_pages() does not lock down file
> associations, so if your application is contending with these types of
> file changes it likely already has a problem keeping transactions in
> sync with the file state even without DAX.

Yes, things go weird in non-ODP RDMA cases like this..

Also, just to clear, I would expect an app using the SIGIO interface
to basically halt ongoing RDMA, wait for MRs to become unused locally
and remotely, destroy the MRs, then somehow, establish new MRs that
cover the same logical map (eg what ODP would do transparently) after
the lease breaker has made their changes, then restart their IO.

Does your SIGIO approach have a race-free way to do that last steps?

> > So, not being able to support DAX on certain RDMA hardware is not
> > an unreasonable situation in our space.
> 
> That makes sense, but it still seems to me that this proposed solution
> allows more than enough ways to avoid that worst case scenario where
> hardware reacts badly to iommu invalidation.

Yes, although I am concerned that returning PCI-E errors is such an
unusual and untested path for some of our RDMA drivers that they may
malfunction badly...

Again, going back to the question of who would ever use this, I would
be very relucant to deploy a production configuration relying on the iommu
invalidate or SIGIO techniques, when ODP HW is available and works
flawlessly.

> be blacklisted from supporting DAX altogether. In other words this is
> a starting point to incrementally enhance or disable specific drivers,
> but with the assurance that the kernel can always do the safe thing
> when / if the driver is missing a finer grained solution.

Seems reasonable.. I think existing HW will have an easier time adding
invalidate, while new hardware really should implement ODP.

Jason
_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

WARNING: multiple messages have this Message-ID (diff)
From: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
To: Dan Williams <dan.j.williams@intel.com>
Cc: "linux-nvdimm@lists.01.org" <linux-nvdimm@lists.01.org>,
	Jan Kara <jack@suse.cz>, Ashok Raj <ashok.raj@intel.com>,
	"Darrick J. Wong" <darrick.wong@oracle.com>,
	linux-rdma@vger.kernel.org,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	Joerg Roedel <joro@8bytes.org>,
	Dave Chinner <david@fromorbit.com>,
	linux-xfs@vger.kernel.org, Linux MM <linux-mm@kvack.org>,
	Jeff Moyer <jmoyer@redhat.com>,
	Linux API <linux-api@vger.kernel.org>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	Ross Zwisler <ross.zwisler@linux.intel.com>,
	David Woodhouse <dwmw2@infradead.org>,
	Robin Murphy <robin.murphy@arm.com>,
	Christoph Hellwig <hch@lst.de>,
	Marek Szyprowski <m.szyprowski@samsung.com>
Subject: Re: [PATCH v7 07/12] dma-mapping: introduce dma_has_iommu()
Date: Thu, 12 Oct 2017 12:27:12 -0600	[thread overview]
Message-ID: <20171012182712.GA5772@obsidianresearch.com> (raw)
In-Reply-To: <CAPcyv4gCBu5ptmWyof+Z-p7NbuCygEs2rMe2wdL0n3QQbXhrzA@mail.gmail.com>

On Tue, Oct 10, 2017 at 01:17:26PM -0700, Dan Williams wrote:

> Also keep in mind that what triggers the lease break is another
> application trying to write or punch holes in a file that is mapped
> for RDMA. So, if the hardware can't handle the iommu mapping getting
> invalidated asynchronously and the application can't react in the
> lease break timeout period then the administrator should arrange for
> the file to not be written or truncated while it is mapped.

That makes sense, but why not return ENOSYS or something to the app
trying to alter the file if the RDMA hardware can't support this
instead of having the RDMA app deal with this lease break weirdness?

> It's already the case that get_user_pages() does not lock down file
> associations, so if your application is contending with these types of
> file changes it likely already has a problem keeping transactions in
> sync with the file state even without DAX.

Yes, things go weird in non-ODP RDMA cases like this..

Also, just to clear, I would expect an app using the SIGIO interface
to basically halt ongoing RDMA, wait for MRs to become unused locally
and remotely, destroy the MRs, then somehow, establish new MRs that
cover the same logical map (eg what ODP would do transparently) after
the lease breaker has made their changes, then restart their IO.

Does your SIGIO approach have a race-free way to do that last steps?

> > So, not being able to support DAX on certain RDMA hardware is not
> > an unreasonable situation in our space.
> 
> That makes sense, but it still seems to me that this proposed solution
> allows more than enough ways to avoid that worst case scenario where
> hardware reacts badly to iommu invalidation.

Yes, although I am concerned that returning PCI-E errors is such an
unusual and untested path for some of our RDMA drivers that they may
malfunction badly...

Again, going back to the question of who would ever use this, I would
be very relucant to deploy a production configuration relying on the iommu
invalidate or SIGIO techniques, when ODP HW is available and works
flawlessly.

> be blacklisted from supporting DAX altogether. In other words this is
> a starting point to incrementally enhance or disable specific drivers,
> but with the assurance that the kernel can always do the safe thing
> when / if the driver is missing a finer grained solution.

Seems reasonable.. I think existing HW will have an easier time adding
invalidate, while new hardware really should implement ODP.

Jason

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

WARNING: multiple messages have this Message-ID (diff)
From: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
To: Dan Williams <dan.j.williams@intel.com>
Cc: "linux-nvdimm@lists.01.org" <linux-nvdimm@lists.01.org>,
	Jan Kara <jack@suse.cz>, Ashok Raj <ashok.raj@intel.com>,
	"Darrick J. Wong" <darrick.wong@oracle.com>,
	linux-rdma@vger.kernel.org,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	Joerg Roedel <joro@8bytes.org>,
	Dave Chinner <david@fromorbit.com>,
	linux-xfs@vger.kernel.org, Linux MM <linux-mm@kvack.org>,
	Jeff Moyer <jmoyer@redhat.com>,
	Linux API <linux-api@vger.kernel.org>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	Ross Zwisler <ross.zwisler@linux.intel.com>,
	David Woodhouse <dwmw2@infradead.org>,
	Robin Murphy <robin.murphy@arm.com>,
	Christoph Hellwig <hch@lst.de>,
	Marek Szyprowski <m.szyprowski@samsung.com>
Subject: Re: [PATCH v7 07/12] dma-mapping: introduce dma_has_iommu()
Date: Thu, 12 Oct 2017 12:27:12 -0600	[thread overview]
Message-ID: <20171012182712.GA5772@obsidianresearch.com> (raw)
In-Reply-To: <CAPcyv4gCBu5ptmWyof+Z-p7NbuCygEs2rMe2wdL0n3QQbXhrzA@mail.gmail.com>

On Tue, Oct 10, 2017 at 01:17:26PM -0700, Dan Williams wrote:

> Also keep in mind that what triggers the lease break is another
> application trying to write or punch holes in a file that is mapped
> for RDMA. So, if the hardware can't handle the iommu mapping getting
> invalidated asynchronously and the application can't react in the
> lease break timeout period then the administrator should arrange for
> the file to not be written or truncated while it is mapped.

That makes sense, but why not return ENOSYS or something to the app
trying to alter the file if the RDMA hardware can't support this
instead of having the RDMA app deal with this lease break weirdness?

> It's already the case that get_user_pages() does not lock down file
> associations, so if your application is contending with these types of
> file changes it likely already has a problem keeping transactions in
> sync with the file state even without DAX.

Yes, things go weird in non-ODP RDMA cases like this..

Also, just to clear, I would expect an app using the SIGIO interface
to basically halt ongoing RDMA, wait for MRs to become unused locally
and remotely, destroy the MRs, then somehow, establish new MRs that
cover the same logical map (eg what ODP would do transparently) after
the lease breaker has made their changes, then restart their IO.

Does your SIGIO approach have a race-free way to do that last steps?

> > So, not being able to support DAX on certain RDMA hardware is not
> > an unreasonable situation in our space.
> 
> That makes sense, but it still seems to me that this proposed solution
> allows more than enough ways to avoid that worst case scenario where
> hardware reacts badly to iommu invalidation.

Yes, although I am concerned that returning PCI-E errors is such an
unusual and untested path for some of our RDMA drivers that they may
malfunction badly...

Again, going back to the question of who would ever use this, I would
be very relucant to deploy a production configuration relying on the iommu
invalidate or SIGIO techniques, when ODP HW is available and works
flawlessly.

> be blacklisted from supporting DAX altogether. In other words this is
> a starting point to incrementally enhance or disable specific drivers,
> but with the assurance that the kernel can always do the safe thing
> when / if the driver is missing a finer grained solution.

Seems reasonable.. I think existing HW will have an easier time adding
invalidate, while new hardware really should implement ODP.

Jason

  reply	other threads:[~2017-10-12 18:24 UTC|newest]

Thread overview: 158+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-10-06 22:35 [PATCH v7 00/12] MAP_DIRECT for DAX RDMA and userspace flush Dan Williams
2017-10-06 22:35 ` Dan Williams
2017-10-06 22:35 ` Dan Williams
2017-10-06 22:35 ` Dan Williams
2017-10-06 22:35 ` [PATCH v7 01/12] mm: introduce MAP_SHARED_VALIDATE, a mechanism to safely define new mmap flags Dan Williams
2017-10-06 22:35   ` Dan Williams
2017-10-06 22:35   ` Dan Williams
2017-10-06 22:35 ` [PATCH v7 02/12] fs, mm: pass fd to ->mmap_validate() Dan Williams
2017-10-06 22:35   ` Dan Williams
2017-10-06 22:35   ` Dan Williams
2017-10-06 22:35   ` Dan Williams
2017-10-06 22:35 ` [PATCH v7 03/12] fs: introduce i_mapdcount Dan Williams
2017-10-06 22:35   ` Dan Williams
2017-10-06 22:35   ` Dan Williams
2017-10-09  3:08   ` Dave Chinner
2017-10-09  3:08     ` Dave Chinner
2017-10-09  3:08     ` Dave Chinner
2017-10-09  3:08     ` Dave Chinner
2017-10-06 22:35 ` [PATCH v7 04/12] fs: MAP_DIRECT core Dan Williams
2017-10-06 22:35   ` Dan Williams
2017-10-06 22:35   ` Dan Williams
2017-10-06 22:35   ` Dan Williams
2017-10-06 22:35 ` [PATCH v7 05/12] xfs: prepare xfs_break_layouts() for reuse with MAP_DIRECT Dan Williams
2017-10-06 22:35   ` Dan Williams
2017-10-06 22:35   ` Dan Williams
2017-10-06 22:35 ` [PATCH v7 06/12] xfs: wire up MAP_DIRECT Dan Williams
2017-10-06 22:35   ` Dan Williams
2017-10-06 22:35   ` Dan Williams
2017-10-09  3:40   ` Dave Chinner
2017-10-09  3:40     ` Dave Chinner
2017-10-09  3:40     ` Dave Chinner
2017-10-09 17:08     ` Dan Williams
2017-10-09 17:08       ` Dan Williams
2017-10-09 17:08       ` Dan Williams
2017-10-09 22:50       ` Dave Chinner
2017-10-09 22:50         ` Dave Chinner
2017-10-06 22:35 ` [PATCH v7 07/12] dma-mapping: introduce dma_has_iommu() Dan Williams
2017-10-06 22:35   ` Dan Williams
2017-10-06 22:35   ` Dan Williams
2017-10-06 22:45   ` David Woodhouse
2017-10-06 22:45     ` David Woodhouse
2017-10-06 22:45     ` David Woodhouse
2017-10-06 22:52     ` Dan Williams
2017-10-06 22:52       ` Dan Williams
2017-10-06 22:52       ` Dan Williams
2017-10-06 22:52       ` Dan Williams
2017-10-06 23:10       ` David Woodhouse
2017-10-06 23:10         ` David Woodhouse
2017-10-06 23:10         ` David Woodhouse
2017-10-06 23:15         ` Dan Williams
2017-10-06 23:15           ` Dan Williams
2017-10-06 23:15           ` Dan Williams
2017-10-06 23:15           ` Dan Williams
2017-10-07 11:08           ` David Woodhouse
2017-10-07 11:08             ` David Woodhouse
2017-10-07 23:33             ` Dan Williams
2017-10-07 23:33               ` Dan Williams
2017-10-07 23:33               ` Dan Williams
2017-10-07 23:33               ` Dan Williams
2017-10-06 23:12       ` Dan Williams
2017-10-06 23:12         ` Dan Williams
2017-10-08  3:45   ` [PATCH v8] dma-mapping: introduce dma_get_iommu_domain() Dan Williams
2017-10-08  3:45     ` Dan Williams
2017-10-08  3:45     ` Dan Williams
2017-10-09 10:37     ` Robin Murphy
2017-10-09 10:37       ` Robin Murphy
2017-10-09 10:37       ` Robin Murphy
2017-10-09 10:37       ` Robin Murphy
2017-10-09 17:32       ` Dan Williams
2017-10-09 17:32         ` Dan Williams
2017-10-10 14:40     ` Raj, Ashok
2017-10-10 14:40       ` Raj, Ashok
2017-10-09 18:58   ` [PATCH v7 07/12] dma-mapping: introduce dma_has_iommu() Jason Gunthorpe
2017-10-09 18:58     ` Jason Gunthorpe
2017-10-09 18:58     ` Jason Gunthorpe
2017-10-09 18:58     ` Jason Gunthorpe
2017-10-09 19:05     ` Dan Williams
2017-10-09 19:05       ` Dan Williams
2017-10-09 19:18       ` Jason Gunthorpe
2017-10-09 19:18         ` Jason Gunthorpe
2017-10-09 19:18         ` Jason Gunthorpe
2017-10-09 19:18         ` Jason Gunthorpe
2017-10-09 19:28         ` Dan Williams
2017-10-09 19:28           ` Dan Williams
2017-10-09 19:28           ` Dan Williams
2017-10-09 19:28           ` Dan Williams
2017-10-10 17:25           ` Jason Gunthorpe
2017-10-10 17:25             ` Jason Gunthorpe
2017-10-10 17:25             ` Jason Gunthorpe
2017-10-10 17:25             ` Jason Gunthorpe
2017-10-10 17:39             ` Dan Williams
2017-10-10 17:39               ` Dan Williams
2017-10-10 17:39               ` Dan Williams
2017-10-10 17:39               ` Dan Williams
2017-10-10 18:05               ` Jason Gunthorpe
2017-10-10 18:05                 ` Jason Gunthorpe
2017-10-10 18:05                 ` Jason Gunthorpe
2017-10-10 18:05                 ` Jason Gunthorpe
2017-10-10 20:17                 ` Dan Williams
2017-10-10 20:17                   ` Dan Williams
2017-10-10 20:17                   ` Dan Williams
2017-10-12 18:27                   ` Jason Gunthorpe [this message]
2017-10-12 18:27                     ` Jason Gunthorpe
2017-10-12 18:27                     ` Jason Gunthorpe
2017-10-12 20:10                     ` Dan Williams
2017-10-12 20:10                       ` Dan Williams
2017-10-13  6:50                       ` Christoph Hellwig
2017-10-13  6:50                         ` Christoph Hellwig
2017-10-13  6:50                         ` Christoph Hellwig
2017-10-13 15:03                         ` Jason Gunthorpe
2017-10-13 15:03                           ` Jason Gunthorpe
2017-10-13 15:03                           ` Jason Gunthorpe
2017-10-13 15:03                           ` Jason Gunthorpe
2017-10-15 15:14                           ` Matan Barak
2017-10-15 15:14                             ` Matan Barak
2017-10-15 15:14                             ` Matan Barak
2017-10-15 15:14                             ` Matan Barak
2017-10-15 15:21                             ` Dan Williams
2017-10-15 15:21                               ` Dan Williams
2017-10-15 15:21                               ` Dan Williams
2017-10-13  7:09         ` Christoph Hellwig
2017-10-13  7:09           ` Christoph Hellwig
2017-10-13  7:09           ` Christoph Hellwig
2017-10-06 22:36 ` [PATCH v7 08/12] fs, mapdirect: introduce ->lease_direct() Dan Williams
2017-10-06 22:36   ` Dan Williams
2017-10-06 22:36   ` Dan Williams
2017-10-06 22:36   ` Dan Williams
2017-10-06 22:36 ` [PATCH v7 09/12] xfs: wire up ->lease_direct() Dan Williams
2017-10-06 22:36   ` Dan Williams
2017-10-06 22:36   ` Dan Williams
2017-10-09  3:45   ` Dave Chinner
2017-10-09  3:45     ` Dave Chinner
2017-10-09  3:45     ` Dave Chinner
2017-10-09  3:45     ` Dave Chinner
2017-10-09 17:10     ` Dan Williams
2017-10-09 17:10       ` Dan Williams
2017-10-06 22:36 ` [PATCH v7 10/12] device-dax: " Dan Williams
2017-10-06 22:36   ` Dan Williams
2017-10-06 22:36   ` Dan Williams
2017-10-06 22:36   ` Dan Williams
2017-10-06 22:36 ` [PATCH v7 11/12] IB/core: use MAP_DIRECT to fix / enable RDMA to DAX mappings Dan Williams
2017-10-06 22:36   ` Dan Williams
2017-10-06 22:36   ` Dan Williams
2017-10-08  4:02   ` [PATCH v8 1/2] iommu: up-level sg_num_pages() from amd-iommu Dan Williams
2017-10-08  4:02     ` Dan Williams
2017-10-08  4:02     ` Dan Williams
2017-10-08  4:04   ` [PATCH v8 2/2] IB/core: use MAP_DIRECT to fix / enable RDMA to DAX mappings Dan Williams
2017-10-08  4:04     ` Dan Williams
2017-10-08  4:04     ` Dan Williams
2017-10-08  6:45     ` kbuild test robot
2017-10-08  6:45       ` kbuild test robot
2017-10-08  6:45       ` kbuild test robot
2017-10-08 15:49       ` Dan Williams
2017-10-08 15:49         ` Dan Williams
2017-10-08 15:49         ` Dan Williams
2017-10-06 22:36 ` [PATCH v7 12/12] tools/testing/nvdimm: enable rdma unit tests Dan Williams
2017-10-06 22:36   ` Dan Williams
2017-10-06 22:36   ` Dan Williams

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20171012182712.GA5772@obsidianresearch.com \
    --to=jgunthorpe@obsidianresearch.com \
    --cc=ashok.raj@intel.com \
    --cc=dan.j.williams@intel.com \
    --cc=darrick.wong@oracle.com \
    --cc=david@fromorbit.com \
    --cc=dwmw2@infradead.org \
    --cc=gregkh@linuxfoundation.org \
    --cc=hch@lst.de \
    --cc=jack@suse.cz \
    --cc=joro@8bytes.org \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-nvdimm@lists.01.org \
    --cc=linux-rdma@vger.kernel.org \
    --cc=linux-xfs@vger.kernel.org \
    --cc=m.szyprowski@samsung.com \
    --cc=robin.murphy@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.