From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from quartz.orcorp.ca (quartz.orcorp.ca [184.70.90.242]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ml01.01.org (Postfix) with ESMTPS id 121AB2095B062 for ; Tue, 10 Oct 2017 10:22:01 -0700 (PDT) Date: Tue, 10 Oct 2017 11:25:16 -0600 From: Jason Gunthorpe Subject: Re: [PATCH v7 07/12] dma-mapping: introduce dma_has_iommu() Message-ID: <20171010172516.GA29915@obsidianresearch.com> References: <150732931273.22363.8436792888326501071.stgit@dwillia2-desk3.amr.corp.intel.com> <150732935473.22363.1853399637339625023.stgit@dwillia2-desk3.amr.corp.intel.com> <20171009185840.GB15336@obsidianresearch.com> <20171009191820.GD15336@obsidianresearch.com> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Errors-To: linux-nvdimm-bounces@lists.01.org Sender: "Linux-nvdimm" To: Dan Williams Cc: Jan Kara , Ashok Raj , "Darrick J. Wong" , linux-rdma@vger.kernel.org, Greg Kroah-Hartman , Joerg Roedel , "linux-nvdimm@lists.01.org" , Dave Chinner , Robin Murphy , linux-xfs@vger.kernel.org, Linux MM , Linux API , linux-fsdevel , David Woodhouse , Christoph Hellwig , Marek Szyprowski List-ID: On Mon, Oct 09, 2017 at 12:28:29PM -0700, Dan Williams wrote: > > I don't think this has ever come up in the context of an all-device MR > > invalidate requirement. Drivers already have code to invalidate > > specifc MRs, but to find all MRs that touch certain pages and then > > invalidate them would be new code. > > > > We also have ODP aware drivers that can retarget a MR to new > > physical pages. If the block map changes DAX should synchronously > > retarget the ODP MR, not halt DMA. > > Have a look at the patch [1], I don't touch the ODP path. But, does ODP work OK already? I'm not clear on that.. > > Most likely ODP & DAX would need to be used together to get robust > > user applications, as having the user QP's go to an error state at > > random times (due to DMA failures) during operation is never going to > > be acceptable... > > It's not random. The process that set up the mapping and registered > the memory gets SIGIO when someone else tries to modify the file map. > That process then gets /proc/sys/fs/lease-break-time seconds to fix > the problem before the kernel force revokes the DMA access. Well, the process can't fix the problem in bounded time, so it is random if it will fail or not. MR life time is under the control of the remote side, and time to complete the network exchanges required to release the MRs is hard to bound. So even if I implement SIGIO properly my app will still likely have random QP failures under various cases and work loads. :( This is why ODP should be the focus because this cannot work fully reliably otherwise.. > > Perhaps you might want to initially only support ODP MR mappings with > > DAX and then the DMA fencing issue goes away? > > I'd rather try to fix the non-ODP DAX case instead of just turning it off. Well, what about using SIGKILL if the lease-break-time hits? The kernel will clean up the MRs when the process exits and this will fence DMA to that memory. But, still, if you really want to be fined graned, then I think invalidating the impacted MR's is a better solution for RDMA than trying to do it with the IOMMU... Jason _______________________________________________ Linux-nvdimm mailing list Linux-nvdimm@lists.01.org https://lists.01.org/mailman/listinfo/linux-nvdimm From mboxrd@z Thu Jan 1 00:00:00 1970 From: Jason Gunthorpe Subject: Re: [PATCH v7 07/12] dma-mapping: introduce dma_has_iommu() Date: Tue, 10 Oct 2017 11:25:16 -0600 Message-ID: <20171010172516.GA29915@obsidianresearch.com> References: <150732931273.22363.8436792888326501071.stgit@dwillia2-desk3.amr.corp.intel.com> <150732935473.22363.1853399637339625023.stgit@dwillia2-desk3.amr.corp.intel.com> <20171009185840.GB15336@obsidianresearch.com> <20171009191820.GD15336@obsidianresearch.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: Content-Disposition: inline In-Reply-To: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: linux-nvdimm-bounces-hn68Rpc1hR1g9hUCZPvPmw@public.gmane.org Sender: "Linux-nvdimm" To: Dan Williams Cc: Jan Kara , Ashok Raj , "Darrick J. Wong" , linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, Greg Kroah-Hartman , Joerg Roedel , "linux-nvdimm-hn68Rpc1hR1g9hUCZPvPmw@public.gmane.org" , Dave Chinner , Robin Murphy , linux-xfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, Linux MM , Linux API , linux-fsdevel , David Woodhouse , Christoph Hellwig , Marek Szyprowski List-Id: linux-rdma@vger.kernel.org On Mon, Oct 09, 2017 at 12:28:29PM -0700, Dan Williams wrote: > > I don't think this has ever come up in the context of an all-device MR > > invalidate requirement. Drivers already have code to invalidate > > specifc MRs, but to find all MRs that touch certain pages and then > > invalidate them would be new code. > > > > We also have ODP aware drivers that can retarget a MR to new > > physical pages. If the block map changes DAX should synchronously > > retarget the ODP MR, not halt DMA. > > Have a look at the patch [1], I don't touch the ODP path. But, does ODP work OK already? I'm not clear on that.. > > Most likely ODP & DAX would need to be used together to get robust > > user applications, as having the user QP's go to an error state at > > random times (due to DMA failures) during operation is never going to > > be acceptable... > > It's not random. The process that set up the mapping and registered > the memory gets SIGIO when someone else tries to modify the file map. > That process then gets /proc/sys/fs/lease-break-time seconds to fix > the problem before the kernel force revokes the DMA access. Well, the process can't fix the problem in bounded time, so it is random if it will fail or not. MR life time is under the control of the remote side, and time to complete the network exchanges required to release the MRs is hard to bound. So even if I implement SIGIO properly my app will still likely have random QP failures under various cases and work loads. :( This is why ODP should be the focus because this cannot work fully reliably otherwise.. > > Perhaps you might want to initially only support ODP MR mappings with > > DAX and then the DMA fencing issue goes away? > > I'd rather try to fix the non-ODP DAX case instead of just turning it off. Well, what about using SIGKILL if the lease-break-time hits? The kernel will clean up the MRs when the process exits and this will fence DMA to that memory. But, still, if you really want to be fined graned, then I think invalidating the impacted MR's is a better solution for RDMA than trying to do it with the IOMMU... Jason From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Date: Tue, 10 Oct 2017 11:25:16 -0600 From: Jason Gunthorpe To: Dan Williams Cc: "linux-nvdimm@lists.01.org" , Jan Kara , Ashok Raj , "Darrick J. Wong" , linux-rdma@vger.kernel.org, Greg Kroah-Hartman , Joerg Roedel , Dave Chinner , linux-xfs@vger.kernel.org, Linux MM , Jeff Moyer , Linux API , linux-fsdevel , Ross Zwisler , David Woodhouse , Robin Murphy , Christoph Hellwig , Marek Szyprowski Subject: Re: [PATCH v7 07/12] dma-mapping: introduce dma_has_iommu() Message-ID: <20171010172516.GA29915@obsidianresearch.com> References: <150732931273.22363.8436792888326501071.stgit@dwillia2-desk3.amr.corp.intel.com> <150732935473.22363.1853399637339625023.stgit@dwillia2-desk3.amr.corp.intel.com> <20171009185840.GB15336@obsidianresearch.com> <20171009191820.GD15336@obsidianresearch.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Sender: owner-linux-mm@kvack.org List-ID: On Mon, Oct 09, 2017 at 12:28:29PM -0700, Dan Williams wrote: > > I don't think this has ever come up in the context of an all-device MR > > invalidate requirement. Drivers already have code to invalidate > > specifc MRs, but to find all MRs that touch certain pages and then > > invalidate them would be new code. > > > > We also have ODP aware drivers that can retarget a MR to new > > physical pages. If the block map changes DAX should synchronously > > retarget the ODP MR, not halt DMA. > > Have a look at the patch [1], I don't touch the ODP path. But, does ODP work OK already? I'm not clear on that.. > > Most likely ODP & DAX would need to be used together to get robust > > user applications, as having the user QP's go to an error state at > > random times (due to DMA failures) during operation is never going to > > be acceptable... > > It's not random. The process that set up the mapping and registered > the memory gets SIGIO when someone else tries to modify the file map. > That process then gets /proc/sys/fs/lease-break-time seconds to fix > the problem before the kernel force revokes the DMA access. Well, the process can't fix the problem in bounded time, so it is random if it will fail or not. MR life time is under the control of the remote side, and time to complete the network exchanges required to release the MRs is hard to bound. So even if I implement SIGIO properly my app will still likely have random QP failures under various cases and work loads. :( This is why ODP should be the focus because this cannot work fully reliably otherwise.. > > Perhaps you might want to initially only support ODP MR mappings with > > DAX and then the DMA fencing issue goes away? > > I'd rather try to fix the non-ODP DAX case instead of just turning it off. Well, what about using SIGKILL if the lease-break-time hits? The kernel will clean up the MRs when the process exits and this will fence DMA to that memory. But, still, if you really want to be fined graned, then I think invalidating the impacted MR's is a better solution for RDMA than trying to do it with the IOMMU... Jason -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from quartz.orcorp.ca ([184.70.90.242]:57060 "EHLO quartz.orcorp.ca" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1756464AbdJJRZi (ORCPT ); Tue, 10 Oct 2017 13:25:38 -0400 Date: Tue, 10 Oct 2017 11:25:16 -0600 From: Jason Gunthorpe Subject: Re: [PATCH v7 07/12] dma-mapping: introduce dma_has_iommu() Message-ID: <20171010172516.GA29915@obsidianresearch.com> References: <150732931273.22363.8436792888326501071.stgit@dwillia2-desk3.amr.corp.intel.com> <150732935473.22363.1853399637339625023.stgit@dwillia2-desk3.amr.corp.intel.com> <20171009185840.GB15336@obsidianresearch.com> <20171009191820.GD15336@obsidianresearch.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Sender: linux-xfs-owner@vger.kernel.org List-ID: List-Id: xfs To: Dan Williams Cc: "linux-nvdimm@lists.01.org" , Jan Kara , Ashok Raj , "Darrick J. Wong" , linux-rdma@vger.kernel.org, Greg Kroah-Hartman , Joerg Roedel , Dave Chinner , linux-xfs@vger.kernel.org, Linux MM , Jeff Moyer , Linux API , linux-fsdevel , Ross Zwisler , David Woodhouse , Robin Murphy , Christoph Hellwig , Marek Szyprowski On Mon, Oct 09, 2017 at 12:28:29PM -0700, Dan Williams wrote: > > I don't think this has ever come up in the context of an all-device MR > > invalidate requirement. Drivers already have code to invalidate > > specifc MRs, but to find all MRs that touch certain pages and then > > invalidate them would be new code. > > > > We also have ODP aware drivers that can retarget a MR to new > > physical pages. If the block map changes DAX should synchronously > > retarget the ODP MR, not halt DMA. > > Have a look at the patch [1], I don't touch the ODP path. But, does ODP work OK already? I'm not clear on that.. > > Most likely ODP & DAX would need to be used together to get robust > > user applications, as having the user QP's go to an error state at > > random times (due to DMA failures) during operation is never going to > > be acceptable... > > It's not random. The process that set up the mapping and registered > the memory gets SIGIO when someone else tries to modify the file map. > That process then gets /proc/sys/fs/lease-break-time seconds to fix > the problem before the kernel force revokes the DMA access. Well, the process can't fix the problem in bounded time, so it is random if it will fail or not. MR life time is under the control of the remote side, and time to complete the network exchanges required to release the MRs is hard to bound. So even if I implement SIGIO properly my app will still likely have random QP failures under various cases and work loads. :( This is why ODP should be the focus because this cannot work fully reliably otherwise.. > > Perhaps you might want to initially only support ODP MR mappings with > > DAX and then the DMA fencing issue goes away? > > I'd rather try to fix the non-ODP DAX case instead of just turning it off. Well, what about using SIGKILL if the lease-break-time hits? The kernel will clean up the MRs when the process exits and this will fence DMA to that memory. But, still, if you really want to be fined graned, then I think invalidating the impacted MR's is a better solution for RDMA than trying to do it with the IOMMU... Jason