Linux-NFS Archive on lore.kernel.org
 help / color / Atom feed
From: Chuck Lever <chuck.lever@oracle.com>
To: linux-rdma@vger.kernel.org, linux-nfs@vger.kernel.org
Subject: [PATCH v1 01/18] xprtrdma: Refresh the documenting comment in frwr_ops.c
Date: Tue, 06 Aug 2019 11:53:51 -0400
Message-ID: <20190806155351.9529.96877.stgit@manet.1015granger.net> (raw)
In-Reply-To: <20190806155246.9529.14571.stgit@manet.1015granger.net>

Things have changed since this comment was written. In particular,
the reworking of connection closing, on-demand creation of MRs, and
the removal of fr_state all mean that deferring MR recovery to
frwr_map is no longer needed. The description is obsolete.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
 net/sunrpc/xprtrdma/frwr_ops.c |   66 +++++++++++-----------------------------
 1 file changed, 18 insertions(+), 48 deletions(-)

diff --git a/net/sunrpc/xprtrdma/frwr_ops.c b/net/sunrpc/xprtrdma/frwr_ops.c
index 0b6dad7..a30f2ae 100644
--- a/net/sunrpc/xprtrdma/frwr_ops.c
+++ b/net/sunrpc/xprtrdma/frwr_ops.c
@@ -7,67 +7,37 @@
 /* Lightweight memory registration using Fast Registration Work
  * Requests (FRWR).
  *
- * FRWR features ordered asynchronous registration and deregistration
- * of arbitrarily sized memory regions. This is the fastest and safest
+ * FRWR features ordered asynchronous registration and invalidation
+ * of arbitrarily-sized memory regions. This is the fastest and safest
  * but most complex memory registration mode.
  */
 
 /* Normal operation
  *
- * A Memory Region is prepared for RDMA READ or WRITE using a FAST_REG
+ * A Memory Region is prepared for RDMA Read or Write using a FAST_REG
  * Work Request (frwr_map). When the RDMA operation is finished, this
  * Memory Region is invalidated using a LOCAL_INV Work Request
- * (frwr_unmap_sync).
+ * (frwr_unmap_async and frwr_unmap_sync).
  *
- * Typically these Work Requests are not signaled, and neither are RDMA
- * SEND Work Requests (with the exception of signaling occasionally to
- * prevent provider work queue overflows). This greatly reduces HCA
+ * Typically FAST_REG Work Requests are not signaled, and neither are
+ * RDMA Send Work Requests (with the exception of signaling occasionally
+ * to prevent provider work queue overflows). This greatly reduces HCA
  * interrupt workload.
- *
- * As an optimization, frwr_unmap marks MRs INVALID before the
- * LOCAL_INV WR is posted. If posting succeeds, the MR is placed on
- * rb_mrs immediately so that no work (like managing a linked list
- * under a spinlock) is needed in the completion upcall.
- *
- * But this means that frwr_map() can occasionally encounter an MR
- * that is INVALID but the LOCAL_INV WR has not completed. Work Queue
- * ordering prevents a subsequent FAST_REG WR from executing against
- * that MR while it is still being invalidated.
  */
 
 /* Transport recovery
  *
- * ->op_map and the transport connect worker cannot run at the same
- * time, but ->op_unmap can fire while the transport connect worker
- * is running. Thus MR recovery is handled in ->op_map, to guarantee
- * that recovered MRs are owned by a sending RPC, and not one where
- * ->op_unmap could fire at the same time transport reconnect is
- * being done.
- *
- * When the underlying transport disconnects, MRs are left in one of
- * four states:
- *
- * INVALID:	The MR was not in use before the QP entered ERROR state.
- *
- * VALID:	The MR was registered before the QP entered ERROR state.
- *
- * FLUSHED_FR:	The MR was being registered when the QP entered ERROR
- *		state, and the pending WR was flushed.
- *
- * FLUSHED_LI:	The MR was being invalidated when the QP entered ERROR
- *		state, and the pending WR was flushed.
- *
- * When frwr_map encounters FLUSHED and VALID MRs, they are recovered
- * with ib_dereg_mr and then are re-initialized. Because MR recovery
- * allocates fresh resources, it is deferred to a workqueue, and the
- * recovered MRs are placed back on the rb_mrs list when recovery is
- * complete. frwr_map allocates another MR for the current RPC while
- * the broken MR is reset.
- *
- * To ensure that frwr_map doesn't encounter an MR that is marked
- * INVALID but that is about to be flushed due to a previous transport
- * disconnect, the transport connect worker attempts to drain all
- * pending send queue WRs before the transport is reconnected.
+ * frwr_map and frwr_unmap_* cannot run at the same time the transport
+ * connect worker is running. The connect worker holds the transport
+ * send lock, just as ->send_request does. This prevents frwr_map and
+ * the connect worker from running concurrently. When a connection is
+ * closed, the Receive completion queue is drained before the allowing
+ * the connect worker to get control. This prevents frwr_unmap and the
+ * connect worker from running concurrently.
+ *
+ * When the underlying transport disconnects, MRs that are in flight
+ * are flushed and are likely unusable. Thus all flushed MRs are
+ * destroyed. New MRs are created on demand.
  */
 
 #include <linux/sunrpc/rpc_rdma.h>


  reply index

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-08-06 15:53 [PATCH v1 00/18] NFS/RDMA patches Chuck Lever
2019-08-06 15:53 ` Chuck Lever [this message]
2019-08-06 15:53 ` [PATCH v1 02/18] xprtrdma: Fix calculation of ri_max_segs again Chuck Lever
2019-08-06 15:54 ` [PATCH v1 03/18] xprtrdma: Boost maximum transport header size Chuck Lever
2019-08-06 15:54 ` [PATCH v1 04/18] xprtrdma: Boost client's max slot table size to match Linux server Chuck Lever
2019-08-06 15:54 ` [PATCH v1 05/18] xprtrdma: Rename CQE field in Receive trace points Chuck Lever
2019-08-06 15:54 ` [PATCH v1 06/18] xprtrdma: Rename rpcrdma_buffer::rb_all Chuck Lever
2019-08-06 15:54 ` [PATCH v1 07/18] xprtrdma: Toggle XPRT_CONGESTED in xprtrdma's slot methods Chuck Lever
2019-08-06 15:54 ` [PATCH v1 08/18] xprtrdma: Simplify rpcrdma_mr_pop Chuck Lever
2019-08-06 15:54 ` [PATCH v1 09/18] xprtrdma: Combine rpcrdma_mr_put and rpcrdma_mr_unmap_and_put Chuck Lever
2019-08-06 15:54 ` [PATCH v1 10/18] xprtrdma: Move rpcrdma_mr_get out of frwr_map Chuck Lever
2019-08-06 15:54 ` [PATCH v1 11/18] xprtrdma: Ensure creating an MR does not trigger FS writeback Chuck Lever
2019-08-06 15:54 ` [PATCH v1 12/18] xprtrdma: Cache free MRs in each rpcrdma_req Chuck Lever
2019-08-06 15:54 ` [PATCH v1 13/18] xprtrdma: Remove rpcrdma_buffer::rb_mrlock Chuck Lever
2019-08-06 15:55 ` [PATCH v1 14/18] xprtrdma: Use an llist to manage free rpcrdma_reps Chuck Lever
2019-08-06 15:55 ` [PATCH v1 15/18] xprtrdma: Clean up xprt_rdma_set_connect_timeout() Chuck Lever
2019-08-06 15:55 ` [PATCH v1 16/18] xprtdma: Fix bc_max_slots return value Chuck Lever
2019-08-06 15:55 ` [PATCH v1 17/18] xprtrdma: Inline XDR chunk encoder functions Chuck Lever
2019-08-06 15:55 ` [PATCH v1 18/18] xprtrdma: Optimize rpcrdma_post_recvs() Chuck Lever

Reply instructions:

You may reply publically to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190806155351.9529.96877.stgit@manet.1015granger.net \
    --to=chuck.lever@oracle.com \
    --cc=linux-nfs@vger.kernel.org \
    --cc=linux-rdma@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Linux-NFS Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-nfs/0 linux-nfs/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-nfs linux-nfs/ https://lore.kernel.org/linux-nfs \
		linux-nfs@vger.kernel.org linux-nfs@archiver.kernel.org
	public-inbox-index linux-nfs


Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-nfs


AGPL code for this site: git clone https://public-inbox.org/ public-inbox