All of lore.kernel.org
 help / color / mirror / Atom feed
From: James Simmons <jsimmons@infradead.org>
To: Andreas Dilger <adilger@whamcloud.com>,
	Oleg Drokin <green@whamcloud.com>, NeilBrown <neilb@suse.de>
Cc: Lustre Development List <lustre-devel@lists.lustre.org>
Subject: [lustre-devel] [PATCH 32/32] lustre: ldlm: Prioritize blocking callbacks
Date: Wed,  3 Aug 2022 21:38:17 -0400	[thread overview]
Message-ID: <1659577097-19253-33-git-send-email-jsimmons@infradead.org> (raw)
In-Reply-To: <1659577097-19253-1-git-send-email-jsimmons@infradead.org>

From: Patrick Farrell <pfarrell@whamcloud.com>

The current code places bl_ast lock callbacks at the end of
the global BL callback queue.  This is bad because it
causes urgent requests from the server to wait behind
non-urgent cleanup tasks to keep lru_size at the right
level.

This can lead to evictions if there is a large queue of
items in the global queue so the callback is not serviced
in a timely manner.

Put bl_ast callbacks on the priority queue so they do not
wait behind the background traffic.

Add some additional debug in this area.

WC-bug-id: https://jira.whamcloud.com/browse/LU-15821
Lustre-commit: 2d59294d52b696125 ("LU-15821 ldlm: Prioritize blocking callbacks")
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/47215
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 fs/lustre/ldlm/ldlm_lockd.c | 39 +++++++++++++++++++++++++++++++++++++--
 1 file changed, 37 insertions(+), 2 deletions(-)

diff --git a/fs/lustre/ldlm/ldlm_lockd.c b/fs/lustre/ldlm/ldlm_lockd.c
index 04fe92e..9f89766 100644
--- a/fs/lustre/ldlm/ldlm_lockd.c
+++ b/fs/lustre/ldlm/ldlm_lockd.c
@@ -94,6 +94,8 @@ struct ldlm_bl_pool {
 	atomic_t		blp_busy_threads;
 	int			blp_min_threads;
 	int			blp_max_threads;
+	int			blp_total_locks;
+	int			blp_total_blwis;
 };
 
 struct ldlm_bl_work_item {
@@ -399,19 +401,39 @@ static int __ldlm_bl_to_thread(struct ldlm_bl_work_item *blwi,
 			       enum ldlm_cancel_flags cancel_flags)
 {
 	struct ldlm_bl_pool *blp = ldlm_state->ldlm_bl_pool;
+	char *prio = "regular";
+	int count;
 
 	spin_lock(&blp->blp_lock);
-	if (blwi->blwi_lock && ldlm_is_discard_data(blwi->blwi_lock)) {
-		/* add LDLM_FL_DISCARD_DATA requests to the priority list */
+	/* cannot access blwi after added to list and lock is dropped */
+	count = blwi->blwi_lock ? 1 : blwi->blwi_count;
+
+	/* if the server is waiting on a lock to be cancelled (bl_ast), this is
+	 * an urgent request and should go in the priority queue so it doesn't
+	 * get stuck behind non-priority work (eg, lru size management)
+	 *
+	 * We also prioritize discard_data, which is for eviction handling
+	 */
+	if (blwi->blwi_lock &&
+	    (ldlm_is_discard_data(blwi->blwi_lock) ||
+	     ldlm_is_bl_ast(blwi->blwi_lock))) {
 		list_add_tail(&blwi->blwi_entry, &blp->blp_prio_list);
+		prio = "priority";
 	} else {
 		/* other blocking callbacks are added to the regular list */
 		list_add_tail(&blwi->blwi_entry, &blp->blp_list);
 	}
+	blp->blp_total_locks += count;
+	blp->blp_total_blwis++;
 	spin_unlock(&blp->blp_lock);
 
 	wake_up(&blp->blp_waitq);
 
+	/* unlocked read of blp values is intentional - OK for debug */
+	CDEBUG(D_DLMTRACE,
+	       "added %d/%d locks to %s blp list, %d blwis in pool\n",
+	       count, blp->blp_total_locks, prio, blp->blp_total_blwis);
+
 	/*
 	 * Can not check blwi->blwi_flags as blwi could be already freed in
 	 * LCF_ASYNC mode
@@ -772,6 +794,17 @@ static int ldlm_bl_get_work(struct ldlm_bl_pool *blp,
 	spin_unlock(&blp->blp_lock);
 	*p_blwi = blwi;
 
+	/* intentional unlocked read of blp values - OK for debug */
+	if (blwi) {
+		CDEBUG(D_DLMTRACE,
+		       "Got %d locks of %d total in blp.  (%d blwis in pool)\n",
+		       blwi->blwi_lock ? 1 : blwi->blwi_count,
+		       blp->blp_total_locks, blp->blp_total_blwis);
+	} else {
+		CDEBUG(D_DLMTRACE,
+		       "No blwi found in queue (no bl locks in queue)\n");
+	}
+
 	return (*p_blwi || *p_exp) ? 1 : 0;
 }
 
@@ -1126,6 +1159,8 @@ static int ldlm_setup(void)
 	init_waitqueue_head(&blp->blp_waitq);
 	atomic_set(&blp->blp_num_threads, 0);
 	atomic_set(&blp->blp_busy_threads, 0);
+	blp->blp_total_locks = 0;
+	blp->blp_total_blwis = 0;
 
 	if (ldlm_num_threads == 0) {
 		blp->blp_min_threads = LDLM_NTHRS_INIT;
-- 
1.8.3.1

_______________________________________________
lustre-devel mailing list
lustre-devel@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-devel-lustre.org

      parent reply	other threads:[~2022-08-04  1:40 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-08-04  1:37 [lustre-devel] [PATCH 00/32] lustre: Update to OpenSFS as of Aug 3 2022 James Simmons
2022-08-04  1:37 ` [lustre-devel] [PATCH 01/32] lustre: mdc: Remove entry from list before freeing James Simmons
2022-08-04  1:37 ` [lustre-devel] [PATCH 02/32] lustre: flr: Don't assume RDONLY implies SOM James Simmons
2022-08-04  1:37 ` [lustre-devel] [PATCH 03/32] lustre: echo: remove client operations from echo objects James Simmons
2022-08-04  1:37 ` [lustre-devel] [PATCH 04/32] lustre: clio: remove cl_page_export() and cl_page_is_vmlocked() James Simmons
2022-08-04  1:37 ` [lustre-devel] [PATCH 05/32] lustre: clio: remove cpo_own and cpo_disown James Simmons
2022-08-04  1:37 ` [lustre-devel] [PATCH 06/32] lustre: clio: remove cpo_assume, cpo_unassume, cpo_fini James Simmons
2022-08-04  1:37 ` [lustre-devel] [PATCH 07/32] lustre: enc: enc-unaware clients get ENOKEY if file not found James Simmons
2022-08-04  1:37 ` [lustre-devel] [PATCH 08/32] lnet: socklnd: Duplicate ksock_conn_cb James Simmons
2022-08-04  1:37 ` [lustre-devel] [PATCH 09/32] lustre: llite: enforce ROOT default on subdir mount James Simmons
2022-08-04  1:37 ` [lustre-devel] [PATCH 10/32] lnet: Replace msg_rdma_force with a new md_flag LNET_MD_FLAG_GPU James Simmons
2022-08-04  1:37 ` [lustre-devel] [PATCH 11/32] lustre: som: disabling xattr cache for LSOM on client James Simmons
2022-08-04  1:37 ` [lustre-devel] [PATCH 12/32] lnet: discard some peer_ni lookup functions James Simmons
2022-08-04  1:37 ` [lustre-devel] [PATCH 13/32] lnet: change lnet_*_peer_ni to take struct lnet_nid James Simmons
2022-08-04  1:37 ` [lustre-devel] [PATCH 14/32] lnet: Ensure round robin across nets James Simmons
2022-08-04  1:38 ` [lustre-devel] [PATCH 15/32] lustre: llite: dont restart directIO with IOCB_NOWAIT James Simmons
2022-08-04  1:38 ` [lustre-devel] [PATCH 16/32] lustre: sec: handle read-only flag James Simmons
2022-08-04  1:38 ` [lustre-devel] [PATCH 17/32] lustre: llog: Add LLOG_SKIP_PLAIN to skip llog plain James Simmons
2022-08-04  1:38 ` [lustre-devel] [PATCH 18/32] lustre: llite: add projid to debug logs James Simmons
2022-08-04  1:38 ` [lustre-devel] [PATCH 19/32] lnet: asym route inconsistency warning James Simmons
2022-08-04  1:38 ` [lustre-devel] [PATCH 20/32] lnet: libcfs: debugfs file_operation should have an owner James Simmons
2022-08-04  1:38 ` [lustre-devel] [PATCH 21/32] lustre: client: able to cleanup devices manually James Simmons
2022-08-04  1:38 ` [lustre-devel] [PATCH 22/32] lustre: lmv: support striped LMVs James Simmons
2022-08-04  1:38 ` [lustre-devel] [PATCH 23/32] lnet: o2iblnd: add debug messages for IB James Simmons
2022-08-04  1:38 ` [lustre-devel] [PATCH 24/32] lnet: o2iblnd: debug message is missing a newline James Simmons
2022-08-04  1:38 ` [lustre-devel] [PATCH 25/32] lustre: quota: skip non-exist or inact tgt for lfs_quota James Simmons
2022-08-04  1:38 ` [lustre-devel] [PATCH 26/32] lustre: mdc: pack default LMV in open reply James Simmons
2022-08-04  1:38 ` [lustre-devel] [PATCH 27/32] lnet: Define KFILND network type James Simmons
2022-08-04  1:38 ` [lustre-devel] [PATCH 28/32] lnet: Adjust niov checks for large MD James Simmons
2022-08-04  1:38 ` [lustre-devel] [PATCH 29/32] lustre: ec: code to add support for M to N parity James Simmons
2022-08-04  1:38 ` [lustre-devel] [PATCH 30/32] lustre: llite: use max default EA size to get default LMV James Simmons
2022-08-04  1:38 ` [lustre-devel] [PATCH 31/32] lustre: llite: pass dmv inherit depth instead of dir depth James Simmons
2022-08-04  1:38 ` James Simmons [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1659577097-19253-33-git-send-email-jsimmons@infradead.org \
    --to=jsimmons@infradead.org \
    --cc=adilger@whamcloud.com \
    --cc=green@whamcloud.com \
    --cc=lustre-devel@lists.lustre.org \
    --cc=neilb@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.