lustre-devel-lustre.org archive mirror
 help / color / mirror / Atom feed
From: James Simmons <jsimmons@infradead.org>
To: Andreas Dilger <adilger@whamcloud.com>,
	Oleg Drokin <green@whamcloud.com>, NeilBrown <neilb@suse.de>
Cc: Lustre Development List <lustre-devel@lists.lustre.org>
Subject: [lustre-devel] [PATCH 20/25] lustre: osc: osc: Do not flush on lockless cancel
Date: Mon,  2 Aug 2021 15:50:40 -0400	[thread overview]
Message-ID: <1627933851-7603-21-git-send-email-jsimmons@infradead.org> (raw)
In-Reply-To: <1627933851-7603-1-git-send-email-jsimmons@infradead.org>

From: Patrick Farrell <pfarrell@whamcloud.com>

The cancellation of a an OSC lock without an LDLM lock
(a 'lockless' OSC lock) should not flush pages.  Only
direct i/o is allowed to use a lockless OSC lock, and
direct i/o does not create flushable pages.

DIO pages are not flushable because:
A) all synced ASAP, and
B) the OSC extents created for them are not added to the
extent tree which is used to track these pages.

Instead, this has the effect of trying to flush pages from
ongoing buffered i/o.  This can lead to crashes like the
following:

osc_cache_writeback_range()) ASSERTION(hp == 0 && discard == 0) failed

This assert essentially says the lock cancellation
(hp == 1) found an active i/o (an extent in the OES_ACTIVE
state).

This is not allowed because the flushing code assumes an
LDLM lock is being cancelled, which will only start once
there is no active i/o.  Because the OSC lock being
cancelled is not associated with an LDLM lock, this is not
true, and nothing prevents active i/o under a different
lock, leading to this assert.

The solution is simply to not flush pages when cancelling a
no-LDLM-lock OSC lock.

Additional note:
New lockless OSC locks cannot be created if they are
blocked by a regular OSC lock, but a new regular lock can
be created if there is a lockless lock present.

Thus, the sequence is something like this:
Direct i/o creates lockless OSC lock
Buffered i/o creates OSC and LDLM lock on the same range
Direct i/o finishes, starts cancelling its OSC lock
Buffered i/o is still ongoing, with extents in OES_ACTIVE

This results in the above crash during the OSC lock
cancellation.

Note it would be possible to resolve this issue by not
allowing lockless OSC locks to match regular OSC locks, but
this is not necessary, since there's no reason for lockless
locks to flush pages on cancellation.

WC-bug-id: https://jira.whamcloud.com/browse/LU-14814
Lustre-commit: 6717c573ed90da91 ("LU-14814 osc: osc: Do not flush on lockless cancel")
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44152
Reviewed-by: Li Dongyang <dongyangli@ddn.com>
Reviewed-by: Wang Shilong <wshilong@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 fs/lustre/osc/osc_lock.c | 8 --------
 1 file changed, 8 deletions(-)

diff --git a/fs/lustre/osc/osc_lock.c b/fs/lustre/osc/osc_lock.c
index f6faed7..eb3cb58 100644
--- a/fs/lustre/osc/osc_lock.c
+++ b/fs/lustre/osc/osc_lock.c
@@ -1134,16 +1134,8 @@ static void osc_lock_lockless_cancel(const struct lu_env *env,
 {
 	struct osc_lock *ols = cl2osc_lock(slice);
 	struct osc_object *osc = cl2osc(slice->cls_obj);
-	struct cl_lock_descr *descr = &slice->cls_lock->cll_descr;
-	int result;
 
 	LASSERT(!ols->ols_dlmlock);
-	result = osc_lock_flush(osc, descr->cld_start, descr->cld_end,
-				descr->cld_mode, false);
-	if (result)
-		CERROR("Pages for lockless lock %p were not purged(%d)\n",
-		       ols, result);
-
 	osc_lock_wake_waiters(env, osc, ols);
 }
 
-- 
1.8.3.1

_______________________________________________
lustre-devel mailing list
lustre-devel@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-devel-lustre.org

  parent reply	other threads:[~2021-08-02 19:52 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-08-02 19:50 [lustre-devel] [PATCH 00/25] Sync to OpenSFS tree as of Aug 2, 2021 James Simmons
2021-08-02 19:50 ` [lustre-devel] [PATCH 01/25] lustre: llite: avoid stale data reading James Simmons
2021-08-02 19:50 ` [lustre-devel] [PATCH 02/25] lustre: llite: No locked parallel DIO James Simmons
2021-08-02 19:50 ` [lustre-devel] [PATCH 03/25] lnet: discard lnet_current_net_count James Simmons
2021-08-02 19:50 ` [lustre-devel] [PATCH 04/25] lnet: convert kiblnd/ksocknal_thread_start to vararg James Simmons
2021-08-02 19:50 ` [lustre-devel] [PATCH 05/25] lnet: print device status in net show command James Simmons
2021-08-02 19:50 ` [lustre-devel] [PATCH 06/25] lustre: lmv: getattr_name("..") under striped directory James Simmons
2021-08-02 19:50 ` [lustre-devel] [PATCH 07/25] lustre: llite: revert 'simplify callback handling for async getattr' James Simmons
2021-08-02 19:50 ` [lustre-devel] [PATCH 08/25] lnet: Protect lpni deref in lnet_health_check James Simmons
2021-08-02 19:50 ` [lustre-devel] [PATCH 09/25] lustre: uapi: remove MDS_SETATTR_PORTAL and service James Simmons
2021-08-02 19:50 ` [lustre-devel] [PATCH 10/25] lustre: llite: Modify AIO/DIO reference counting James Simmons
2021-08-02 19:50 ` [lustre-devel] [PATCH 11/25] lustre: llite: Remove transient page counting James Simmons
2021-08-02 19:50 ` [lustre-devel] [PATCH 12/25] lustre: lov: Improve DIO submit James Simmons
2021-08-02 19:50 ` [lustre-devel] [PATCH 13/25] lustre: llite: Adjust dio refcounting James Simmons
2021-08-02 19:50 ` [lustre-devel] [PATCH 14/25] lustre: clio: Skip prep for transients James Simmons
2021-08-02 19:50 ` [lustre-devel] [PATCH 15/25] lustre: osc: Improve osc_queue_sync_pages James Simmons
2021-08-02 19:50 ` [lustre-devel] [PATCH 16/25] lustre: llite: avoid project quota overflow James Simmons
2021-08-02 19:50 ` [lustre-devel] [PATCH 17/25] lnet: check memdup_user_nul using IS_ERR James Simmons
2021-08-02 19:50 ` [lustre-devel] [PATCH 18/25] lustre: osc: Remove lockless truncate James Simmons
2021-08-02 19:50 ` [lustre-devel] [PATCH 19/25] lustre: osc: Remove client contention support James Simmons
2021-08-02 19:50 ` James Simmons [this message]
2021-08-02 19:50 ` [lustre-devel] [PATCH 21/26] lustre: pcc: add LCM_FL_PCC_RDONLY layout flag James Simmons
2021-08-02 19:50 ` [lustre-devel] [PATCH 21/25] lustre: update version to 2.14.53 James Simmons
2021-08-02 19:50 ` [lustre-devel] [PATCH 22/25] lustre: mdc: set default LMV on ROOT James Simmons
2021-08-02 19:50 ` [lustre-devel] [PATCH 22/26] lustre: update version to 2.14.53 James Simmons
2021-08-02 19:50 ` [lustre-devel] [PATCH 23/25] lustre: llite: enable filesystem-wide default LMV James Simmons
2021-08-02 19:50 ` [lustre-devel] [PATCH 23/26] lustre: mdc: set default LMV on ROOT James Simmons
2021-08-02 19:50 ` [lustre-devel] [PATCH 24/25] lnet: o2iblnd: clear fatal error on successful failover James Simmons
2021-08-02 19:50 ` [lustre-devel] [PATCH 24/26] lustre: llite: enable filesystem-wide default LMV James Simmons
2021-08-02 19:50 ` [lustre-devel] [PATCH 25/25] lnet: add "stats reset" to lnetctl James Simmons
2021-08-02 19:50 ` [lustre-devel] [PATCH 25/26] lnet: o2iblnd: clear fatal error on successful failover James Simmons
2021-08-02 19:50 ` [lustre-devel] [PATCH 26/26] lnet: add "stats reset" to lnetctl James Simmons

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1627933851-7603-21-git-send-email-jsimmons@infradead.org \
    --to=jsimmons@infradead.org \
    --cc=adilger@whamcloud.com \
    --cc=green@whamcloud.com \
    --cc=lustre-devel@lists.lustre.org \
    --cc=neilb@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).