linux-cifs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Rohith Surabattula <rohiths.msft@gmail.com>
To: Tom Talpey <tom@talpey.com>
Cc: "Shyam Prasad N" <nspmangalore@gmail.com>,
	linux-cifs <linux-cifs@vger.kernel.org>,
	"Steve French" <smfrench@gmail.com>,
	"Pavel Shilovsky" <piastryyy@gmail.com>,
	sribhat.msa@outlook.com,
	"ronnie sahlberg" <ronniesahlberg@gmail.com>,
	"Aurélien Aptel" <aaptel@suse.com>
Subject: Re: cifs: Deferred close for files
Date: Mon, 22 Mar 2021 22:37:34 +0530	[thread overview]
Message-ID: <CACdtm0agzeVRiQg1zZjm=jFrf-gSQ-+YGc1Zm1xN1Pd9tJia4Q@mail.gmail.com> (raw)
In-Reply-To: <461d79c3-1b32-b0f8-157c-b5d4b25507d7@talpey.com>

[-- Attachment #1: Type: text/plain, Size: 5371 bytes --]

On 3/11/2021 8:47 AM, Shyam Prasad N wrote:
> Hi Rohith,
>
> The changes look good at a high level.
>
> Just a few points worth checking:
> 1. In cifs_open(), should be perform deferred close for certain cases
> like O_DIRECT? AFAIK, O_DIRECT is just a hint to the filesystem to
> perform less data caching. By deferring close, aren't we delaying
> flushing dirty pages? @Steve French ?

With O_DIRECT flag, data is not cached locally and will be sent to
server immediately.

> 2. I see that you're maintaining a new list of files for deferred
> closing. Since there could be a large number of such files for a big
> share with sufficient I/O, maybe we should think of a structure with
> faster lookups (rb trees?).
> I know we already have a bunch of linked lists in cifs.ko, and we need
> to review perf impact for all those lists. But this one sounds like a
> candidate for faster lookups.

Entries will be added into this list only once file is closed and will
remain for acregmax amount interval.
So,  Will this affect performance i.e during lookups ?

> 3. Minor comment. Maybe change the field name oplock_deferred_close to
> oplock_break_received?

Addressed. Attached the patch.
>
> Regards,
> Shyam

>I wonder why the choice of 5 seconds? It seems to me a more natural
>interval on the order of one or more of
>- attribute cache timeout
>- lease time
>- echo_interval

Yes, This sounds good. I modified the patch to defer by attribute
cache timeout interval.

>Also, this wording is rather confusing:

>> When file is closed, SMB2 close request is not sent to server
>> immediately and is deferred for 5 seconds interval. When file is
>> reopened by same process for read or write, previous file handle
>> can be used until oplock is held.

>It's not a "previous" filehandle if it's open, and "can be used" is
>a rather passive statement. A better wording may be "the filehandle
>is reused".

>And, "until oplock is held" is similarly awkward. Do you mean "*if*
>an oplock is held", or "*provided" an oplock is held"?

>> When same file is reopened by another client during 5 second
>> interval, oplock break is sent by server and file is closed immediately
>> if reference count is zero or else oplock is downgraded.

>Only the second part of the sentence is relevant to the patch. Also,
>what about lease break?

Modified the patch.

>What happens if the handle is durable or persistent, and the connection
>to the server times out? Is the handle recovered, then closed?

Do you mean when the session gets reconnected, the deferred handle
will be recovered and closed?

Regards,
Rohith

On Thu, Mar 11, 2021 at 11:25 PM Tom Talpey <tom@talpey.com> wrote:
>
> On 3/11/2021 8:47 AM, Shyam Prasad N wrote:
> > Hi Rohith,
> >
> > The changes look good at a high level.
> >
> > Just a few points worth checking:
> > 1. In cifs_open(), should be perform deferred close for certain cases
> > like O_DIRECT? AFAIK, O_DIRECT is just a hint to the filesystem to
> > perform less data caching. By deferring close, aren't we delaying
> > flushing dirty pages? @Steve French ?
> > 2. I see that you're maintaining a new list of files for deferred
> > closing. Since there could be a large number of such files for a big
> > share with sufficient I/O, maybe we should think of a structure with
> > faster lookups (rb trees?).
> > I know we already have a bunch of linked lists in cifs.ko, and we need
> > to review perf impact for all those lists. But this one sounds like a
> > candidate for faster lookups.
> > 3. Minor comment. Maybe change the field name oplock_deferred_close to
> > oplock_break_received?
> >
> > Regards,
> > Shyam
>
> I wonder why the choice of 5 seconds? It seems to me a more natural
> interval on the order of one or more of
> - attribute cache timeout
> - lease time
> - echo_interval
>
> Also, this wording is rather confusing:
>
> > When file is closed, SMB2 close request is not sent to server
> > immediately and is deferred for 5 seconds interval. When file is
> > reopened by same process for read or write, previous file handle
> > can be used until oplock is held.
>
> It's not a "previous" filehandle if it's open, and "can be used" is
> a rather passive statement. A better wording may be "the filehandle
> is reused".
>
> And, "until oplock is held" is similarly awkward. Do you mean "*if*
> an oplock is held", or "*provided" an oplock is held"?
>
> > When same file is reopened by another client during 5 second
> > interval, oplock break is sent by server and file is closed immediately
> > if reference count is zero or else oplock is downgraded.
>
> Only the second part of the sentence is relevant to the patch. Also,
> what about lease break?
>
> What happens if the handle is durable or persistent, and the connection
> to the server times out? Is the handle recovered, then closed?
>
> Tom.
>
>
> > On Tue, Mar 9, 2021 at 2:41 PM Rohith Surabattula
> > <rohiths.msft@gmail.com> wrote:
> >>
> >> Hi All,
> >>
> >> Please find the attached patch which will defer the close to server.
> >> So, performance can be improved.
> >>
> >> i.e When file is open, write, close, open, read, close....
> >> As close is deferred and oplock is held, cache will not be invalidated
> >> and same handle can be used for second open.
> >>
> >> Please review the changes and let me know your thoughts.
> >>
> >> Regards,
> >> Rohith
> >
> >
> >

[-- Attachment #2: 0001-cifs-Deferred-close-for-files.patch --]
[-- Type: application/octet-stream, Size: 10855 bytes --]

From ca42cf0388c194e835e414b5811443b3a27de4de Mon Sep 17 00:00:00 2001
From: Rohith Surabattula <rohiths@microsoft.com>
Date: Mon, 8 Mar 2021 16:28:09 +0000
Subject: [PATCH] cifs: Deferred close for files

When file is closed, SMB2 close request is not sent to server
immediately and is deferred for acregmax defined interval. When file is
reopened by same process for read or write, the file handle
is reused if an oplock is held.

When client receives a oplock/lease break, file is closed immediately
if reference count is zero, else oplock is downgraded.

Signed-off-by: Rohith Surabattula <rohiths@microsoft.com>
---
 fs/cifs/cifsfs.c    | 13 +++++++++-
 fs/cifs/cifsglob.h  | 12 +++++++++
 fs/cifs/cifsproto.h |  8 ++++++
 fs/cifs/connect.c   |  1 +
 fs/cifs/file.c      | 63 +++++++++++++++++++++++++++++++++++++++++++--
 fs/cifs/misc.c      | 47 +++++++++++++++++++++++++++++++++
 6 files changed, 141 insertions(+), 3 deletions(-)

diff --git a/fs/cifs/cifsfs.c b/fs/cifs/cifsfs.c
index 099ad9f3660b..3a426efba94e 100644
--- a/fs/cifs/cifsfs.c
+++ b/fs/cifs/cifsfs.c
@@ -133,6 +133,7 @@ struct workqueue_struct	*cifsiod_wq;
 struct workqueue_struct	*decrypt_wq;
 struct workqueue_struct	*fileinfo_put_wq;
 struct workqueue_struct	*cifsoplockd_wq;
+struct workqueue_struct *deferredclose_wq;
 __u32 cifs_lock_secret;
 
 /*
@@ -1605,9 +1606,16 @@ init_cifs(void)
 		goto out_destroy_fileinfo_put_wq;
 	}
 
+	deferredclose_wq = alloc_workqueue("deferredclose",
+					   WQ_FREEZABLE|WQ_MEM_RECLAIM, 0);
+	if (!deferredclose_wq) {
+		rc = -ENOMEM;
+		goto out_destroy_cifsoplockd_wq;
+	}
+
 	rc = cifs_fscache_register();
 	if (rc)
-		goto out_destroy_cifsoplockd_wq;
+		goto out_destroy_deferredclose_wq;
 
 	rc = cifs_init_inodecache();
 	if (rc)
@@ -1675,6 +1683,8 @@ init_cifs(void)
 	cifs_destroy_inodecache();
 out_unreg_fscache:
 	cifs_fscache_unregister();
+out_destroy_deferredclose_wq:
+	destroy_workqueue(deferredclose_wq);
 out_destroy_cifsoplockd_wq:
 	destroy_workqueue(cifsoplockd_wq);
 out_destroy_fileinfo_put_wq:
@@ -1709,6 +1719,7 @@ exit_cifs(void)
 	cifs_destroy_mids();
 	cifs_destroy_inodecache();
 	cifs_fscache_unregister();
+	destroy_workqueue(deferredclose_wq);
 	destroy_workqueue(cifsoplockd_wq);
 	destroy_workqueue(decrypt_wq);
 	destroy_workqueue(fileinfo_put_wq);
diff --git a/fs/cifs/cifsglob.h b/fs/cifs/cifsglob.h
index 31fc8695abd6..48858e31b746 100644
--- a/fs/cifs/cifsglob.h
+++ b/fs/cifs/cifsglob.h
@@ -1097,6 +1097,8 @@ struct cifs_tcon {
 #ifdef CONFIG_CIFS_SWN_UPCALL
 	bool use_witness:1; /* use witness protocol */
 #endif
+	struct list_head deferred_closes; /* list of deferred closes */
+	spinlock_t deferred_lock; /* protection on deferred list */
 };
 
 /*
@@ -1154,6 +1156,14 @@ struct cifs_pending_open {
 	__u32 oplock;
 };
 
+struct cifs_deferred_close {
+	struct list_head dlist;
+	struct tcon_link *tlink;
+	__u16  netfid;
+	__u64  persistent_fid;
+	__u64  volatile_fid;
+};
+
 /*
  * This info hangs off the cifsFileInfo structure, pointed to by llist.
  * This is used to track byte stream locks on the file
@@ -1248,6 +1258,7 @@ struct cifsFileInfo {
 	struct cifs_search_info srch_inf;
 	struct work_struct oplock_break; /* work for oplock breaks */
 	struct work_struct put; /* work for the final part of _put */
+	bool oplock_break_received; /* Flag to indicate oplock break */
 };
 
 struct cifs_io_parms {
@@ -1898,6 +1909,7 @@ extern struct workqueue_struct *cifsiod_wq;
 extern struct workqueue_struct *decrypt_wq;
 extern struct workqueue_struct *fileinfo_put_wq;
 extern struct workqueue_struct *cifsoplockd_wq;
+extern struct workqueue_struct *deferredclose_wq;
 extern __u32 cifs_lock_secret;
 
 extern mempool_t *cifs_mid_poolp;
diff --git a/fs/cifs/cifsproto.h b/fs/cifs/cifsproto.h
index 75ce6f742b8d..a35b599d53a5 100644
--- a/fs/cifs/cifsproto.h
+++ b/fs/cifs/cifsproto.h
@@ -256,6 +256,14 @@ extern void cifs_add_pending_open_locked(struct cifs_fid *fid,
 					 struct tcon_link *tlink,
 					 struct cifs_pending_open *open);
 extern void cifs_del_pending_open(struct cifs_pending_open *open);
+
+extern bool cifs_is_deferred_close(struct cifsFileInfo *cfile,
+				struct cifs_deferred_close **dclose);
+
+extern void cifs_add_deferred_close(struct cifsFileInfo *cfile);
+
+extern void cifs_del_deferred_close(struct cifsFileInfo *cfile);
+
 extern struct TCP_Server_Info *cifs_get_tcp_session(struct smb3_fs_context *ctx);
 extern void cifs_put_tcp_session(struct TCP_Server_Info *server,
 				 int from_reconnect);
diff --git a/fs/cifs/connect.c b/fs/cifs/connect.c
index eec8a2052da2..2804634d4040 100644
--- a/fs/cifs/connect.c
+++ b/fs/cifs/connect.c
@@ -2228,6 +2228,7 @@ cifs_get_tcon(struct cifs_ses *ses, struct smb3_fs_context *ctx)
 	tcon->nodelete = ctx->nodelete;
 	tcon->local_lease = ctx->local_lease;
 	INIT_LIST_HEAD(&tcon->pending_opens);
+	INIT_LIST_HEAD(&tcon->deferred_closes);
 
 	spin_lock(&cifs_tcp_ses_lock);
 	list_add(&tcon->tcon_list, &ses->tcon_list);
diff --git a/fs/cifs/file.c b/fs/cifs/file.c
index 26de4329d161..5c140dbe9a6b 100644
--- a/fs/cifs/file.c
+++ b/fs/cifs/file.c
@@ -321,6 +321,7 @@ cifs_new_fileinfo(struct cifs_fid *fid, struct file *file,
 	cfile->dentry = dget(dentry);
 	cfile->f_flags = file->f_flags;
 	cfile->invalidHandle = false;
+	cfile->oplock_break_received = false;
 	cfile->tlink = cifs_get_tlink(tlink);
 	INIT_WORK(&cfile->oplock_break, cifs_oplock_break);
 	INIT_WORK(&cfile->put, cifsFileInfo_put_work);
@@ -562,6 +563,17 @@ int cifs_open(struct inode *inode, struct file *file)
 			file->f_op = &cifs_file_direct_ops;
 	}
 
+	spin_lock(&tcon->deferred_lock);
+	/* Get the cached handle as SMB2 close is deferred */
+	rc = cifs_get_readable_path(tcon, full_path, &cfile);
+	if (rc == 0) {
+		file->private_data = cfile;
+		cifs_del_deferred_close(cfile);
+		spin_unlock(&tcon->deferred_lock);
+		goto out;
+	}
+	spin_unlock(&tcon->deferred_lock);
+
 	if (server->oplocks)
 		oplock = REQ_OPLOCK;
 	else
@@ -842,11 +854,45 @@ cifs_reopen_file(struct cifsFileInfo *cfile, bool can_flush)
 	return rc;
 }
 
+struct smb2_deferred_work {
+	struct delayed_work deferred;
+	struct cifsFileInfo *cfile;
+};
+
+void smb2_deferred_work_close(struct work_struct *work)
+{
+	struct smb2_deferred_work *dwork = container_of(work,
+			struct smb2_deferred_work, deferred.work);
+
+	spin_lock(&tlink_tcon(dwork->cfile->tlink)->deferred_lock);
+	cifs_del_deferred_close(dwork->cfile);
+	spin_unlock(&tlink_tcon(dwork->cfile->tlink)->deferred_lock);
+	_cifsFileInfo_put(dwork->cfile, true, false);
+}
+
 int cifs_close(struct inode *inode, struct file *file)
 {
+	struct smb2_deferred_work *dwork;
+	struct cifsInodeInfo *cinode = CIFS_I(inode);
+	struct cifs_sb_info *cifs_sb = CIFS_SB(inode->i_sb);
+
+	dwork = kmalloc(sizeof(struct smb2_deferred_work), GFP_KERNEL);
+
+	INIT_DELAYED_WORK(&dwork->deferred, smb2_deferred_work_close);
+
 	if (file->private_data != NULL) {
-		_cifsFileInfo_put(file->private_data, true, false);
+		dwork->cfile = file->private_data;
 		file->private_data = NULL;
+		if ((cinode->oplock == CIFS_CACHE_RHW_FLG) ||
+		    (cinode->oplock == CIFS_CACHE_RH_FLG)) {
+			spin_lock(&tlink_tcon(dwork->cfile->tlink)->deferred_lock);
+			cifs_add_deferred_close(dwork->cfile);
+			spin_unlock(&tlink_tcon(dwork->cfile->tlink)->deferred_lock);
+			/* Deferred close for files */
+			queue_delayed_work(deferredclose_wq, &dwork->deferred, cifs_sb->ctx->acregmax);
+		} else {
+			_cifsFileInfo_put(dwork->cfile, true, false);
+		}
 	}
 
 	/* return code from the ->release op is always ignored */
@@ -1943,7 +1989,8 @@ struct cifsFileInfo *find_readable_file(struct cifsInodeInfo *cifs_inode,
 		if (fsuid_only && !uid_eq(open_file->uid, current_fsuid()))
 			continue;
 		if (OPEN_FMODE(open_file->f_flags) & FMODE_READ) {
-			if (!open_file->invalidHandle) {
+			if ((!open_file->invalidHandle) &&
+				(!open_file->oplock_break_received)) {
 				/* found a good file */
 				/* lock it so it will not be closed on us */
 				cifsFileInfo_get(open_file);
@@ -4746,6 +4793,8 @@ void cifs_oplock_break(struct work_struct *work)
 	struct TCP_Server_Info *server = tcon->ses->server;
 	int rc = 0;
 	bool purge_cache = false;
+	bool is_deferred = false;
+	struct cifs_deferred_close *dclose;
 
 	wait_on_bit(&cinode->flags, CIFS_INODE_PENDING_WRITERS,
 			TASK_UNINTERRUPTIBLE);
@@ -4793,6 +4842,16 @@ void cifs_oplock_break(struct work_struct *work)
 		cifs_dbg(FYI, "Oplock release rc = %d\n", rc);
 	}
 	_cifsFileInfo_put(cfile, false /* do not wait for ourself */, false);
+	/*
+	 * When oplock break is received and there are no active
+	 * file handles but cached, then set the flag oplock_break_received.
+	 * So, new open will not use cached handle.
+	 */
+	spin_lock(&tlink_tcon(cfile->tlink)->deferred_lock);
+	is_deferred = cifs_is_deferred_close(cfile, &dclose);
+	if (is_deferred)
+		cfile->oplock_break_received = true;
+	spin_unlock(&tlink_tcon(cfile->tlink)->deferred_lock);
 	cifs_done_oplock_break(cinode);
 }
 
diff --git a/fs/cifs/misc.c b/fs/cifs/misc.c
index 82e176720ca6..298cc8b54857 100644
--- a/fs/cifs/misc.c
+++ b/fs/cifs/misc.c
@@ -136,6 +136,7 @@ tconInfoAlloc(void)
 	spin_lock_init(&ret_buf->stat_lock);
 	atomic_set(&ret_buf->num_local_opens, 0);
 	atomic_set(&ret_buf->num_remote_opens, 0);
+	spin_lock_init(&ret_buf->deferred_lock);
 
 	return ret_buf;
 }
@@ -672,6 +673,52 @@ cifs_add_pending_open(struct cifs_fid *fid, struct tcon_link *tlink,
 	spin_unlock(&tlink_tcon(open->tlink)->open_file_lock);
 }
 
+bool
+cifs_is_deferred_close(struct cifsFileInfo *cfile, struct cifs_deferred_close **pdclose)
+{
+	struct cifs_deferred_close *dclose;
+
+	list_for_each_entry(dclose, &tlink_tcon(cfile->tlink)->deferred_closes, dlist) {
+		if ((dclose->netfid == cfile->fid.netfid) &&
+			(dclose->persistent_fid == cfile->fid.persistent_fid) &&
+			(dclose->volatile_fid == cfile->fid.volatile_fid)) {
+			*pdclose = dclose;
+			return true;
+		}
+	}
+	return false;
+}
+
+void
+cifs_add_deferred_close(struct cifsFileInfo *cfile)
+{
+	bool is_deferred = false;
+	struct cifs_deferred_close *dclose;
+
+	is_deferred = cifs_is_deferred_close(cfile, &dclose);
+	if (is_deferred)
+		return;
+
+	dclose = kmalloc(sizeof(struct cifs_deferred_close), GFP_KERNEL);
+	dclose->tlink = cfile->tlink;
+	dclose->netfid = cfile->fid.netfid;
+	dclose->persistent_fid = cfile->fid.persistent_fid;
+	dclose->volatile_fid = cfile->fid.volatile_fid;
+	list_add_tail(&dclose->dlist, &tlink_tcon(dclose->tlink)->deferred_closes);
+}
+
+void
+cifs_del_deferred_close(struct cifsFileInfo *cfile)
+{
+	bool is_deferred = false;
+	struct cifs_deferred_close *dclose;
+
+	is_deferred = cifs_is_deferred_close(cfile, &dclose);
+	if (!is_deferred)
+		return;
+	list_del(&dclose->dlist);
+}
+
 /* parses DFS refferal V3 structure
  * caller is responsible for freeing target_nodes
  * returns:
-- 
2.25.1


  reply	other threads:[~2021-03-22 17:08 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-03-09  9:11 cifs: Deferred close for files Rohith Surabattula
2021-03-11 13:47 ` Shyam Prasad N
2021-03-11 17:55   ` Tom Talpey
2021-03-22 17:07     ` Rohith Surabattula [this message]
2021-03-24 14:20       ` Tom Talpey
2021-03-25  2:42         ` Rohith Surabattula
2021-04-07 14:57           ` Rohith Surabattula
2021-04-11 12:19             ` Rohith Surabattula
2021-04-11 18:49               ` Steve French
2021-04-12  3:43                 ` Rohith Surabattula
2021-04-12 16:57                   ` Aurélien Aptel
2021-04-12 17:23 ` Steve French
2021-04-12 17:24   ` Steve French
2021-04-12 19:35     ` Rohith Surabattula
2021-04-19 23:03       ` Steve French
2021-04-28  3:30         ` Rohith Surabattula

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CACdtm0agzeVRiQg1zZjm=jFrf-gSQ-+YGc1Zm1xN1Pd9tJia4Q@mail.gmail.com' \
    --to=rohiths.msft@gmail.com \
    --cc=aaptel@suse.com \
    --cc=linux-cifs@vger.kernel.org \
    --cc=nspmangalore@gmail.com \
    --cc=piastryyy@gmail.com \
    --cc=ronniesahlberg@gmail.com \
    --cc=smfrench@gmail.com \
    --cc=sribhat.msa@outlook.com \
    --cc=tom@talpey.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).