All of lore.kernel.org
 help / color / mirror / Atom feed
* Possible bug report: kernel 6.5.0/6.5.1 high load when CIFS share is mounted (cifsd-cfid-laundromat in"D" state)
@ 2023-09-05 17:09 Brian Pardy
  2023-09-06  1:01 ` Bagas Sanjaya
  0 siblings, 1 reply; 25+ messages in thread
From: Brian Pardy @ 2023-09-05 17:09 UTC (permalink / raw)
  To: linux-cifs

My apologies if I do not have the bug report protocol correct in posting here.

I've noticed an issue with the CIFS client in kernel 6.5.0/6.5.1 that
does not exist in 6.4.12 or other previous kernels (I have not tested
6.4.13). Almost immediately after mounting a CIFS share, the reported
load average on my system goes up by 2. At the time this occurs I see
two [cifsd-cfid-laundromat] kernel threads running the "D" state,
where they remain for the entire time the CIFS share is mounted. The
load will remain stable at 2 (otherwise idle) until the share is
unmounted, at which point the [cifsd-cfid-laundromat] threads
disappear and load drops back down to 0. This is easily reproducible
on my system, but I am not sure what to do to retrieve more useful
debugging information. If I mount two shares from this server, I get
four laundromat threads in "D" state and a sustained load average of
4.

The client is running Gentoo Linux, the server is a Seagate Personal
Cloud NAS running Samba 4.6.5. Mount options used are
"noperm,guest,vers=3.02". The CPUs do not actually appear to be
spinning, the reported load average appears incorrect as far as actual
CPU use is concerned.

I am happy to follow any instructions provided to gather more details
if I can help to track this down. Nothing that appears relevant
appears in syslog or dmesg output.

Thank you.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Possible bug report: kernel 6.5.0/6.5.1 high load when CIFS share is mounted (cifsd-cfid-laundromat in"D" state)
  2023-09-05 17:09 Possible bug report: kernel 6.5.0/6.5.1 high load when CIFS share is mounted (cifsd-cfid-laundromat in"D" state) Brian Pardy
@ 2023-09-06  1:01 ` Bagas Sanjaya
  2023-09-06 21:03   ` Brian Pardy
  2023-10-20 10:40   ` Linux regression tracking #update (Thorsten Leemhuis)
  0 siblings, 2 replies; 25+ messages in thread
From: Bagas Sanjaya @ 2023-09-06  1:01 UTC (permalink / raw)
  To: Brian Pardy, Linux CIFS
  Cc: Linux Kernel Mailing List, Linux Regressions, Steve French

[-- Attachment #1: Type: text/plain, Size: 1915 bytes --]

On Tue, Sep 05, 2023 at 01:09:05PM -0400, Brian Pardy wrote:
> My apologies if I do not have the bug report protocol correct in posting here.
> 
> I've noticed an issue with the CIFS client in kernel 6.5.0/6.5.1 that
> does not exist in 6.4.12 or other previous kernels (I have not tested
> 6.4.13). Almost immediately after mounting a CIFS share, the reported
> load average on my system goes up by 2. At the time this occurs I see
> two [cifsd-cfid-laundromat] kernel threads running the "D" state,
> where they remain for the entire time the CIFS share is mounted. The
> load will remain stable at 2 (otherwise idle) until the share is
> unmounted, at which point the [cifsd-cfid-laundromat] threads
> disappear and load drops back down to 0. This is easily reproducible
> on my system, but I am not sure what to do to retrieve more useful
> debugging information. If I mount two shares from this server, I get
> four laundromat threads in "D" state and a sustained load average of
> 4.
> 
> The client is running Gentoo Linux, the server is a Seagate Personal
> Cloud NAS running Samba 4.6.5. Mount options used are
> "noperm,guest,vers=3.02". The CPUs do not actually appear to be
> spinning, the reported load average appears incorrect as far as actual
> CPU use is concerned.
> 
> I am happy to follow any instructions provided to gather more details
> if I can help to track this down. Nothing that appears relevant
> appears in syslog or dmesg output.

Thanks for the regression report. But if you want to get it fixed,
you have to do your part: perform bisection. See Documentation/admin-guide/bug-bisect.rst in the kernel sources for how to do that.

Anyway, I'm adding it to regzbot:

#regzbot ^introduced: v6.4..v6.5
#regzbot title: incorrect CPU utilization report (multiplied) when mounting CIFS

Bye!

-- 
An old man doll... just what I always wanted! - Clara

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Possible bug report: kernel 6.5.0/6.5.1 high load when CIFS share is mounted (cifsd-cfid-laundromat in"D" state)
  2023-09-06  1:01 ` Bagas Sanjaya
@ 2023-09-06 21:03   ` Brian Pardy
  2023-09-07  3:40     ` Linux regression tracking #update (Thorsten Leemhuis)
  2023-09-18 22:55     ` Brian Pardy
  2023-10-20 10:40   ` Linux regression tracking #update (Thorsten Leemhuis)
  1 sibling, 2 replies; 25+ messages in thread
From: Brian Pardy @ 2023-09-06 21:03 UTC (permalink / raw)
  To: Bagas Sanjaya, Linux CIFS
  Cc: Linux Kernel Mailing List, Linux Regressions, Steve French,
	Ronnie Sahlberg

Added committer Ronnie Sahlberg to CC.

On Tue, Sep 5, 2023 at 9:01 PM Bagas Sanjaya <bagasdotme@gmail.com> wrote:
> On Tue, Sep 05, 2023 at 01:09:05PM -0400, Brian Pardy wrote:
> > I've noticed an issue with the CIFS client in kernel 6.5.0/6.5.1 that
> > does not exist in 6.4.12 or other previous kernels (I have not tested
> > 6.4.13). Almost immediately after mounting a CIFS share, the reported
> > load average on my system goes up by 2. At the time this occurs I see
> > two [cifsd-cfid-laundromat] kernel threads running the "D" state,
> > where they remain for the entire time the CIFS share is mounted. The
> > load will remain stable at 2 (otherwise idle) until the share is
> > unmounted, at which point the [cifsd-cfid-laundromat] threads
> > disappear and load drops back down to 0. This is easily reproducible
> > on my system, but I am not sure what to do to retrieve more useful
> > debugging information. If I mount two shares from this server, I get
> > four laundromat threads in "D" state and a sustained load average of
> > 4.
> >
> > The client is running Gentoo Linux, the server is a Seagate Personal
> > Cloud NAS running Samba 4.6.5. Mount options used are
> > "noperm,guest,vers=3.02". The CPUs do not actually appear to be
> > spinning, the reported load average appears incorrect as far as actual
> > CPU use is concerned.
>
> Thanks for the regression report. But if you want to get it fixed,
> you have to do your part: perform bisection. See Documentation/admin-guide/bug-bisect.rst in the kernel sources for how to do that.
>
> Anyway, I'm adding it to regzbot:
>
> #regzbot ^introduced: v6.4..v6.5
> #regzbot title: incorrect CPU utilization report (multiplied) when mounting CIFS

Thank you for directing me to the bug-bisect documentation. Results below:

# git bisect bad
d14de8067e3f9653cdef5a094176d00f3260ab20 is the first bad commit
commit d14de8067e3f9653cdef5a094176d00f3260ab20
Author: Ronnie Sahlberg <lsahlber@redhat.com>
Date:   Thu Jul 6 12:32:24 2023 +1000

    cifs: Add a laundromat thread for cached directories

    and drop cached directories after 30 seconds

    Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com>
    Signed-off-by: Steve French <stfrench@microsoft.com>

 fs/smb/client/cached_dir.c | 67 ++++++++++++++++++++++++++++++++++++++++++++++
 fs/smb/client/cached_dir.h |  1 +
 2 files changed, 68 insertions(+)

I do not know what other debug info may be useful, but here is
/proc/[pid]/stack output for one of these threads in D state:

# cat /proc/17314/stack
[<0>] msleep+0x24/0x40
[<0>] cifs_cfids_laundromat_thread+0x5e/0x1c0 [cifs]
[<0>] kthread+0xc4/0xf0
[<0>] ret_from_fork+0x28/0x40
[<0>] ret_from_fork_asm+0x1b/0x30

I will provide any other details requested. Thank you.

#regzbot introduced: d14de8067e3

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Possible bug report: kernel 6.5.0/6.5.1 high load when CIFS share is mounted (cifsd-cfid-laundromat in"D" state)
  2023-09-06 21:03   ` Brian Pardy
@ 2023-09-07  3:40     ` Linux regression tracking #update (Thorsten Leemhuis)
  2023-09-18 22:55     ` Brian Pardy
  1 sibling, 0 replies; 25+ messages in thread
From: Linux regression tracking #update (Thorsten Leemhuis) @ 2023-09-07  3:40 UTC (permalink / raw)
  To: Linux CIFS; +Cc: Linux Kernel Mailing List, Linux Regressions, Bagas Sanjaya

[TLDR: This mail in primarily relevant for Linux kernel regression
tracking. See link in footer if these mails annoy you.]

On 06.09.23 23:03, Brian Pardy wrote:
> Added committer Ronnie Sahlberg to CC.
> 
> On Tue, Sep 5, 2023 at 9:01 PM Bagas Sanjaya <bagasdotme@gmail.com> wrote:
>> On Tue, Sep 05, 2023 at 01:09:05PM -0400, Brian Pardy wrote:
>>> I've noticed an issue with the CIFS client in kernel 6.5.0/6.5.1 that
>>> does not exist in 6.4.12 or other previous kernels (I have not tested
>>> 6.4.13). Almost immediately after mounting a CIFS share, the reported
>>> load average on my system goes up by 2. At the time this occurs I see
>>> two [cifsd-cfid-laundromat] kernel threads running the "D" state,
>>> where they remain for the entire time the CIFS share is mounted. The
>>> load will remain stable at 2 (otherwise idle) until the share is
>>> unmounted, at which point the [cifsd-cfid-laundromat] threads
>>> disappear and load drops back down to 0. This is easily reproducible
>>> on my system, but I am not sure what to do to retrieve more useful
>>> debugging information. If I mount two shares from this server, I get
>>> four laundromat threads in "D" state and a sustained load average of
>>> 4.
> [...]
> 
> Thank you for directing me to the bug-bisect documentation. Results below:
> 
> # git bisect bad
> d14de8067e3f9653cdef5a094176d00f3260ab20 is the first bad commit
>

#regzbot ^introduced: d14de8067e3f9653cdef5a094176d00f3260a
#regzbot ignore-activity

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
--
Everything you wanna know about Linux kernel regression tracking:
https://linux-regtracking.leemhuis.info/about/#tldr
That page also explains what to do if mails like this annoy you.


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Possible bug report: kernel 6.5.0/6.5.1 high load when CIFS share is mounted (cifsd-cfid-laundromat in"D" state)
  2023-09-06 21:03   ` Brian Pardy
  2023-09-07  3:40     ` Linux regression tracking #update (Thorsten Leemhuis)
@ 2023-09-18 22:55     ` Brian Pardy
  2023-09-19  3:00       ` Steve French
  2023-09-19  3:01       ` Steve French
  1 sibling, 2 replies; 25+ messages in thread
From: Brian Pardy @ 2023-09-18 22:55 UTC (permalink / raw)
  To: Bagas Sanjaya, Linux CIFS
  Cc: Linux Kernel Mailing List, Linux Regressions, Steve French

[RS removed from CC due to bounce message]

On Wed, Sep 6, 2023 at 5:03 PM Brian Pardy <brian.pardy@gmail.com> wrote:
> On Tue, Sep 5, 2023 at 9:01 PM Bagas Sanjaya <bagasdotme@gmail.com> wrote:
> > Thanks for the regression report. But if you want to get it fixed,
> > you have to do your part: perform bisection. See Documentation/admin-guide/bug-bisect.rst in the kernel sources for how to do that.
> >
> > Anyway, I'm adding it to regzbot:
> >
> > #regzbot ^introduced: v6.4..v6.5
> > #regzbot title: incorrect CPU utilization report (multiplied) when mounting CIFS
>
> Thank you for directing me to the bug-bisect documentation. Results below:
>
> # git bisect bad
> d14de8067e3f9653cdef5a094176d00f3260ab20 is the first bad commit
> commit d14de8067e3f9653cdef5a094176d00f3260ab20
> Author: Ronnie Sahlberg <lsahlber@redhat.com>
> Date:   Thu Jul 6 12:32:24 2023 +1000
>
>     cifs: Add a laundromat thread for cached directories
>
>     and drop cached directories after 30 seconds
>
>     Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com>
>     Signed-off-by: Steve French <stfrench@microsoft.com>
>
>  fs/smb/client/cached_dir.c | 67 ++++++++++++++++++++++++++++++++++++++++++++++
>  fs/smb/client/cached_dir.h |  1 +
>  2 files changed, 68 insertions(+)

Is there any further information I can provide to aid in debugging
this issue? Should I just expect incorrect load average reporting when
a CIFS share is mounted on any kernel >6.5.0?

I'm not clear on the value or necessity of this "laundromat thread" -
everything worked as expected before it was added - shall I just patch
it out of my kernel builds going forward if there is no interest in
fixing it? Is a .config option to disable it possible?

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Possible bug report: kernel 6.5.0/6.5.1 high load when CIFS share is mounted (cifsd-cfid-laundromat in"D" state)
  2023-09-18 22:55     ` Brian Pardy
@ 2023-09-19  3:00       ` Steve French
  2023-09-19  5:36         ` Steve French
  2023-09-19  3:01       ` Steve French
  1 sibling, 1 reply; 25+ messages in thread
From: Steve French @ 2023-09-19  3:00 UTC (permalink / raw)
  To: Brian Pardy
  Cc: Bagas Sanjaya, Linux CIFS, Linux Kernel Mailing List,
	Linux Regressions, Paulo Alcantara, ronnie sahlberg

Paulo and I were discussing the laundromat thread at the SMB3.1.1 test
event (at SDC this week) which is now going on - will let you know
what we find.

One obvious thing is that it probably isn't necessary for cases when
the server does not support directory leases, but we noticed another
problem as well.


On Mon, Sep 18, 2023 at 9:20 PM Brian Pardy <brian.pardy@gmail.com> wrote:
>
> [RS removed from CC due to bounce message]
>
> On Wed, Sep 6, 2023 at 5:03 PM Brian Pardy <brian.pardy@gmail.com> wrote:
> > On Tue, Sep 5, 2023 at 9:01 PM Bagas Sanjaya <bagasdotme@gmail.com> wrote:
> > > Thanks for the regression report. But if you want to get it fixed,
> > > you have to do your part: perform bisection. See Documentation/admin-guide/bug-bisect.rst in the kernel sources for how to do that.
> > >
> > > Anyway, I'm adding it to regzbot:
> > >
> > > #regzbot ^introduced: v6.4..v6.5
> > > #regzbot title: incorrect CPU utilization report (multiplied) when mounting CIFS
> >
> > Thank you for directing me to the bug-bisect documentation. Results below:
> >
> > # git bisect bad
> > d14de8067e3f9653cdef5a094176d00f3260ab20 is the first bad commit
> > commit d14de8067e3f9653cdef5a094176d00f3260ab20
> > Author: Ronnie Sahlberg <lsahlber@redhat.com>
> > Date:   Thu Jul 6 12:32:24 2023 +1000
> >
> >     cifs: Add a laundromat thread for cached directories
> >
> >     and drop cached directories after 30 seconds
> >
> >     Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com>
> >     Signed-off-by: Steve French <stfrench@microsoft.com>
> >
> >  fs/smb/client/cached_dir.c | 67 ++++++++++++++++++++++++++++++++++++++++++++++
> >  fs/smb/client/cached_dir.h |  1 +
> >  2 files changed, 68 insertions(+)
>
> Is there any further information I can provide to aid in debugging
> this issue? Should I just expect incorrect load average reporting when
> a CIFS share is mounted on any kernel >6.5.0?
>
> I'm not clear on the value or necessity of this "laundromat thread" -
> everything worked as expected before it was added - shall I just patch
> it out of my kernel builds going forward if there is no interest in
> fixing it? Is a .config option to disable it possible?



-- 
Thanks,

Steve

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Possible bug report: kernel 6.5.0/6.5.1 high load when CIFS share is mounted (cifsd-cfid-laundromat in"D" state)
  2023-09-18 22:55     ` Brian Pardy
  2023-09-19  3:00       ` Steve French
@ 2023-09-19  3:01       ` Steve French
  1 sibling, 0 replies; 25+ messages in thread
From: Steve French @ 2023-09-19  3:01 UTC (permalink / raw)
  To: Brian Pardy
  Cc: Bagas Sanjaya, Linux CIFS, Linux Kernel Mailing List,
	Linux Regressions, Steve French

> I'm not clear on the value or necessity of this "laundromat thread"

This thread is used to clean up cached directories when directory
leases are supported (in those cases directory contents, and
information needed for revalidate of directory paths can be cached
safely).

On Mon, Sep 18, 2023 at 9:20 PM Brian Pardy <brian.pardy@gmail.com> wrote:
>
> [RS removed from CC due to bounce message]
>
> On Wed, Sep 6, 2023 at 5:03 PM Brian Pardy <brian.pardy@gmail.com> wrote:
> > On Tue, Sep 5, 2023 at 9:01 PM Bagas Sanjaya <bagasdotme@gmail.com> wrote:
> > > Thanks for the regression report. But if you want to get it fixed,
> > > you have to do your part: perform bisection. See Documentation/admin-guide/bug-bisect.rst in the kernel sources for how to do that.
> > >
> > > Anyway, I'm adding it to regzbot:
> > >
> > > #regzbot ^introduced: v6.4..v6.5
> > > #regzbot title: incorrect CPU utilization report (multiplied) when mounting CIFS
> >
> > Thank you for directing me to the bug-bisect documentation. Results below:
> >
> > # git bisect bad
> > d14de8067e3f9653cdef5a094176d00f3260ab20 is the first bad commit
> > commit d14de8067e3f9653cdef5a094176d00f3260ab20
> > Author: Ronnie Sahlberg <lsahlber@redhat.com>
> > Date:   Thu Jul 6 12:32:24 2023 +1000
> >
> >     cifs: Add a laundromat thread for cached directories
> >
> >     and drop cached directories after 30 seconds
> >
> >     Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com>
> >     Signed-off-by: Steve French <stfrench@microsoft.com>
> >
> >  fs/smb/client/cached_dir.c | 67 ++++++++++++++++++++++++++++++++++++++++++++++
> >  fs/smb/client/cached_dir.h |  1 +
> >  2 files changed, 68 insertions(+)
>
> Is there any further information I can provide to aid in debugging
> this issue? Should I just expect incorrect load average reporting when
> a CIFS share is mounted on any kernel >6.5.0?
>
> I'm not clear on the value or necessity of this "laundromat thread" -
> everything worked as expected before it was added - shall I just patch
> it out of my kernel builds going forward if there is no interest in
> fixing it? Is a .config option to disable it possible?



-- 
Thanks,

Steve

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Possible bug report: kernel 6.5.0/6.5.1 high load when CIFS share is mounted (cifsd-cfid-laundromat in"D" state)
  2023-09-19  3:00       ` Steve French
@ 2023-09-19  5:36         ` Steve French
  2023-09-19 13:21           ` Brian Pardy
  0 siblings, 1 reply; 25+ messages in thread
From: Steve French @ 2023-09-19  5:36 UTC (permalink / raw)
  To: Brian Pardy
  Cc: Bagas Sanjaya, Linux CIFS, Linux Kernel Mailing List,
	Linux Regressions, Paulo Alcantara, ronnie sahlberg,
	Shyam Prasad N, Bharath S M

[-- Attachment #1: Type: text/plain, Size: 2778 bytes --]

Does the attached patch help in your case?  It avoids starting the
laundromat thread for IPC shares (which cuts the number of the threads
in half for many cases) and also avoids starting them if the server
does not support directory leases (e.g. if Samba server instead of
Windows server).


On Mon, Sep 18, 2023 at 10:00 PM Steve French <smfrench@gmail.com> wrote:
>
> Paulo and I were discussing the laundromat thread at the SMB3.1.1 test
> event (at SDC this week) which is now going on - will let you know
> what we find.
>
> One obvious thing is that it probably isn't necessary for cases when
> the server does not support directory leases, but we noticed another
> problem as well.
>
>
> On Mon, Sep 18, 2023 at 9:20 PM Brian Pardy <brian.pardy@gmail.com> wrote:
> >
> > [RS removed from CC due to bounce message]
> >
> > On Wed, Sep 6, 2023 at 5:03 PM Brian Pardy <brian.pardy@gmail.com> wrote:
> > > On Tue, Sep 5, 2023 at 9:01 PM Bagas Sanjaya <bagasdotme@gmail.com> wrote:
> > > > Thanks for the regression report. But if you want to get it fixed,
> > > > you have to do your part: perform bisection. See Documentation/admin-guide/bug-bisect.rst in the kernel sources for how to do that.
> > > >
> > > > Anyway, I'm adding it to regzbot:
> > > >
> > > > #regzbot ^introduced: v6.4..v6.5
> > > > #regzbot title: incorrect CPU utilization report (multiplied) when mounting CIFS
> > >
> > > Thank you for directing me to the bug-bisect documentation. Results below:
> > >
> > > # git bisect bad
> > > d14de8067e3f9653cdef5a094176d00f3260ab20 is the first bad commit
> > > commit d14de8067e3f9653cdef5a094176d00f3260ab20
> > > Author: Ronnie Sahlberg <lsahlber@redhat.com>
> > > Date:   Thu Jul 6 12:32:24 2023 +1000
> > >
> > >     cifs: Add a laundromat thread for cached directories
> > >
> > >     and drop cached directories after 30 seconds
> > >
> > >     Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com>
> > >     Signed-off-by: Steve French <stfrench@microsoft.com>
> > >
> > >  fs/smb/client/cached_dir.c | 67 ++++++++++++++++++++++++++++++++++++++++++++++
> > >  fs/smb/client/cached_dir.h |  1 +
> > >  2 files changed, 68 insertions(+)
> >
> > Is there any further information I can provide to aid in debugging
> > this issue? Should I just expect incorrect load average reporting when
> > a CIFS share is mounted on any kernel >6.5.0?
> >
> > I'm not clear on the value or necessity of this "laundromat thread" -
> > everything worked as expected before it was added - shall I just patch
> > it out of my kernel builds going forward if there is no interest in
> > fixing it? Is a .config option to disable it possible?
>
>
>
> --
> Thanks,
>
> Steve



-- 
Thanks,

Steve

[-- Attachment #2: 0001-smb3-do-not-start-laundromat-thread-when-dir-leases-.patch --]
[-- Type: text/x-patch, Size: 4926 bytes --]

From a63586b4b0b12265097e874b42c7a2cdbbf4138b Mon Sep 17 00:00:00 2001
From: Steve French <stfrench@microsoft.com>
Date: Tue, 19 Sep 2023 00:28:02 -0500
Subject: [PATCH] smb3: do not start laundromat thread when dir leases disabled

When no directory lease support, or for IPC shares where directories
can not be opened, do not start an unneeded laundromat thread for
that mount (it wastes resources).

Fixes: d14de8067e3f ("cifs: Add a laundromat thread for cached directories")
Signed-off-by: Steve French <stfrench@microsoft.com>
---
 fs/smb/client/cached_dir.c |  6 ++++++
 fs/smb/client/cifsglob.h   |  2 +-
 fs/smb/client/cifsproto.h  |  2 +-
 fs/smb/client/connect.c    |  9 +++++++--
 fs/smb/client/misc.c       | 16 ++++++++++------
 fs/smb/client/smb2pdu.c    |  5 ++++-
 6 files changed, 29 insertions(+), 11 deletions(-)

diff --git a/fs/smb/client/cached_dir.c b/fs/smb/client/cached_dir.c
index b17f067e4ada..a7cbaa32005c 100644
--- a/fs/smb/client/cached_dir.c
+++ b/fs/smb/client/cached_dir.c
@@ -450,6 +450,9 @@ void invalidate_all_cached_dirs(struct cifs_tcon *tcon)
 {
 	struct cached_fids *cfids = tcon->cfids;
 	struct cached_fid *cfid, *q;
+
+	if (cfids == NULL)
+		return;
 	LIST_HEAD(entry);
 
 	spin_lock(&cfids->cfid_list_lock);
@@ -651,6 +654,9 @@ void free_cached_dirs(struct cached_fids *cfids)
 	struct cached_fid *cfid, *q;
 	LIST_HEAD(entry);
 
+	if (cfids == NULL)
+		return;
+
 	if (cfids->laundromat) {
 		kthread_stop(cfids->laundromat);
 		cfids->laundromat = NULL;
diff --git a/fs/smb/client/cifsglob.h b/fs/smb/client/cifsglob.h
index 032d8716f671..f594fcc0e889 100644
--- a/fs/smb/client/cifsglob.h
+++ b/fs/smb/client/cifsglob.h
@@ -1943,7 +1943,7 @@ require use of the stronger protocol */
  * cifsInodeInfo->lock_sem	cifsInodeInfo->llist		cifs_init_once
  *				->can_cache_brlcks
  * cifsInodeInfo->deferred_lock	cifsInodeInfo->deferred_closes	cifsInodeInfo_alloc
- * cached_fid->fid_mutex		cifs_tcon->crfid		tconInfoAlloc
+ * cached_fid->fid_mutex		cifs_tcon->crfid		tcon_info_alloc
  * cifsFileInfo->fh_mutex		cifsFileInfo			cifs_new_fileinfo
  * cifsFileInfo->file_info_lock	cifsFileInfo->count		cifs_new_fileinfo
  *				->invalidHandle			initiate_cifs_search
diff --git a/fs/smb/client/cifsproto.h b/fs/smb/client/cifsproto.h
index 7d8035846680..74c428d3c916 100644
--- a/fs/smb/client/cifsproto.h
+++ b/fs/smb/client/cifsproto.h
@@ -512,7 +512,7 @@ extern int CIFSSMBLogoff(const unsigned int xid, struct cifs_ses *ses);
 
 extern struct cifs_ses *sesInfoAlloc(void);
 extern void sesInfoFree(struct cifs_ses *);
-extern struct cifs_tcon *tconInfoAlloc(void);
+extern struct cifs_tcon *tcon_info_alloc(bool init_cached_dirs);
 extern void tconInfoFree(struct cifs_tcon *);
 
 extern int cifs_sign_rqst(struct smb_rqst *rqst, struct TCP_Server_Info *server,
diff --git a/fs/smb/client/connect.c b/fs/smb/client/connect.c
index 687754791bf0..45e60c0d262c 100644
--- a/fs/smb/client/connect.c
+++ b/fs/smb/client/connect.c
@@ -1882,7 +1882,9 @@ cifs_setup_ipc(struct cifs_ses *ses, struct smb3_fs_context *ctx)
 		}
 	}
 
-	tcon = tconInfoAlloc();
+	/* no need to setup directory caching on IPC share, so pass in false */
+	tcon = tcon_info_alloc(false);
+
 	if (tcon == NULL)
 		return -ENOMEM;
 
@@ -2492,7 +2494,10 @@ cifs_get_tcon(struct cifs_ses *ses, struct smb3_fs_context *ctx)
 		goto out_fail;
 	}
 
-	tcon = tconInfoAlloc();
+	if (ses->server->capabilities & SMB2_GLOBAL_CAP_DIRECTORY_LEASING)
+		tcon = tcon_info_alloc(true);
+	else
+		tcon = tcon_info_alloc(false);
 	if (tcon == NULL) {
 		rc = -ENOMEM;
 		goto out_fail;
diff --git a/fs/smb/client/misc.c b/fs/smb/client/misc.c
index 366b755ca913..ec23b9989b63 100644
--- a/fs/smb/client/misc.c
+++ b/fs/smb/client/misc.c
@@ -113,18 +113,22 @@ sesInfoFree(struct cifs_ses *buf_to_free)
 }
 
 struct cifs_tcon *
-tconInfoAlloc(void)
+tcon_info_alloc(bool dir_leases_enabled)
 {
 	struct cifs_tcon *ret_buf;
 
 	ret_buf = kzalloc(sizeof(*ret_buf), GFP_KERNEL);
 	if (!ret_buf)
 		return NULL;
-	ret_buf->cfids = init_cached_dirs();
-	if (!ret_buf->cfids) {
-		kfree(ret_buf);
-		return NULL;
-	}
+
+	if (dir_leases_enabled == true) {
+		ret_buf->cfids = init_cached_dirs();
+		if (!ret_buf->cfids) {
+			kfree(ret_buf);
+			return NULL;
+		}
+	} else
+		ret_buf->cfids = NULL;
 
 	atomic_inc(&tconInfoAllocCount);
 	ret_buf->status = TID_NEW;
diff --git a/fs/smb/client/smb2pdu.c b/fs/smb/client/smb2pdu.c
index 44d4943e9c56..c63b72ed7fe9 100644
--- a/fs/smb/client/smb2pdu.c
+++ b/fs/smb/client/smb2pdu.c
@@ -3878,7 +3878,10 @@ void smb2_reconnect_server(struct work_struct *work)
 		goto done;
 
 	/* allocate a dummy tcon struct used for reconnect */
-	tcon = tconInfoAlloc();
+	if (pserver->capabilities & SMB2_GLOBAL_CAP_LEASING)
+		tcon = tcon_info_alloc(true);
+	else
+		tcon = tcon_info_alloc(false);
 	if (!tcon) {
 		resched = true;
 		list_for_each_entry_safe(ses, ses2, &tmp_ses_list, rlist) {
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: Possible bug report: kernel 6.5.0/6.5.1 high load when CIFS share is mounted (cifsd-cfid-laundromat in"D" state)
  2023-09-19  5:36         ` Steve French
@ 2023-09-19 13:21           ` Brian Pardy
  2023-09-19 16:38             ` Steve French
  0 siblings, 1 reply; 25+ messages in thread
From: Brian Pardy @ 2023-09-19 13:21 UTC (permalink / raw)
  To: Steve French
  Cc: Bagas Sanjaya, Linux CIFS, Linux Kernel Mailing List,
	Linux Regressions, Paulo Alcantara, ronnie sahlberg,
	Shyam Prasad N, Bharath S M

On Tue, Sep 19, 2023 at 1:36 AM Steve French <smfrench@gmail.com> wrote:
>
> Does the attached patch help in your case?  It avoids starting the
> laundromat thread for IPC shares (which cuts the number of the threads
> in half for many cases) and also avoids starting them if the server
> does not support directory leases (e.g. if Samba server instead of
> Windows server).

Hello,

I applied the 0001-smb3-do-not-start-laundromat-thread-when-dir-leases-.patch
you provided against the 6.5.3 kernel.

I can confirm that it resolves this issue - no laundromat threads are
created, and the reported load average is as expected, not falsely
high.

This appears to fully fix the issue in my case.  Thank you very much!

> On Mon, Sep 18, 2023 at 10:00 PM Steve French <smfrench@gmail.com> wrote:
> >
> > Paulo and I were discussing the laundromat thread at the SMB3.1.1 test
> > event (at SDC this week) which is now going on - will let you know
> > what we find.
> >
> > One obvious thing is that it probably isn't necessary for cases when
> > the server does not support directory leases, but we noticed another
> > problem as well.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Possible bug report: kernel 6.5.0/6.5.1 high load when CIFS share is mounted (cifsd-cfid-laundromat in"D" state)
  2023-09-19 13:21           ` Brian Pardy
@ 2023-09-19 16:38             ` Steve French
  2023-09-19 17:42               ` Brian Pardy
  2023-09-19 18:06               ` Tom Talpey
  0 siblings, 2 replies; 25+ messages in thread
From: Steve French @ 2023-09-19 16:38 UTC (permalink / raw)
  To: Brian Pardy
  Cc: Bagas Sanjaya, Linux CIFS, Linux Kernel Mailing List,
	Linux Regressions, Paulo Alcantara, ronnie sahlberg,
	Shyam Prasad N, Bharath S M

[-- Attachment #1: Type: text/plain, Size: 1389 bytes --]

Minor updates (pointed out by Paulo) to patch. See attached.

On Tue, Sep 19, 2023 at 8:21 AM Brian Pardy <brian.pardy@gmail.com> wrote:
>
> On Tue, Sep 19, 2023 at 1:36 AM Steve French <smfrench@gmail.com> wrote:
> >
> > Does the attached patch help in your case?  It avoids starting the
> > laundromat thread for IPC shares (which cuts the number of the threads
> > in half for many cases) and also avoids starting them if the server
> > does not support directory leases (e.g. if Samba server instead of
> > Windows server).
>
> Hello,
>
> I applied the 0001-smb3-do-not-start-laundromat-thread-when-dir-leases-.patch
> you provided against the 6.5.3 kernel.
>
> I can confirm that it resolves this issue - no laundromat threads are
> created, and the reported load average is as expected, not falsely
> high.
>
> This appears to fully fix the issue in my case.  Thank you very much!
>
> > On Mon, Sep 18, 2023 at 10:00 PM Steve French <smfrench@gmail.com> wrote:
> > >
> > > Paulo and I were discussing the laundromat thread at the SMB3.1.1 test
> > > event (at SDC this week) which is now going on - will let you know
> > > what we find.
> > >
> > > One obvious thing is that it probably isn't necessary for cases when
> > > the server does not support directory leases, but we noticed another
> > > problem as well.



-- 
Thanks,

Steve

[-- Attachment #2: 0001-smb3-do-not-start-laundromat-thread-when-dir-leases.patch --]
[-- Type: text/x-patch, Size: 4954 bytes --]

From a82d40177a0273852fe89ae2edba9fd97b5d161e Mon Sep 17 00:00:00 2001
From: Steve French <stfrench@microsoft.com>
Date: Tue, 19 Sep 2023 11:35:53 -0500
Subject: [PATCH] smb3: do not start laundromat thread when dir leases 
 disabled

When no directory lease support, or for IPC shares where directories
can not be opened, do not start an unneeded laundromat thread for
that mount (it wastes resources).

Fixes: d14de8067e3f ("cifs: Add a laundromat thread for cached directories")
Reviewed-by: Paulo Alcantara (SUSE) <pc@manguebit.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
---
 fs/smb/client/cached_dir.c |  6 ++++++
 fs/smb/client/cifsglob.h   |  2 +-
 fs/smb/client/cifsproto.h  |  2 +-
 fs/smb/client/connect.c    |  8 ++++++--
 fs/smb/client/misc.c       | 14 +++++++++-----
 fs/smb/client/smb2pdu.c    |  2 +-
 6 files changed, 24 insertions(+), 10 deletions(-)

diff --git a/fs/smb/client/cached_dir.c b/fs/smb/client/cached_dir.c
index b17f067e4ada..e2be8aedb26e 100644
--- a/fs/smb/client/cached_dir.c
+++ b/fs/smb/client/cached_dir.c
@@ -452,6 +452,9 @@ void invalidate_all_cached_dirs(struct cifs_tcon *tcon)
 	struct cached_fid *cfid, *q;
 	LIST_HEAD(entry);
 
+	if (cfids == NULL)
+		return;
+
 	spin_lock(&cfids->cfid_list_lock);
 	list_for_each_entry_safe(cfid, q, &cfids->entries, entry) {
 		list_move(&cfid->entry, &entry);
@@ -651,6 +654,9 @@ void free_cached_dirs(struct cached_fids *cfids)
 	struct cached_fid *cfid, *q;
 	LIST_HEAD(entry);
 
+	if (cfids == NULL)
+		return;
+
 	if (cfids->laundromat) {
 		kthread_stop(cfids->laundromat);
 		cfids->laundromat = NULL;
diff --git a/fs/smb/client/cifsglob.h b/fs/smb/client/cifsglob.h
index 032d8716f671..f594fcc0e889 100644
--- a/fs/smb/client/cifsglob.h
+++ b/fs/smb/client/cifsglob.h
@@ -1943,7 +1943,7 @@ require use of the stronger protocol */
  * cifsInodeInfo->lock_sem	cifsInodeInfo->llist		cifs_init_once
  *				->can_cache_brlcks
  * cifsInodeInfo->deferred_lock	cifsInodeInfo->deferred_closes	cifsInodeInfo_alloc
- * cached_fid->fid_mutex		cifs_tcon->crfid		tconInfoAlloc
+ * cached_fid->fid_mutex		cifs_tcon->crfid		tcon_info_alloc
  * cifsFileInfo->fh_mutex		cifsFileInfo			cifs_new_fileinfo
  * cifsFileInfo->file_info_lock	cifsFileInfo->count		cifs_new_fileinfo
  *				->invalidHandle			initiate_cifs_search
diff --git a/fs/smb/client/cifsproto.h b/fs/smb/client/cifsproto.h
index 7d8035846680..0c37eefa18a5 100644
--- a/fs/smb/client/cifsproto.h
+++ b/fs/smb/client/cifsproto.h
@@ -512,7 +512,7 @@ extern int CIFSSMBLogoff(const unsigned int xid, struct cifs_ses *ses);
 
 extern struct cifs_ses *sesInfoAlloc(void);
 extern void sesInfoFree(struct cifs_ses *);
-extern struct cifs_tcon *tconInfoAlloc(void);
+extern struct cifs_tcon *tcon_info_alloc(bool dir_leases_enabled);
 extern void tconInfoFree(struct cifs_tcon *);
 
 extern int cifs_sign_rqst(struct smb_rqst *rqst, struct TCP_Server_Info *server,
diff --git a/fs/smb/client/connect.c b/fs/smb/client/connect.c
index 687754791bf0..3902e90dca6b 100644
--- a/fs/smb/client/connect.c
+++ b/fs/smb/client/connect.c
@@ -1882,7 +1882,8 @@ cifs_setup_ipc(struct cifs_ses *ses, struct smb3_fs_context *ctx)
 		}
 	}
 
-	tcon = tconInfoAlloc();
+	/* no need to setup directory caching on IPC share, so pass in false */
+	tcon = tcon_info_alloc(false);
 	if (tcon == NULL)
 		return -ENOMEM;
 
@@ -2492,7 +2493,10 @@ cifs_get_tcon(struct cifs_ses *ses, struct smb3_fs_context *ctx)
 		goto out_fail;
 	}
 
-	tcon = tconInfoAlloc();
+	if (ses->server->capabilities & SMB2_GLOBAL_CAP_DIRECTORY_LEASING)
+		tcon = tcon_info_alloc(true);
+	else
+		tcon = tcon_info_alloc(false);
 	if (tcon == NULL) {
 		rc = -ENOMEM;
 		goto out_fail;
diff --git a/fs/smb/client/misc.c b/fs/smb/client/misc.c
index 366b755ca913..35b176457bbe 100644
--- a/fs/smb/client/misc.c
+++ b/fs/smb/client/misc.c
@@ -113,18 +113,22 @@ sesInfoFree(struct cifs_ses *buf_to_free)
 }
 
 struct cifs_tcon *
-tconInfoAlloc(void)
+tcon_info_alloc(bool dir_leases_enabled)
 {
 	struct cifs_tcon *ret_buf;
 
 	ret_buf = kzalloc(sizeof(*ret_buf), GFP_KERNEL);
 	if (!ret_buf)
 		return NULL;
-	ret_buf->cfids = init_cached_dirs();
-	if (!ret_buf->cfids) {
-		kfree(ret_buf);
-		return NULL;
+
+	if (dir_leases_enabled == true) {
+		ret_buf->cfids = init_cached_dirs();
+		if (!ret_buf->cfids) {
+			kfree(ret_buf);
+			return NULL;
+		}
 	}
+	/* else ret_buf->cfids is already set to NULL above */
 
 	atomic_inc(&tconInfoAllocCount);
 	ret_buf->status = TID_NEW;
diff --git a/fs/smb/client/smb2pdu.c b/fs/smb/client/smb2pdu.c
index 44d4943e9c56..405ea324f28d 100644
--- a/fs/smb/client/smb2pdu.c
+++ b/fs/smb/client/smb2pdu.c
@@ -3878,7 +3878,7 @@ void smb2_reconnect_server(struct work_struct *work)
 		goto done;
 
 	/* allocate a dummy tcon struct used for reconnect */
-	tcon = tconInfoAlloc();
+	tcon = tcon_info_alloc(false);
 	if (!tcon) {
 		resched = true;
 		list_for_each_entry_safe(ses, ses2, &tmp_ses_list, rlist) {
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: Possible bug report: kernel 6.5.0/6.5.1 high load when CIFS share is mounted (cifsd-cfid-laundromat in"D" state)
  2023-09-19 16:38             ` Steve French
@ 2023-09-19 17:42               ` Brian Pardy
  2023-09-19 18:06               ` Tom Talpey
  1 sibling, 0 replies; 25+ messages in thread
From: Brian Pardy @ 2023-09-19 17:42 UTC (permalink / raw)
  To: Steve French
  Cc: Bagas Sanjaya, Linux CIFS, Linux Kernel Mailing List,
	Linux Regressions, Paulo Alcantara, ronnie sahlberg,
	Shyam Prasad N, Bharath S M

On Tue, Sep 19, 2023 at 12:39 PM Steve French <smfrench@gmail.com> wrote:
>
> Minor updates (pointed out by Paulo) to patch. See attached.

I can't comment on the updates between the two versions of the patch,
but I can confirm that the second patch also works as expected to
resolve this issue on my system.

-Brian


> On Tue, Sep 19, 2023 at 8:21 AM Brian Pardy <brian.pardy@gmail.com> wrote:
> >
> > On Tue, Sep 19, 2023 at 1:36 AM Steve French <smfrench@gmail.com> wrote:
> > >
> > > Does the attached patch help in your case?  It avoids starting the
> > > laundromat thread for IPC shares (which cuts the number of the threads
> > > in half for many cases) and also avoids starting them if the server
> > > does not support directory leases (e.g. if Samba server instead of
> > > Windows server).
> >
> > Hello,
> >
> > I applied the 0001-smb3-do-not-start-laundromat-thread-when-dir-leases-.patch
> > you provided against the 6.5.3 kernel.
> >
> > I can confirm that it resolves this issue - no laundromat threads are
> > created, and the reported load average is as expected, not falsely
> > high.
> >
> > This appears to fully fix the issue in my case.  Thank you very much!
> >
> > > On Mon, Sep 18, 2023 at 10:00 PM Steve French <smfrench@gmail.com> wrote:
> > > >
> > > > Paulo and I were discussing the laundromat thread at the SMB3.1.1 test
> > > > event (at SDC this week) which is now going on - will let you know
> > > > what we find.
> > > >
> > > > One obvious thing is that it probably isn't necessary for cases when
> > > > the server does not support directory leases, but we noticed another
> > > > problem as well.
>
>
>
> --
> Thanks,
>
> Steve

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Possible bug report: kernel 6.5.0/6.5.1 high load when CIFS share is mounted (cifsd-cfid-laundromat in"D" state)
  2023-09-19 16:38             ` Steve French
  2023-09-19 17:42               ` Brian Pardy
@ 2023-09-19 18:06               ` Tom Talpey
  2023-09-19 18:23                 ` Steve French
  1 sibling, 1 reply; 25+ messages in thread
From: Tom Talpey @ 2023-09-19 18:06 UTC (permalink / raw)
  To: Steve French, Brian Pardy
  Cc: Bagas Sanjaya, Linux CIFS, Linux Kernel Mailing List,
	Linux Regressions, Paulo Alcantara, ronnie sahlberg,
	Shyam Prasad N, Bharath S M

On 9/19/2023 9:38 AM, Steve French wrote:
> Minor updates (pointed out by Paulo) to patch. See attached.

So, was the thread crashing before??

+	if (cfids == NULL)
+		return;
+
  	spin_lock(&cfids->cfid_list_lock);


This if/else, IMO...

@@ -2492,7 +2493,10 @@ cifs_get_tcon(struct cifs_ses *ses, struct 
smb3_fs_context *ctx)
  		goto out_fail;
  	}

-	tcon = tconInfoAlloc();
+	if (ses->server->capabilities & SMB2_GLOBAL_CAP_DIRECTORY_LEASING)
+		tcon = tcon_info_alloc(true);
+	else
+		tcon = tcon_info_alloc(false);

would be more readable as...

	tcon = tcon_info_alloc(ses->server->capabilities & 
SMB2_GLOBAL_CAP_DIRECTORY_LEASING != 0);


These changes are good, but I'm skeptical they will reduce the load
when the laundromat thread is actually running. All these do is avoid 
creating it when not necessary, right?

Acked-by: Tom Talpey <tom@talpey.com>

Tom.


> 
> On Tue, Sep 19, 2023 at 8:21 AM Brian Pardy <brian.pardy@gmail.com> wrote:
>>
>> On Tue, Sep 19, 2023 at 1:36 AM Steve French <smfrench@gmail.com> wrote:
>>>
>>> Does the attached patch help in your case?  It avoids starting the
>>> laundromat thread for IPC shares (which cuts the number of the threads
>>> in half for many cases) and also avoids starting them if the server
>>> does not support directory leases (e.g. if Samba server instead of
>>> Windows server).
>>
>> Hello,
>>
>> I applied the 0001-smb3-do-not-start-laundromat-thread-when-dir-leases-.patch
>> you provided against the 6.5.3 kernel.
>>
>> I can confirm that it resolves this issue - no laundromat threads are
>> created, and the reported load average is as expected, not falsely
>> high.
>>
>> This appears to fully fix the issue in my case.  Thank you very much!
>>
>>> On Mon, Sep 18, 2023 at 10:00 PM Steve French <smfrench@gmail.com> wrote:
>>>>
>>>> Paulo and I were discussing the laundromat thread at the SMB3.1.1 test
>>>> event (at SDC this week) which is now going on - will let you know
>>>> what we find.
>>>>
>>>> One obvious thing is that it probably isn't necessary for cases when
>>>> the server does not support directory leases, but we noticed another
>>>> problem as well.
> 
> 
> 

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Possible bug report: kernel 6.5.0/6.5.1 high load when CIFS share is mounted (cifsd-cfid-laundromat in"D" state)
  2023-09-19 18:06               ` Tom Talpey
@ 2023-09-19 18:23                 ` Steve French
  2023-09-27  0:54                   ` Paul Aurich
  0 siblings, 1 reply; 25+ messages in thread
From: Steve French @ 2023-09-19 18:23 UTC (permalink / raw)
  To: Tom Talpey
  Cc: Brian Pardy, Bagas Sanjaya, Linux CIFS,
	Linux Kernel Mailing List, Linux Regressions, Paulo Alcantara,
	ronnie sahlberg, Shyam Prasad N, Bharath S M

On Tue, Sep 19, 2023 at 1:07 PM Tom Talpey <tom@talpey.com> wrote:
>
> On 9/19/2023 9:38 AM, Steve French wrote:
> > Minor updates (pointed out by Paulo) to patch. See attached.
>
> So, was the thread crashing before??
>
> +       if (cfids == NULL)
> +               return;
> +

Without laundromat initialized cfids can be null - so we need to check
if cfids is initialized in a few places (may help in a few corner
cases if there is a race in closing laundromat thread at umount but
was added to avoid oops at unmount if laundromat not initialized)

> These changes are good, but I'm skeptical they will reduce the load
> when the laundromat thread is actually running. All these do is avoid
> creating it when not necessary, right?

It does create half as many laundromat threads (we don't need
laundromat on connection to IPC$) even for the Windows server target
example, but helps more for cases where server doesn't support
directory leases.

> > On Tue, Sep 19, 2023 at 8:21 AM Brian Pardy <brian.pardy@gmail.com> wrote:
> >>
> >> On Tue, Sep 19, 2023 at 1:36 AM Steve French <smfrench@gmail.com> wrote:
> >>>
> >>> Does the attached patch help in your case?  It avoids starting the
> >>> laundromat thread for IPC shares (which cuts the number of the threads
> >>> in half for many cases) and also avoids starting them if the server
> >>> does not support directory leases (e.g. if Samba server instead of
> >>> Windows server).
> >>
> >> Hello,
> >>
> >> I applied the 0001-smb3-do-not-start-laundromat-thread-when-dir-leases-.patch
> >> you provided against the 6.5.3 kernel.
> >>
> >> I can confirm that it resolves this issue - no laundromat threads are
> >> created, and the reported load average is as expected, not falsely
> >> high.
> >>
> >> This appears to fully fix the issue in my case.  Thank you very much!
> >>
> >>> On Mon, Sep 18, 2023 at 10:00 PM Steve French <smfrench@gmail.com> wrote:
> >>>>
> >>>> Paulo and I were discussing the laundromat thread at the SMB3.1.1 test
> >>>> event (at SDC this week) which is now going on - will let you know
> >>>> what we find.
> >>>>
> >>>> One obvious thing is that it probably isn't necessary for cases when
> >>>> the server does not support directory leases, but we noticed another
> >>>> problem as well.
> >
> >
> >



-- 
Thanks,

Steve

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Possible bug report: kernel 6.5.0/6.5.1 high load when CIFS share is mounted (cifsd-cfid-laundromat in"D" state)
  2023-09-19 18:23                 ` Steve French
@ 2023-09-27  0:54                   ` Paul Aurich
  2023-10-05  9:55                     ` Dr. Bernd Feige
  0 siblings, 1 reply; 25+ messages in thread
From: Paul Aurich @ 2023-09-27  0:54 UTC (permalink / raw)
  To: Steve French, Tom Talpey
  Cc: Brian Pardy, Bagas Sanjaya, Linux CIFS,
	Linux Kernel Mailing List, Linux Regressions, Paulo Alcantara,
	ronnie sahlberg, Shyam Prasad N, Bharath S M

On 2023-09-19 13:23:44 -0500, Steve French wrote:
>On Tue, Sep 19, 2023 at 1:07 PM Tom Talpey <tom@talpey.com> wrote:
>> These changes are good, but I'm skeptical they will reduce the load
>> when the laundromat thread is actually running. All these do is avoid
>> creating it when not necessary, right?
>
>It does create half as many laundromat threads (we don't need
>laundromat on connection to IPC$) even for the Windows server target
>example, but helps more for cases where server doesn't support
>directory leases.

Perhaps the laundromat thread should be using msleep_interruptible()?

Using an interruptible sleep appears to prevent the thread from contributing 
to the load average, and has the happy side-effect of removing the up-to-1s delay 
when tearing down the tcon (since a7c01fa93ae, kthread_stop() will return 
early triggered by kthread_stop).

~Paul


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Possible bug report: kernel 6.5.0/6.5.1 high load when CIFS share is mounted (cifsd-cfid-laundromat in"D" state)
  2023-09-27  0:54                   ` Paul Aurich
@ 2023-10-05  9:55                     ` Dr. Bernd Feige
  2023-10-05  9:59                       ` Steve French
  2023-10-13 23:19                       ` matoro
  0 siblings, 2 replies; 25+ messages in thread
From: Dr. Bernd Feige @ 2023-10-05  9:55 UTC (permalink / raw)
  To: tom, smfrench, paul
  Cc: linux-cifs, bagasdotme, regressions, pc, ronniesahlberg,
	nspmangalore, brian.pardy, bharathsm, linux-kernel

Am Dienstag, dem 26.09.2023 um 17:54 -0700 schrieb Paul Aurich:
> Perhaps the laundromat thread should be using msleep_interruptible()?
> 
> Using an interruptible sleep appears to prevent the thread from
> contributing 
> to the load average, and has the happy side-effect of removing the
> up-to-1s delay 
> when tearing down the tcon (since a7c01fa93ae, kthread_stop() will
> return 
> early triggered by kthread_stop).

Sorry for chiming in so late - I'm also on gentoo (kernel 6.5.5-
gentoo), but as a client of Windows AD.

Just want to emphasize that using uninterruptible sleep has not just
unhappy but devastating side-effects.

I have 8 processors and 16 cifsd-cfid-laundromat processes, so
/proc/loadavg reports a load average of 16 on a totally idle system.

This means that load-balancing software will never start additional
tasks on this system - "make -l" but also any other load-dependent
system. Just reducing the number of cifsd-cfid-laundromat processes
does not fix this - even a single one makes loadavg report a wrong
result for load balancing.

So, if cifsd-cfid-laundromat must really be uninterruptible, the only
solution would be to change the way loadavg is computed by the kernel
to exclude uninterruptible but sleeping processes. But must it be
uninterruptible?

Thanks and best regards,
Bernd

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Possible bug report: kernel 6.5.0/6.5.1 high load when CIFS share is mounted (cifsd-cfid-laundromat in"D" state)
  2023-10-05  9:55                     ` Dr. Bernd Feige
@ 2023-10-05  9:59                       ` Steve French
  2023-10-13 23:19                       ` matoro
  1 sibling, 0 replies; 25+ messages in thread
From: Steve French @ 2023-10-05  9:59 UTC (permalink / raw)
  To: Dr. Bernd Feige
  Cc: tom, paul, linux-cifs, bagasdotme, regressions, pc,
	ronniesahlberg, nspmangalore, brian.pardy, bharathsm,
	linux-kernel

FYI - Paulo is looking at an additional patch to address this (and one
small patch was also just added)

On Thu, Oct 5, 2023 at 4:55 AM Dr. Bernd Feige
<bernd.feige@uniklinik-freiburg.de> wrote:
>
> Am Dienstag, dem 26.09.2023 um 17:54 -0700 schrieb Paul Aurich:
> > Perhaps the laundromat thread should be using msleep_interruptible()?
> >
> > Using an interruptible sleep appears to prevent the thread from
> > contributing
> > to the load average, and has the happy side-effect of removing the
> > up-to-1s delay
> > when tearing down the tcon (since a7c01fa93ae, kthread_stop() will
> > return
> > early triggered by kthread_stop).
>
> Sorry for chiming in so late - I'm also on gentoo (kernel 6.5.5-
> gentoo), but as a client of Windows AD.
>
> Just want to emphasize that using uninterruptible sleep has not just
> unhappy but devastating side-effects.
>
> I have 8 processors and 16 cifsd-cfid-laundromat processes, so
> /proc/loadavg reports a load average of 16 on a totally idle system.
>
> This means that load-balancing software will never start additional
> tasks on this system - "make -l" but also any other load-dependent
> system. Just reducing the number of cifsd-cfid-laundromat processes
> does not fix this - even a single one makes loadavg report a wrong
> result for load balancing.
>
> So, if cifsd-cfid-laundromat must really be uninterruptible, the only
> solution would be to change the way loadavg is computed by the kernel
> to exclude uninterruptible but sleeping processes. But must it be
> uninterruptible?
>
> Thanks and best regards,
> Bernd



-- 
Thanks,

Steve

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Possible bug report: kernel 6.5.0/6.5.1 high load when CIFS share is mounted (cifsd-cfid-laundromat in"D" state)
  2023-10-05  9:55                     ` Dr. Bernd Feige
  2023-10-05  9:59                       ` Steve French
@ 2023-10-13 23:19                       ` matoro
  2023-10-13 23:52                         ` Paulo Alcantara
  1 sibling, 1 reply; 25+ messages in thread
From: matoro @ 2023-10-13 23:19 UTC (permalink / raw)
  To: Dr. Bernd Feige
  Cc: tom, smfrench, paul, linux-cifs, bagasdotme, regressions, pc,
	ronniesahlberg, nspmangalore, brian.pardy, bharathsm,
	linux-kernel

On 2023-10-05 05:55, Dr. Bernd Feige wrote:
> Am Dienstag, dem 26.09.2023 um 17:54 -0700 schrieb Paul Aurich:
>> Perhaps the laundromat thread should be using msleep_interruptible()?
>> 
>> Using an interruptible sleep appears to prevent the thread from
>> contributing
>> to the load average, and has the happy side-effect of removing the
>> up-to-1s delay
>> when tearing down the tcon (since a7c01fa93ae, kthread_stop() will
>> return
>> early triggered by kthread_stop).
> 
> Sorry for chiming in so late - I'm also on gentoo (kernel 6.5.5-
> gentoo), but as a client of Windows AD.
> 
> Just want to emphasize that using uninterruptible sleep has not just
> unhappy but devastating side-effects.
> 
> I have 8 processors and 16 cifsd-cfid-laundromat processes, so
> /proc/loadavg reports a load average of 16 on a totally idle system.
> 
> This means that load-balancing software will never start additional
> tasks on this system - "make -l" but also any other load-dependent
> system. Just reducing the number of cifsd-cfid-laundromat processes
> does not fix this - even a single one makes loadavg report a wrong
> result for load balancing.
> 
> So, if cifsd-cfid-laundromat must really be uninterruptible, the only
> solution would be to change the way loadavg is computed by the kernel
> to exclude uninterruptible but sleeping processes. But must it be
> uninterruptible?
> 
> Thanks and best regards,
> Bernd

This is a huge problem here as well, as a client to Samba using SMB1 
(for Unix extensions).

For others encountering this problem, I was able to work around it with 
the following snippet:

diff --git a/fs/smb/client/cached_dir.c b/fs/smb/client/cached_dir.c
index 2d5e9a9d5b8b..fc2caccb597a 100644
--- a/fs/smb/client/cached_dir.c
+++ b/fs/smb/client/cached_dir.c
@@ -576,7 +576,7 @@ cifs_cfids_laundromat_thread(void *p)
         struct list_head entry;

         while (!kthread_should_stop()) {
-               ssleep(1);
+               msleep_interruptible(1000);
                 INIT_LIST_HEAD(&entry);
                 if (kthread_should_stop())
                         return 0;

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: Possible bug report: kernel 6.5.0/6.5.1 high load when CIFS share is mounted (cifsd-cfid-laundromat in"D" state)
  2023-10-13 23:19                       ` matoro
@ 2023-10-13 23:52                         ` Paulo Alcantara
  2023-10-14  0:01                           ` Paulo Alcantara
  0 siblings, 1 reply; 25+ messages in thread
From: Paulo Alcantara @ 2023-10-13 23:52 UTC (permalink / raw)
  To: matoro, Dr. Bernd Feige
  Cc: tom, smfrench, paul, linux-cifs, bagasdotme, regressions,
	ronniesahlberg, nspmangalore, brian.pardy, bharathsm,
	linux-kernel

Could you please try two commits[1][2] from for-next?

[1] https://git.samba.org/?p=sfrench/cifs-2.6.git;a=commit;h=e95f3f74465072c2545d8e65a3c3a96e37129cf8
[2] https://git.samba.org/?p=sfrench/cifs-2.6.git;a=commit;h=81ba10959970d15c388bf29866b01b62f387e6a3

On 13 October 2023 20:19:37 GMT-03:00, matoro <matoro_mailinglist_kernel@matoro.tk> wrote:
>On 2023-10-05 05:55, Dr. Bernd Feige wrote:
>> Am Dienstag, dem 26.09.2023 um 17:54 -0700 schrieb Paul Aurich:
>>> Perhaps the laundromat thread should be using msleep_interruptible()?
>>> 
>>> Using an interruptible sleep appears to prevent the thread from
>>> contributing
>>> to the load average, and has the happy side-effect of removing the
>>> up-to-1s delay
>>> when tearing down the tcon (since a7c01fa93ae, kthread_stop() will
>>> return
>>> early triggered by kthread_stop).
>> 
>> Sorry for chiming in so late - I'm also on gentoo (kernel 6.5.5-
>> gentoo), but as a client of Windows AD.
>> 
>> Just want to emphasize that using uninterruptible sleep has not just
>> unhappy but devastating side-effects.
>> 
>> I have 8 processors and 16 cifsd-cfid-laundromat processes, so
>> /proc/loadavg reports a load average of 16 on a totally idle system.
>> 
>> This means that load-balancing software will never start additional
>> tasks on this system - "make -l" but also any other load-dependent
>> system. Just reducing the number of cifsd-cfid-laundromat processes
>> does not fix this - even a single one makes loadavg report a wrong
>> result for load balancing.
>> 
>> So, if cifsd-cfid-laundromat must really be uninterruptible, the only
>> solution would be to change the way loadavg is computed by the kernel
>> to exclude uninterruptible but sleeping processes. But must it be
>> uninterruptible?
>> 
>> Thanks and best regards,
>> Bernd
>
>This is a huge problem here as well, as a client to Samba using SMB1 (for Unix extensions).
>
>For others encountering this problem, I was able to work around it with the following snippet:
>
>diff --git a/fs/smb/client/cached_dir.c b/fs/smb/client/cached_dir.c
>index 2d5e9a9d5b8b..fc2caccb597a 100644
>--- a/fs/smb/client/cached_dir.c
>+++ b/fs/smb/client/cached_dir.c
>@@ -576,7 +576,7 @@ cifs_cfids_laundromat_thread(void *p)
>        struct list_head entry;
>
>        while (!kthread_should_stop()) {
>-               ssleep(1);
>+               msleep_interruptible(1000);
>                INIT_LIST_HEAD(&entry);
>                if (kthread_should_stop())
>                        return 0;

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Possible bug report: kernel 6.5.0/6.5.1 high load when CIFS share is mounted (cifsd-cfid-laundromat in"D" state)
  2023-10-13 23:52                         ` Paulo Alcantara
@ 2023-10-14  0:01                           ` Paulo Alcantara
       [not found]                             ` <CAH2r5mse_2sfXF+tdTmie5LLtBuc+6DOumDH3rn=5V24yhrYVQ@mail.gmail.com>
  0 siblings, 1 reply; 25+ messages in thread
From: Paulo Alcantara @ 2023-10-14  0:01 UTC (permalink / raw)
  To: matoro, Dr. Bernd Feige
  Cc: tom, smfrench, paul, linux-cifs, bagasdotme, regressions,
	ronniesahlberg, nspmangalore, brian.pardy, bharathsm,
	linux-kernel

You probably want these two as well

https://git.samba.org/?p=sfrench/cifs-2.6.git;a=commit;h=2da338ff752a2789470d733111a5241f30026675

https://git.samba.org/?p=sfrench/cifs-2.6.git;a=commit;h=3b8bb3171571f92eda863e5f78b063604c61f72a

as directory leases isn't supported in SMB1, so no waste of system resources by having those kthreads running.

On 13 October 2023 20:52:11 GMT-03:00, Paulo Alcantara <pc@manguebit.com> wrote:
>Could you please try two commits[1][2] from for-next?
>
>[1] https://git.samba.org/?p=sfrench/cifs-2.6.git;a=commit;h=e95f3f74465072c2545d8e65a3c3a96e37129cf8
>[2] https://git.samba.org/?p=sfrench/cifs-2.6.git;a=commit;h=81ba10959970d15c388bf29866b01b62f387e6a3
>
>On 13 October 2023 20:19:37 GMT-03:00, matoro <matoro_mailinglist_kernel@matoro.tk> wrote:
>>On 2023-10-05 05:55, Dr. Bernd Feige wrote:
>>> Am Dienstag, dem 26.09.2023 um 17:54 -0700 schrieb Paul Aurich:
>>>> Perhaps the laundromat thread should be using msleep_interruptible()?
>>>> 
>>>> Using an interruptible sleep appears to prevent the thread from
>>>> contributing
>>>> to the load average, and has the happy side-effect of removing the
>>>> up-to-1s delay
>>>> when tearing down the tcon (since a7c01fa93ae, kthread_stop() will
>>>> return
>>>> early triggered by kthread_stop).
>>> 
>>> Sorry for chiming in so late - I'm also on gentoo (kernel 6.5.5-
>>> gentoo), but as a client of Windows AD.
>>> 
>>> Just want to emphasize that using uninterruptible sleep has not just
>>> unhappy but devastating side-effects.
>>> 
>>> I have 8 processors and 16 cifsd-cfid-laundromat processes, so
>>> /proc/loadavg reports a load average of 16 on a totally idle system.
>>> 
>>> This means that load-balancing software will never start additional
>>> tasks on this system - "make -l" but also any other load-dependent
>>> system. Just reducing the number of cifsd-cfid-laundromat processes
>>> does not fix this - even a single one makes loadavg report a wrong
>>> result for load balancing.
>>> 
>>> So, if cifsd-cfid-laundromat must really be uninterruptible, the only
>>> solution would be to change the way loadavg is computed by the kernel
>>> to exclude uninterruptible but sleeping processes. But must it be
>>> uninterruptible?
>>> 
>>> Thanks and best regards,
>>> Bernd
>>
>>This is a huge problem here as well, as a client to Samba using SMB1 (for Unix extensions).
>>
>>For others encountering this problem, I was able to work around it with the following snippet:
>>
>>diff --git a/fs/smb/client/cached_dir.c b/fs/smb/client/cached_dir.c
>>index 2d5e9a9d5b8b..fc2caccb597a 100644
>>--- a/fs/smb/client/cached_dir.c
>>+++ b/fs/smb/client/cached_dir.c
>>@@ -576,7 +576,7 @@ cifs_cfids_laundromat_thread(void *p)
>>        struct list_head entry;
>>
>>        while (!kthread_should_stop()) {
>>-               ssleep(1);
>>+               msleep_interruptible(1000);
>>                INIT_LIST_HEAD(&entry);
>>                if (kthread_should_stop())
>>                        return 0;

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Possible bug report: kernel 6.5.0/6.5.1 high load when CIFS share is mounted (cifsd-cfid-laundromat in"D" state)
       [not found]                             ` <CAH2r5mse_2sfXF+tdTmie5LLtBuc+6DOumDH3rn=5V24yhrYVQ@mail.gmail.com>
@ 2023-10-14  0:59                               ` matoro
  2023-10-16 18:41                                 ` Paulo Alcantara
  0 siblings, 1 reply; 25+ messages in thread
From: matoro @ 2023-10-14  0:59 UTC (permalink / raw)
  To: Steve French
  Cc: Paulo Alcantara, Dr. Bernd Feige, Tom Talpey, Paul Aurich, CIFS,
	Bagas Sanjaya, Linux Regressions, ronnie sahlberg,
	Shyam Prasad N, Brian Pardy, Bharath S M, LKML

On 2023-10-13 20:13, Steve French wrote:
> Let me know if those fixes help as two of them have not been sent to 
> Linus
> yet, but I could send tomorrow
> 
> On Fri, Oct 13, 2023, 19:01 Paulo Alcantara <pc@manguebit.com> wrote:
> 
>> You probably want these two as well
>> 
>> 
>> https://git.samba.org/?p=sfrench/cifs-2.6.git;a=commit;h=2da338ff752a2789470d733111a5241f30026675
>> 
>> 
>> https://git.samba.org/?p=sfrench/cifs-2.6.git;a=commit;h=3b8bb3171571f92eda863e5f78b063604c61f72a
>> 
>> as directory leases isn't supported in SMB1, so no waste of system
>> resources by having those kthreads running.
>> 
>> On 13 October 2023 20:52:11 GMT-03:00, Paulo Alcantara 
>> <pc@manguebit.com>
>> wrote:
>> >Could you please try two commits[1][2] from for-next?
>> >
>> >[1]
>> https://git.samba.org/?p=sfrench/cifs-2.6.git;a=commit;h=e95f3f74465072c2545d8e65a3c3a96e37129cf8
>> >[2]
>> https://git.samba.org/?p=sfrench/cifs-2.6.git;a=commit;h=81ba10959970d15c388bf29866b01b62f387e6a3
>> >
>> >On 13 October 2023 20:19:37 GMT-03:00, matoro <
>> matoro_mailinglist_kernel@matoro.tk> wrote:
>> >>On 2023-10-05 05:55, Dr. Bernd Feige wrote:
>> >>> Am Dienstag, dem 26.09.2023 um 17:54 -0700 schrieb Paul Aurich:
>> >>>> Perhaps the laundromat thread should be using msleep_interruptible()?
>> >>>>
>> >>>> Using an interruptible sleep appears to prevent the thread from
>> >>>> contributing
>> >>>> to the load average, and has the happy side-effect of removing the
>> >>>> up-to-1s delay
>> >>>> when tearing down the tcon (since a7c01fa93ae, kthread_stop() will
>> >>>> return
>> >>>> early triggered by kthread_stop).
>> >>>
>> >>> Sorry for chiming in so late - I'm also on gentoo (kernel 6.5.5-
>> >>> gentoo), but as a client of Windows AD.
>> >>>
>> >>> Just want to emphasize that using uninterruptible sleep has not just
>> >>> unhappy but devastating side-effects.
>> >>>
>> >>> I have 8 processors and 16 cifsd-cfid-laundromat processes, so
>> >>> /proc/loadavg reports a load average of 16 on a totally idle system.
>> >>>
>> >>> This means that load-balancing software will never start additional
>> >>> tasks on this system - "make -l" but also any other load-dependent
>> >>> system. Just reducing the number of cifsd-cfid-laundromat processes
>> >>> does not fix this - even a single one makes loadavg report a wrong
>> >>> result for load balancing.
>> >>>
>> >>> So, if cifsd-cfid-laundromat must really be uninterruptible, the only
>> >>> solution would be to change the way loadavg is computed by the kernel
>> >>> to exclude uninterruptible but sleeping processes. But must it be
>> >>> uninterruptible?
>> >>>
>> >>> Thanks and best regards,
>> >>> Bernd
>> >>
>> >>This is a huge problem here as well, as a client to Samba using SMB1
>> (for Unix extensions).
>> >>
>> >>For others encountering this problem, I was able to work around it with
>> the following snippet:
>> >>
>> >>diff --git a/fs/smb/client/cached_dir.c b/fs/smb/client/cached_dir.c
>> >>index 2d5e9a9d5b8b..fc2caccb597a 100644
>> >>--- a/fs/smb/client/cached_dir.c
>> >>+++ b/fs/smb/client/cached_dir.c
>> >>@@ -576,7 +576,7 @@ cifs_cfids_laundromat_thread(void *p)
>> >>        struct list_head entry;
>> >>
>> >>        while (!kthread_should_stop()) {
>> >>-               ssleep(1);
>> >>+               msleep_interruptible(1000);
>> >>                INIT_LIST_HEAD(&entry);
>> >>                if (kthread_should_stop())
>> >>                        return 0;
>> 

Do you have backports of these to 6.5?  I tried to do it manually but 
there's already so many changes between 6.5 and these commits.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Possible bug report: kernel 6.5.0/6.5.1 high load when CIFS share is mounted (cifsd-cfid-laundromat in"D" state)
  2023-10-14  0:59                               ` matoro
@ 2023-10-16 18:41                                 ` Paulo Alcantara
  2023-10-20 14:37                                   ` Dr. Bernd Feige
  0 siblings, 1 reply; 25+ messages in thread
From: Paulo Alcantara @ 2023-10-16 18:41 UTC (permalink / raw)
  To: matoro, Steve French
  Cc: Dr. Bernd Feige, Tom Talpey, Paul Aurich, CIFS, Bagas Sanjaya,
	Linux Regressions, ronnie sahlberg, Shyam Prasad N, Brian Pardy,
	Bharath S M, LKML

[-- Attachment #1: Type: text/plain, Size: 386 bytes --]

matoro <matoro_mailinglist_kernel@matoro.tk> writes:

> Do you have backports of these to 6.5?  I tried to do it manually but 
> there's already so many changes between 6.5 and these commits.

Please find attached two patches that should fix your SMB1 case.  They
applied cleanly on top of v6.5.y branch.

Let me know if it works for you and then I'll ask stable team to pick
those up.

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-smb3-do-not-start-laundromat-thread-when-dir-leases-.patch --]
[-- Type: text/x-patch, Size: 4995 bytes --]

From b5aa3d1b8550613570a5ebb0c6be291d58d371a3 Mon Sep 17 00:00:00 2001
From: Steve French <stfrench@microsoft.com>
Date: Tue, 19 Sep 2023 11:35:53 -0500
Subject: [PATCH 1/2] smb3: do not start laundromat thread when dir leases
 disabled

When no directory lease support, or for IPC shares where directories
can not be opened, do not start an unneeded laundromat thread for
that mount (it wastes resources).

Fixes: d14de8067e3f ("cifs: Add a laundromat thread for cached directories")
Reviewed-by: Paulo Alcantara (SUSE) <pc@manguebit.com>
Acked-by: Tom Talpey <tom@talpey.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
---
 fs/smb/client/cached_dir.c |  6 ++++++
 fs/smb/client/cifsglob.h   |  2 +-
 fs/smb/client/cifsproto.h  |  2 +-
 fs/smb/client/connect.c    |  8 ++++++--
 fs/smb/client/misc.c       | 14 +++++++++-----
 fs/smb/client/smb2pdu.c    |  2 +-
 6 files changed, 24 insertions(+), 10 deletions(-)

diff --git a/fs/smb/client/cached_dir.c b/fs/smb/client/cached_dir.c
index 2d5e9a9d5b8b..23e853a32de7 100644
--- a/fs/smb/client/cached_dir.c
+++ b/fs/smb/client/cached_dir.c
@@ -451,6 +451,9 @@ void invalidate_all_cached_dirs(struct cifs_tcon *tcon)
 	struct cached_fid *cfid, *q;
 	LIST_HEAD(entry);
 
+	if (cfids == NULL)
+		return;
+
 	spin_lock(&cfids->cfid_list_lock);
 	list_for_each_entry_safe(cfid, q, &cfids->entries, entry) {
 		list_move(&cfid->entry, &entry);
@@ -650,6 +653,9 @@ void free_cached_dirs(struct cached_fids *cfids)
 	struct cached_fid *cfid, *q;
 	LIST_HEAD(entry);
 
+	if (cfids == NULL)
+		return;
+
 	if (cfids->laundromat) {
 		kthread_stop(cfids->laundromat);
 		cfids->laundromat = NULL;
diff --git a/fs/smb/client/cifsglob.h b/fs/smb/client/cifsglob.h
index 35782a6bede0..bc44474daf88 100644
--- a/fs/smb/client/cifsglob.h
+++ b/fs/smb/client/cifsglob.h
@@ -1913,7 +1913,7 @@ require use of the stronger protocol */
  * cifsInodeInfo->lock_sem	cifsInodeInfo->llist		cifs_init_once
  *				->can_cache_brlcks
  * cifsInodeInfo->deferred_lock	cifsInodeInfo->deferred_closes	cifsInodeInfo_alloc
- * cached_fid->fid_mutex		cifs_tcon->crfid		tconInfoAlloc
+ * cached_fid->fid_mutex		cifs_tcon->crfid		tcon_info_alloc
  * cifsFileInfo->fh_mutex		cifsFileInfo			cifs_new_fileinfo
  * cifsFileInfo->file_info_lock	cifsFileInfo->count		cifs_new_fileinfo
  *				->invalidHandle			initiate_cifs_search
diff --git a/fs/smb/client/cifsproto.h b/fs/smb/client/cifsproto.h
index 1d71d658e167..bd0a1505719a 100644
--- a/fs/smb/client/cifsproto.h
+++ b/fs/smb/client/cifsproto.h
@@ -513,7 +513,7 @@ extern int CIFSSMBLogoff(const unsigned int xid, struct cifs_ses *ses);
 
 extern struct cifs_ses *sesInfoAlloc(void);
 extern void sesInfoFree(struct cifs_ses *);
-extern struct cifs_tcon *tconInfoAlloc(void);
+extern struct cifs_tcon *tcon_info_alloc(bool dir_leases_enabled);
 extern void tconInfoFree(struct cifs_tcon *);
 
 extern int cifs_sign_rqst(struct smb_rqst *rqst, struct TCP_Server_Info *server,
diff --git a/fs/smb/client/connect.c b/fs/smb/client/connect.c
index 352e251c4113..d4562d36e160 100644
--- a/fs/smb/client/connect.c
+++ b/fs/smb/client/connect.c
@@ -1882,7 +1882,8 @@ cifs_setup_ipc(struct cifs_ses *ses, struct smb3_fs_context *ctx)
 		}
 	}
 
-	tcon = tconInfoAlloc();
+	/* no need to setup directory caching on IPC share, so pass in false */
+	tcon = tcon_info_alloc(false);
 	if (tcon == NULL)
 		return -ENOMEM;
 
@@ -2492,7 +2493,10 @@ cifs_get_tcon(struct cifs_ses *ses, struct smb3_fs_context *ctx)
 		goto out_fail;
 	}
 
-	tcon = tconInfoAlloc();
+	if (ses->server->capabilities & SMB2_GLOBAL_CAP_DIRECTORY_LEASING)
+		tcon = tcon_info_alloc(true);
+	else
+		tcon = tcon_info_alloc(false);
 	if (tcon == NULL) {
 		rc = -ENOMEM;
 		goto out_fail;
diff --git a/fs/smb/client/misc.c b/fs/smb/client/misc.c
index d7e85d9a2655..249fac8be5a5 100644
--- a/fs/smb/client/misc.c
+++ b/fs/smb/client/misc.c
@@ -113,18 +113,22 @@ sesInfoFree(struct cifs_ses *buf_to_free)
 }
 
 struct cifs_tcon *
-tconInfoAlloc(void)
+tcon_info_alloc(bool dir_leases_enabled)
 {
 	struct cifs_tcon *ret_buf;
 
 	ret_buf = kzalloc(sizeof(*ret_buf), GFP_KERNEL);
 	if (!ret_buf)
 		return NULL;
-	ret_buf->cfids = init_cached_dirs();
-	if (!ret_buf->cfids) {
-		kfree(ret_buf);
-		return NULL;
+
+	if (dir_leases_enabled == true) {
+		ret_buf->cfids = init_cached_dirs();
+		if (!ret_buf->cfids) {
+			kfree(ret_buf);
+			return NULL;
+		}
 	}
+	/* else ret_buf->cfids is already set to NULL above */
 
 	atomic_inc(&tconInfoAllocCount);
 	ret_buf->status = TID_NEW;
diff --git a/fs/smb/client/smb2pdu.c b/fs/smb/client/smb2pdu.c
index 9c7e46b7e7c7..c22cc7222381 100644
--- a/fs/smb/client/smb2pdu.c
+++ b/fs/smb/client/smb2pdu.c
@@ -3871,7 +3871,7 @@ void smb2_reconnect_server(struct work_struct *work)
 		goto done;
 
 	/* allocate a dummy tcon struct used for reconnect */
-	tcon = tconInfoAlloc();
+	tcon = tcon_info_alloc(false);
 	if (!tcon) {
 		resched = true;
 		list_for_each_entry_safe(ses, ses2, &tmp_ses_list, rlist) {
-- 
2.42.0


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #3: 0002-smb-client-do-not-start-laundromat-thread-on-nohandl.patch --]
[-- Type: text/x-patch, Size: 2332 bytes --]

From a259c24cfe48ae6d44ea12543773f39a6b8d2cfa Mon Sep 17 00:00:00 2001
From: Paulo Alcantara <pc@manguebit.com>
Date: Wed, 4 Oct 2023 17:28:38 -0300
Subject: [PATCH 2/2] smb: client: do not start laundromat thread on
 nohandlecache

Honor 'nohandlecache' mount option by not starting laundromat thread
even when SMB server supports directory leases. Do not waste system
resources by having laundromat thread running with no directory
caching at all.

Fixes: 2da338ff752a ("smb3: do not start laundromat thread when dir leases  disabled")
Signed-off-by: Paulo Alcantara (SUSE) <pc@manguebit.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
---
 fs/smb/client/connect.c | 16 ++++++++--------
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/fs/smb/client/connect.c b/fs/smb/client/connect.c
index d4562d36e160..3915ac5a2543 100644
--- a/fs/smb/client/connect.c
+++ b/fs/smb/client/connect.c
@@ -2474,8 +2474,9 @@ cifs_put_tcon(struct cifs_tcon *tcon)
 static struct cifs_tcon *
 cifs_get_tcon(struct cifs_ses *ses, struct smb3_fs_context *ctx)
 {
+	struct cifs_tcon *tcon;
+	bool nohandlecache;
 	int rc, xid;
-	struct cifs_tcon *tcon;
 
 	tcon = cifs_find_tcon(ses, ctx);
 	if (tcon) {
@@ -2493,14 +2494,17 @@ cifs_get_tcon(struct cifs_ses *ses, struct smb3_fs_context *ctx)
 		goto out_fail;
 	}
 
-	if (ses->server->capabilities & SMB2_GLOBAL_CAP_DIRECTORY_LEASING)
-		tcon = tcon_info_alloc(true);
+	if (ses->server->dialect >= SMB20_PROT_ID &&
+	    (ses->server->capabilities & SMB2_GLOBAL_CAP_DIRECTORY_LEASING))
+		nohandlecache = ctx->nohandlecache;
 	else
-		tcon = tcon_info_alloc(false);
+		nohandlecache = true;
+	tcon = tcon_info_alloc(!nohandlecache);
 	if (tcon == NULL) {
 		rc = -ENOMEM;
 		goto out_fail;
 	}
+	tcon->nohandlecache = nohandlecache;
 
 	if (ctx->snapshot_time) {
 		if (ses->server->vals->protocol_id == 0) {
@@ -2661,10 +2665,6 @@ cifs_get_tcon(struct cifs_ses *ses, struct smb3_fs_context *ctx)
 	tcon->retry = ctx->retry;
 	tcon->nocase = ctx->nocase;
 	tcon->broken_sparse_sup = ctx->no_sparse;
-	if (ses->server->capabilities & SMB2_GLOBAL_CAP_DIRECTORY_LEASING)
-		tcon->nohandlecache = ctx->nohandlecache;
-	else
-		tcon->nohandlecache = true;
 	tcon->nodelete = ctx->nodelete;
 	tcon->local_lease = ctx->local_lease;
 	INIT_LIST_HEAD(&tcon->pending_opens);
-- 
2.42.0


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: Possible bug report: kernel 6.5.0/6.5.1 high load when CIFS share is mounted (cifsd-cfid-laundromat in"D" state)
  2023-09-06  1:01 ` Bagas Sanjaya
  2023-09-06 21:03   ` Brian Pardy
@ 2023-10-20 10:40   ` Linux regression tracking #update (Thorsten Leemhuis)
  1 sibling, 0 replies; 25+ messages in thread
From: Linux regression tracking #update (Thorsten Leemhuis) @ 2023-10-20 10:40 UTC (permalink / raw)
  To: Linux CIFS; +Cc: Linux Kernel Mailing List, Linux Regressions

[TLDR: This mail in primarily relevant for Linux kernel regression
tracking. See link in footer if these mails annoy you.]

On 06.09.23 03:01, Bagas Sanjaya wrote:
> On Tue, Sep 05, 2023 at 01:09:05PM -0400, Brian Pardy wrote:
>> My apologies if I do not have the bug report protocol correct in posting here.
>>
>> I've noticed an issue with the CIFS client in kernel 6.5.0/6.5.1 that
>> does not exist in 6.4.12 or other previous kernels (I have not tested
>> 6.4.13). Almost immediately after mounting a CIFS share, the reported
>> load average on my system goes up by 2. At the time this occurs I see
>> two [cifsd-cfid-laundromat] kernel threads running the "D" state,
>> where they remain for the entire time the CIFS share is mounted. The
>> load will remain stable at 2 (otherwise idle) until the share is
>> unmounted, at which point the [cifsd-cfid-laundromat] threads
>> disappear and load drops back down to 0. This is easily reproducible
>> on my system, but I am not sure what to do to retrieve more useful
>> debugging information. If I mount two shares from this server, I get
>> four laundromat threads in "D" state and a sustained load average of
>> 4.
>>[...]
> Anyway, I'm adding it to regzbot:
> 
> #regzbot ^introduced: v6.4..v6.5
> #regzbot title: incorrect CPU utilization report (multiplied) when mounting CIFS

#regzbot fix: 2da338ff752a2789470
#regzbot ignore-activity

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
--
Everything you wanna know about Linux kernel regression tracking:
https://linux-regtracking.leemhuis.info/about/#tldr
That page also explains what to do if mails like this annoy you.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Possible bug report: kernel 6.5.0/6.5.1 high load when CIFS share is mounted (cifsd-cfid-laundromat in"D" state)
  2023-10-16 18:41                                 ` Paulo Alcantara
@ 2023-10-20 14:37                                   ` Dr. Bernd Feige
  2023-10-23 14:07                                     ` Paulo Alcantara
  0 siblings, 1 reply; 25+ messages in thread
From: Dr. Bernd Feige @ 2023-10-20 14:37 UTC (permalink / raw)
  To: smfrench, matoro_mailinglist_kernel, pc
  Cc: linux-kernel, linux-cifs, bagasdotme, paul, tom, ronniesahlberg,
	bharathsm, regressions, brian.pardy, nspmangalore

Am Montag, dem 16.10.2023 um 15:41 -0300 schrieb Paulo Alcantara:
> matoro <matoro_mailinglist_kernel@matoro.tk> writes:
> 
> > Do you have backports of these to 6.5?  I tried to do it manually
> > but 
> > there's already so many changes between 6.5 and these commits.
> 
> Please find attached two patches that should fix your SMB1 case. 
> They
> applied cleanly on top of v6.5.y branch.
> 
> Let me know if it works for you and then I'll ask stable team to pick
> those up.

Thanks!
I can confirm that the patches apply cleanly on 6.5.8 and help a lot
with the issue here (vers=3.1.1, gentoo client, MS AD server with DFS).


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Possible bug report: kernel 6.5.0/6.5.1 high load when CIFS share is mounted (cifsd-cfid-laundromat in"D" state)
  2023-10-20 14:37                                   ` Dr. Bernd Feige
@ 2023-10-23 14:07                                     ` Paulo Alcantara
  2023-10-24  2:51                                       ` Steve French
  0 siblings, 1 reply; 25+ messages in thread
From: Paulo Alcantara @ 2023-10-23 14:07 UTC (permalink / raw)
  To: Dr. Bernd Feige, smfrench, matoro_mailinglist_kernel
  Cc: linux-kernel, linux-cifs, bagasdotme, paul, tom, ronniesahlberg,
	bharathsm, regressions, brian.pardy, nspmangalore

"Dr. Bernd Feige" <bernd.feige@uniklinik-freiburg.de> writes:

> I can confirm that the patches apply cleanly on 6.5.8 and help a lot
> with the issue here (vers=3.1.1, gentoo client, MS AD server with DFS).

Thanks for testing it.

Steve, I would suggest below commits for v6.5.y

	238b351d0935 ("smb3: allow controlling length of time directory entries are cached with dir leases")
	6a50d71d0fff ("smb3: allow controlling maximum number of cached directories")
	2da338ff752a ("smb3: do not start laundromat thread when dir leases  disabled")
	e95f3f744650 ("smb: client: make laundromat a delayed worker")
	81ba10959970 ("smb: client: prevent new fids from being removed by laundromat")

If OK, please ask stable team to pick those up.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Possible bug report: kernel 6.5.0/6.5.1 high load when CIFS share is mounted (cifsd-cfid-laundromat in"D" state)
  2023-10-23 14:07                                     ` Paulo Alcantara
@ 2023-10-24  2:51                                       ` Steve French
  0 siblings, 0 replies; 25+ messages in thread
From: Steve French @ 2023-10-24  2:51 UTC (permalink / raw)
  To: Paulo Alcantara
  Cc: Dr. Bernd Feige, matoro_mailinglist_kernel, linux-kernel,
	linux-cifs, bagasdotme, paul, tom, ronniesahlberg, bharathsm,
	regressions, brian.pardy, nspmangalore

We probably want this one as well ...

commit 2da338ff752a2789470d733111a5241f30026675
Author: Steve French <stfrench@microsoft.com>
Date:   Tue Sep 19 11:35:53 2023 -0500

    smb3: do not start laundromat thread when dir leases
     disabled

    When no directory lease support, or for IPC shares where directories
    can not be opened, do not start an unneeded laundromat thread for
    that mount (it wastes resources).

    Fixes: d14de8067e3f ("cifs: Add a laundromat thread for cached directories")
    Reviewed-by: Paulo Alcantara (SUSE) <pc@manguebit.com>
    Acked-by: Tom Talpey <tom@talpey.com>
    Signed-off-by: Steve French <stfrench@microsoft.com>



Any objections to adding that one to the list as well?  The patches
all seem to apply fine to current 6.5-stable-rc

On Mon, Oct 23, 2023 at 9:07 AM Paulo Alcantara <pc@manguebit.com> wrote:
>
> "Dr. Bernd Feige" <bernd.feige@uniklinik-freiburg.de> writes:
>
> > I can confirm that the patches apply cleanly on 6.5.8 and help a lot
> > with the issue here (vers=3.1.1, gentoo client, MS AD server with DFS).
>
> Thanks for testing it.
>
> Steve, I would suggest below commits for v6.5.y
>
>         238b351d0935 ("smb3: allow controlling length of time directory entries are cached with dir leases")
>         6a50d71d0fff ("smb3: allow controlling maximum number of cached directories")
>         2da338ff752a ("smb3: do not start laundromat thread when dir leases  disabled")
>         e95f3f744650 ("smb: client: make laundromat a delayed worker")
>         81ba10959970 ("smb: client: prevent new fids from being removed by laundromat")
>
> If OK, please ask stable team to pick those up.



-- 
Thanks,

Steve

^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2023-10-24  2:51 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-09-05 17:09 Possible bug report: kernel 6.5.0/6.5.1 high load when CIFS share is mounted (cifsd-cfid-laundromat in"D" state) Brian Pardy
2023-09-06  1:01 ` Bagas Sanjaya
2023-09-06 21:03   ` Brian Pardy
2023-09-07  3:40     ` Linux regression tracking #update (Thorsten Leemhuis)
2023-09-18 22:55     ` Brian Pardy
2023-09-19  3:00       ` Steve French
2023-09-19  5:36         ` Steve French
2023-09-19 13:21           ` Brian Pardy
2023-09-19 16:38             ` Steve French
2023-09-19 17:42               ` Brian Pardy
2023-09-19 18:06               ` Tom Talpey
2023-09-19 18:23                 ` Steve French
2023-09-27  0:54                   ` Paul Aurich
2023-10-05  9:55                     ` Dr. Bernd Feige
2023-10-05  9:59                       ` Steve French
2023-10-13 23:19                       ` matoro
2023-10-13 23:52                         ` Paulo Alcantara
2023-10-14  0:01                           ` Paulo Alcantara
     [not found]                             ` <CAH2r5mse_2sfXF+tdTmie5LLtBuc+6DOumDH3rn=5V24yhrYVQ@mail.gmail.com>
2023-10-14  0:59                               ` matoro
2023-10-16 18:41                                 ` Paulo Alcantara
2023-10-20 14:37                                   ` Dr. Bernd Feige
2023-10-23 14:07                                     ` Paulo Alcantara
2023-10-24  2:51                                       ` Steve French
2023-09-19  3:01       ` Steve French
2023-10-20 10:40   ` Linux regression tracking #update (Thorsten Leemhuis)

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.