All of lore.kernel.org
 help / color / mirror / Atom feed
* ksmbd threads eating masses of cputime
@ 2022-05-31 21:05 David Howells
  2022-06-01  0:56 ` Namjae Jeon
  2022-06-01  8:40 ` David Howells
  0 siblings, 2 replies; 4+ messages in thread
From: David Howells @ 2022-05-31 21:05 UTC (permalink / raw)
  To: Namjae Jeon; +Cc: dhowells, Steve French, CIFS

Hi Namjae,

Steve says I should show this to you.

My server box that I'm using to do cifs-over-RDMA testing is running really
slowly because it has about 30 ksmbd thread hogging the cpus:

    PID USER    PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND
  19993 root    20   0       0      0      0 R  14.3   0.0 910:06.02 ksmbd:r5445
  20048 root    20   0       0      0      0 R  14.3   0.0 896:19.22 ksmbd:r5445
  20052 root    20   0       0      0      0 R  14.3   0.0 901:51.52 ksmbd:r5445
  20053 root    20   0       0      0      0 R  14.3   0.0 904:20.84 ksmbd:r5445
  20056 root    20   0       0      0      0 R  14.3   0.0 910:39.38 ksmbd:r5445
  20095 root    20   0       0      0      0 R  14.3   0.0 901:28.48 ksmbd:r5445
  20097 root    20   0       0      0      0 R  14.3   0.0 910:02.19 ksmbd:r5445
  20103 root    20   0       0      0      0 R  14.3   0.0 912:13.18 ksmbd:r5445
  20105 root    20   0       0      0      0 R  14.3   0.0 908:46.76 ksmbd:r5445
  ...


I tried to shut them down with "ksmbd.control -s", but that just hung and the
threads are still running.  I captured a stack trace from one of them through
/proc:

	[root@carina ~]# cat /proc/20052/stack
	[<0>] ksmbd_conn_handler_loop+0x181/0x200 [ksmbd]
	[<0>] kthread+0xe8/0x110
	[<0>] ret_from_fork+0x22/0x30

Note that nothing is currently mounted from the server and it is getting no
incoming packets.

Looking at the loop in ksmbd_conn_handler_loop(), it seems to be busy-waiting
- unless kernel_recvmsg() is doing that?  In the TCP transport, if
kernel_recvmsg() isn't waiting, but returns -EAGAIN, it will sleep for 1-2ms
and then go round again... and again... and again - and all 30 threads would
be doing that.


Btw in:

		ret = kernel_accept(iface->ksmbd_socket, &client_sk,
				    O_NONBLOCK);

that should be SOCK_NONBLOCK, I think.

Also:

	[root@carina ~]# ksmbd.control --shutdown
	Usage: ksmbd.control
		-s | --shutdown
	...

that looks like it doesn't handle the advertised long parameters.


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: ksmbd threads eating masses of cputime
  2022-05-31 21:05 ksmbd threads eating masses of cputime David Howells
@ 2022-06-01  0:56 ` Namjae Jeon
  2022-06-01  8:40 ` David Howells
  1 sibling, 0 replies; 4+ messages in thread
From: Namjae Jeon @ 2022-06-01  0:56 UTC (permalink / raw)
  To: David Howells; +Cc: Steve French, CIFS

2022-06-01 6:05 GMT+09:00, David Howells <dhowells@redhat.com>:
> Hi Namjae,
>
> Steve says I should show this to you.
>
> My server box that I'm using to do cifs-over-RDMA testing is running really
> slowly because it has about 30 ksmbd thread hogging the cpus:
>
>     PID USER    PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+
> COMMAND
>   19993 root    20   0       0      0      0 R  14.3   0.0 910:06.02
> ksmbd:r5445
>   20048 root    20   0       0      0      0 R  14.3   0.0 896:19.22
> ksmbd:r5445
>   20052 root    20   0       0      0      0 R  14.3   0.0 901:51.52
> ksmbd:r5445
>   20053 root    20   0       0      0      0 R  14.3   0.0 904:20.84
> ksmbd:r5445
>   20056 root    20   0       0      0      0 R  14.3   0.0 910:39.38
> ksmbd:r5445
>   20095 root    20   0       0      0      0 R  14.3   0.0 901:28.48
> ksmbd:r5445
>   20097 root    20   0       0      0      0 R  14.3   0.0 910:02.19
> ksmbd:r5445
>   20103 root    20   0       0      0      0 R  14.3   0.0 912:13.18
> ksmbd:r5445
>   20105 root    20   0       0      0      0 R  14.3   0.0 908:46.76
> ksmbd:r5445
>   ...
>
>
> I tried to shut them down with "ksmbd.control -s", but that just hung and
> the
> threads are still running.  I captured a stack trace from one of them
> through
> /proc:
>
> 	[root@carina ~]# cat /proc/20052/stack
> 	[<0>] ksmbd_conn_handler_loop+0x181/0x200 [ksmbd]
> 	[<0>] kthread+0xe8/0x110
> 	[<0>] ret_from_fork+0x22/0x30
>
> Note that nothing is currently mounted from the server and it is getting no
> incoming packets.
Okay, How do you reproduce this problem ? Did you run xfsftests
against ksmbd RDMA ?
>
> Looking at the loop in ksmbd_conn_handler_loop(), it seems to be
> busy-waiting
> - unless kernel_recvmsg() is doing that?  In the TCP transport, if
> kernel_recvmsg() isn't waiting, but returns -EAGAIN, it will sleep for
> 1-2ms
> and then go round again... and again... and again - and all 30 threads
> would
> be doing that.
Okay, we need to add maximum retry count for that case.
but when I check kernel thread name in your top message, It is RDMA connection.
So smb_direct_read() is used in ksmbd_conn_handler_loop().
I'd like to reproduce the problem to figure out where the problem is.
Can I try to reproduce it with soft-iWARP and xfstests?
>
>
> Btw in:
>
> 		ret = kernel_accept(iface->ksmbd_socket, &client_sk,
> 				    O_NONBLOCK);
>
> that should be SOCK_NONBLOCK, I think.
Ah, I found that normally it is O_NONBLOCK but there are different
value for some arch.
I will change it. Thanks for pointing out:)

/include/linux/net.h
#ifndef SOCK_NONBLOCK
#define SOCK_NONBLOCK	O_NONBLOCK
#endif

/arch/alpha/include/asm/socket.h
#define SOCK_NONBLOCK	0x40000000

>
> Also:
>
> 	[root@carina ~]# ksmbd.control --shutdown
> 	Usage: ksmbd.control
> 		-s | --shutdown
> 	...
>
> that looks like it doesn't handle the advertised long parameters.
I will fix it:)

Thanks!
>
>

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: ksmbd threads eating masses of cputime
  2022-05-31 21:05 ksmbd threads eating masses of cputime David Howells
  2022-06-01  0:56 ` Namjae Jeon
@ 2022-06-01  8:40 ` David Howells
  2022-06-02 23:50   ` Namjae Jeon
  1 sibling, 1 reply; 4+ messages in thread
From: David Howells @ 2022-06-01  8:40 UTC (permalink / raw)
  To: Namjae Jeon; +Cc: dhowells, Steve French, CIFS

Namjae Jeon <linkinjeon@kernel.org> wrote:

> Okay, How do you reproduce this problem ? Did you run xfsftests
> against ksmbd RDMA ?

Yeah - I've been making sure my cifs filesystem changes work with RDMA.
There've been a lot of connections that haven't been taken down cleanly, due
to oopses, lockups and stuff.

One thing that could be useful is, say, /proc/fs/ksmbd/

> Okay, we need to add maximum retry count for that case.
> but when I check kernel thread name in your top message, It is RDMA connection.
> So smb_direct_read() is used in ksmbd_conn_handler_loop().
> I'd like to reproduce the problem to figure out where the problem is.
> Can I try to reproduce it with soft-iWARP and xfstests?

Note that I only noticed the issue when I switched to working on another
filesystem and found that performance was unexpectedly down by 80%.

I was using softRoCE, though it may well be causable with softIWarp also,
since that's not really a detail visible to cifs/ksmbd, I think.

I've just had a quick go at trying to reproduce this, hard-resetting the test
client in the middle of performing an xfstest run, but it didn't seem to cause
the single ksmbd:r5445 thread to explode.

David


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: ksmbd threads eating masses of cputime
  2022-06-01  8:40 ` David Howells
@ 2022-06-02 23:50   ` Namjae Jeon
  0 siblings, 0 replies; 4+ messages in thread
From: Namjae Jeon @ 2022-06-02 23:50 UTC (permalink / raw)
  To: David Howells; +Cc: Steve French, CIFS

2022-06-01 17:40 GMT+09:00, David Howells <dhowells@redhat.com>:
> Namjae Jeon <linkinjeon@kernel.org> wrote:
>
>> Okay, How do you reproduce this problem ? Did you run xfsftests
>> against ksmbd RDMA ?
>
> Yeah - I've been making sure my cifs filesystem changes work with RDMA.
> There've been a lot of connections that haven't been taken down cleanly,
> due
> to oopses, lockups and stuff.
>
> One thing that could be useful is, say, /proc/fs/ksmbd/
>
>> Okay, we need to add maximum retry count for that case.
>> but when I check kernel thread name in your top message, It is RDMA
>> connection.
>> So smb_direct_read() is used in ksmbd_conn_handler_loop().
>> I'd like to reproduce the problem to figure out where the problem is.
>> Can I try to reproduce it with soft-iWARP and xfstests?
>
> Note that I only noticed the issue when I switched to working on another
> filesystem and found that performance was unexpectedly down by 80%.
>
> I was using softRoCE, though it may well be causable with softIWarp also,
> since that's not really a detail visible to cifs/ksmbd, I think.
>
> I've just had a quick go at trying to reproduce this, hard-resetting the
> test
> client in the middle of performing an xfstest run, but it didn't seem to
> cause
> the single ksmbd:r5445 thread to explode.
Thanks for your check!
We also try to reproduce it but can't reproduce it yet. Let's check
whether an infinite loop can occur in smb_direct_read().

>
> David
>
>

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2022-06-02 23:50 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-05-31 21:05 ksmbd threads eating masses of cputime David Howells
2022-06-01  0:56 ` Namjae Jeon
2022-06-01  8:40 ` David Howells
2022-06-02 23:50   ` Namjae Jeon

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.