linux-cifs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Aurélien Aptel" <aaptel@suse.com>
To: Martijn de Gouw <martijn.de.gouw@prodrive-technologies.com>,
	linux-cifs@vger.kernel.org
Subject: Re: Many processes end up in uninterruptible sleep accessing cifs mounts
Date: Thu, 04 Jul 2019 13:22:21 +0200	[thread overview]
Message-ID: <8736jmxcwi.fsf@suse.com> (raw)
In-Reply-To: <1fc4f6d0-6cdc-69a5-4359-23484d6bdfc9@prodrive-technologies.com>

Martijn de Gouw <martijn.de.gouw@prodrive-technologies.com> writes:
>> Are there any kernel oops/panic with stack traces and register dumps in
>> the log?
>> 
>> You can inspect the kernel stack trace of the hung processes (to see where
>> they are stuck) by printing the file /proc/<pid>/stack.
>
> These are the stacks of all processes that are D, most of them being df.
> I also attached the cifs Stats output below.

Ok thanks. What about Oops or BUG or panic in dmesg logs, did you see
any?

The individual stack dumps are pretty useful. Here is my theory:

> pid: 9505
> syscall: 4 0x56550a2ec470 0x7ffede42e9a0 0x7ffede42e9a0 0x83a 0x3 0x20 
> 0x7ffede42e8f8 0x7f7f8928f295
> [<0>] open_shroot+0x43/0x200 [cifs]
> [<0>] smb2_query_path_info+0x93/0x220 [cifs]

Almost all of the processes have the same stack trace. They are stuck at
open_shroot()+0x43 which is probably

    mutex_lock(&tcon->crfid.fid_mutex); 

Then there are only 2 other processes stuck somewhere in the same code path
(open_shroot) but deeper, meaning they have the locks that the other
processes are waiting for:


> pid: 22858
> syscall: 4 0x564b46285d10 0x7ffcea3f9a80 0x7ffcea3f9a80 0x83a 0x3 0x20 
> 0x7ffcea3f99d8 0x7f6cc78c7295
> [<0>] cifs_mark_open_files_invalid+0x54/0xa0 [cifs]
> [<0>] smb2_reconnect+0x2d6/0x4b0 [cifs]
> [<0>] smb2_plain_req_init+0x30/0x240 [cifs]
> [<0>] SMB2_open_init+0x6d/0x7c0 [cifs]
> [<0>] SMB2_open+0x150/0x520 [cifs]
> [<0>] open_shroot+0x12f/0x200 [cifs]
> [<0>] smb2_query_path_info+0x93/0x220 [cifs]
> [<0>] cifs_get_inode_info+0x580/0xb10 [cifs]
> [<0>] cifs_revalidate_dentry_attr+0xdc/0x3e0 [cifs]
> [<0>] cifs_getattr+0x5b/0x1b0 [cifs]
> [<0>] vfs_statx+0x89/0xe0
> [<0>] __do_sys_newstat+0x39/0x70
> [<0>] do_syscall_64+0x55/0x100
> [<0>] entry_SYSCALL_64_after_hwframe+0x44/0xa9
> [<0>] 0xffffffffffffffff


> pid: 20027
> syscall: 4 0x55a3c7d767d0 0x7ffe51432ab0 0x7ffe51432ab0 0x83a 
> 0x55a3c7d75c40 0x20 0x7ffe51432a08 0x7f5f7c4e7295
> [<0>] cifs_mark_open_files_invalid+0x54/0xa0 [cifs]
> [<0>] smb2_reconnect+0x2d6/0x4b0 [cifs]
> [<0>] smb2_plain_req_init+0x30/0x240 [cifs]
> [<0>] SMB2_open_init+0x6d/0x7c0 [cifs]
> [<0>] SMB2_open+0x150/0x520 [cifs]
> [<0>] open_shroot+0x12f/0x200 [cifs]
> [<0>] smb2_query_path_info+0x93/0x220 [cifs]
> [<0>] cifs_get_inode_info+0x580/0xb10 [cifs]
> [<0>] cifs_revalidate_dentry_attr+0xdc/0x3e0 [cifs]
> [<0>] cifs_getattr+0x5b/0x1b0 [cifs]
> [<0>] vfs_statx+0x89/0xe0
> [<0>] __do_sys_newstat+0x39/0x70
> [<0>] do_syscall_64+0x55/0x100
> [<0>] entry_SYSCALL_64_after_hwframe+0x44/0xa9
> [<0>] 0xffffffffffffffff

Due to timeouts maybe the Open request needs
to reconnect the server/ses/tcon and to do this it calls
cifs_mark_open_files_invalid() and gets stuck somewhere there.

        spin_lock(&tcon->open_file_lock);
        list_for_each_safe(tmp, tmp1, &tcon->openFileList) {
                open_file = list_entry(tmp, struct cifsFileInfo, tlist);
                open_file->invalidHandle = true;
                open_file->oplock_break_cancelled = true;
        }
        spin_unlock(&tcon->open_file_lock);

        mutex_lock(&tcon->crfid.fid_mutex); <=== most likely here
        tcon->crfid.is_valid = false;
        memset(tcon->crfid.fid, 0, sizeof(struct cifs_fid));
        mutex_unlock(&tcon->crfid.fid_mutex);

I think these processes are trying to lock the same lock twice: one in
open_shroot() and since Open ends up having to reconnect, once again in
mark_open_files_invalid(). I think it's the same lock because I don't
see why the tcon pointers would be different in those 2 spots. Kernel
mutexes are not reentrant so this is a deadlock.

Cheers,
-- 
Aurélien Aptel / SUSE Labs Samba Team
GPG: 1839 CB5F 9F5B FB9B AA97  8C99 03C8 A49B 521B D5D3
SUSE Linux GmbH, Maxfeldstraße 5, 90409 Nürnberg, Germany
GF: Felix Imendörffer, Mary Higgins, Sri Rasiah HRB 21284 (AG Nürnberg)

  parent reply	other threads:[~2019-07-04 11:22 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-07-03 18:12 Many processes end up in uninterruptible sleep accessing cifs mounts Martijn de Gouw
2019-07-03 19:17 ` Aurélien Aptel
2019-07-03 19:24   ` Steve French
     [not found]   ` <1fc4f6d0-6cdc-69a5-4359-23484d6bdfc9@prodrive-technologies.com>
2019-07-04 11:22     ` Aurélien Aptel [this message]
2019-07-04 13:36       ` Martijn de Gouw
2019-07-04 21:34         ` Pavel Shilovsky
2019-07-06 11:11           ` Martijn de Gouw
2019-07-06 13:05             ` Steve French
2019-07-06 16:40               ` Pavel Shilovsky
2019-07-09  8:48                 ` Martijn de Gouw
2019-07-15 23:40                   ` Pavel Shilovsky
2019-08-07  7:34                     ` Martijn de Gouw
2019-07-08  9:04               ` Martijn de Gouw
2019-07-08 11:06                 ` Aurélien Aptel
2019-07-08 13:18                   ` Martijn de Gouw
2019-07-04 21:08       ` Pavel Shilovsky

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=8736jmxcwi.fsf@suse.com \
    --to=aaptel@suse.com \
    --cc=linux-cifs@vger.kernel.org \
    --cc=martijn.de.gouw@prodrive-technologies.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).