linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Nigel Banks <Nigel_Banks@waters.com>
To: linux-fsdevel@vger.kernel.org
Subject: Re: Deadlock in fsnotify for
Date: Mon, 1 Oct 2018 12:09:50 +0100	[thread overview]
Message-ID: <OF3A517E0C.437F223F-ON80258319.003D4EFF-80258319.003D5350@waters.com> (raw)
In-Reply-To: <OF184949B9.0B312CF1-ON80258319.003B7FFA-80258319.003BA9B9@LocalDomain>

Nigel Banks/Waters wrote on 10/01/2018 11:51:40 AM:

> From: Nigel Banks/Waters
> To: linux-fsdevel@vger.kernel.org
> Date: 10/01/2018 11:51 AM
> Subject: Re: Deadlock in fsnotify for
> 
> Sorry for the repeated messages, my work email client is a bit clunky.
> 
> Seems like the attachment couldn't be sent along to linux-fsdevel so
> I'm resending the email with a link to the kern.log
> 
> https://gist.github.com/nigelgbanks/a38143b6f16be14026637efc0c362d3a
> 
> From: Nigel Banks/Waters
> To: Jan Kara <jack@suse.cz>
> Cc: Amir Goldstein <amir73il@gmail.com>, linux-fsdevel@vger.kernel.org
> Date: 10/01/2018 11:47 AM
> Subject: Re: Deadlock in fsnotify for
> 
> Seems like the attachment couldn't be sent along to linux-fsdevel so
> I'm resending the email with a link to the kern.log
> 
> https://gist.github.com/nigelgbanks/a38143b6f16be14026637efc0c362d3a
> 
> From: Nigel Banks/Waters
> To: Jan Kara <jack@suse.cz>
> Cc: Amir Goldstein <amir73il@gmail.com>, linux-fsdevel@vger.kernel.org
> Date: 09/28/2018 02:31 PM
> Subject: Re: Deadlock in fsnotify for
> 
> Hello Again,
> 
> I've attached the kern.log as you instructed, please let me know if 
> there is any more information I can provide.
> 
> [attachment "kern.log" deleted by Nigel Banks/Waters] 
> 
> Cheers,
> 
> Nigel
> 
> From: Jan Kara <jack@suse.cz>
> To: Nigel Banks <Nigel_Banks@waters.com>
> Cc: jack@suse.cz, linux-fsdevel@vger.kernel.org, Amir Goldstein 
> <amir73il@gmail.com>
> Date: 09/27/2018 05:24 PM
> Subject: Re: Deadlock in fsnotify for
> 
> Hello,
> 
> [added to CC other relevant mails]
> 
> On Thu 27-09-18 16:44:53, Nigel Banks wrote:
> > Sorry to trouble you, but from looking through the git history of 
linux/fs/
> > notify you seem to be the best person to contact.
> > 
> > I've encounter a hard to reproduce situation that happens on our CI
> > servers, in which it becomes impossible to release any inotify file
> > descriptors. We're currently running Ubuntu 18.04 (Kernel 4.15) using
> > ext4 fs, and our code is running in docker containers (overlay2) if 
that
> > makes a difference.
> > 
> > Essentially we're running a number of concurrent tests which 
internally
> > use inotify to monitor some directories this all works fine and they
> > clean up after themselves, but after several days there will be a
> > deadlock in the kernel code (sys stack below):
> > 
> > [<0>] flush_work+0x126/0x1e0
> > [<0>] flush_delayed_work+0x3f/0x50
> > [<0>] fsnotify_wait_marks_destroyed+0x15/0x20
> > [<0>] fsnotify_destroy_group+0x48/0xd0
> > [<0>] inotify_release+0x1e/0x50
> > [<0>] __fput+0xea/0x220
> > [<0>] ____fput+0xe/0x10
> > [<0>] task_work_run+0x9d/0xc0
> > [<0>] exit_to_usermode_loop+0xc0/0xd0
> > [<0>] do_syscall_64+0x115/0x130
> > [<0>] entry_SYSCALL_64_after_hwframe+0x3d/0xa2
> > [<0>] 0xffffffffffffffff
> 
> Hum, I don't remember seeing any deadlock like this. When a system hangs
> like this, can you please do:
> 
> echo w >/proc/sysrq-trigger
> 
> and send me the output of 'dmesg' command after that. In that output we
> should see all hung tasks (including kernel threads) and their traces 
and
> hopefully it will tell us more.
> 
> > Once a processes gets stuck in this uninterruptable sleep it will 
> never wake.
> > At this point the system is still usable, we're able to create more 
inotify
> > instances and receive messages for them, but we are not able to close 
any of
> > them. So eventually we run out of handles and the system becomes 
> unstable, not
> > to mention we can't run any more tests on the machine at this point, 
and a
> > reboot is required.
> 
> Yes, this is expected. I looks like some deadlock in the fsnotify
> subsystem.
> 
> > From my research, it looks like lxc project has also encountered this 
issue:
> > https://github.com/lxc/lxc/issues/2456, like them we also didn't 
experience
> > this behaviour with our previous set-up Ubuntu 16.04 (Kernel 14.04).
> > 
> > I had a look through the bug lists and through the commit history 
> for linux/fs/
> > notify and could not find this issue listed anywhere.
> > 
> > I've  attempted to write a small C program using pthreads and the 
inotify
> > sys-calls, but was unable to create a program that could reproduce
> this issue.
> 
> Thanks for report.
> 
>                         Honza
> -- 
> Jan Kara <jack@suse.com>
> SUSE Labs, CR


===========================================================
The information in this email is confidential, and is intended solely for the addressee(s). 
Access to this email by anyone else is unauthorized and therefore prohibited.  If you are 
not the intended recipient you are notified that disclosing, copying, distributing or taking 
any action in reliance on the contents of this information is strictly prohibited and may be unlawful.
===========================================================

      parent reply	other threads:[~2018-10-01 17:47 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <OFAA2B5164.CC89CD80-ON80258315.0053848D-80258315.005681B2@waters.com>
2018-09-27 16:24 ` Deadlock in fsnotify for Jan Kara
     [not found]   ` <OFC1587FEF.00F5AC8A-ON80258316.004A23A8-80258316.004AAC30@waters.com>
2018-09-28 14:27     ` Amir Goldstein
     [not found]       ` <OFD621F3D0.4B122C7F-ON80258319.00395CDF-80258319.00399966@waters.com>
2018-10-01 14:09         ` Amir Goldstein
2018-10-01 10:25     ` Jan Kara
     [not found]   ` <OFC1587FEF.00F5AC8A-ON80258316.004A23A8-80258316.004A4100@LocalDomain>
     [not found]     ` <OFBCEABA83.BA7DAF1C-ON80258319.003B1DB4-80258319.003B4841@LocalDomain>
     [not found]       ` <OF184949B9.0B312CF1-ON80258319.003B7FFA-80258319.003BA9B9@LocalDomain>
2018-10-01 11:09         ` Nigel Banks [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=OF3A517E0C.437F223F-ON80258319.003D4EFF-80258319.003D5350@waters.com \
    --to=nigel_banks@waters.com \
    --cc=linux-fsdevel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).