All of lore.kernel.org
 help / color / mirror / Atom feed
From: Toshiyuki Okajima <toshi.okajima@jp.fujitsu.com>
To: Jan Kara <jack@suse.cz>
Cc: Ted Ts'o <tytso@mit.edu>,
	Masayoshi MIZUMA <m.mizuma@jp.fujitsu.com>,
	Andreas Dilger <adilger.kernel@dilger.ca>,
	linux-ext4@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	sandeen@redhat.com
Subject: Re: [RFC][PATCH] Re: [BUG] ext4: cannot unfreeze a filesystem due to a deadlock
Date: Wed, 06 Apr 2011 16:40:15 +0900	[thread overview]
Message-ID: <4D9C18DF.90803@jp.fujitsu.com> (raw)
In-Reply-To: <20110406055708.GB23285@quack.suse.cz>

Hi.

(2011/04/06 14:57), Jan Kara wrote:
> On Wed 06-04-11 14:09:14, Toshiyuki Okajima wrote:
>> (2011/04/06 7:54), Jan Kara wrote:
>>> On Tue 05-04-11 19:25:44, Toshiyuki Okajima wrote:
>>>> (2011/03/31 21:03), Toshiyuki Okajima wrote:
>>>>> Hi, thanks for your reviewing.
>>>>>
>>>>> (2011/03/30 23:12), Jan Kara wrote:
>>>>>> Hello,
>>>>>>
>>>>>> On Mon 28-03-11 17:06:28, Toshiyuki Okajima wrote:
>>>>>>> On Thu, 17 Feb 2011 11:45:52 +0100
>>>>>>> Jan Kara<jack@suse.cz>   wrote:
>>>>>>>> On Thu 17-02-11 12:50:51, Toshiyuki Okajima wrote:
>>>>>>>>> (2011/02/16 23:56), Jan Kara wrote:
>>>>>>>>>> On Wed 16-02-11 08:17:46, Toshiyuki Okajima wrote:
>>>>>>>>>>> On Tue, 15 Feb 2011 18:29:54 +0100
>>>>>>>>>>> Jan Kara<jack@suse.cz>   wrote:
>>>>>>>>>>>> On Tue 15-02-11 12:03:52, Ted Ts'o wrote:
>>>>>>>>>>>>> On Tue, Feb 15, 2011 at 05:06:30PM +0100, Jan Kara wrote:
>>>>> <SNIP>
>>>>>>> I have deeply continued to examined the root cause of this problem, then
>>>>>>> I found it.
>>>>>>>
>>>>>>> It is that we can write a memory which is mmaped to a file. Then the memory
>>>>>>> becomes "DIRTY" so then the flusher thread (ex. wb_do_writeback) tries to
>>>>>>> "writeback" the memory.
>>>>>>>
>>>>>>> Therefore, the root cause of this hangup is not only ext4 component (with
>>>>>>> delayed allocation feature) but also writeback mechanism for mmap. If you
>>>>>>> use the other filesystem, you can write something to the filesystem though
>>>>>>> you have freezed the filesystem.
>>>>>
>>>>>> Well, you can write something only in the caches, not to the on disk
>>>>>> image. So it's not a problem as such.
>>>>> My reproducer uses the loopback device(/dev/loopX). By using it, I have confirmed that
>>>>> we can write in not only the caches but also the loopback device. However,
>>>>> I don't still confirm that we can write to the real device(/dev/sdaX).
>>>>>
>>>>>>
>>>>>>> A sample problem is attached on this mail. Try to execute it then you can
>>>>>>> confirm that we can write some data to your filesystem while freezing the
>>>>>>> filesystem.
>>>>>>> (If you change FS variable in go.sh from ext3 to ext4 and you execute
>>>>>>> "fsfreeze -u mnt" manually on other prompt, you can also confirm this deadlock.)
>>>>>>>
>>>>>>> I think the best approach to fix this problem is to let users not to write
>>>>>>> memory which is mapped to a certain file while the filesystem is freezing.
>>>>>>> However, it is very difficult to control users not to write memory which has
>>>>>>> been already mapped to the file.
>>>>>> It is actually possible. In case of ext4, you could add a check (+ wait)
>>>>>> in ext4_page_mkwrite() whether the filesystem is frozen or in the process
>>>>>> of being frozen and if so, wait for it to get unfrozen. The only tough
>>>>>> problem here might be the locking as ext4_page_mkwrite() is called with
>>>>>> mmap_sem held and I'm not sure we can take s_umount with mmap_sem held.
>>>>>> But you'd have to fix all filesystems (and all paths possibly creating
>>>>>> dirty data) in this way.
>>>>>>
>>>>>
>>>>>>> Therefore, I think there is only actual method that we stop writeback thread
>>>>>>> to resolve the mmap problem. Also, by this fix, the original problem
>>>>>>> (ext4 delayed write vs unfreeze) can be solved.
>>>>>> Hmm, I had a look at the code again and think we could fix the issue
>>>>>> cleanly (i.e. all possible users of s_umount) as follows: The lock
>>>>>> ordering will be
>>>>>> s_umount ->   "fs frozen"
>>>>>> and there will be a new mutex s_freeze_mutex protecting changes of
>>>>>> s_frozen.
>>>>>>
>>>>>> freeze_bdev() already observes this lock ordering, it will only take
>>>>>> s_freeze_mutex for the changes of s_frozen values. The only other code
>>>>>> that is relevant for the lock ordering is thaw_super() (the freezing
>>>>>> process is not expected to reenter kernel for the frozen filesystem).
>>>>>> In thaw_super() we could take s_freeze_mutex, do all the thawing work,
>>>>>> set s_frozen, release s_freeze_mutex and put superblock reference.
>>>>>>
>>>>>
>>>>>> So something like the patch below - it seems to work for me, can you test
>>>>>> it please?
>>>>> I think your patch looks good, so, the original problem seems to be solved.
>>>>> OK, I will test your patch.
>>>>> This weekend I cannot test it. So, I will reply next week.
>>>> I have tested whether Mizuma-san's reproducer can cause to deadlock with your
>>>> patch. And then any problems didn't hit while the reproducer was running.
>>>>
>>>> I think your patch solves the original deadlock problem which is reported by
>>>> Mizuma-san.
>>>    Good. Thanks.
>>>
>>>>> Reported-by: Toshiyuki Okajima<toshi.okajima@jp.fujitsu.com>
>>>>> Signed-off-by: Jan Kara<jack@suse.cz>
>>>>> ---
>>>>> fs/super.c         |   40 ++++++++++++++++++++++++++++++++++------
>>>>> include/linux/fs.h |    1 +
>>>>> 2 files changed, 35 insertions(+), 6 deletions(-)
>>>>
>>
>>>> However, I think a write which causes the deadlock is from mmapped dirty
>>>> pages. So, I guess we also need to fix in the mmap path while fsfreezing.
>>>    Why? If you dirty a page, writeback thread can come and try to write it -
>>> which blocks - but now that does not matter...

>> I have not understood the code around writeback thread very much...
>> Please explain me the concrete function name which blocks some writes?
>    It would block in ext4_da_writepages() function.
In ext4 with delayed allocation case, I understand it blocks.
(Original deadlock problem is just this case.)
But in ext4 without delayed allocation or other filesystems case, which function
can block writing?

>
>> Mizuma-san's reproducer also writes the data which maps to the file (mmap).
>> The original problem happens after the fsfreeze operation is done.
>> I understand the normal write operation (not mmap) can be blocked while
>> fsfreezing. So, I guess we don't always block all the write operation
>> while fsfreezing.
>    Technically speaking, we block all the transaction starts which means we
> end up blocking all the writes from going to disk. But that does not mean
> we block all the writes from going to in-memory cache - as you properly
> note the mmap case is one of such exceptions.
Hm, I also think we can allow the writes to in-memory cache but we can't allow
the writes to disk while fsfreezing. I am considering that mmap path can
write to disk while fsfreezing because this deadlock problem happens after
fsfreeze operation is done...

Thanks,
Toshiyuki Okajima


  reply	other threads:[~2011-04-06  7:40 UTC|newest]

Thread overview: 121+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-02-07 11:53 [BUG] ext4: cannot unfreeze a filesystem due to a deadlock Masayoshi MIZUMA
2011-02-15 16:06 ` Jan Kara
2011-02-15 17:03   ` Ted Ts'o
2011-02-15 17:29     ` Jan Kara
2011-02-15 18:04       ` Ted Ts'o
2011-02-15 19:11         ` Jan Kara
2011-02-15 23:17       ` Toshiyuki Okajima
2011-02-16 14:56         ` Jan Kara
2011-02-17  3:50           ` Toshiyuki Okajima
2011-02-17  5:13             ` Andreas Dilger
2011-02-17 10:41               ` Jan Kara
2011-02-17 10:45             ` Jan Kara
2011-03-28  8:06               ` [RFC][PATCH] " Toshiyuki Okajima
2011-03-30 14:12                 ` Jan Kara
2011-03-31  8:37                   ` Yongqiang Yang
2011-03-31  8:48                     ` Yongqiang Yang
2011-03-31 14:04                     ` Eric Sandeen
2011-03-31 14:36                       ` Yongqiang Yang
2011-03-31 15:25                         ` Eric Sandeen
2011-03-31 16:28                         ` Jan Kara
2011-03-31 12:03                   ` Toshiyuki Okajima
2011-04-05 10:25                     ` Toshiyuki Okajima
2011-04-05 22:54                       ` Jan Kara
2011-04-06  5:09                         ` Toshiyuki Okajima
2011-04-06  5:57                           ` Jan Kara
2011-04-06  7:40                             ` Toshiyuki Okajima [this message]
2011-04-06 17:46                               ` Jan Kara
2011-04-15 13:39                                 ` Toshiyuki Okajima
2011-04-15 17:13                                   ` Jan Kara
2011-04-15 17:17                                     ` Eric Sandeen
2011-04-15 17:37                                       ` Jan Kara
2011-04-18  9:05                                     ` Toshiyuki Okajima
2011-04-18 10:51                                       ` Jan Kara
2011-04-19  9:43                                         ` Toshiyuki Okajima
2011-04-22  6:58                                           ` Toshiyuki Okajima
2011-04-22 21:26                                             ` Peter M. Petrakis
2011-04-22 21:40                                               ` Jan Kara
2011-04-22 22:57                                                 ` Peter M. Petrakis
2011-04-22 22:10                                             ` Jan Kara
2011-04-25  6:28                                               ` Toshiyuki Okajima
2011-05-03  8:06                                                 ` Surbhi Palande
2011-05-03 11:01                                       ` Surbhi Palande
2011-05-03 13:08                                         ` (unknown), Surbhi Palande
2011-05-03 13:46                                           ` your mail Jan Kara
2011-05-03 13:56                                             ` Surbhi Palande
2011-05-03 15:26                                               ` Surbhi Palande
2011-05-03 15:36                                               ` Jan Kara
2011-05-03 15:43                                                 ` Surbhi Palande
2011-05-04 19:24                                                   ` Jan Kara
2011-05-06 15:20                                                     ` [RFC][PATCH] Do not accept a new handle when the F.S is frozen Surbhi Palande
2011-05-06 15:20                                                     ` [PATCH] Adding support to freeze and unfreeze a journal Surbhi Palande
2011-05-06 20:56                                                       ` Andreas Dilger
2011-05-07 20:04                                                         ` [PATCH v2] " Surbhi Palande
2011-05-08  8:24                                                           ` Marco Stornelli
2011-05-09  9:04                                                             ` Surbhi Palande
2011-05-09  9:24                                                               ` Jan Kara
2011-05-09  9:53                                                           ` Jan Kara
2011-05-09 13:49                                                             ` Surbhi Palande
2011-05-09 14:51                                                               ` [PATCH v3] " Surbhi Palande
2011-05-09 15:08                                                                 ` Jan Kara
2011-05-10 15:07                                                                   ` [PATCH] " Surbhi Palande
2011-05-10 21:07                                                                     ` Andreas Dilger
2011-05-11  7:46                                                                       ` Surbhi Palande
2011-05-09 15:23                                                                 ` [PATCH v3] " Eric Sandeen
2011-05-11  7:06                                                                   ` Surbhi Palande
2011-05-11  7:10                                                                     ` [PATCH] Attempt to sync the fsstress writes to a frozen F.S Surbhi Palande
2011-05-12 14:22                                                                       ` Eric Sandeen
2011-05-12 14:22                                                                         ` Eric Sandeen
2011-05-24 21:42                                                                       ` Ted Ts'o
2011-05-25 12:00                                                                         ` Surbhi Palande
2011-05-25 12:12                                                                           ` Theodore Tso
2011-05-27 16:28                                                                             ` Jan Kara
2011-05-11  9:05                                                                     ` [PATCH v3] Adding support to freeze and unfreeze a journal Andreas Dilger
2011-05-12  9:40                                                                       ` Surbhi Palande
2011-05-03 13:08                                         ` [PATCH] Prevent dirtying a page when ext4 F.S is frozen Surbhi Palande
2011-05-03 15:19                                         ` [RFC][PATCH] Re: [BUG] ext4: cannot unfreeze a filesystem due to a deadlock Jan Kara
2011-05-04 12:09                                           ` Surbhi Palande
2011-05-04 19:19                                             ` Jan Kara
2011-05-04 21:34                                               ` Surbhi Palande
2011-05-04 22:48                                                 ` Jan Kara
2011-05-05  6:06                                                   ` Surbhi Palande
2011-05-05 11:18                                                     ` Jan Kara
2011-05-05 14:01                                                       ` Surbhi Palande
2011-03-31 23:40                 ` Dave Chinner
2011-03-31 23:53                   ` Eric Sandeen
2011-04-01 14:08                   ` Jan Kara
2011-04-06  5:40                     ` Dave Chinner
2011-04-06  6:18                       ` Jan Kara
2011-04-06 11:21                         ` Dave Chinner
2011-04-06 13:44                           ` Christoph Hellwig
2011-04-06 22:59                             ` Dave Chinner
2011-04-06 17:40                           ` Jan Kara
2011-04-06 22:54                             ` Dave Chinner
2011-04-08 21:33                               ` Jan Kara
2011-05-02  9:07                           ` Surbhi Palande
2011-05-02 10:56                             ` Jan Kara
2011-05-02 11:27                               ` Surbhi Palande
2011-05-02 12:06                                 ` Surbhi Palande
2011-05-02 12:20                                 ` Jan Kara
2011-05-02 12:30                                   ` Surbhi Palande
2011-05-02 13:16                                     ` Jan Kara
2011-05-02 13:22                                       ` Christoph Hellwig
2011-05-02 14:20                                         ` Jan Kara
2011-05-02 14:41                                           ` Christoph Hellwig
2011-05-02 16:23                                             ` Jan Kara
2011-05-02 16:38                                               ` Christoph Hellwig
2011-05-02 13:22                                       ` Surbhi Palande
2011-05-02 13:24                                         ` Christoph Hellwig
2011-05-02 13:27                                           ` Surbhi Palande
2011-05-02 14:26                                             ` Jan Kara
2011-05-02 14:04                                         ` Eric Sandeen
2011-05-03  7:27                                           ` Surbhi Palande
2011-05-03 20:14                                             ` Eric Sandeen
2011-05-04  8:26                                               ` Surbhi Palande
2011-05-04 14:30                                                 ` Eric Sandeen
2011-05-02 14:01                                     ` Eric Sandeen
2011-04-05 10:44                   ` Toshiyuki Okajima
2011-12-09  1:56 ` Masayoshi MIZUMA
2011-12-15 12:41   ` Masayoshi MIZUMA
2013-11-29  4:58     ` Yongqiang Yang
2013-11-29  8:00       ` Jan Kara

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4D9C18DF.90803@jp.fujitsu.com \
    --to=toshi.okajima@jp.fujitsu.com \
    --cc=adilger.kernel@dilger.ca \
    --cc=jack@suse.cz \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=m.mizuma@jp.fujitsu.com \
    --cc=sandeen@redhat.com \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.