All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jan Kara <jack@suse.cz>
To: Surbhi Palande <surbhi.palande@canonical.com>
Cc: Jan Kara <jack@suse.cz>,
	Toshiyuki Okajima <toshi.okajima@jp.fujitsu.com>,
	Ted Ts'o <tytso@mit.edu>,
	Masayoshi MIZUMA <m.mizuma@jp.fujitsu.com>,
	Andreas Dilger <adilger.kernel@dilger.ca>,
	linux-ext4@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	sandeen@redhat.com
Subject: Re: [RFC][PATCH] Re: [BUG] ext4: cannot unfreeze a filesystem due to a deadlock
Date: Wed, 4 May 2011 21:19:12 +0200	[thread overview]
Message-ID: <20110504191912.GB6968@quack.suse.cz> (raw)
In-Reply-To: <4DC14201.6090009@canonical.com>

On Wed 04-05-11 15:09:37, Surbhi Palande wrote:
> On 05/03/2011 06:19 PM, Jan Kara wrote:
> >On Tue 03-05-11 14:01:50, Surbhi Palande wrote:
> >>On 04/18/2011 12:05 PM, Toshiyuki Okajima wrote:
> >>>(2011/04/16 2:13), Jan Kara wrote:
> >>>>Hello,
> >>>>
> >>>>On Fri 15-04-11 22:39:07, Toshiyuki Okajima wrote:
> >>>>>>For ext3 or ext4 without delayed allocation we block inside writepage()
> >>>>>>function. But as I wrote to Dave Chinner, ->page_mkwrite() should
> >>>>>>probably
> >>>>>>get modified to block while minor-faulting the page on frozen fs
> >>>>>>because
> >>>>>>when blocks are already allocated we may skip starting a transaction
> >>>>>>and so
> >>>>>>we could possibly modify the filesystem.
> >>>>>OK. I think ->page_mkwrite() should also block writing the
> >>>>>minor-faulting pages.
> >>>>>
> >>>>>(minor-pagefault)
> >>>>>->  do_wp_page()
> >>>>>->  page_mkwrite(= ext4_mkwrite())
> >>>>>=>  BLOCK!
> >>>>>
> >>>>>(major-pagefault)
> >>>>>->  do_liner_fault()
> >>>>>->  page_mkwrite(= ext4_mkwrite())
> >>>>>=>  BLOCK!
> >>>>>
> >>>>>>
> >>>>>>>>>Mizuma-san's reproducer also writes the data which maps to the
> >>>>>>>>>file (mmap).
> >>>>>>>>>The original problem happens after the fsfreeze operation is done.
> >>>>>>>>>I understand the normal write operation (not mmap) can be blocked
> >>>>>>>>>while
> >>>>>>>>>fsfreezing. So, I guess we don't always block all the write
> >>>>>>>>>operation
> >>>>>>>>>while fsfreezing.
> >>>>>>>>Technically speaking, we block all the transaction starts which
> >>>>>>>>means we
> >>>>>>>>end up blocking all the writes from going to disk. But that does
> >>>>>>>>not mean
> >>>>>>>>we block all the writes from going to in-memory cache - as you
> >>>>>>>>properly
> >>>>>>>>note the mmap case is one of such exceptions.
> >>>>>>>Hm, I also think we can allow the writes to in-memory cache but we
> >>>>>>>can't allow
> >>>>>>>the writes to disk while fsfreezing. I am considering that mmap
> >>>>>>>path can
> >>>>>>>write to disk while fsfreezing because this deadlock problem
> >>>>>>>happens after
> >>>>>>>fsfreeze operation is done...
> >>>>>>I'm sorry I don't understand now - are you speaking about the case
> >>>>>>above
> >>>>>>when writepage() does not wait for filesystem being frozen or something
> >>>>>>else?
> >>>>>Sorry, I didn't understand around the page fault path.
> >>>>>So, I had read the kernel source code around it, then I maybe
> >>>>>understand...
> >>>>>
> >>>>>I worry whether we can update the file data in mmap case while
> >>>>>fsfreezing.
> >>>>>Of course, I understand that we can write to in-memory cache, and it
> >>>>>is not a
> >>>>>problem. However, if we can write to disk while fsfreezing, it is a
> >>>>>problem.
> >>>>>So, I summarize the cases whether we can write to disk or not.
> >>>>>
> >>>>>--------------------------------------------------------------------------
> >>>>>
> >>>>>Cases (Whether we can write the data mmapped to the file on the disk
> >>>>>while fsfreezing)
> >>>>>
> >>>>>[1] One of the page which has been mmapped is not bound. And
> >>>>>the page is not allocated yet. (major fault?)
> >>>>>
> >>>>>(1) user dirtys a page
> >>>>>(2) a page fault occurs (do_page_fault)
> >>>>>(3) __do_falut is called.
> >>>>>(4) ext4_page_mkwrite is called
> >>>>>(5) ext4_write_begin is called
> >>>>>(6) ext4_journal_start_sb =>  We can STOP!
> >>>>>
> >>>>>[2] One of the page which has been mmapped is not bound. But
> >>>>>the page is already allocated, and the buffer_heads of the page
> >>>>>are not mapped (BH_Mapped). (minor fault?)
> >>>>>
> >>>>>(1) user dirtys a page
> >>>>>(2) a page fault occurs (do_page_fault)
> >>>>>(3) do_wp_page is called.
> >>>>>(4) ext4_page_mkwrite is called
> >>>>>(5) ext4_write_begin is called
> >>>>>(6) ext4_journal_start_sb =>  We can STOP!
> >>
> >>What happens in the case as follows:
> >>
> >>Task 1: Mmapped writes
> >>t1)ext4_page_mkwrite()
> >>   t2) ext4_write_begin() (FS is thawed so we proceed)
> >>   t3) ext4_write_end() (journal is stopped now)
> >>-----Pre-empted-----
> >>
> >>
> >>Task 2: Freeze Task
> >>t4) freezes the super block...
> >>...(continues)....
> >>tn) the page cache is clean and the F.S is frozen. Freeze has
> >>completed execution.
> >>
> >>Task 1: Mmapped writes
> >>tn+1) ext4_page_mkwrite() returns 0.
> >>tn+2) __do_fault() gets control, code gets executed.
> >>tn+3) _do_fault() marks the page dirty if the intent is to write to
> >>a file based page which faulted.
> >>
> >>So you end up dirtying the page cache when the F.S is frozen? No?
> >   You are right ext4_page_mkrite() as currently implemented has problems.
> >You have to return the page locked (and check for frozen fs with page lock
> >held) to avoid races.
> >
> >If you check for frozen fs with page lock held, you are guaranteed that
> >freezing code must wait for the page to get unlocked before proceeding. And
> >before the page is unlocked, it is marked dirty by the pagefault code which
> >makes freezing code write the page and writeprotect it again. So everything
> >will be safe.
> For the locked page to be a part of the freeze initiated sync,
> should its owner inode not be dirtied? The page fault handler
> dirties the page, but who ensures that the inode is dirtied at this
> point?
  Follow the path from set_page_dirty() -> __set_page_dirty_buffers()
-> __set_page_dirty() -> __mark_inode_dirty(mapping->host, I_DIRTY_PAGES);

  More code reading would save you (and me) some typing ;).

								Honza
-- 
Jan Kara <jack@suse.cz>
SUSE Labs, CR

  reply	other threads:[~2011-05-04 19:19 UTC|newest]

Thread overview: 121+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-02-07 11:53 [BUG] ext4: cannot unfreeze a filesystem due to a deadlock Masayoshi MIZUMA
2011-02-15 16:06 ` Jan Kara
2011-02-15 17:03   ` Ted Ts'o
2011-02-15 17:29     ` Jan Kara
2011-02-15 18:04       ` Ted Ts'o
2011-02-15 19:11         ` Jan Kara
2011-02-15 23:17       ` Toshiyuki Okajima
2011-02-16 14:56         ` Jan Kara
2011-02-17  3:50           ` Toshiyuki Okajima
2011-02-17  5:13             ` Andreas Dilger
2011-02-17 10:41               ` Jan Kara
2011-02-17 10:45             ` Jan Kara
2011-03-28  8:06               ` [RFC][PATCH] " Toshiyuki Okajima
2011-03-30 14:12                 ` Jan Kara
2011-03-31  8:37                   ` Yongqiang Yang
2011-03-31  8:48                     ` Yongqiang Yang
2011-03-31 14:04                     ` Eric Sandeen
2011-03-31 14:36                       ` Yongqiang Yang
2011-03-31 15:25                         ` Eric Sandeen
2011-03-31 16:28                         ` Jan Kara
2011-03-31 12:03                   ` Toshiyuki Okajima
2011-04-05 10:25                     ` Toshiyuki Okajima
2011-04-05 22:54                       ` Jan Kara
2011-04-06  5:09                         ` Toshiyuki Okajima
2011-04-06  5:57                           ` Jan Kara
2011-04-06  7:40                             ` Toshiyuki Okajima
2011-04-06 17:46                               ` Jan Kara
2011-04-15 13:39                                 ` Toshiyuki Okajima
2011-04-15 17:13                                   ` Jan Kara
2011-04-15 17:17                                     ` Eric Sandeen
2011-04-15 17:37                                       ` Jan Kara
2011-04-18  9:05                                     ` Toshiyuki Okajima
2011-04-18 10:51                                       ` Jan Kara
2011-04-19  9:43                                         ` Toshiyuki Okajima
2011-04-22  6:58                                           ` Toshiyuki Okajima
2011-04-22 21:26                                             ` Peter M. Petrakis
2011-04-22 21:40                                               ` Jan Kara
2011-04-22 22:57                                                 ` Peter M. Petrakis
2011-04-22 22:10                                             ` Jan Kara
2011-04-25  6:28                                               ` Toshiyuki Okajima
2011-05-03  8:06                                                 ` Surbhi Palande
2011-05-03 11:01                                       ` Surbhi Palande
2011-05-03 13:08                                         ` (unknown), Surbhi Palande
2011-05-03 13:46                                           ` your mail Jan Kara
2011-05-03 13:56                                             ` Surbhi Palande
2011-05-03 15:26                                               ` Surbhi Palande
2011-05-03 15:36                                               ` Jan Kara
2011-05-03 15:43                                                 ` Surbhi Palande
2011-05-04 19:24                                                   ` Jan Kara
2011-05-06 15:20                                                     ` [RFC][PATCH] Do not accept a new handle when the F.S is frozen Surbhi Palande
2011-05-06 15:20                                                     ` [PATCH] Adding support to freeze and unfreeze a journal Surbhi Palande
2011-05-06 20:56                                                       ` Andreas Dilger
2011-05-07 20:04                                                         ` [PATCH v2] " Surbhi Palande
2011-05-08  8:24                                                           ` Marco Stornelli
2011-05-09  9:04                                                             ` Surbhi Palande
2011-05-09  9:24                                                               ` Jan Kara
2011-05-09  9:53                                                           ` Jan Kara
2011-05-09 13:49                                                             ` Surbhi Palande
2011-05-09 14:51                                                               ` [PATCH v3] " Surbhi Palande
2011-05-09 15:08                                                                 ` Jan Kara
2011-05-10 15:07                                                                   ` [PATCH] " Surbhi Palande
2011-05-10 21:07                                                                     ` Andreas Dilger
2011-05-11  7:46                                                                       ` Surbhi Palande
2011-05-09 15:23                                                                 ` [PATCH v3] " Eric Sandeen
2011-05-11  7:06                                                                   ` Surbhi Palande
2011-05-11  7:10                                                                     ` [PATCH] Attempt to sync the fsstress writes to a frozen F.S Surbhi Palande
2011-05-12 14:22                                                                       ` Eric Sandeen
2011-05-12 14:22                                                                         ` Eric Sandeen
2011-05-24 21:42                                                                       ` Ted Ts'o
2011-05-25 12:00                                                                         ` Surbhi Palande
2011-05-25 12:12                                                                           ` Theodore Tso
2011-05-27 16:28                                                                             ` Jan Kara
2011-05-11  9:05                                                                     ` [PATCH v3] Adding support to freeze and unfreeze a journal Andreas Dilger
2011-05-12  9:40                                                                       ` Surbhi Palande
2011-05-03 13:08                                         ` [PATCH] Prevent dirtying a page when ext4 F.S is frozen Surbhi Palande
2011-05-03 15:19                                         ` [RFC][PATCH] Re: [BUG] ext4: cannot unfreeze a filesystem due to a deadlock Jan Kara
2011-05-04 12:09                                           ` Surbhi Palande
2011-05-04 19:19                                             ` Jan Kara [this message]
2011-05-04 21:34                                               ` Surbhi Palande
2011-05-04 22:48                                                 ` Jan Kara
2011-05-05  6:06                                                   ` Surbhi Palande
2011-05-05 11:18                                                     ` Jan Kara
2011-05-05 14:01                                                       ` Surbhi Palande
2011-03-31 23:40                 ` Dave Chinner
2011-03-31 23:53                   ` Eric Sandeen
2011-04-01 14:08                   ` Jan Kara
2011-04-06  5:40                     ` Dave Chinner
2011-04-06  6:18                       ` Jan Kara
2011-04-06 11:21                         ` Dave Chinner
2011-04-06 13:44                           ` Christoph Hellwig
2011-04-06 22:59                             ` Dave Chinner
2011-04-06 17:40                           ` Jan Kara
2011-04-06 22:54                             ` Dave Chinner
2011-04-08 21:33                               ` Jan Kara
2011-05-02  9:07                           ` Surbhi Palande
2011-05-02 10:56                             ` Jan Kara
2011-05-02 11:27                               ` Surbhi Palande
2011-05-02 12:06                                 ` Surbhi Palande
2011-05-02 12:20                                 ` Jan Kara
2011-05-02 12:30                                   ` Surbhi Palande
2011-05-02 13:16                                     ` Jan Kara
2011-05-02 13:22                                       ` Christoph Hellwig
2011-05-02 14:20                                         ` Jan Kara
2011-05-02 14:41                                           ` Christoph Hellwig
2011-05-02 16:23                                             ` Jan Kara
2011-05-02 16:38                                               ` Christoph Hellwig
2011-05-02 13:22                                       ` Surbhi Palande
2011-05-02 13:24                                         ` Christoph Hellwig
2011-05-02 13:27                                           ` Surbhi Palande
2011-05-02 14:26                                             ` Jan Kara
2011-05-02 14:04                                         ` Eric Sandeen
2011-05-03  7:27                                           ` Surbhi Palande
2011-05-03 20:14                                             ` Eric Sandeen
2011-05-04  8:26                                               ` Surbhi Palande
2011-05-04 14:30                                                 ` Eric Sandeen
2011-05-02 14:01                                     ` Eric Sandeen
2011-04-05 10:44                   ` Toshiyuki Okajima
2011-12-09  1:56 ` Masayoshi MIZUMA
2011-12-15 12:41   ` Masayoshi MIZUMA
2013-11-29  4:58     ` Yongqiang Yang
2013-11-29  8:00       ` Jan Kara

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20110504191912.GB6968@quack.suse.cz \
    --to=jack@suse.cz \
    --cc=adilger.kernel@dilger.ca \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=m.mizuma@jp.fujitsu.com \
    --cc=sandeen@redhat.com \
    --cc=surbhi.palande@canonical.com \
    --cc=toshi.okajima@jp.fujitsu.com \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.