linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jan Kara <jack@suse.cz>
To: Eric Sandeen <sandeen@sandeen.net>
Cc: Kamal Mostafa <kamal@canonical.com>, Jan Kara <jack@suse.cz>,
	Alexander Viro <viro@zeniv.linux.org.uk>,
	Andreas Dilger <adilger.kernel@dilger.ca>,
	Matthew Wilcox <matthew@wil.cx>,
	Randy Dunlap <rdunlap@xenotime.net>, Theodore Tso <tytso@mit.edu>,
	linux-doc@vger.kernel.org, linux-ext4@vger.kernel.org,
	linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org,
	Surbhi Palande <csurbhi@gmail.com>,
	Valerie Aurora <val@vaaconsulting.com>,
	Christopher Chaltain <christopher.chaltain@canonical.com>,
	"Peter M. Petrakis" <peter.petrakis@canonical.com>,
	Mikulas Patocka <mpatocka@redhat.com>,
	Surbhi Palande <surbhi.palande@canonical.com>
Subject: Re: [PATCH v2 1/7] Adding support to freeze and unfreeze a journal
Date: Tue, 10 Jan 2012 22:31:04 +0100	[thread overview]
Message-ID: <20120110213104.GI4516@quack.suse.cz> (raw)
In-Reply-To: <4F0C9D87.8010006@sandeen.net>

On Tue 10-01-12 14:20:23, Eric Sandeen wrote:
> On 12/8/11 12:04 PM, Kamal Mostafa wrote:
> > From: Surbhi Palande <surbhi.palande@canonical.com>
> > 
> > The journal should be frozen when a filesystem freezes. What this means is
> > that until the filesystem is thawed again, no new transactions should be
> > accepted by the journal. When the filesystem thaws, inturn it should thaw the
> > journal and this should allow the journal to resume accepting new
> > transactions. While the the filesystem has frozen the journal, the clients of
> > the journal on calling jbd2_journal_start() will sleep on a wait queue.
> > Thawing the journal will wake up the sleeping clients and journalling can
> > progress normally.
> > 
> > An example of the race condition that can happen without this patch is as
> > follows:
> > 
> > Say the filesystem is thawed when we begin. Let tx be the time at unit x
> > 
> > P1: Process doing an aio write
> > t1) ext4_file_write()
> >   t2) generic_file_aio_write()
> >     t3) __generic_file_aio_write()
> >       // filesystem is not frozen, so we do not block in the next check.
> >       t4) vfs_check_frozen()
> >       t5) generic_write_checks()
> > ----------------- Prempted------------------
> > 
> > P2: Process that does filesystem freeze
> > 
> > t6) freeze_super()
> >   t7) sync_filesystem()
> >   t8) sync_blockdev()
> >   t9) sb->s_op->freeze_fs() (= ext4_freeze)
> >     t10) jbd2_journal_lock_updates()
> >     t11) jbd2_journal_flush()
> >     // Need to unlock the journal before returning to user space.
> >     t12) jbd2_journal_unlock_updates()
> >     // Journal is unlocked and so we can start accepting new transactions now.
> > 
> > // freezing process completes execution. Page cache is now clean and should
> > // remain clean till the filesystem is frozen.
> > --------------------------------------------
> > 
> > P1: writing process gets the control back
> > t13) generic_file_buffered_write()
> >   t14) generic_perform_write()
> >     t15) a_ops->write_begin() (= ext4_write_begin)
> >       t16) ext4_journal_start()
> > 	// New handle is started. We do not block here! Write continues
> > 	// dirtying the page cache while the filesystem is frozen!
> 
> Hrm let me think through this a little more; we actually do:
> 
> t16) ext4_journal_start()
>   t17) ext4_journal_start_sb()
>     t18) handle = ext4_journal_current_handle();
>     t19) if (!handle) vfs_check_frozen()
>     t20) ... jbd2_journal_start()
  Ah, right. I forgot.

> So actually we *do* block new handles, but let *existing* ones
> continue (see commits 6b0310fbf087ad6e9e3b8392adca97cd77184084
> and be4f27d324e8ddd57cc0d4d604fe85ee0425cba9)
> 
> So your assertion that a new handle is started is incorrect
> in general, isn't it?  So then does the fix seem necessary?
> Or, at least, in the fashion below - maybe we need to just make
> sure all started handles complete before the unlock_updates?
> Or am I missing something...?
  Well, the problem with running operations and freezing is more
fundamental I believe. See my email
http://marc.info/?l=linux-fsdevel&m=132585911925796&w=2

So I believe we'll need some better exclusion mechanism already in VFS.

								Honza

> > BugLink: https://bugs.launchpad.net/bugs/897421
> > Signed-off-by: Surbhi Palande <surbhi.palande@canonical.com>
> > Acked-by: Jan Kara <jack@suse.cz>
> > Reviewed-by: Andreas Dilger <adilger.kernel@dilger.ca>
> > Cc: Kamal Mostafa <kamal@canonical.com>
> > Tested-by: Peter M. Petrakis <peter.petrakis@canonical.com>
> > Signed-off-by: Kamal Mostafa <kamal@canonical.com>
> > ---
> >  fs/jbd2/journal.c     |    1 +
> >  fs/jbd2/transaction.c |   42 ++++++++++++++++++++++++++++++++++++++++++
> >  include/linux/jbd2.h  |    7 +++++++
> >  3 files changed, 50 insertions(+), 0 deletions(-)
> > 
> > diff --git a/fs/jbd2/journal.c b/fs/jbd2/journal.c
> > index 0fa0123..f0170cc 100644
> > --- a/fs/jbd2/journal.c
> > +++ b/fs/jbd2/journal.c
> > @@ -894,6 +894,7 @@ static journal_t * journal_init_common (void)
> >  	init_waitqueue_head(&journal->j_wait_checkpoint);
> >  	init_waitqueue_head(&journal->j_wait_commit);
> >  	init_waitqueue_head(&journal->j_wait_updates);
> > +	init_waitqueue_head(&journal->j_wait_frozen);
> >  	mutex_init(&journal->j_barrier);
> >  	mutex_init(&journal->j_checkpoint_mutex);
> >  	spin_lock_init(&journal->j_revoke_lock);
> > diff --git a/fs/jbd2/transaction.c b/fs/jbd2/transaction.c
> > index a0e41a4..340ee35 100644
> > --- a/fs/jbd2/transaction.c
> > +++ b/fs/jbd2/transaction.c
> > @@ -173,6 +173,17 @@ repeat:
> >  				journal->j_barrier_count == 0);
> >  		goto repeat;
> >  	}
> > +	/* Don't let a new handle start when a journal is frozen.
> > +	 * jbd2_journal_freeze calls jbd2_journal_unlock_updates() only after
> > +	 * the j_flags indicate that the journal is frozen. So if the
> > +	 * j_barrier_count is 0, then check if this was made 0 by the freezing
> > +	 * process
> > +	 */
> > +	if (journal->j_flags & JBD2_FROZEN) {
> > +		read_unlock(&journal->j_state_lock);
> > +		wait_event(journal->j_wait_frozen, (journal->j_flags & JBD2_FROZEN));
> > +		goto repeat;
> > +	}
> >  
> >  	if (!journal->j_running_transaction) {
> >  		read_unlock(&journal->j_state_lock);
> > @@ -492,6 +503,37 @@ int jbd2_journal_restart(handle_t *handle, int nblocks)
> >  }
> >  EXPORT_SYMBOL(jbd2_journal_restart);
> >  
> > +int jbd2_journal_freeze(journal_t *journal)
> > +{
> > +	int error = 0;
> > +	/* Now we set up the journal barrier. */
> > +	jbd2_journal_lock_updates(journal);
> > +
> > +	/*
> > +	 * Don't clear the needs_recovery flag if we failed to flush
> > +	 * the journal.
> > +	 */
> > +	error = jbd2_journal_flush(journal);
> > +	if (error >= 0) {
> > +		write_lock(&journal->j_state_lock);
> > +		journal->j_flags |= JBD2_FROZEN;
> > +		write_unlock(&journal->j_state_lock);
> > +	}
> > +	jbd2_journal_unlock_updates(journal);
> > +	return error;
> > +}
> > +EXPORT_SYMBOL(jbd2_journal_freeze);
> > +
> > +void jbd2_journal_thaw(journal_t * journal)
> > +{
> > +	write_lock(&journal->j_state_lock);
> > +	journal->j_flags &= ~JBD2_FROZEN;
> > +	write_unlock(&journal->j_state_lock);
> > +	wake_up(&journal->j_wait_frozen);
> > +}
> > +EXPORT_SYMBOL(jbd2_journal_thaw);
> > +
> > +
> >  /**
> >   * void jbd2_journal_lock_updates () - establish a transaction barrier.
> >   * @journal:  Journal to establish a barrier on.
> > diff --git a/include/linux/jbd2.h b/include/linux/jbd2.h
> > index 2092ea2..bfa0752 100644
> > --- a/include/linux/jbd2.h
> > +++ b/include/linux/jbd2.h
> > @@ -658,6 +658,7 @@ jbd2_time_diff(unsigned long start, unsigned long end)
> >   * @j_wait_checkpoint:  Wait queue to trigger checkpointing
> >   * @j_wait_commit: Wait queue to trigger commit
> >   * @j_wait_updates: Wait queue to wait for updates to complete
> > + * @j_wait_frozen: Wait queue to wait for journal to thaw
> >   * @j_checkpoint_mutex: Mutex for locking against concurrent checkpoints
> >   * @j_head: Journal head - identifies the first unused block in the journal
> >   * @j_tail: Journal tail - identifies the oldest still-used block in the
> > @@ -775,6 +776,9 @@ struct journal_s
> >  	/* Wait queue to wait for updates to complete */
> >  	wait_queue_head_t	j_wait_updates;
> >  
> > +	/* Wait queue to wait for journal to thaw*/
> > +	wait_queue_head_t	j_wait_frozen;
> > +
> >  	/* Semaphore for locking against concurrent checkpoints */
> >  	struct mutex		j_checkpoint_mutex;
> >  
> > @@ -953,6 +957,7 @@ struct journal_s
> >  #define JBD2_ABORT_ON_SYNCDATA_ERR	0x040	/* Abort the journal on file
> >  						 * data write error in ordered
> >  						 * mode */
> > +#define JBD2_FROZEN	0x080 /* Journal thread frozen along with filesystem */
> >  
> >  /*
> >   * Function declarations for the journaling transaction and buffer
> > @@ -1060,6 +1065,8 @@ extern void	 jbd2_journal_invalidatepage(journal_t *,
> >  				struct page *, unsigned long);
> >  extern int	 jbd2_journal_try_to_free_buffers(journal_t *, struct page *, gfp_t);
> >  extern int	 jbd2_journal_stop(handle_t *);
> > +extern int	 jbd2_journal_freeze(journal_t *);
> > +extern void	 jbd2_journal_thaw(journal_t *);
> >  extern int	 jbd2_journal_flush (journal_t *);
> >  extern void	 jbd2_journal_lock_updates (journal_t *);
> >  extern void	 jbd2_journal_unlock_updates (journal_t *);
> 
-- 
Jan Kara <jack@suse.cz>
SUSE Labs, CR

  reply	other threads:[~2012-01-10 21:31 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-12-08 18:04 [PATCH v2 0/7] fix s_umount thaw/write and journal deadlock Kamal Mostafa
2011-12-08 18:04 ` [PATCH v2 1/7] Adding support to freeze and unfreeze a journal Kamal Mostafa
2012-01-10 20:20   ` Eric Sandeen
2012-01-10 21:31     ` Jan Kara [this message]
2012-01-10 21:55       ` Surbhi Palande
     [not found]       ` <CAMBkX3eVeKSmEzmYTe6Oe_D6kAMQTL5LYoi1-Axj7CcrM85Pow@mail.gmail.com>
2012-01-11  0:04         ` Jan Kara
2012-01-11  0:13           ` Surbhi Palande
2012-01-11  0:51             ` Jan Kara
2012-01-11  3:08       ` Eric Sandeen
2012-01-10 22:00     ` Surbhi Palande
2011-12-08 18:04 ` [PATCH v2 2/7] Freeze and thaw the journal on ext4 freeze Kamal Mostafa
2012-01-06  0:32   ` Jan Kara
2011-12-08 18:04 ` [PATCH v2 3/7] VFS: Fix s_umount thaw/write deadlock Kamal Mostafa
2012-01-06  1:50   ` Jan Kara
2011-12-08 18:04 ` [PATCH v2 4/7] VFS: Rename and refactor writeback_inodes_sb_if_idle Kamal Mostafa
2011-12-13  3:34   ` Miao Xie
2011-12-15  7:10     ` Miao Xie
2011-12-16 20:48     ` Kamal Mostafa
2012-01-06  0:33   ` Jan Kara
2011-12-08 18:04 ` [PATCH v2 5/7] VFS: Avoid read-write deadlock in try_to_writeback_inodes_sb Kamal Mostafa
2012-01-06  0:35   ` Jan Kara
2012-01-11 20:29     ` Kamal Mostafa
2012-01-12 15:53       ` Mikulas Patocka
2011-12-08 18:04 ` [PATCH v2 6/7] VFS: Document s_frozen state through freeze_super Kamal Mostafa
2012-01-06  0:36   ` Jan Kara
2011-12-08 18:04 ` [PATCH v2 7/7] Documentation: Correct s_umount state for freeze_fs/unfreeze_fs Kamal Mostafa
2012-01-06  0:36   ` Jan Kara

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20120110213104.GI4516@quack.suse.cz \
    --to=jack@suse.cz \
    --cc=adilger.kernel@dilger.ca \
    --cc=christopher.chaltain@canonical.com \
    --cc=csurbhi@gmail.com \
    --cc=kamal@canonical.com \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=matthew@wil.cx \
    --cc=mpatocka@redhat.com \
    --cc=peter.petrakis@canonical.com \
    --cc=rdunlap@xenotime.net \
    --cc=sandeen@sandeen.net \
    --cc=surbhi.palande@canonical.com \
    --cc=tytso@mit.edu \
    --cc=val@vaaconsulting.com \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).