All of lore.kernel.org
 help / color / mirror / Atom feed
* ext4+quota patch series
@ 2009-11-23 18:13 Dmitry Monakhov
  2009-11-23 18:15 ` Eric Sandeen
                   ` (4 more replies)
  0 siblings, 5 replies; 20+ messages in thread
From: Dmitry Monakhov @ 2009-11-23 18:13 UTC (permalink / raw)
  To: linux-ext4

Hello, I've prepared patch-set which fixes some issues
in quota related code. With this patch-set applied i've
got success pass from all quota tests I use.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: ext4+quota patch series
  2009-11-23 18:13 ext4+quota patch series Dmitry Monakhov
@ 2009-11-23 18:15 ` Eric Sandeen
  2009-11-23 19:18   ` Dmitry Monakhov
  2009-11-23 18:30 ` [PATCH 1/4] ext4: delalloc quota fixes Dmitry Monakhov
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 20+ messages in thread
From: Eric Sandeen @ 2009-11-23 18:15 UTC (permalink / raw)
  To: Dmitry Monakhov; +Cc: linux-ext4

Dmitry Monakhov wrote:
> Hello, I've prepared patch-set which fixes some issues
> in quota related code. With this patch-set applied i've
> got success pass from all quota tests I use.
> --



Can you share those quota tests?  I'd love to put them into the xfstests
suite we've been using for ext4 as well.

Thanks,
-Eric

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH 1/4] ext4: delalloc quota fixes
  2009-11-23 18:13 ext4+quota patch series Dmitry Monakhov
  2009-11-23 18:15 ` Eric Sandeen
@ 2009-11-23 18:30 ` Dmitry Monakhov
  2009-11-23 22:43   ` Dmitry Monakhov
  2009-11-23 22:58   ` Dmitry Monakhov
  2009-11-23 18:32 ` [PATCH 2/4] ext4: fix race chown vs truncate Dmitry Monakhov
                   ` (2 subsequent siblings)
  4 siblings, 2 replies; 20+ messages in thread
From: Dmitry Monakhov @ 2009-11-23 18:30 UTC (permalink / raw)
  To: linux-ext4

This patch fix most visible problems with delalloc+quota issues
- ext4_get_reserved_space() must return bytes instead of blocks.
- Claim allocated meta blocks. Do it as we do for data blocks
  but delay it until proper moment.
- Move space claiming to ext4_da_update_reserve_space()
  Only here we do know what we are dealing with data or metadata

The most useful test case for delalloc+quota is concurrent tasks
with write+truncate vs chown for a same file.

Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org>
---
 fs/ext4/inode.c   |   13 ++++++++-----
 fs/ext4/mballoc.c |    6 ------
 2 files changed, 8 insertions(+), 11 deletions(-)

diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index ab22297..84863e6 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -1057,7 +1057,7 @@ qsize_t ext4_get_reserved_space(struct inode *inode)
 		EXT4_I(inode)->i_reserved_meta_blocks;
 	spin_unlock(&EXT4_I(inode)->i_block_reservation_lock);
 
-	return total;
+	return (total << inode->i_blkbits);
 }
 /*
  * Calculate the number of metadata blocks need to reserve
@@ -1096,7 +1096,7 @@ static int ext4_calc_metadata_amount(struct inode *inode, int blocks)
 static void ext4_da_update_reserve_space(struct inode *inode, int used)
 {
 	struct ext4_sb_info *sbi = EXT4_SB(inode->i_sb);
-	int total, mdb, mdb_free;
+	int total, mdb, mdb_free, mdb_claim = 0;
 
 	spin_lock(&EXT4_I(inode)->i_block_reservation_lock);
 	/* recalculate the number of metablocks still need to be reserved */
@@ -1109,7 +1109,9 @@ static void ext4_da_update_reserve_space(struct inode *inode, int used)
 
 	if (mdb_free) {
 		/* Account for allocated meta_blocks */
-		mdb_free -= EXT4_I(inode)->i_allocated_meta_blocks;
+		mdb_claim = EXT4_I(inode)->i_allocated_meta_blocks;
+		BUG_ON(mdb_free < mdb_claim);
+		mdb_free -= mdb_claim;
 
 		/* update fs dirty blocks counter */
 		percpu_counter_sub(&sbi->s_dirtyblocks_counter, mdb_free);
@@ -1119,8 +1121,9 @@ static void ext4_da_update_reserve_space(struct inode *inode, int used)
 
 	/* update per-inode reservations */
 	BUG_ON(used  > EXT4_I(inode)->i_reserved_data_blocks);
-	EXT4_I(inode)->i_reserved_data_blocks -= used;
-	spin_unlock(&EXT4_I(inode)->i_block_reservation_lock);
+ 	EXT4_I(inode)->i_reserved_data_blocks -= used;
+	percpu_counter_sub(&sbi->s_dirtyblocks_counter, used + mdb_claim);
+	vfs_dq_claim_block(inode, used + mdb_claim);
 
 	/*
 	 * free those over-booking quota for metadata blocks
diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
index bba1282..d4c52db 100644
--- a/fs/ext4/mballoc.c
+++ b/fs/ext4/mballoc.c
@@ -2750,12 +2750,6 @@ ext4_mb_mark_diskspace_used(struct ext4_allocation_context *ac,
 	if (!(ac->ac_flags & EXT4_MB_DELALLOC_RESERVED))
 		/* release all the reserved blocks if non delalloc */
 		percpu_counter_sub(&sbi->s_dirtyblocks_counter, reserv_blks);
-	else {
-		percpu_counter_sub(&sbi->s_dirtyblocks_counter,
-						ac->ac_b_ex.fe_len);
-		/* convert reserved quota blocks to real quota blocks */
-		vfs_dq_claim_block(ac->ac_inode, ac->ac_b_ex.fe_len);
-	}
 
 	if (sbi->s_log_groups_per_flex) {
 		ext4_group_t flex_group = ext4_flex_group(sbi,
-- 
1.6.0.4


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 2/4] ext4: fix race chown vs truncate
  2009-11-23 18:13 ext4+quota patch series Dmitry Monakhov
  2009-11-23 18:15 ` Eric Sandeen
  2009-11-23 18:30 ` [PATCH 1/4] ext4: delalloc quota fixes Dmitry Monakhov
@ 2009-11-23 18:32 ` Dmitry Monakhov
  2009-11-23 18:42   ` Dmitry Monakhov
  2009-11-23 18:33 ` [PATCH 3/4] ext4: quota macros cleanup Dmitry Monakhov
  2009-11-23 18:34 ` [PATCH 4/4] ext4: fix incorrect block reservation on quota transfer Dmitry Monakhov
  4 siblings, 1 reply; 20+ messages in thread
From: Dmitry Monakhov @ 2009-11-23 18:32 UTC (permalink / raw)
  To: linux-ext4

Currently all functions which call vfs_dq_release_reservation_block()
call it without i_block_reservation_lock. This result in
ext4_reservation vs quota_reservation inconsistency which provoke
incorrect reservation transfer and incorrect quota.
Task 1 (chown)                   Task 2 (truncate)
dquot_transfer
 ->down_write(dqptr_sem)	 ext4_da_release_spac
 ->dquot_get_reserved_space	  ->lock(i_block_reservation_lock)
    ->get_reserved_space	    /* decrement reservation */
      ->ext4_get_reserved_spac	  ->unlock(i_block_reservation_lock)
        lock(i_block_rsv_lock) ---- /* During this time window
	Read incorrect value	     * fs's reservation not equals
				     * to quota's */
				    ->vfs_dq_release_reservation_block()
In fact i_block_reservation_lock is held by ext4_da_reserve_space()
while calling vfs_dq_reserve_block(). This may result in deadlock:
because of different lock ordering:
ext4_da_reserve_space()             dquot_transfer()
lock(i_block_reservation_lock)	     down_write(dqptr_sem)
  down_write(dqptr_sem)		     lock(i_block_reservation_lock)

But this not happens only because both callers must have i_mutex so
serialization happens on i_mutex.

To prevent ext4_reservation vs dquot_reservation inconsistency, we have
to reorganize locking ordering like follows:
i_block_reservation_lock > dqptr_sem
This means what all functions which changes ext4 or quota reservation have
to hold i_block_reservation_lock.

Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org>
---
 fs/ext4/inode.c |   23 ++++++++++++++++++-----
 1 files changed, 18 insertions(+), 5 deletions(-)

diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 84863e6..c521c93 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -1047,16 +1047,23 @@ cleanup:
 out:
 	return err;
 }
+/*
+ * Quota_transfer callback.
+ * During quota transfer we have to transfer rsv-blocks from one dquot to
+ * another. inode must be protected from concurrent reservation/reclamation.
+ * Locking ordering for all space reservation code:
+ *  i_block_reservation_lock > dqptr_sem
+ * This is differ from i_block,i_lock locking ordering, but this is the
+ * only possible way.
+ * NOTE: Caller must hold i_block_reservation_lock.
+ */
 
 qsize_t ext4_get_reserved_space(struct inode *inode)
 {
 	unsigned long long total;
 
-	spin_lock(&EXT4_I(inode)->i_block_reservation_lock);
 	total = EXT4_I(inode)->i_reserved_data_blocks +
 		EXT4_I(inode)->i_reserved_meta_blocks;
-	spin_unlock(&EXT4_I(inode)->i_block_reservation_lock);
-
 	return (total << inode->i_blkbits);
 }
 /*
@@ -1131,6 +1138,8 @@ static void ext4_da_update_reserve_space(struct inode *inode, int used)
 	if (mdb_free)
 		vfs_dq_release_reservation_block(inode, mdb_free);
 
+	spin_unlock(&EXT4_I(inode)->i_block_reservation_lock);
+
 	/*
 	 * If we have done all the pending block allocations and if
 	 * there aren't any writers on the inode, we can discard the
@@ -1866,8 +1875,8 @@ repeat:
 	}
 
 	if (ext4_claim_free_blocks(sbi, total)) {
-		spin_unlock(&EXT4_I(inode)->i_block_reservation_lock);
 		vfs_dq_release_reservation_block(inode, total);
+		spin_unlock(&EXT4_I(inode)->i_block_reservation_lock);
 		if (ext4_should_retry_alloc(inode->i_sb, &retries)) {
 			yield();
 			goto repeat;
@@ -1924,9 +1933,9 @@ static void ext4_da_release_space(struct inode *inode, int to_free)
 
 	BUG_ON(mdb > EXT4_I(inode)->i_reserved_meta_blocks);
 	EXT4_I(inode)->i_reserved_meta_blocks = mdb;
-	spin_unlock(&EXT4_I(inode)->i_block_reservation_lock);
 
 	vfs_dq_release_reservation_block(inode, release);
+	spin_unlock(&EXT4_I(inode)->i_block_reservation_lock);
 }
 
 static void ext4_da_page_release_reservation(struct page *page,
@@ -5436,7 +5445,11 @@ int ext4_setattr(struct dentry *dentry, struct iattr *attr)
 			error = PTR_ERR(handle);
 			goto err_out;
 		}
+		/* i_block_reservation must being held in order to avoid races
+		 * with concurent block reservation. */
+		spin_lock(&EXT4_I(inode)->i_block_reservation_lock);
 		error = vfs_dq_transfer(inode, attr) ? -EDQUOT : 0;
+		spin_unlock(&EXT4_I(inode)->i_block_reservation_lock);
 		if (error) {
 			ext4_journal_stop(handle);
 			return error;
-- 
1.6.0.4


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 3/4] ext4: quota macros cleanup
  2009-11-23 18:13 ext4+quota patch series Dmitry Monakhov
                   ` (2 preceding siblings ...)
  2009-11-23 18:32 ` [PATCH 2/4] ext4: fix race chown vs truncate Dmitry Monakhov
@ 2009-11-23 18:33 ` Dmitry Monakhov
  2009-11-23 18:34 ` [PATCH 4/4] ext4: fix incorrect block reservation on quota transfer Dmitry Monakhov
  4 siblings, 0 replies; 20+ messages in thread
From: Dmitry Monakhov @ 2009-11-23 18:33 UTC (permalink / raw)
  To: linux-ext4

Currently all quota block reservation macros contains hard-coded "2"
aka MAXQUOTAS value. This is no good because in some places it is not
obvious to understand what does this digit represent. Let's introduce
new macro with self descriptive name.

Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org>
---
 fs/ext4/ext4_jbd2.h |    8 ++++++--
 fs/ext4/extents.c   |    2 +-
 fs/ext4/inode.c     |    2 +-
 fs/ext4/migrate.c   |    4 ++--
 fs/ext4/namei.c     |    8 ++++----
 5 files changed, 14 insertions(+), 10 deletions(-)

diff --git a/fs/ext4/ext4_jbd2.h b/fs/ext4/ext4_jbd2.h
index a286598..404ab6c 100644
--- a/fs/ext4/ext4_jbd2.h
+++ b/fs/ext4/ext4_jbd2.h
@@ -49,7 +49,7 @@
 
 #define EXT4_DATA_TRANS_BLOCKS(sb)	(EXT4_SINGLEDATA_TRANS_BLOCKS(sb) + \
 					 EXT4_XATTR_TRANS_BLOCKS - 2 + \
-					 2*EXT4_QUOTA_TRANS_BLOCKS(sb))
+					 EXT4_MAXQUOTAS_TRANS_BLOCKS(sb))
 
 /*
  * Define the number of metadata blocks we need to account to modify data.
@@ -57,7 +57,7 @@
  * This include super block, inode block, quota blocks and xattr blocks
  */
 #define EXT4_META_TRANS_BLOCKS(sb)	(EXT4_XATTR_TRANS_BLOCKS + \
-					2*EXT4_QUOTA_TRANS_BLOCKS(sb))
+					EXT4_MAXQUOTAS_TRANS_BLOCKS(sb))
 
 /* Delete operations potentially hit one directory's namespace plus an
  * entire inode, plus arbitrary amounts of bitmap/indirection data.  Be
@@ -92,6 +92,7 @@
  * but inode, sb and group updates are done only once */
 #define EXT4_QUOTA_INIT_BLOCKS(sb) (test_opt(sb, QUOTA) ? (DQUOT_INIT_ALLOC*\
 		(EXT4_SINGLEDATA_TRANS_BLOCKS(sb)-3)+3+DQUOT_INIT_REWRITE) : 0)
+
 #define EXT4_QUOTA_DEL_BLOCKS(sb) (test_opt(sb, QUOTA) ? (DQUOT_DEL_ALLOC*\
 		(EXT4_SINGLEDATA_TRANS_BLOCKS(sb)-3)+3+DQUOT_DEL_REWRITE) : 0)
 #else
@@ -99,6 +100,9 @@
 #define EXT4_QUOTA_INIT_BLOCKS(sb) 0
 #define EXT4_QUOTA_DEL_BLOCKS(sb) 0
 #endif
+#define EXT4_MAXQUOTAS_TRANS_BLOCKS(sb) (MAXQUOTAS*EXT4_QUOTA_TRANS_BLOCKS(sb))
+#define EXT4_MAXQUOTAS_INIT_BLOCKS(sb) (MAXQUOTAS*EXT4_QUOTA_INIT_BLOCKS(sb))
+#define EXT4_MAXQUOTAS_DEL_BLOCKS(sb) (MAXQUOTAS*EXT4_QUOTA_DEL_BLOCKS(sb))
 
 int
 ext4_mark_iloc_dirty(handle_t *handle,
diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c
index 715264b..f73c3ff 100644
--- a/fs/ext4/extents.c
+++ b/fs/ext4/extents.c
@@ -2167,7 +2167,7 @@ ext4_ext_rm_leaf(handle_t *handle, struct inode *inode,
 			correct_index = 1;
 			credits += (ext_depth(inode)) + 1;
 		}
-		credits += 2 * EXT4_QUOTA_TRANS_BLOCKS(inode->i_sb);
+		credits += EXT4_MAXQUOTAS_TRANS_BLOCKS(inode->i_sb);
 
 		err = ext4_ext_truncate_extend_restart(handle, inode, credits);
 		if (err)
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index c521c93..b166e90 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -5439,7 +5439,7 @@ int ext4_setattr(struct dentry *dentry, struct iattr *attr)
 
 		/* (user+group)*(old+new) structure, inode write (sb,
 		 * inode block, ? - but truncate inode update has it) */
-		handle = ext4_journal_start(inode, 2*(EXT4_QUOTA_INIT_BLOCKS(inode->i_sb)+
+		handle = ext4_journal_start(inode, (EXT4_MAXQUOTAS_INIT_BLOCKS(inode->i_sb)+
 					EXT4_QUOTA_DEL_BLOCKS(inode->i_sb))+3);
 		if (IS_ERR(handle)) {
 			error = PTR_ERR(handle);
diff --git a/fs/ext4/migrate.c b/fs/ext4/migrate.c
index a93d5b8..8646149 100644
--- a/fs/ext4/migrate.c
+++ b/fs/ext4/migrate.c
@@ -238,7 +238,7 @@ static int extend_credit_for_blkdel(handle_t *handle, struct inode *inode)
 	 * So allocate a credit of 3. We may update
 	 * quota (user and group).
 	 */
-	needed = 3 + 2*EXT4_QUOTA_TRANS_BLOCKS(inode->i_sb);
+	needed = 3 + EXT4_MAXQUOTAS_TRANS_BLOCKS(inode->i_sb);
 
 	if (ext4_journal_extend(handle, needed) != 0)
 		retval = ext4_journal_restart(handle, needed);
@@ -477,7 +477,7 @@ int ext4_ext_migrate(struct inode *inode)
 	handle = ext4_journal_start(inode,
 					EXT4_DATA_TRANS_BLOCKS(inode->i_sb) +
 					EXT4_INDEX_EXTRA_TRANS_BLOCKS + 3 +
-					2 * EXT4_QUOTA_INIT_BLOCKS(inode->i_sb)
+					EXT4_MAXQUOTAS_INIT_BLOCKS(inode->i_sb)
 					+ 1);
 	if (IS_ERR(handle)) {
 		retval = PTR_ERR(handle);
diff --git a/fs/ext4/namei.c b/fs/ext4/namei.c
index fde08c9..17a17e1 100644
--- a/fs/ext4/namei.c
+++ b/fs/ext4/namei.c
@@ -1769,7 +1769,7 @@ static int ext4_create(struct inode *dir, struct dentry *dentry, int mode,
 retry:
 	handle = ext4_journal_start(dir, EXT4_DATA_TRANS_BLOCKS(dir->i_sb) +
 					EXT4_INDEX_EXTRA_TRANS_BLOCKS + 3 +
-					2*EXT4_QUOTA_INIT_BLOCKS(dir->i_sb));
+					EXT4_MAXQUOTAS_INIT_BLOCKS(dir->i_sb));
 	if (IS_ERR(handle))
 		return PTR_ERR(handle);
 
@@ -1803,7 +1803,7 @@ static int ext4_mknod(struct inode *dir, struct dentry *dentry,
 retry:
 	handle = ext4_journal_start(dir, EXT4_DATA_TRANS_BLOCKS(dir->i_sb) +
 					EXT4_INDEX_EXTRA_TRANS_BLOCKS + 3 +
-					2*EXT4_QUOTA_INIT_BLOCKS(dir->i_sb));
+					EXT4_MAXQUOTAS_INIT_BLOCKS(dir->i_sb));
 	if (IS_ERR(handle))
 		return PTR_ERR(handle);
 
@@ -1840,7 +1840,7 @@ static int ext4_mkdir(struct inode *dir, struct dentry *dentry, int mode)
 retry:
 	handle = ext4_journal_start(dir, EXT4_DATA_TRANS_BLOCKS(dir->i_sb) +
 					EXT4_INDEX_EXTRA_TRANS_BLOCKS + 3 +
-					2*EXT4_QUOTA_INIT_BLOCKS(dir->i_sb));
+					EXT4_MAXQUOTAS_INIT_BLOCKS(dir->i_sb));
 	if (IS_ERR(handle))
 		return PTR_ERR(handle);
 
@@ -2253,7 +2253,7 @@ static int ext4_symlink(struct inode *dir,
 retry:
 	handle = ext4_journal_start(dir, EXT4_DATA_TRANS_BLOCKS(dir->i_sb) +
 					EXT4_INDEX_EXTRA_TRANS_BLOCKS + 5 +
-					2*EXT4_QUOTA_INIT_BLOCKS(dir->i_sb));
+					EXT4_MAXQUOTAS_INIT_BLOCKS(dir->i_sb));
 	if (IS_ERR(handle))
 		return PTR_ERR(handle);
 
-- 
1.6.0.4


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 4/4] ext4: fix incorrect block reservation on quota transfer.
  2009-11-23 18:13 ext4+quota patch series Dmitry Monakhov
                   ` (3 preceding siblings ...)
  2009-11-23 18:33 ` [PATCH 3/4] ext4: quota macros cleanup Dmitry Monakhov
@ 2009-11-23 18:34 ` Dmitry Monakhov
  4 siblings, 0 replies; 20+ messages in thread
From: Dmitry Monakhov @ 2009-11-23 18:34 UTC (permalink / raw)
  To: linux-ext4

Inside ->setattr() call both ATTR_UID and ATTR_GID may be valid
This means that we may end-up with transferring all quotas. Add
we have to reserve QUOTA_DEL_BLOCKS for all quotas, as we do in
case of QUOTA_INIT_BLOCKS.

Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org>
---
 fs/ext4/inode.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index b166e90..428336a 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -5440,7 +5440,7 @@ int ext4_setattr(struct dentry *dentry, struct iattr *attr)
 		/* (user+group)*(old+new) structure, inode write (sb,
 		 * inode block, ? - but truncate inode update has it) */
 		handle = ext4_journal_start(inode, (EXT4_MAXQUOTAS_INIT_BLOCKS(inode->i_sb)+
-					EXT4_QUOTA_DEL_BLOCKS(inode->i_sb))+3);
+					EXT4_MAXQUOTAS_DEL_BLOCKS(inode->i_sb))+3);
 		if (IS_ERR(handle)) {
 			error = PTR_ERR(handle);
 			goto err_out;
-- 
1.6.0.4


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH 2/4] ext4: fix race chown vs truncate
  2009-11-23 18:32 ` [PATCH 2/4] ext4: fix race chown vs truncate Dmitry Monakhov
@ 2009-11-23 18:42   ` Dmitry Monakhov
  0 siblings, 0 replies; 20+ messages in thread
From: Dmitry Monakhov @ 2009-11-23 18:42 UTC (permalink / raw)
  To: linux-ext4

Dmitry Monakhov <dmonakhov@openvz.org> writes:
> To prevent ext4_reservation vs dquot_reservation inconsistency, we have
> to reorganize locking ordering like follows:
> i_block_reservation_lock > dqptr_sem
> This means what all functions which changes ext4 or quota reservation have
> to hold i_block_reservation_lock.
While investigating this issue i've considered other solution

* Introduce i_block analog for generic reserved space management:
We may introduce i_rsv_block field in generic inode, and protected
it by i_lock(similar to i_block).
Introduce inc/dec/set/get methods similar to inode_get_bytes,
inode_sub_bytes.. . 
This value is managed internally by quota code. Perform reservation
management inside dquot_reserve_space, dquot_release_reservation
without interfering with fs internals, as we do for i_blocks.
IMHO this is best way because:

1)We don't have to hold i_block_reservation_lock while quota-op
  which may lead to virtual performance penalty.
2)This brings to well defined VFS interface for reserved space management.

But I expect some problems from AlViro because only ext4 would use it by now.
And off course this may lead to ext4_rsv quot_rsv inconsistency
due to some bugs.
>
> Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org>
> ---
>  fs/ext4/inode.c |   23 ++++++++++++++++++-----
>  1 files changed, 18 insertions(+), 5 deletions(-)
>
> diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
> index 84863e6..c521c93 100644
> --- a/fs/ext4/inode.c
> +++ b/fs/ext4/inode.c
> @@ -1047,16 +1047,23 @@ cleanup:
>  out:
>  	return err;
>  }
> +/*
> + * Quota_transfer callback.
> + * During quota transfer we have to transfer rsv-blocks from one dquot to
> + * another. inode must be protected from concurrent reservation/reclamation.
> + * Locking ordering for all space reservation code:
> + *  i_block_reservation_lock > dqptr_sem
> + * This is differ from i_block,i_lock locking ordering, but this is the
> + * only possible way.
> + * NOTE: Caller must hold i_block_reservation_lock.
> + */
>  
>  qsize_t ext4_get_reserved_space(struct inode *inode)
>  {
>  	unsigned long long total;
>  
> -	spin_lock(&EXT4_I(inode)->i_block_reservation_lock);
>  	total = EXT4_I(inode)->i_reserved_data_blocks +
>  		EXT4_I(inode)->i_reserved_meta_blocks;
> -	spin_unlock(&EXT4_I(inode)->i_block_reservation_lock);
> -
>  	return (total << inode->i_blkbits);
>  }
>  /*
> @@ -1131,6 +1138,8 @@ static void ext4_da_update_reserve_space(struct inode *inode, int used)
>  	if (mdb_free)
>  		vfs_dq_release_reservation_block(inode, mdb_free);
>  
> +	spin_unlock(&EXT4_I(inode)->i_block_reservation_lock);
> +
>  	/*
>  	 * If we have done all the pending block allocations and if
>  	 * there aren't any writers on the inode, we can discard the
> @@ -1866,8 +1875,8 @@ repeat:
>  	}
>  
>  	if (ext4_claim_free_blocks(sbi, total)) {
> -		spin_unlock(&EXT4_I(inode)->i_block_reservation_lock);
>  		vfs_dq_release_reservation_block(inode, total);
> +		spin_unlock(&EXT4_I(inode)->i_block_reservation_lock);
>  		if (ext4_should_retry_alloc(inode->i_sb, &retries)) {
>  			yield();
>  			goto repeat;
> @@ -1924,9 +1933,9 @@ static void ext4_da_release_space(struct inode *inode, int to_free)
>  
>  	BUG_ON(mdb > EXT4_I(inode)->i_reserved_meta_blocks);
>  	EXT4_I(inode)->i_reserved_meta_blocks = mdb;
> -	spin_unlock(&EXT4_I(inode)->i_block_reservation_lock);
>  
>  	vfs_dq_release_reservation_block(inode, release);
> +	spin_unlock(&EXT4_I(inode)->i_block_reservation_lock);
>  }
>  
>  static void ext4_da_page_release_reservation(struct page *page,
> @@ -5436,7 +5445,11 @@ int ext4_setattr(struct dentry *dentry, struct iattr *attr)
>  			error = PTR_ERR(handle);
>  			goto err_out;
>  		}
> +		/* i_block_reservation must being held in order to avoid races
> +		 * with concurent block reservation. */
> +		spin_lock(&EXT4_I(inode)->i_block_reservation_lock);
>  		error = vfs_dq_transfer(inode, attr) ? -EDQUOT : 0;
> +		spin_unlock(&EXT4_I(inode)->i_block_reservation_lock);
>  		if (error) {
>  			ext4_journal_stop(handle);
>  			return error;

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: ext4+quota patch series
  2009-11-23 18:15 ` Eric Sandeen
@ 2009-11-23 19:18   ` Dmitry Monakhov
  2009-11-23 19:35     ` Eric Sandeen
  0 siblings, 1 reply; 20+ messages in thread
From: Dmitry Monakhov @ 2009-11-23 19:18 UTC (permalink / raw)
  To: Eric Sandeen; +Cc: linux-ext4

[-- Attachment #1: Type: TEXT/PLAIN, Size: 964 bytes --]

Eric Sandeen <sandeen@redhat.com> writes:
>Can you share those quota tests?  I'd love to put them into the xfstests
>suite we've been using for ext4 as well.

write-truncate-chown: test delalloc + quota_transfer
I've written crappy quotactl for quota manipulation
(which i use for ct-tree-quota development).Some times
it more useful. See files attached.

mkfs.ext4 /dev/sdb5 -b4096
mount /dev/sdb5 /mnt -ogrpquota,usrquota
quotacheck -cug /mnt
# sync is necessary, because files may have reserved some blocks
# which later lead to complain from claim_reserved_space
# Probably we have print *huge* warning if we found
# file with reserved blocks inodes traversing on quotaon 
sync;sync;sync
# turn on quota
./quotactl --all --on --device=/dev/sdb5 --path /mnt
# run test
./write-truncate-chown /mnt/ 9999999999&
# get quota report, print warn in case of incorrect quota.
./quotactl --all --get --type 0 --device=/dev/sdb5 || echo "failed"
# checkout dmesg
dmesg


[-- Attachment #2: quotactl.c --]
[-- Type: text/x-csrc, Size: 5219 bytes --]

/*
 * Disk quota reporting program.
 */
#include <sys/types.h>
#include <sys/param.h>
#include <getopt.h>
#include <stdio.h>
#include <stdlib.h>
#include <pwd.h>
#include <grp.h>
#include <time.h>
#include <errno.h>
#include <string.h>
#include <unistd.h>
#include <error.h>
#include <linux/quota.h>
#include <linux/dqblk_v1.h>
#include <linux/dqblk_v2.h>

enum options {
	ID_FL  =  	0x1,
	HELP_FL = 	0x2,
	GET_FL = 	0x4,
	SET_FL = 	0x8,
	ALL_FL = 	0x10,
	ON_FL =		0x20,
	OFF_FL =	0x40,
	TYPE_FL =	0x80,
};
unsigned flags = 0;
char device[128];
char path[128];
int  id;
int type;
int warn = 0;
 void print_dqblk(int qid, struct if_dqblk *dqb)
 {
	 printf("%6d   %10lld  %10lld %10lld,    %10lld %10lld %10lld\n",

		 qid, dqb->dqb_curspace, dqb->dqb_bsoftlimit, dqb->dqb_bhardlimit,
		 dqb->dqb_curinodes, dqb->dqb_isoftlimit, dqb->dqb_ihardlimit);
	 if (((int64_t)dqb->dqb_curspace) < 0 ||
		 ((int64_t)dqb->dqb_bsoftlimit) < 0 ||
		 ((int64_t)dqb->dqb_bhardlimit) < 0 ||
		 ((int64_t)dqb->dqb_curinodes) < 0 ||
		 ((int64_t)dqb->dqb_isoftlimit) < 0 ||
		 ((int64_t)dqb->dqb_ihardlimit) < 0 ) {
		 printf ("WANR: Negative quota !!!");
		 warn = qid + 1;
	 }
 }
 int onoff_quota(int type, int on) {
	 int beg = type;
	 int end = type;
	 int i, ret;
	 char  *name[] = {"aquota.user", "aquota.group", "aquota.tree"};
	 char p[1024];
	 if (!(flags & TYPE_FL) && !(flags & ALL_FL)) {
		 printf("Err --set with out --all or --type opt\n");
		 exit(1);
	 }
	 if (flags & ALL_FL) {
		 beg = 0;
		 end = 2; /* MAXQUOTAS */
	 }
	 for (i = beg; i  <= end; i++) {
		 snprintf(p, sizeof(p), "%s/%s", path, name[i]);
		 ret = quotactl(QCMD(on ? Q_QUOTAON : Q_QUOTAOFF, i),
			 device, QFMT_VFS_V0, p);
		 if (ret)
			 perror("quotactl");
	 }
	 return ret;
 }



 int show_quota(int start, int end)
 {
	 struct if_dqblk dqb, sum_dqb;
	 int i,found = 0;
	 int ret;
	 memset(&sum_dqb, 0, sizeof(sum_dqb));
	 printf("    ID     curspace       soft        hard      curinodes      soft       hard\n");
	 for (i = start; i < end; i++) {
		 ret = quotactl(QCMD(Q_GETQUOTA, type),device, i, (char*)&dqb);
		 if (ret && errno == ESRCH)
			 continue;
		 if (!dqb.dqb_curspace && !dqb.dqb_bsoftlimit &&
			 !dqb.dqb_bhardlimit && !dqb.dqb_curinodes &&
			 !dqb.dqb_isoftlimit && !dqb.dqb_ihardlimit)
			 continue;
		 if (ret) {
			 perror ("quotactl");
			 return 1;
		 }
		 print_dqblk(i, &dqb);
		 found++;
		 sum_dqb.dqb_curspace += dqb.dqb_curspace;
		 sum_dqb.dqb_curinodes += dqb.dqb_curinodes;
	 }
	 printf("--------------------------------------------------------------------------------\n");
	 print_dqblk(found, &sum_dqb);
	 if (warn)
		 printf("WARN!!  bad quota found id:%d \n", warn - 1);
	 return warn;
 }
 int main(int argc, char **argv)
 {
	 unsigned treeid = -1;
	 gid_t gidset[NGROUPS], *gidsetp;
	 int i, ret;
	 int ngroups = 0;

	 struct if_dqblk dqb;
	 struct option long_opts[] = {
		 { "help", 0, NULL, 'H' },
		 { "path", 1, NULL, 'p'},
		 { "get", 0, NULL, 'G' },
		 { "set", 0, NULL, 'S' },
		 { "on", 0, NULL, 'O'},
		 { "off", 0, NULL, 'o'},
		{ "device", 1, NULL, 'D'},
		{ "bsoft", 1, NULL, 'b'},
		{ "bhard",1, NULL, 'B'},
		{ "curspace",1, NULL, 'c'},
		{ "curinodes",1, NULL, 'C'},
		{ "isoft",1, NULL, 'i'},
		{ "ihard",2, NULL, 'I'},
		{ "type",1, NULL, 'T'},
		{ "all",0, NULL, 'a'},
		{ "id",1, NULL, 'd'},
		{ NULL, 0, NULL, 0 }
	};
	while ((ret = getopt_long(argc, argv, "HGSp:D:T:i:b:B:i:I:c:C:aOo", long_opts, NULL)) != -1) {
		switch (ret) {
		case 'H':
			  flags |= HELP_FL;
			  break;
		case 'G':
			  flags |= GET_FL;
			  break;
		case 'S':
			  flags |= SET_FL;
			  break;
		case 'O':
			  flags |= ON_FL;
			  break;
		case 'o':
			flags |= OFF_FL;
			break;
		case 'p':
			strcpy(path, optarg);
			break;

		case 'a':
			  flags |= ALL_FL;
			  break;
		case 'D' :
			strcpy(device ,optarg);
			break;
		case 'T' :
			flags |= TYPE_FL;
			type = atoi(optarg);
			break;
		case 'd' :
			id = atoi(optarg);
			flags |= ID_FL;
			break;
		case 'b' :
			dqb.dqb_bsoftlimit= atol(optarg);
			dqb.dqb_valid |= QIF_BLIMITS;
			break;
		case 'B' :
			dqb.dqb_bhardlimit= atol(optarg);
			dqb.dqb_valid |= QIF_BLIMITS;
			break;
		case 'i' :
			dqb.dqb_isoftlimit= atol(optarg);
			dqb.dqb_valid |= QIF_ILIMITS;
			break;
		case 'I' :
			dqb.dqb_ihardlimit= atol(optarg);
			dqb.dqb_valid |= QIF_ILIMITS;
			break;
		case 'c' :
			dqb.dqb_curspace= atol(optarg);
			dqb.dqb_valid |= QIF_SPACE;
			break;
		case 'C' :
			dqb.dqb_curinodes= atol(optarg);
			dqb.dqb_valid |= QIF_INODES;
			break;

		default:
			printf("Unknown opt:%c\n", ret);
			exit(1);
		}
	}
	argc -= optind;
	argv += optind;
	if (flags & (ON_FL |OFF_FL))
		return onoff_quota(type, flags & ON_FL);

	if (flags & SET_FL) {
		if ((flags & (ID_FL| TYPE_FL)) != (ID_FL| TYPE_FL)) {
			printf("Err --set with out --id opt\n");
			exit(1);
		}
		if (!(dqb.dqb_valid & QIF_ALL)) {
			printf("Err --set with bhard,bsoft,isoft,ihard "
				 "curspace, curinodes opt\n");
			exit(1);
		}
		ret = quotactl(QCMD(Q_SETQUOTA, type),device, id, (char*)&dqb);
		if (ret) {
			perror("quotactl");
			return 1;
		}

	}
	if (flags & GET_FL)
		if (flags & ALL_FL)
			return show_quota(0, 65535);
		else
			return show_quota(id, id + 1);
	return 0;
}

[-- Attachment #3: write-truncate-chown.c --]
[-- Type: text/x-csrc, Size: 1340 bytes --]

/*
 *  Write-truncate-chown testcase
 */
#include <sys/types.h>
#include <limits.h>
#include <unistd.h>
#include <fcntl.h>
#include <stdio.h>
#include "time.h"

long long time_diff(struct timeval *tv1, struct timeval *tv2)
{
	long long diff = 0;
	long long sec = 0;
	diff = tv2->tv_usec - tv1->tv_usec;
	sec += (tv2->tv_sec - tv1->tv_sec);
	sec *= 1000000;
	diff += sec;
	return diff;
}

int main(int argc, char **argv)

{
	int num, i;
	int ret = 0;
	int fd;
	char name[] = "test-write-truncate-chown";
	uid_t uid, gid;
	char buf[1024*1024];
	struct timeval tv[2];
	if (argc  <  2) {
		printf("usage %s <path> <num_iter>\n", argv[0]);
		return 1;
	}
	chdir(argv[1]);

	if (argc == 2)
		num = INT_MAX;
	else
		num = atoi(argv[2]);
	fd = open(name, O_CREAT|O_RDWR, 0777);
	uid = geteuid();
	uid = getegid();
	if (fd < 0)
		goto out;

	gettimeofday(tv, NULL);
	for (i = 0; i < num; i+= 2) {
		// Trigger quota transfer
		ret |= fchown(fd, i % 65538, i % 65538);
		// Trigger pages invalidate
		ftruncate(fd, 0);
		// Trigger quota transfer
		ret |= fchown(fd, (i + 1) % 65538, (i+1) % 65538);
		// Trigger allocation + space reservation
		pwrite(fd, buf, sizeof(buf));
	}
	gettimeofday(tv+1, NULL);
	ret |= unlink (name);
	printf("%lld\n", time_diff(tv, tv+1));
	if (ret) {
out:
		printf("WARN error happend during test\n");
	}
	return ret;
}

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: ext4+quota patch series
  2009-11-23 19:18   ` Dmitry Monakhov
@ 2009-11-23 19:35     ` Eric Sandeen
  0 siblings, 0 replies; 20+ messages in thread
From: Eric Sandeen @ 2009-11-23 19:35 UTC (permalink / raw)
  To: Dmitry Monakhov; +Cc: linux-ext4

Dmitry Monakhov wrote:
> Eric Sandeen <sandeen@redhat.com> writes:
>> Can you share those quota tests?  I'd love to put them into the xfstests
>> suite we've been using for ext4 as well.
> 
> write-truncate-chown: test delalloc + quota_transfer
> I've written crappy quotactl for quota manipulation
> (which i use for ct-tree-quota development).Some times
> it more useful. See files attached.

Great, thanks!

-Eric


> mkfs.ext4 /dev/sdb5 -b4096
> mount /dev/sdb5 /mnt -ogrpquota,usrquota
> quotacheck -cug /mnt
> # sync is necessary, because files may have reserved some blocks
> # which later lead to complain from claim_reserved_space
> # Probably we have print *huge* warning if we found
> # file with reserved blocks inodes traversing on quotaon 
> sync;sync;sync
> # turn on quota
> ./quotactl --all --on --device=/dev/sdb5 --path /mnt
> # run test
> ./write-truncate-chown /mnt/ 9999999999&
> # get quota report, print warn in case of incorrect quota.
> ./quotactl --all --get --type 0 --device=/dev/sdb5 || echo "failed"
> # checkout dmesg
> dmesg
> 
> 


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 1/4] ext4: delalloc quota fixes
  2009-11-23 18:30 ` [PATCH 1/4] ext4: delalloc quota fixes Dmitry Monakhov
@ 2009-11-23 22:43   ` Dmitry Monakhov
  2009-11-23 22:58   ` Dmitry Monakhov
  1 sibling, 0 replies; 20+ messages in thread
From: Dmitry Monakhov @ 2009-11-23 22:43 UTC (permalink / raw)
  To: linux-ext4

Dmitry Monakhov <dmonakhov@openvz.org> writes:
> @@ -1119,8 +1121,9 @@ static void ext4_da_update_reserve_space(struct inode *inode, int used)
>  
>  	/* update per-inode reservations */
>  	BUG_ON(used  > EXT4_I(inode)->i_reserved_data_blocks);
> -	EXT4_I(inode)->i_reserved_data_blocks -= used;
> -	spin_unlock(&EXT4_I(inode)->i_block_reservation_lock);
^^^^^^^^^^ OOps... i've just deleted spin_unlock, but it supposed to
           be moved down. And appear again in second patch ;(
           Seem what i've mistyped during patch splitting. Please ignore
           this version, i'll send new version in a minute. 

> + 	EXT4_I(inode)->i_reserved_data_blocks -= used;
> +	percpu_counter_sub(&sbi->s_dirtyblocks_counter, used + mdb_claim);
> +	vfs_dq_claim_block(inode, used + mdb_claim);
>  
>  	/*
>  	 * free those over-booking quota for metadata blocks
> diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
> index bba1282..d4c52db 100644
> --- a/fs/ext4/mballoc.c
> +++ b/fs/ext4/mballoc.c
> @@ -2750,12 +2750,6 @@ ext4_mb_mark_diskspace_used(struct ext4_allocation_context *ac,
>  	if (!(ac->ac_flags & EXT4_MB_DELALLOC_RESERVED))
>  		/* release all the reserved blocks if non delalloc */
>  		percpu_counter_sub(&sbi->s_dirtyblocks_counter, reserv_blks);
> -	else {
> -		percpu_counter_sub(&sbi->s_dirtyblocks_counter,
> -						ac->ac_b_ex.fe_len);
> -		/* convert reserved quota blocks to real quota blocks */
> -		vfs_dq_claim_block(ac->ac_inode, ac->ac_b_ex.fe_len);
> -	}
>  
>  	if (sbi->s_log_groups_per_flex) {
>  		ext4_group_t flex_group = ext4_flex_group(sbi,

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH 1/4] ext4: delalloc quota fixes
  2009-11-23 18:30 ` [PATCH 1/4] ext4: delalloc quota fixes Dmitry Monakhov
  2009-11-23 22:43   ` Dmitry Monakhov
@ 2009-11-23 22:58   ` Dmitry Monakhov
  2009-11-23 22:58     ` [PATCH 2/4] ext4: fix race chown vs truncate Dmitry Monakhov
                       ` (2 more replies)
  1 sibling, 3 replies; 20+ messages in thread
From: Dmitry Monakhov @ 2009-11-23 22:58 UTC (permalink / raw)
  To: linux-ext4; +Cc: Dmitry Monakhov

This patch fix most visible problems with delalloc+quota issues
- ext4_get_reserved_space() must return bytes instead of blocks.
- Claim allocated meta blocks. Do it as we do for data blocks
  but delay it untill proper moment.
- Move space claiming to ext4_da_update_reserve_space()
  Only here we do know what we are dealing with data or metadata

The most useful test case for delalloc+quota is concurent tasks
with write+truncate vs chown for a same file.

Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org>
---
 fs/ext4/inode.c   |   12 ++++++++----
 fs/ext4/mballoc.c |    6 ------
 2 files changed, 8 insertions(+), 10 deletions(-)

diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index ab22297..e642cdb 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -1057,7 +1057,7 @@ qsize_t ext4_get_reserved_space(struct inode *inode)
 		EXT4_I(inode)->i_reserved_meta_blocks;
 	spin_unlock(&EXT4_I(inode)->i_block_reservation_lock);
 
-	return total;
+	return (total << inode->i_blkbits);
 }
 /*
  * Calculate the number of metadata blocks need to reserve
@@ -1096,7 +1096,7 @@ static int ext4_calc_metadata_amount(struct inode *inode, int blocks)
 static void ext4_da_update_reserve_space(struct inode *inode, int used)
 {
 	struct ext4_sb_info *sbi = EXT4_SB(inode->i_sb);
-	int total, mdb, mdb_free;
+	int total, mdb, mdb_free, mdb_claim = 0;
 
 	spin_lock(&EXT4_I(inode)->i_block_reservation_lock);
 	/* recalculate the number of metablocks still need to be reserved */
@@ -1109,7 +1109,9 @@ static void ext4_da_update_reserve_space(struct inode *inode, int used)
 
 	if (mdb_free) {
 		/* Account for allocated meta_blocks */
-		mdb_free -= EXT4_I(inode)->i_allocated_meta_blocks;
+		mdb_claim = EXT4_I(inode)->i_allocated_meta_blocks;
+		BUG_ON(mdb_free < mdb_claim);
+		mdb_free -= mdb_claim;
 
 		/* update fs dirty blocks counter */
 		percpu_counter_sub(&sbi->s_dirtyblocks_counter, mdb_free);
@@ -1119,7 +1121,9 @@ static void ext4_da_update_reserve_space(struct inode *inode, int used)
 
 	/* update per-inode reservations */
 	BUG_ON(used  > EXT4_I(inode)->i_reserved_data_blocks);
-	EXT4_I(inode)->i_reserved_data_blocks -= used;
+ 	EXT4_I(inode)->i_reserved_data_blocks -= used;
+	percpu_counter_sub(&sbi->s_dirtyblocks_counter, used + mdb_claim);
+	vfs_dq_claim_block(inode, used + mdb_claim);
 	spin_unlock(&EXT4_I(inode)->i_block_reservation_lock);
 
 	/*
diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
index bba1282..d4c52db 100644
--- a/fs/ext4/mballoc.c
+++ b/fs/ext4/mballoc.c
@@ -2750,12 +2750,6 @@ ext4_mb_mark_diskspace_used(struct ext4_allocation_context *ac,
 	if (!(ac->ac_flags & EXT4_MB_DELALLOC_RESERVED))
 		/* release all the reserved blocks if non delalloc */
 		percpu_counter_sub(&sbi->s_dirtyblocks_counter, reserv_blks);
-	else {
-		percpu_counter_sub(&sbi->s_dirtyblocks_counter,
-						ac->ac_b_ex.fe_len);
-		/* convert reserved quota blocks to real quota blocks */
-		vfs_dq_claim_block(ac->ac_inode, ac->ac_b_ex.fe_len);
-	}
 
 	if (sbi->s_log_groups_per_flex) {
 		ext4_group_t flex_group = ext4_flex_group(sbi,
-- 
1.6.0.4


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 2/4] ext4: fix race chown vs truncate
  2009-11-23 22:58   ` Dmitry Monakhov
@ 2009-11-23 22:58     ` Dmitry Monakhov
  2009-11-23 22:58       ` [PATCH 3/4] ext4: quota macros cleanup Dmitry Monakhov
  2009-11-24 15:24     ` [PATCH 1/4] ext4: delalloc quota fixes Eric Sandeen
  2009-12-08  0:00     ` Mingming
  2 siblings, 1 reply; 20+ messages in thread
From: Dmitry Monakhov @ 2009-11-23 22:58 UTC (permalink / raw)
  To: linux-ext4; +Cc: Dmitry Monakhov

Currently all functions which call vfs_dq_release_reservation_block()
call it without i_block_reservation_lock. This result in
ext4_reservation vs quota_reservation inconsistency which provoke
incorrect reservation transfer and incorrect quota.
Task 1 (chown)                   Task 2 (truncate)
dquot_transfer
 ->down_write(dqptr_sem)	 ext4_da_release_spac
 ->dquot_get_reserved_space	  ->lock(i_block_reservation_lock)
    ->get_reserved_space	    /* decrement reservation */
      ->ext4_get_reserved_spac	  ->unlock(i_block_reservation_lock)
        lock(i_block_rsv_lock) ---- /* During this time window
	Read incorrect value	     * fs's reservation not equals
				     * to quota's */
				    ->vfs_dq_release_reservation_block()
In fact i_block_reservation_lock is held by ext4_da_reserve_space()
while calling vfs_dq_reserve_block(). This may result in deadlock:
because of different lock ordering:
ext4_da_reserve_space()             dquot_transfer()
lock(i_block_reservation_lock)	     down_write(dqptr_sem)
  down_write(dqptr_sem)		     lock(i_block_reservation_lock)

But this not happens only because both callers must have i_mutex so
serialization happens on i_mutex.

To prevent ext4_reservation vs dquot_reservation inconsistency, we have
to reorganize locking ordering like follows:
i_block_reservation_lock > dqptr_sem
This means what all functions which changes ext4 or quota reservation have
to hold i_block_reservation_lock.

Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org>
---
 fs/ext4/inode.c |   23 +++++++++++++++++------
 1 files changed, 17 insertions(+), 6 deletions(-)

diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index e642cdb..979362d 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -1047,16 +1047,23 @@ cleanup:
 out:
 	return err;
 }
+/*
+ * Quota_transfer callback.
+ * During quota transfer we have to transfer rsv-blocks from one dquot to
+ * another. inode must be protected from concurrent reservation/reclamation.
+ * Locking ordering for all space reservation code:
+ *  i_block_reservation_lock > dqptr_sem
+ * This is differ from i_block,i_lock locking ordering, but this is the
+ * only possible way.
+ * NOTE: Caller must hold i_block_reservation_lock.
+ */
 
 qsize_t ext4_get_reserved_space(struct inode *inode)
 {
 	unsigned long long total;
 
-	spin_lock(&EXT4_I(inode)->i_block_reservation_lock);
 	total = EXT4_I(inode)->i_reserved_data_blocks +
 		EXT4_I(inode)->i_reserved_meta_blocks;
-	spin_unlock(&EXT4_I(inode)->i_block_reservation_lock);
-
 	return (total << inode->i_blkbits);
 }
 /*
@@ -1124,13 +1131,13 @@ static void ext4_da_update_reserve_space(struct inode *inode, int used)
  	EXT4_I(inode)->i_reserved_data_blocks -= used;
 	percpu_counter_sub(&sbi->s_dirtyblocks_counter, used + mdb_claim);
 	vfs_dq_claim_block(inode, used + mdb_claim);
-	spin_unlock(&EXT4_I(inode)->i_block_reservation_lock);
 
 	/*
 	 * free those over-booking quota for metadata blocks
 	 */
 	if (mdb_free)
 		vfs_dq_release_reservation_block(inode, mdb_free);
+	spin_unlock(&EXT4_I(inode)->i_block_reservation_lock);
 
 	/*
 	 * If we have done all the pending block allocations and if
@@ -1867,8 +1874,8 @@ repeat:
 	}
 
 	if (ext4_claim_free_blocks(sbi, total)) {
-		spin_unlock(&EXT4_I(inode)->i_block_reservation_lock);
 		vfs_dq_release_reservation_block(inode, total);
+		spin_unlock(&EXT4_I(inode)->i_block_reservation_lock);
 		if (ext4_should_retry_alloc(inode->i_sb, &retries)) {
 			yield();
 			goto repeat;
@@ -1925,9 +1932,9 @@ static void ext4_da_release_space(struct inode *inode, int to_free)
 
 	BUG_ON(mdb > EXT4_I(inode)->i_reserved_meta_blocks);
 	EXT4_I(inode)->i_reserved_meta_blocks = mdb;
-	spin_unlock(&EXT4_I(inode)->i_block_reservation_lock);
 
 	vfs_dq_release_reservation_block(inode, release);
+	spin_unlock(&EXT4_I(inode)->i_block_reservation_lock);
 }
 
 static void ext4_da_page_release_reservation(struct page *page,
@@ -5437,7 +5444,11 @@ int ext4_setattr(struct dentry *dentry, struct iattr *attr)
 			error = PTR_ERR(handle);
 			goto err_out;
 		}
+		/* i_block_reservation must being held in order to avoid races
+		 * with concurent block reservation. */
+		spin_lock(&EXT4_I(inode)->i_block_reservation_lock);
 		error = vfs_dq_transfer(inode, attr) ? -EDQUOT : 0;
+		spin_unlock(&EXT4_I(inode)->i_block_reservation_lock);
 		if (error) {
 			ext4_journal_stop(handle);
 			return error;
-- 
1.6.0.4


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 3/4] ext4: quota macros cleanup
  2009-11-23 22:58     ` [PATCH 2/4] ext4: fix race chown vs truncate Dmitry Monakhov
@ 2009-11-23 22:58       ` Dmitry Monakhov
  2009-11-23 22:58         ` [PATCH 4/4] ext4: fix incorrect block reservation on quota transfer Dmitry Monakhov
  0 siblings, 1 reply; 20+ messages in thread
From: Dmitry Monakhov @ 2009-11-23 22:58 UTC (permalink / raw)
  To: linux-ext4; +Cc: Dmitry Monakhov

Currently all quota block reservation macros contains hard-coded "2"
aka MAXQUOTAS value. This is no good because in some places it is not
obvious to understand what does this digit represent. Let's introduce
new macro with self descriptive name.

Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org>
---
 fs/ext4/ext4_jbd2.h |    8 ++++++--
 fs/ext4/extents.c   |    2 +-
 fs/ext4/inode.c     |    2 +-
 fs/ext4/migrate.c   |    4 ++--
 fs/ext4/namei.c     |    8 ++++----
 5 files changed, 14 insertions(+), 10 deletions(-)

diff --git a/fs/ext4/ext4_jbd2.h b/fs/ext4/ext4_jbd2.h
index a286598..404ab6c 100644
--- a/fs/ext4/ext4_jbd2.h
+++ b/fs/ext4/ext4_jbd2.h
@@ -49,7 +49,7 @@
 
 #define EXT4_DATA_TRANS_BLOCKS(sb)	(EXT4_SINGLEDATA_TRANS_BLOCKS(sb) + \
 					 EXT4_XATTR_TRANS_BLOCKS - 2 + \
-					 2*EXT4_QUOTA_TRANS_BLOCKS(sb))
+					 EXT4_MAXQUOTAS_TRANS_BLOCKS(sb))
 
 /*
  * Define the number of metadata blocks we need to account to modify data.
@@ -57,7 +57,7 @@
  * This include super block, inode block, quota blocks and xattr blocks
  */
 #define EXT4_META_TRANS_BLOCKS(sb)	(EXT4_XATTR_TRANS_BLOCKS + \
-					2*EXT4_QUOTA_TRANS_BLOCKS(sb))
+					EXT4_MAXQUOTAS_TRANS_BLOCKS(sb))
 
 /* Delete operations potentially hit one directory's namespace plus an
  * entire inode, plus arbitrary amounts of bitmap/indirection data.  Be
@@ -92,6 +92,7 @@
  * but inode, sb and group updates are done only once */
 #define EXT4_QUOTA_INIT_BLOCKS(sb) (test_opt(sb, QUOTA) ? (DQUOT_INIT_ALLOC*\
 		(EXT4_SINGLEDATA_TRANS_BLOCKS(sb)-3)+3+DQUOT_INIT_REWRITE) : 0)
+
 #define EXT4_QUOTA_DEL_BLOCKS(sb) (test_opt(sb, QUOTA) ? (DQUOT_DEL_ALLOC*\
 		(EXT4_SINGLEDATA_TRANS_BLOCKS(sb)-3)+3+DQUOT_DEL_REWRITE) : 0)
 #else
@@ -99,6 +100,9 @@
 #define EXT4_QUOTA_INIT_BLOCKS(sb) 0
 #define EXT4_QUOTA_DEL_BLOCKS(sb) 0
 #endif
+#define EXT4_MAXQUOTAS_TRANS_BLOCKS(sb) (MAXQUOTAS*EXT4_QUOTA_TRANS_BLOCKS(sb))
+#define EXT4_MAXQUOTAS_INIT_BLOCKS(sb) (MAXQUOTAS*EXT4_QUOTA_INIT_BLOCKS(sb))
+#define EXT4_MAXQUOTAS_DEL_BLOCKS(sb) (MAXQUOTAS*EXT4_QUOTA_DEL_BLOCKS(sb))
 
 int
 ext4_mark_iloc_dirty(handle_t *handle,
diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c
index 715264b..f73c3ff 100644
--- a/fs/ext4/extents.c
+++ b/fs/ext4/extents.c
@@ -2167,7 +2167,7 @@ ext4_ext_rm_leaf(handle_t *handle, struct inode *inode,
 			correct_index = 1;
 			credits += (ext_depth(inode)) + 1;
 		}
-		credits += 2 * EXT4_QUOTA_TRANS_BLOCKS(inode->i_sb);
+		credits += EXT4_MAXQUOTAS_TRANS_BLOCKS(inode->i_sb);
 
 		err = ext4_ext_truncate_extend_restart(handle, inode, credits);
 		if (err)
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 979362d..211722b 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -5438,7 +5438,7 @@ int ext4_setattr(struct dentry *dentry, struct iattr *attr)
 
 		/* (user+group)*(old+new) structure, inode write (sb,
 		 * inode block, ? - but truncate inode update has it) */
-		handle = ext4_journal_start(inode, 2*(EXT4_QUOTA_INIT_BLOCKS(inode->i_sb)+
+		handle = ext4_journal_start(inode, (EXT4_MAXQUOTAS_INIT_BLOCKS(inode->i_sb)+
 					EXT4_QUOTA_DEL_BLOCKS(inode->i_sb))+3);
 		if (IS_ERR(handle)) {
 			error = PTR_ERR(handle);
diff --git a/fs/ext4/migrate.c b/fs/ext4/migrate.c
index a93d5b8..8646149 100644
--- a/fs/ext4/migrate.c
+++ b/fs/ext4/migrate.c
@@ -238,7 +238,7 @@ static int extend_credit_for_blkdel(handle_t *handle, struct inode *inode)
 	 * So allocate a credit of 3. We may update
 	 * quota (user and group).
 	 */
-	needed = 3 + 2*EXT4_QUOTA_TRANS_BLOCKS(inode->i_sb);
+	needed = 3 + EXT4_MAXQUOTAS_TRANS_BLOCKS(inode->i_sb);
 
 	if (ext4_journal_extend(handle, needed) != 0)
 		retval = ext4_journal_restart(handle, needed);
@@ -477,7 +477,7 @@ int ext4_ext_migrate(struct inode *inode)
 	handle = ext4_journal_start(inode,
 					EXT4_DATA_TRANS_BLOCKS(inode->i_sb) +
 					EXT4_INDEX_EXTRA_TRANS_BLOCKS + 3 +
-					2 * EXT4_QUOTA_INIT_BLOCKS(inode->i_sb)
+					EXT4_MAXQUOTAS_INIT_BLOCKS(inode->i_sb)
 					+ 1);
 	if (IS_ERR(handle)) {
 		retval = PTR_ERR(handle);
diff --git a/fs/ext4/namei.c b/fs/ext4/namei.c
index fde08c9..17a17e1 100644
--- a/fs/ext4/namei.c
+++ b/fs/ext4/namei.c
@@ -1769,7 +1769,7 @@ static int ext4_create(struct inode *dir, struct dentry *dentry, int mode,
 retry:
 	handle = ext4_journal_start(dir, EXT4_DATA_TRANS_BLOCKS(dir->i_sb) +
 					EXT4_INDEX_EXTRA_TRANS_BLOCKS + 3 +
-					2*EXT4_QUOTA_INIT_BLOCKS(dir->i_sb));
+					EXT4_MAXQUOTAS_INIT_BLOCKS(dir->i_sb));
 	if (IS_ERR(handle))
 		return PTR_ERR(handle);
 
@@ -1803,7 +1803,7 @@ static int ext4_mknod(struct inode *dir, struct dentry *dentry,
 retry:
 	handle = ext4_journal_start(dir, EXT4_DATA_TRANS_BLOCKS(dir->i_sb) +
 					EXT4_INDEX_EXTRA_TRANS_BLOCKS + 3 +
-					2*EXT4_QUOTA_INIT_BLOCKS(dir->i_sb));
+					EXT4_MAXQUOTAS_INIT_BLOCKS(dir->i_sb));
 	if (IS_ERR(handle))
 		return PTR_ERR(handle);
 
@@ -1840,7 +1840,7 @@ static int ext4_mkdir(struct inode *dir, struct dentry *dentry, int mode)
 retry:
 	handle = ext4_journal_start(dir, EXT4_DATA_TRANS_BLOCKS(dir->i_sb) +
 					EXT4_INDEX_EXTRA_TRANS_BLOCKS + 3 +
-					2*EXT4_QUOTA_INIT_BLOCKS(dir->i_sb));
+					EXT4_MAXQUOTAS_INIT_BLOCKS(dir->i_sb));
 	if (IS_ERR(handle))
 		return PTR_ERR(handle);
 
@@ -2253,7 +2253,7 @@ static int ext4_symlink(struct inode *dir,
 retry:
 	handle = ext4_journal_start(dir, EXT4_DATA_TRANS_BLOCKS(dir->i_sb) +
 					EXT4_INDEX_EXTRA_TRANS_BLOCKS + 5 +
-					2*EXT4_QUOTA_INIT_BLOCKS(dir->i_sb));
+					EXT4_MAXQUOTAS_INIT_BLOCKS(dir->i_sb));
 	if (IS_ERR(handle))
 		return PTR_ERR(handle);
 
-- 
1.6.0.4


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 4/4] ext4: fix incorrect block reservation on quota transfer.
  2009-11-23 22:58       ` [PATCH 3/4] ext4: quota macros cleanup Dmitry Monakhov
@ 2009-11-23 22:58         ` Dmitry Monakhov
  0 siblings, 0 replies; 20+ messages in thread
From: Dmitry Monakhov @ 2009-11-23 22:58 UTC (permalink / raw)
  To: linux-ext4; +Cc: Dmitry Monakhov

Inside ->setattr() call both ATTR_UID and ATTR_GID may be valid
This means that we may end-up with transferring all quotas. Add
we have to reserve QUOTA_DEL_BLOCKS for all quotas, as we do in
case of QUOTA_INIT_BLOCKS.

Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org>
---
 fs/ext4/inode.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 211722b..d42e954 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -5439,7 +5439,7 @@ int ext4_setattr(struct dentry *dentry, struct iattr *attr)
 		/* (user+group)*(old+new) structure, inode write (sb,
 		 * inode block, ? - but truncate inode update has it) */
 		handle = ext4_journal_start(inode, (EXT4_MAXQUOTAS_INIT_BLOCKS(inode->i_sb)+
-					EXT4_QUOTA_DEL_BLOCKS(inode->i_sb))+3);
+					EXT4_MAXQUOTAS_DEL_BLOCKS(inode->i_sb))+3);
 		if (IS_ERR(handle)) {
 			error = PTR_ERR(handle);
 			goto err_out;
-- 
1.6.0.4


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH 1/4] ext4: delalloc quota fixes
  2009-11-23 22:58   ` Dmitry Monakhov
  2009-11-23 22:58     ` [PATCH 2/4] ext4: fix race chown vs truncate Dmitry Monakhov
@ 2009-11-24 15:24     ` Eric Sandeen
  2009-11-24 19:38       ` Dmitry Monakhov
  2009-12-08  0:00     ` Mingming
  2 siblings, 1 reply; 20+ messages in thread
From: Eric Sandeen @ 2009-11-24 15:24 UTC (permalink / raw)
  To: Dmitry Monakhov; +Cc: linux-ext4

Dmitry Monakhov wrote:
> This patch fix most visible problems with delalloc+quota issues
> - ext4_get_reserved_space() must return bytes instead of blocks.
> - Claim allocated meta blocks. Do it as we do for data blocks
>   but delay it untill proper moment.
> - Move space claiming to ext4_da_update_reserve_space()
>   Only here we do know what we are dealing with data or metadata
> 
> The most useful test case for delalloc+quota is concurent tasks
> with write+truncate vs chown for a same file.
> 
> Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org>
> ---
>  fs/ext4/inode.c   |   12 ++++++++----
>  fs/ext4/mballoc.c |    6 ------
>  2 files changed, 8 insertions(+), 10 deletions(-)
> 
> diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
> index ab22297..e642cdb 100644
> --- a/fs/ext4/inode.c
> +++ b/fs/ext4/inode.c
> @@ -1057,7 +1057,7 @@ qsize_t ext4_get_reserved_space(struct inode *inode)
>  		EXT4_I(inode)->i_reserved_meta_blocks;
>  	spin_unlock(&EXT4_I(inode)->i_block_reservation_lock);
>  
> -	return total;
> +	return (total << inode->i_blkbits);
>  }

Ow, whoops, yes this looks like a bug!  (maybe should be in its own patch?)

>  /*
>   * Calculate the number of metadata blocks need to reserve
> @@ -1096,7 +1096,7 @@ static int ext4_calc_metadata_amount(struct inode *inode, int blocks)
>  static void ext4_da_update_reserve_space(struct inode *inode, int used)
>  {
>  	struct ext4_sb_info *sbi = EXT4_SB(inode->i_sb);
> -	int total, mdb, mdb_free;
> +	int total, mdb, mdb_free, mdb_claim = 0;
>  
>  	spin_lock(&EXT4_I(inode)->i_block_reservation_lock);
>  	/* recalculate the number of metablocks still need to be reserved */
> @@ -1109,7 +1109,9 @@ static void ext4_da_update_reserve_space(struct inode *inode, int used)
>  
>  	if (mdb_free) {
>  		/* Account for allocated meta_blocks */
> -		mdb_free -= EXT4_I(inode)->i_allocated_meta_blocks;
> +		mdb_claim = EXT4_I(inode)->i_allocated_meta_blocks;
> +		BUG_ON(mdb_free < mdb_claim);

Did you see this happening in testing?

> +		mdb_free -= mdb_claim;
>  
>  		/* update fs dirty blocks counter */
>  		percpu_counter_sub(&sbi->s_dirtyblocks_counter, mdb_free);
> @@ -1119,7 +1121,9 @@ static void ext4_da_update_reserve_space(struct inode *inode, int used)
>  
>  	/* update per-inode reservations */
>  	BUG_ON(used  > EXT4_I(inode)->i_reserved_data_blocks);
> -	EXT4_I(inode)->i_reserved_data_blocks -= used;
> + 	EXT4_I(inode)->i_reserved_data_blocks -= used;

looks like a little whitespace damage here

> +	percpu_counter_sub(&sbi->s_dirtyblocks_counter, used + mdb_claim);
> +	vfs_dq_claim_block(inode, used + mdb_claim);
>  	spin_unlock(&EXT4_I(inode)->i_block_reservation_lock);
>  
>  	/*
> diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
> index bba1282..d4c52db 100644
> --- a/fs/ext4/mballoc.c
> +++ b/fs/ext4/mballoc.c
> @@ -2750,12 +2750,6 @@ ext4_mb_mark_diskspace_used(struct ext4_allocation_context *ac,
>  	if (!(ac->ac_flags & EXT4_MB_DELALLOC_RESERVED))
>  		/* release all the reserved blocks if non delalloc */
>  		percpu_counter_sub(&sbi->s_dirtyblocks_counter, reserv_blks);
> -	else {
> -		percpu_counter_sub(&sbi->s_dirtyblocks_counter,
> -						ac->ac_b_ex.fe_len);
> -		/* convert reserved quota blocks to real quota blocks */
> -		vfs_dq_claim_block(ac->ac_inode, ac->ac_b_ex.fe_len);
> -	}

Maybe Mingming can review this change better, I don't have all the quota paths
in my head yet ...

-Eric

>  	if (sbi->s_log_groups_per_flex) {
>  		ext4_group_t flex_group = ext4_flex_group(sbi,


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 1/4] ext4: delalloc quota fixes
  2009-11-24 15:24     ` [PATCH 1/4] ext4: delalloc quota fixes Eric Sandeen
@ 2009-11-24 19:38       ` Dmitry Monakhov
  0 siblings, 0 replies; 20+ messages in thread
From: Dmitry Monakhov @ 2009-11-24 19:38 UTC (permalink / raw)
  To: Eric Sandeen; +Cc: linux-ext4

Eric Sandeen <sandeen@redhat.com> writes:

> Dmitry Monakhov wrote:
>> This patch fix most visible problems with delalloc+quota issues
>> - ext4_get_reserved_space() must return bytes instead of blocks.
>> - Claim allocated meta blocks. Do it as we do for data blocks
>>   but delay it untill proper moment.
>> - Move space claiming to ext4_da_update_reserve_space()
>>   Only here we do know what we are dealing with data or metadata
>> 
>> The most useful test case for delalloc+quota is concurent tasks
>> with write+truncate vs chown for a same file.
>> 
>> Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org>
>> ---
>>  fs/ext4/inode.c   |   12 ++++++++----
>>  fs/ext4/mballoc.c |    6 ------
>>  2 files changed, 8 insertions(+), 10 deletions(-)
>> 
>> diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
>> index ab22297..e642cdb 100644
>> --- a/fs/ext4/inode.c
>> +++ b/fs/ext4/inode.c
>> @@ -1057,7 +1057,7 @@ qsize_t ext4_get_reserved_space(struct inode *inode)
>>  		EXT4_I(inode)->i_reserved_meta_blocks;
>>  	spin_unlock(&EXT4_I(inode)->i_block_reservation_lock);
>>  
>> -	return total;
>> +	return (total << inode->i_blkbits);
>>  }
>
> Ow, whoops, yes this looks like a bug!  (maybe should be in its own patch?)
Ok i'll resubmit patchset soon.
>
>>  /*
>>   * Calculate the number of metadata blocks need to reserve
>> @@ -1096,7 +1096,7 @@ static int ext4_calc_metadata_amount(struct inode *inode, int blocks)
>>  static void ext4_da_update_reserve_space(struct inode *inode, int used)
>>  {
>>  	struct ext4_sb_info *sbi = EXT4_SB(inode->i_sb);
>> -	int total, mdb, mdb_free;
>> +	int total, mdb, mdb_free, mdb_claim = 0;
>>  
>>  	spin_lock(&EXT4_I(inode)->i_block_reservation_lock);
>>  	/* recalculate the number of metablocks still need to be reserved */
>> @@ -1109,7 +1109,9 @@ static void ext4_da_update_reserve_space(struct inode *inode, int used)
>>  
>>  	if (mdb_free) {
>>  		/* Account for allocated meta_blocks */
>> -		mdb_free -= EXT4_I(inode)->i_allocated_meta_blocks;
>> +		mdb_claim = EXT4_I(inode)->i_allocated_meta_blocks;
>> +		BUG_ON(mdb_free < mdb_claim);
>
> Did you see this happening in testing?
>
No i didn't. I've add it just for sanity reason.
>> +		mdb_free -= mdb_claim;
>>  
>>  		/* update fs dirty blocks counter */
>>  		percpu_counter_sub(&sbi->s_dirtyblocks_counter, mdb_free);
>> @@ -1119,7 +1121,9 @@ static void ext4_da_update_reserve_space(struct inode *inode, int used)
>>  
>>  	/* update per-inode reservations */
>>  	BUG_ON(used  > EXT4_I(inode)->i_reserved_data_blocks);
>> -	EXT4_I(inode)->i_reserved_data_blocks -= used;
>> + 	EXT4_I(inode)->i_reserved_data_blocks -= used;
>
> looks like a little whitespace damage here
>
>> +	percpu_counter_sub(&sbi->s_dirtyblocks_counter, used + mdb_claim);
>> +	vfs_dq_claim_block(inode, used + mdb_claim);
>>  	spin_unlock(&EXT4_I(inode)->i_block_reservation_lock);
>>  
>>  	/*
>> diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
>> index bba1282..d4c52db 100644
>> --- a/fs/ext4/mballoc.c
>> +++ b/fs/ext4/mballoc.c
>> @@ -2750,12 +2750,6 @@ ext4_mb_mark_diskspace_used(struct ext4_allocation_context *ac,
>>  	if (!(ac->ac_flags & EXT4_MB_DELALLOC_RESERVED))
>>  		/* release all the reserved blocks if non delalloc */
>>  		percpu_counter_sub(&sbi->s_dirtyblocks_counter, reserv_blks);
>> -	else {
>> -		percpu_counter_sub(&sbi->s_dirtyblocks_counter,
>> -						ac->ac_b_ex.fe_len);
>> -		/* convert reserved quota blocks to real quota blocks */
>> -		vfs_dq_claim_block(ac->ac_inode, ac->ac_b_ex.fe_len);
>> -	}
>
> Maybe Mingming can review this change better, I don't have all the quota paths
> in my head yet ...
>
> -Eric
>
>>  	if (sbi->s_log_groups_per_flex) {
>>  		ext4_group_t flex_group = ext4_flex_group(sbi,

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 1/4] ext4: delalloc quota fixes
  2009-11-23 22:58   ` Dmitry Monakhov
  2009-11-23 22:58     ` [PATCH 2/4] ext4: fix race chown vs truncate Dmitry Monakhov
  2009-11-24 15:24     ` [PATCH 1/4] ext4: delalloc quota fixes Eric Sandeen
@ 2009-12-08  0:00     ` Mingming
  2009-12-08  6:34       ` Dmitry Monakhov
  2 siblings, 1 reply; 20+ messages in thread
From: Mingming @ 2009-12-08  0:00 UTC (permalink / raw)
  To: Dmitry Monakhov; +Cc: linux-ext4

On Tue, 2009-11-24 at 01:58 +0300, Dmitry Monakhov wrote:
> This patch fix most visible problems with delalloc+quota issues
> - ext4_get_reserved_space() must return bytes instead of blocks.
> - Claim allocated meta blocks. Do it as we do for data blocks
>   but delay it untill proper moment.
> - Move space claiming to ext4_da_update_reserve_space()
>   Only here we do know what we are dealing with data or metadata
> 

Thanks for sending the first fix, I am actually surprisely to know the
fix didn't get merged before, as Jan has pointed this out when he review
the original patch and I have get this fixed...somehow the latest patch
wasn't being picked at the end.

here it is.
http://marc.info/?l=linux-ext4&m=123185939602949&w=2


About the second and third issue you are trying to fix here... I think
the current code does what you want already.

The current code does reserve quota for metadata blocks at
ext4_da_update_reserve_space() already and delay the quota claim at the
later time when metadata blocks are really allocated (via
ext4_mb_mark_diskspace_used), extra reserved quota for metadata blocks
will get freed at ext4_da_update_reserve_space().
ext4_mb_mark_diskspace_used() is called for every block allocation
including medatadata allocation, so we won't miss quota claim for
metadata there.

The reason that I keep all quota claim immediately at the block
allocation time via ext4_mb_mark_diskspace_used() is to keep the block
allocation space accounting/dirty/delayed block space accounting/quota
accounting in the same place; Plus, when ext4_da_update_reserve_space()
called, the number of blocks passed is the number of blocks mapped, may
not necessarilly the the number of blocks just new allocated.

cheers,
Mingming

 put the space claiming under 
> The most useful test case for delalloc+quota is concurent tasks
> with write+truncate vs chown for a same file.
> 
> Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org>
> ---
>  fs/ext4/inode.c   |   12 ++++++++----
>  fs/ext4/mballoc.c |    6 ------
>  2 files changed, 8 insertions(+), 10 deletions(-)
> 
> diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
> index ab22297..e642cdb 100644
> --- a/fs/ext4/inode.c
> +++ b/fs/ext4/inode.c
> @@ -1057,7 +1057,7 @@ qsize_t ext4_get_reserved_space(struct inode *inode)
>  		EXT4_I(inode)->i_reserved_meta_blocks;
>  	spin_unlock(&EXT4_I(inode)->i_block_reservation_lock);
> 
> -	return total;
> +	return (total << inode->i_blkbits);
>  }



>  /*
>   * Calculate the number of metadata blocks need to reserve
> @@ -1096,7 +1096,7 @@ static int ext4_calc_metadata_amount(struct inode *inode, int blocks)
>  static void ext4_da_update_reserve_space(struct inode *inode, int used)
>  {
>  	struct ext4_sb_info *sbi = EXT4_SB(inode->i_sb);
> -	int total, mdb, mdb_free;
> +	int total, mdb, mdb_free, mdb_claim = 0;
> 
>  	spin_lock(&EXT4_I(inode)->i_block_reservation_lock);
>  	/* recalculate the number of metablocks still need to be reserved */
> @@ -1109,7 +1109,9 @@ static void ext4_da_update_reserve_space(struct inode *inode, int used)
> 
>  	if (mdb_free) {
>  		/* Account for allocated meta_blocks */
> -		mdb_free -= EXT4_I(inode)->i_allocated_meta_blocks;
> +		mdb_claim = EXT4_I(inode)->i_allocated_meta_blocks;
> +		BUG_ON(mdb_free < mdb_claim);
> +		mdb_free -= mdb_claim;
> 
>  		/* update fs dirty blocks counter */
>  		percpu_counter_sub(&sbi->s_dirtyblocks_counter, mdb_free);
> @@ -1119,7 +1121,9 @@ static void ext4_da_update_reserve_space(struct inode *inode, int used)
> 
>  	/* update per-inode reservations */
>  	BUG_ON(used  > EXT4_I(inode)->i_reserved_data_blocks);
> -	EXT4_I(inode)->i_reserved_data_blocks -= used;
> + 	EXT4_I(inode)->i_reserved_data_blocks -= used;
> +	percpu_counter_sub(&sbi->s_dirtyblocks_counter, used + mdb_claim);
> +	vfs_dq_claim_block(inode, used + mdb_claim);
>  	spin_unlock(&EXT4_I(inode)->i_block_reservation_lock);
> 
>  	/*
> diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
> index bba1282..d4c52db 100644
> --- a/fs/ext4/mballoc.c
> +++ b/fs/ext4/mballoc.c
> @@ -2750,12 +2750,6 @@ ext4_mb_mark_diskspace_used(struct ext4_allocation_context *ac,
>  	if (!(ac->ac_flags & EXT4_MB_DELALLOC_RESERVED))
>  		/* release all the reserved blocks if non delalloc */
>  		percpu_counter_sub(&sbi->s_dirtyblocks_counter, reserv_blks);
> -	else {
> -		percpu_counter_sub(&sbi->s_dirtyblocks_counter,
> -						ac->ac_b_ex.fe_len);
> -		/* convert reserved quota blocks to real quota blocks */
> -		vfs_dq_claim_block(ac->ac_inode, ac->ac_b_ex.fe_len);
> -	}
> 
>  	if (sbi->s_log_groups_per_flex) {
>  		ext4_group_t flex_group = ext4_flex_group(sbi,



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 1/4] ext4: delalloc quota fixes
  2009-12-08  0:00     ` Mingming
@ 2009-12-08  6:34       ` Dmitry Monakhov
  0 siblings, 0 replies; 20+ messages in thread
From: Dmitry Monakhov @ 2009-12-08  6:34 UTC (permalink / raw)
  To: Mingming; +Cc: linux-ext4

Mingming <cmm@us.ibm.com> writes:

> On Tue, 2009-11-24 at 01:58 +0300, Dmitry Monakhov wrote:
>> This patch fix most visible problems with delalloc+quota issues
>> - ext4_get_reserved_space() must return bytes instead of blocks.
>> - Claim allocated meta blocks. Do it as we do for data blocks
>>   but delay it untill proper moment.
>> - Move space claiming to ext4_da_update_reserve_space()
>>   Only here we do know what we are dealing with data or metadata
>> 
>
> Thanks for sending the first fix, I am actually surprisely to know the
> fix didn't get merged before, as Jan has pointed this out when he review
> the original patch and I have get this fixed...somehow the latest patch
> wasn't being picked at the end.
>
> here it is.
> http://marc.info/?l=linux-ext4&m=123185939602949&w=2
>
>
> About the second and third issue you are trying to fix here... I think
> the current code does what you want already.
>
> The current code does reserve quota for metadata blocks at
> ext4_da_update_reserve_space() already and delay the quota claim at the
> later time when metadata blocks are really allocated (via
> ext4_mb_mark_diskspace_used), extra reserved quota for metadata blocks
> will get freed at ext4_da_update_reserve_space().
> ext4_mb_mark_diskspace_used() is called for every block allocation
> including medatadata allocation, so we won't miss quota claim for
> metadata there.
>
> The reason that I keep all quota claim immediately at the block
> allocation time via ext4_mb_mark_diskspace_used() is to keep the block
> allocation space accounting/dirty/delayed block space accounting/quota
> accounting in the same place; Plus, when ext4_da_update_reserve_space()
> called, the number of blocks passed is the number of blocks mapped, may
> not necessarilly the the number of blocks just new allocated.
You have reviewed wrong patch version. Please read the latest one
http://article.gmane.org/gmane.comp.file-systems.ext4/16629 in order
to get my idea. But in fact as Aneesh have spotted, my I've missed
important issue aka #14739. Currently I'm working on new patch set
version. This patch series will fix "chown vs truncate issue" and
#14739 by one shot.
>
> cheers,
> Mingming
>
>  put the space claiming under 
>> The most useful test case for delalloc+quota is concurent tasks
>> with write+truncate vs chown for a same file.
>> 
>> Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org>
>> ---
>>  fs/ext4/inode.c   |   12 ++++++++----
>>  fs/ext4/mballoc.c |    6 ------
>>  2 files changed, 8 insertions(+), 10 deletions(-)
>> 
>> diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
>> index ab22297..e642cdb 100644
>> --- a/fs/ext4/inode.c
>> +++ b/fs/ext4/inode.c
>> @@ -1057,7 +1057,7 @@ qsize_t ext4_get_reserved_space(struct inode *inode)
>>  		EXT4_I(inode)->i_reserved_meta_blocks;
>>  	spin_unlock(&EXT4_I(inode)->i_block_reservation_lock);
>> 
>> -	return total;
>> +	return (total << inode->i_blkbits);
>>  }
>
>
>
>>  /*
>>   * Calculate the number of metadata blocks need to reserve
>> @@ -1096,7 +1096,7 @@ static int ext4_calc_metadata_amount(struct inode *inode, int blocks)
>>  static void ext4_da_update_reserve_space(struct inode *inode, int used)
>>  {
>>  	struct ext4_sb_info *sbi = EXT4_SB(inode->i_sb);
>> -	int total, mdb, mdb_free;
>> +	int total, mdb, mdb_free, mdb_claim = 0;
>> 
>>  	spin_lock(&EXT4_I(inode)->i_block_reservation_lock);
>>  	/* recalculate the number of metablocks still need to be reserved */
>> @@ -1109,7 +1109,9 @@ static void ext4_da_update_reserve_space(struct inode *inode, int used)
>> 
>>  	if (mdb_free) {
>>  		/* Account for allocated meta_blocks */
>> -		mdb_free -= EXT4_I(inode)->i_allocated_meta_blocks;
>> +		mdb_claim = EXT4_I(inode)->i_allocated_meta_blocks;
>> +		BUG_ON(mdb_free < mdb_claim);
>> +		mdb_free -= mdb_claim;
>> 
>>  		/* update fs dirty blocks counter */
>>  		percpu_counter_sub(&sbi->s_dirtyblocks_counter, mdb_free);
>> @@ -1119,7 +1121,9 @@ static void ext4_da_update_reserve_space(struct inode *inode, int used)
>> 
>>  	/* update per-inode reservations */
>>  	BUG_ON(used  > EXT4_I(inode)->i_reserved_data_blocks);
>> -	EXT4_I(inode)->i_reserved_data_blocks -= used;
>> + 	EXT4_I(inode)->i_reserved_data_blocks -= used;
>> +	percpu_counter_sub(&sbi->s_dirtyblocks_counter, used + mdb_claim);
>> +	vfs_dq_claim_block(inode, used + mdb_claim);
>>  	spin_unlock(&EXT4_I(inode)->i_block_reservation_lock);
>> 
>>  	/*
>> diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
>> index bba1282..d4c52db 100644
>> --- a/fs/ext4/mballoc.c
>> +++ b/fs/ext4/mballoc.c
>> @@ -2750,12 +2750,6 @@ ext4_mb_mark_diskspace_used(struct ext4_allocation_context *ac,
>>  	if (!(ac->ac_flags & EXT4_MB_DELALLOC_RESERVED))
>>  		/* release all the reserved blocks if non delalloc */
>>  		percpu_counter_sub(&sbi->s_dirtyblocks_counter, reserv_blks);
>> -	else {
>> -		percpu_counter_sub(&sbi->s_dirtyblocks_counter,
>> -						ac->ac_b_ex.fe_len);
>> -		/* convert reserved quota blocks to real quota blocks */
>> -		vfs_dq_claim_block(ac->ac_inode, ac->ac_b_ex.fe_len);
>> -	}
>> 
>>  	if (sbi->s_log_groups_per_flex) {
>>  		ext4_group_t flex_group = ext4_flex_group(sbi,
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 3/4] ext4: quota macros cleanup
  2009-11-25  6:57   ` [PATCH 3/4] ext4: quota macros cleanup Dmitry Monakhov
@ 2009-12-08  0:59     ` Mingming
  0 siblings, 0 replies; 20+ messages in thread
From: Mingming @ 2009-12-08  0:59 UTC (permalink / raw)
  To: Dmitry Monakhov; +Cc: linux-ext4, Jan Kara

On Wed, 2009-11-25 at 09:57 +0300, Dmitry Monakhov wrote:
> Currently all quota block reservation macros contains hard-coded "2"
> aka MAXQUOTAS value. This is no good because in some places it is not
> obvious to understand what does this digit represent. Let's introduce
> new macro with self descriptive name.
> 
> Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org>
looks sane, MAXQUOTAS is either user or group quota.

Acked-by: Mingming Cao <cmm@us.ibm.com>

> ---
>  fs/ext4/ext4_jbd2.h |    8 ++++++--
>  fs/ext4/extents.c   |    2 +-
>  fs/ext4/inode.c     |    2 +-
>  fs/ext4/migrate.c   |    4 ++--
>  fs/ext4/namei.c     |    8 ++++----
>  5 files changed, 14 insertions(+), 10 deletions(-)
> 
> diff --git a/fs/ext4/ext4_jbd2.h b/fs/ext4/ext4_jbd2.h
> index a286598..404ab6c 100644
> --- a/fs/ext4/ext4_jbd2.h
> +++ b/fs/ext4/ext4_jbd2.h
> @@ -49,7 +49,7 @@
> 
>  #define EXT4_DATA_TRANS_BLOCKS(sb)	(EXT4_SINGLEDATA_TRANS_BLOCKS(sb) + \
>  					 EXT4_XATTR_TRANS_BLOCKS - 2 + \
> -					 2*EXT4_QUOTA_TRANS_BLOCKS(sb))
> +					 EXT4_MAXQUOTAS_TRANS_BLOCKS(sb))
> 
>  /*
>   * Define the number of metadata blocks we need to account to modify data.
> @@ -57,7 +57,7 @@
>   * This include super block, inode block, quota blocks and xattr blocks
>   */
>  #define EXT4_META_TRANS_BLOCKS(sb)	(EXT4_XATTR_TRANS_BLOCKS + \
> -					2*EXT4_QUOTA_TRANS_BLOCKS(sb))
> +					EXT4_MAXQUOTAS_TRANS_BLOCKS(sb))
> 
>  /* Delete operations potentially hit one directory's namespace plus an
>   * entire inode, plus arbitrary amounts of bitmap/indirection data.  Be
> @@ -92,6 +92,7 @@
>   * but inode, sb and group updates are done only once */
>  #define EXT4_QUOTA_INIT_BLOCKS(sb) (test_opt(sb, QUOTA) ? (DQUOT_INIT_ALLOC*\
>  		(EXT4_SINGLEDATA_TRANS_BLOCKS(sb)-3)+3+DQUOT_INIT_REWRITE) : 0)
> +
>  #define EXT4_QUOTA_DEL_BLOCKS(sb) (test_opt(sb, QUOTA) ? (DQUOT_DEL_ALLOC*\
>  		(EXT4_SINGLEDATA_TRANS_BLOCKS(sb)-3)+3+DQUOT_DEL_REWRITE) : 0)
>  #else
> @@ -99,6 +100,9 @@
>  #define EXT4_QUOTA_INIT_BLOCKS(sb) 0
>  #define EXT4_QUOTA_DEL_BLOCKS(sb) 0
>  #endif
> +#define EXT4_MAXQUOTAS_TRANS_BLOCKS(sb) (MAXQUOTAS*EXT4_QUOTA_TRANS_BLOCKS(sb))
> +#define EXT4_MAXQUOTAS_INIT_BLOCKS(sb) (MAXQUOTAS*EXT4_QUOTA_INIT_BLOCKS(sb))
> +#define EXT4_MAXQUOTAS_DEL_BLOCKS(sb) (MAXQUOTAS*EXT4_QUOTA_DEL_BLOCKS(sb))
> 
>  int
>  ext4_mark_iloc_dirty(handle_t *handle,
> diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c
> index 715264b..f73c3ff 100644
> --- a/fs/ext4/extents.c
> +++ b/fs/ext4/extents.c
> @@ -2167,7 +2167,7 @@ ext4_ext_rm_leaf(handle_t *handle, struct inode *inode,
>  			correct_index = 1;
>  			credits += (ext_depth(inode)) + 1;
>  		}
> -		credits += 2 * EXT4_QUOTA_TRANS_BLOCKS(inode->i_sb);
> +		credits += EXT4_MAXQUOTAS_TRANS_BLOCKS(inode->i_sb);
> 
>  		err = ext4_ext_truncate_extend_restart(handle, inode, credits);
>  		if (err)
> diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
> index 979362d..211722b 100644
> --- a/fs/ext4/inode.c
> +++ b/fs/ext4/inode.c
> @@ -5438,7 +5438,7 @@ int ext4_setattr(struct dentry *dentry, struct iattr *attr)
> 
>  		/* (user+group)*(old+new) structure, inode write (sb,
>  		 * inode block, ? - but truncate inode update has it) */
> -		handle = ext4_journal_start(inode, 2*(EXT4_QUOTA_INIT_BLOCKS(inode->i_sb)+
> +		handle = ext4_journal_start(inode, (EXT4_MAXQUOTAS_INIT_BLOCKS(inode->i_sb)+
>  					EXT4_QUOTA_DEL_BLOCKS(inode->i_sb))+3);
>  		if (IS_ERR(handle)) {
>  			error = PTR_ERR(handle);
> diff --git a/fs/ext4/migrate.c b/fs/ext4/migrate.c
> index a93d5b8..8646149 100644
> --- a/fs/ext4/migrate.c
> +++ b/fs/ext4/migrate.c
> @@ -238,7 +238,7 @@ static int extend_credit_for_blkdel(handle_t *handle, struct inode *inode)
>  	 * So allocate a credit of 3. We may update
>  	 * quota (user and group).
>  	 */
> -	needed = 3 + 2*EXT4_QUOTA_TRANS_BLOCKS(inode->i_sb);
> +	needed = 3 + EXT4_MAXQUOTAS_TRANS_BLOCKS(inode->i_sb);
> 
>  	if (ext4_journal_extend(handle, needed) != 0)
>  		retval = ext4_journal_restart(handle, needed);
> @@ -477,7 +477,7 @@ int ext4_ext_migrate(struct inode *inode)
>  	handle = ext4_journal_start(inode,
>  					EXT4_DATA_TRANS_BLOCKS(inode->i_sb) +
>  					EXT4_INDEX_EXTRA_TRANS_BLOCKS + 3 +
> -					2 * EXT4_QUOTA_INIT_BLOCKS(inode->i_sb)
> +					EXT4_MAXQUOTAS_INIT_BLOCKS(inode->i_sb)
>  					+ 1);
>  	if (IS_ERR(handle)) {
>  		retval = PTR_ERR(handle);
> diff --git a/fs/ext4/namei.c b/fs/ext4/namei.c
> index fde08c9..17a17e1 100644
> --- a/fs/ext4/namei.c
> +++ b/fs/ext4/namei.c
> @@ -1769,7 +1769,7 @@ static int ext4_create(struct inode *dir, struct dentry *dentry, int mode,
>  retry:
>  	handle = ext4_journal_start(dir, EXT4_DATA_TRANS_BLOCKS(dir->i_sb) +
>  					EXT4_INDEX_EXTRA_TRANS_BLOCKS + 3 +
> -					2*EXT4_QUOTA_INIT_BLOCKS(dir->i_sb));
> +					EXT4_MAXQUOTAS_INIT_BLOCKS(dir->i_sb));
>  	if (IS_ERR(handle))
>  		return PTR_ERR(handle);
> 
> @@ -1803,7 +1803,7 @@ static int ext4_mknod(struct inode *dir, struct dentry *dentry,
>  retry:
>  	handle = ext4_journal_start(dir, EXT4_DATA_TRANS_BLOCKS(dir->i_sb) +
>  					EXT4_INDEX_EXTRA_TRANS_BLOCKS + 3 +
> -					2*EXT4_QUOTA_INIT_BLOCKS(dir->i_sb));
> +					EXT4_MAXQUOTAS_INIT_BLOCKS(dir->i_sb));
>  	if (IS_ERR(handle))
>  		return PTR_ERR(handle);
> 
> @@ -1840,7 +1840,7 @@ static int ext4_mkdir(struct inode *dir, struct dentry *dentry, int mode)
>  retry:
>  	handle = ext4_journal_start(dir, EXT4_DATA_TRANS_BLOCKS(dir->i_sb) +
>  					EXT4_INDEX_EXTRA_TRANS_BLOCKS + 3 +
> -					2*EXT4_QUOTA_INIT_BLOCKS(dir->i_sb));
> +					EXT4_MAXQUOTAS_INIT_BLOCKS(dir->i_sb));
>  	if (IS_ERR(handle))
>  		return PTR_ERR(handle);
> 
> @@ -2253,7 +2253,7 @@ static int ext4_symlink(struct inode *dir,
>  retry:
>  	handle = ext4_journal_start(dir, EXT4_DATA_TRANS_BLOCKS(dir->i_sb) +
>  					EXT4_INDEX_EXTRA_TRANS_BLOCKS + 5 +
> -					2*EXT4_QUOTA_INIT_BLOCKS(dir->i_sb));
> +					EXT4_MAXQUOTAS_INIT_BLOCKS(dir->i_sb));
>  	if (IS_ERR(handle))
>  		return PTR_ERR(handle);
> 



^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH 3/4] ext4: quota macros cleanup
  2009-11-25  6:57 ` [PATCH 2/4] ext4: fix reserved space transferring on chown() [V2] Dmitry Monakhov
@ 2009-11-25  6:57   ` Dmitry Monakhov
  2009-12-08  0:59     ` Mingming
  0 siblings, 1 reply; 20+ messages in thread
From: Dmitry Monakhov @ 2009-11-25  6:57 UTC (permalink / raw)
  To: linux-ext4; +Cc: Dmitry Monakhov, Jan Kara

Currently all quota block reservation macros contains hard-coded "2"
aka MAXQUOTAS value. This is no good because in some places it is not
obvious to understand what does this digit represent. Let's introduce
new macro with self descriptive name.

Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org>
---
 fs/ext4/ext4_jbd2.h |    8 ++++++--
 fs/ext4/extents.c   |    2 +-
 fs/ext4/inode.c     |    2 +-
 fs/ext4/migrate.c   |    4 ++--
 fs/ext4/namei.c     |    8 ++++----
 5 files changed, 14 insertions(+), 10 deletions(-)

diff --git a/fs/ext4/ext4_jbd2.h b/fs/ext4/ext4_jbd2.h
index a286598..404ab6c 100644
--- a/fs/ext4/ext4_jbd2.h
+++ b/fs/ext4/ext4_jbd2.h
@@ -49,7 +49,7 @@
 
 #define EXT4_DATA_TRANS_BLOCKS(sb)	(EXT4_SINGLEDATA_TRANS_BLOCKS(sb) + \
 					 EXT4_XATTR_TRANS_BLOCKS - 2 + \
-					 2*EXT4_QUOTA_TRANS_BLOCKS(sb))
+					 EXT4_MAXQUOTAS_TRANS_BLOCKS(sb))
 
 /*
  * Define the number of metadata blocks we need to account to modify data.
@@ -57,7 +57,7 @@
  * This include super block, inode block, quota blocks and xattr blocks
  */
 #define EXT4_META_TRANS_BLOCKS(sb)	(EXT4_XATTR_TRANS_BLOCKS + \
-					2*EXT4_QUOTA_TRANS_BLOCKS(sb))
+					EXT4_MAXQUOTAS_TRANS_BLOCKS(sb))
 
 /* Delete operations potentially hit one directory's namespace plus an
  * entire inode, plus arbitrary amounts of bitmap/indirection data.  Be
@@ -92,6 +92,7 @@
  * but inode, sb and group updates are done only once */
 #define EXT4_QUOTA_INIT_BLOCKS(sb) (test_opt(sb, QUOTA) ? (DQUOT_INIT_ALLOC*\
 		(EXT4_SINGLEDATA_TRANS_BLOCKS(sb)-3)+3+DQUOT_INIT_REWRITE) : 0)
+
 #define EXT4_QUOTA_DEL_BLOCKS(sb) (test_opt(sb, QUOTA) ? (DQUOT_DEL_ALLOC*\
 		(EXT4_SINGLEDATA_TRANS_BLOCKS(sb)-3)+3+DQUOT_DEL_REWRITE) : 0)
 #else
@@ -99,6 +100,9 @@
 #define EXT4_QUOTA_INIT_BLOCKS(sb) 0
 #define EXT4_QUOTA_DEL_BLOCKS(sb) 0
 #endif
+#define EXT4_MAXQUOTAS_TRANS_BLOCKS(sb) (MAXQUOTAS*EXT4_QUOTA_TRANS_BLOCKS(sb))
+#define EXT4_MAXQUOTAS_INIT_BLOCKS(sb) (MAXQUOTAS*EXT4_QUOTA_INIT_BLOCKS(sb))
+#define EXT4_MAXQUOTAS_DEL_BLOCKS(sb) (MAXQUOTAS*EXT4_QUOTA_DEL_BLOCKS(sb))
 
 int
 ext4_mark_iloc_dirty(handle_t *handle,
diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c
index 715264b..f73c3ff 100644
--- a/fs/ext4/extents.c
+++ b/fs/ext4/extents.c
@@ -2167,7 +2167,7 @@ ext4_ext_rm_leaf(handle_t *handle, struct inode *inode,
 			correct_index = 1;
 			credits += (ext_depth(inode)) + 1;
 		}
-		credits += 2 * EXT4_QUOTA_TRANS_BLOCKS(inode->i_sb);
+		credits += EXT4_MAXQUOTAS_TRANS_BLOCKS(inode->i_sb);
 
 		err = ext4_ext_truncate_extend_restart(handle, inode, credits);
 		if (err)
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 979362d..211722b 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -5438,7 +5438,7 @@ int ext4_setattr(struct dentry *dentry, struct iattr *attr)
 
 		/* (user+group)*(old+new) structure, inode write (sb,
 		 * inode block, ? - but truncate inode update has it) */
-		handle = ext4_journal_start(inode, 2*(EXT4_QUOTA_INIT_BLOCKS(inode->i_sb)+
+		handle = ext4_journal_start(inode, (EXT4_MAXQUOTAS_INIT_BLOCKS(inode->i_sb)+
 					EXT4_QUOTA_DEL_BLOCKS(inode->i_sb))+3);
 		if (IS_ERR(handle)) {
 			error = PTR_ERR(handle);
diff --git a/fs/ext4/migrate.c b/fs/ext4/migrate.c
index a93d5b8..8646149 100644
--- a/fs/ext4/migrate.c
+++ b/fs/ext4/migrate.c
@@ -238,7 +238,7 @@ static int extend_credit_for_blkdel(handle_t *handle, struct inode *inode)
 	 * So allocate a credit of 3. We may update
 	 * quota (user and group).
 	 */
-	needed = 3 + 2*EXT4_QUOTA_TRANS_BLOCKS(inode->i_sb);
+	needed = 3 + EXT4_MAXQUOTAS_TRANS_BLOCKS(inode->i_sb);
 
 	if (ext4_journal_extend(handle, needed) != 0)
 		retval = ext4_journal_restart(handle, needed);
@@ -477,7 +477,7 @@ int ext4_ext_migrate(struct inode *inode)
 	handle = ext4_journal_start(inode,
 					EXT4_DATA_TRANS_BLOCKS(inode->i_sb) +
 					EXT4_INDEX_EXTRA_TRANS_BLOCKS + 3 +
-					2 * EXT4_QUOTA_INIT_BLOCKS(inode->i_sb)
+					EXT4_MAXQUOTAS_INIT_BLOCKS(inode->i_sb)
 					+ 1);
 	if (IS_ERR(handle)) {
 		retval = PTR_ERR(handle);
diff --git a/fs/ext4/namei.c b/fs/ext4/namei.c
index fde08c9..17a17e1 100644
--- a/fs/ext4/namei.c
+++ b/fs/ext4/namei.c
@@ -1769,7 +1769,7 @@ static int ext4_create(struct inode *dir, struct dentry *dentry, int mode,
 retry:
 	handle = ext4_journal_start(dir, EXT4_DATA_TRANS_BLOCKS(dir->i_sb) +
 					EXT4_INDEX_EXTRA_TRANS_BLOCKS + 3 +
-					2*EXT4_QUOTA_INIT_BLOCKS(dir->i_sb));
+					EXT4_MAXQUOTAS_INIT_BLOCKS(dir->i_sb));
 	if (IS_ERR(handle))
 		return PTR_ERR(handle);
 
@@ -1803,7 +1803,7 @@ static int ext4_mknod(struct inode *dir, struct dentry *dentry,
 retry:
 	handle = ext4_journal_start(dir, EXT4_DATA_TRANS_BLOCKS(dir->i_sb) +
 					EXT4_INDEX_EXTRA_TRANS_BLOCKS + 3 +
-					2*EXT4_QUOTA_INIT_BLOCKS(dir->i_sb));
+					EXT4_MAXQUOTAS_INIT_BLOCKS(dir->i_sb));
 	if (IS_ERR(handle))
 		return PTR_ERR(handle);
 
@@ -1840,7 +1840,7 @@ static int ext4_mkdir(struct inode *dir, struct dentry *dentry, int mode)
 retry:
 	handle = ext4_journal_start(dir, EXT4_DATA_TRANS_BLOCKS(dir->i_sb) +
 					EXT4_INDEX_EXTRA_TRANS_BLOCKS + 3 +
-					2*EXT4_QUOTA_INIT_BLOCKS(dir->i_sb));
+					EXT4_MAXQUOTAS_INIT_BLOCKS(dir->i_sb));
 	if (IS_ERR(handle))
 		return PTR_ERR(handle);
 
@@ -2253,7 +2253,7 @@ static int ext4_symlink(struct inode *dir,
 retry:
 	handle = ext4_journal_start(dir, EXT4_DATA_TRANS_BLOCKS(dir->i_sb) +
 					EXT4_INDEX_EXTRA_TRANS_BLOCKS + 5 +
-					2*EXT4_QUOTA_INIT_BLOCKS(dir->i_sb));
+					EXT4_MAXQUOTAS_INIT_BLOCKS(dir->i_sb));
 	if (IS_ERR(handle))
 		return PTR_ERR(handle);
 
-- 
1.6.0.4


^ permalink raw reply related	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2009-12-08  6:34 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-11-23 18:13 ext4+quota patch series Dmitry Monakhov
2009-11-23 18:15 ` Eric Sandeen
2009-11-23 19:18   ` Dmitry Monakhov
2009-11-23 19:35     ` Eric Sandeen
2009-11-23 18:30 ` [PATCH 1/4] ext4: delalloc quota fixes Dmitry Monakhov
2009-11-23 22:43   ` Dmitry Monakhov
2009-11-23 22:58   ` Dmitry Monakhov
2009-11-23 22:58     ` [PATCH 2/4] ext4: fix race chown vs truncate Dmitry Monakhov
2009-11-23 22:58       ` [PATCH 3/4] ext4: quota macros cleanup Dmitry Monakhov
2009-11-23 22:58         ` [PATCH 4/4] ext4: fix incorrect block reservation on quota transfer Dmitry Monakhov
2009-11-24 15:24     ` [PATCH 1/4] ext4: delalloc quota fixes Eric Sandeen
2009-11-24 19:38       ` Dmitry Monakhov
2009-12-08  0:00     ` Mingming
2009-12-08  6:34       ` Dmitry Monakhov
2009-11-23 18:32 ` [PATCH 2/4] ext4: fix race chown vs truncate Dmitry Monakhov
2009-11-23 18:42   ` Dmitry Monakhov
2009-11-23 18:33 ` [PATCH 3/4] ext4: quota macros cleanup Dmitry Monakhov
2009-11-23 18:34 ` [PATCH 4/4] ext4: fix incorrect block reservation on quota transfer Dmitry Monakhov
2009-11-25  6:57 [PATCH 1/4] ext4: ext4_get_reserved_space() must return bytes instead of blocks Dmitry Monakhov
2009-11-25  6:57 ` [PATCH 2/4] ext4: fix reserved space transferring on chown() [V2] Dmitry Monakhov
2009-11-25  6:57   ` [PATCH 3/4] ext4: quota macros cleanup Dmitry Monakhov
2009-12-08  0:59     ` Mingming

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.