linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/27 v1] Quota scalability patches
@ 2017-08-16 15:41 Jan Kara
  2017-08-16 15:41 ` [PATCH 01/27] quota: Convert dqio_mutex to rwsem Jan Kara
                   ` (26 more replies)
  0 siblings, 27 replies; 55+ messages in thread
From: Jan Kara @ 2017-08-16 15:41 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: Wang Shilong, Jan Kara

  Hello,

this is a series of patches that address scalability issues in quota subsystem.
For my testing I've used a synthetic benchmark where 50 different processes are
creating files 10000 files each in separate directories (either for different
or the same user). I've been testing on a machine with 48 CPUs with 64GB of
RAM over either SATA drive or ramdisk. Different patches have shown
improvements in different configs (see details in the changelog). Also
Wang Shilong gave some testing to my patches and in his setup he saw about
100% improvement for file creation and about 50% improvement for file unlink.

The patches have passed xfstests on ext4.

First 14 patches convert users of a global dqio_mutex to rw semaphore and also
avoid using the semaphore altogether in cases where we don't need it - in
particular we now use dquot->dq_lock for protecting writeout of dquot (which
happens very frequently on every usage change). Patches 15-18 are smaller
fixes / cleanups. Patches 19-22 allow filesystem to not use dirty dquot list
(as e.g. ext4 does not need it), reducing contention on global dq_list_lock
significantly. Patches 23-27 remove use of global dq_data_lock for quota
usage changes (by using new dquot->dq_dqb_lock).

Full series can be also fetched from my tree at:

git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs.git quota_scaling

Unless someone objects, I'll queue the patches in my tree for the next merge
window.

								Honza

^ permalink raw reply	[flat|nested] 55+ messages in thread

* [PATCH 01/27] quota: Convert dqio_mutex to rwsem
  2017-08-16 15:41 [PATCH 0/27 v1] Quota scalability patches Jan Kara
@ 2017-08-16 15:41 ` Jan Kara
  2017-08-16 16:23   ` Andreas Dilger
  2017-08-16 15:41 ` [PATCH 02/27] quota: Do more fine-grained locking in dquot_acquire() Jan Kara
                   ` (25 subsequent siblings)
  26 siblings, 1 reply; 55+ messages in thread
From: Jan Kara @ 2017-08-16 15:41 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: Wang Shilong, Jan Kara

Convert dqio_mutex to rwsem and call it dqio_sem. No functional changes
yet.

Signed-off-by: Jan Kara <jack@suse.cz>
---
 fs/ext4/super.c         |  4 ++--
 fs/ocfs2/quota_global.c | 20 ++++++++++----------
 fs/ocfs2/quota_local.c  | 10 +++++-----
 fs/quota/dquot.c        | 28 ++++++++++++++--------------
 fs/quota/quota_tree.c   |  2 +-
 fs/super.c              |  2 +-
 include/linux/quota.h   |  2 +-
 7 files changed, 34 insertions(+), 34 deletions(-)

diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index d61a70e2193a..8e0c27387ab7 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -5268,8 +5268,8 @@ static int ext4_statfs(struct dentry *dentry, struct kstatfs *buf)
  * Process 1                         Process 2
  * ext4_create()                     quota_sync()
  *   jbd2_journal_start()                  write_dquot()
- *   dquot_initialize()                         down(dqio_mutex)
- *     down(dqio_mutex)                    jbd2_journal_start()
+ *   dquot_initialize()                         down(dqio_sem)
+ *     down(dqio_sem)                           jbd2_journal_start()
  *
  */
 
diff --git a/fs/ocfs2/quota_global.c b/fs/ocfs2/quota_global.c
index cec495a921e3..4134d557a8e5 100644
--- a/fs/ocfs2/quota_global.c
+++ b/fs/ocfs2/quota_global.c
@@ -33,7 +33,7 @@
  * Locking of quotas with OCFS2 is rather complex. Here are rules that
  * should be obeyed by all the functions:
  * - any write of quota structure (either to local or global file) is protected
- *   by dqio_mutex or dquot->dq_lock.
+ *   by dqio_sem or dquot->dq_lock.
  * - any modification of global quota file holds inode cluster lock, i_mutex,
  *   and ip_alloc_sem of the global quota file (achieved by
  *   ocfs2_lock_global_qf). It also has to hold qinfo_lock.
@@ -42,9 +42,9 @@
  *
  * A rough sketch of locking dependencies (lf = local file, gf = global file):
  * Normal filesystem operation:
- *   start_trans -> dqio_mutex -> write to lf
+ *   start_trans -> dqio_sem -> write to lf
  * Syncing of local and global file:
- *   ocfs2_lock_global_qf -> start_trans -> dqio_mutex -> qinfo_lock ->
+ *   ocfs2_lock_global_qf -> start_trans -> dqio_sem -> qinfo_lock ->
  *     write to gf
  *						       -> write to lf
  * Acquire dquot for the first time:
@@ -60,7 +60,7 @@
  * Recovery:
  *   inode cluster lock of recovered lf
  *     -> read bitmaps -> ip_alloc_sem of lf
- *     -> ocfs2_lock_global_qf -> start_trans -> dqio_mutex -> qinfo_lock ->
+ *     -> ocfs2_lock_global_qf -> start_trans -> dqio_sem -> qinfo_lock ->
  *        write to gf
  */
 
@@ -611,7 +611,7 @@ static int ocfs2_sync_dquot_helper(struct dquot *dquot, unsigned long type)
 		mlog_errno(status);
 		goto out_ilock;
 	}
-	mutex_lock(&sb_dqopt(sb)->dqio_mutex);
+	down_write(&sb_dqopt(sb)->dqio_sem);
 	status = ocfs2_sync_dquot(dquot);
 	if (status < 0)
 		mlog_errno(status);
@@ -619,7 +619,7 @@ static int ocfs2_sync_dquot_helper(struct dquot *dquot, unsigned long type)
 	status = ocfs2_local_write_dquot(dquot);
 	if (status < 0)
 		mlog_errno(status);
-	mutex_unlock(&sb_dqopt(sb)->dqio_mutex);
+	up_write(&sb_dqopt(sb)->dqio_sem);
 	ocfs2_commit_trans(osb, handle);
 out_ilock:
 	ocfs2_unlock_global_qf(oinfo, 1);
@@ -666,9 +666,9 @@ static int ocfs2_write_dquot(struct dquot *dquot)
 		mlog_errno(status);
 		goto out;
 	}
-	mutex_lock(&sb_dqopt(dquot->dq_sb)->dqio_mutex);
+	down_write(&sb_dqopt(dquot->dq_sb)->dqio_sem);
 	status = ocfs2_local_write_dquot(dquot);
-	mutex_unlock(&sb_dqopt(dquot->dq_sb)->dqio_mutex);
+	up_write(&sb_dqopt(dquot->dq_sb)->dqio_sem);
 	ocfs2_commit_trans(osb, handle);
 out:
 	return status;
@@ -939,7 +939,7 @@ static int ocfs2_mark_dquot_dirty(struct dquot *dquot)
 		mlog_errno(status);
 		goto out_ilock;
 	}
-	mutex_lock(&sb_dqopt(sb)->dqio_mutex);
+	down_write(&sb_dqopt(sb)->dqio_sem);
 	status = ocfs2_sync_dquot(dquot);
 	if (status < 0) {
 		mlog_errno(status);
@@ -948,7 +948,7 @@ static int ocfs2_mark_dquot_dirty(struct dquot *dquot)
 	/* Now write updated local dquot structure */
 	status = ocfs2_local_write_dquot(dquot);
 out_dlock:
-	mutex_unlock(&sb_dqopt(sb)->dqio_mutex);
+	up_write(&sb_dqopt(sb)->dqio_sem);
 	ocfs2_commit_trans(osb, handle);
 out_ilock:
 	ocfs2_unlock_global_qf(oinfo, 1);
diff --git a/fs/ocfs2/quota_local.c b/fs/ocfs2/quota_local.c
index 32c5a40c1257..1311eff1c050 100644
--- a/fs/ocfs2/quota_local.c
+++ b/fs/ocfs2/quota_local.c
@@ -520,7 +520,7 @@ static int ocfs2_recover_local_quota_file(struct inode *lqinode,
 				mlog_errno(status);
 				goto out_drop_lock;
 			}
-			mutex_lock(&sb_dqopt(sb)->dqio_mutex);
+			down_write(&sb_dqopt(sb)->dqio_sem);
 			spin_lock(&dq_data_lock);
 			/* Add usage from quota entry into quota changes
 			 * of our node. Auxiliary variables are important
@@ -553,7 +553,7 @@ static int ocfs2_recover_local_quota_file(struct inode *lqinode,
 			unlock_buffer(qbh);
 			ocfs2_journal_dirty(handle, qbh);
 out_commit:
-			mutex_unlock(&sb_dqopt(sb)->dqio_mutex);
+			up_write(&sb_dqopt(sb)->dqio_sem);
 			ocfs2_commit_trans(OCFS2_SB(sb), handle);
 out_drop_lock:
 			ocfs2_unlock_global_qf(oinfo, 1);
@@ -693,7 +693,7 @@ static int ocfs2_local_read_info(struct super_block *sb, int type)
 
 	/* We don't need the lock and we have to acquire quota file locks
 	 * which will later depend on this lock */
-	mutex_unlock(&sb_dqopt(sb)->dqio_mutex);
+	up_write(&sb_dqopt(sb)->dqio_sem);
 	info->dqi_max_spc_limit = 0x7fffffffffffffffLL;
 	info->dqi_max_ino_limit = 0x7fffffffffffffffLL;
 	oinfo = kmalloc(sizeof(struct ocfs2_mem_dqinfo), GFP_NOFS);
@@ -772,7 +772,7 @@ static int ocfs2_local_read_info(struct super_block *sb, int type)
 		goto out_err;
 	}
 
-	mutex_lock(&sb_dqopt(sb)->dqio_mutex);
+	down_write(&sb_dqopt(sb)->dqio_sem);
 	return 0;
 out_err:
 	if (oinfo) {
@@ -786,7 +786,7 @@ static int ocfs2_local_read_info(struct super_block *sb, int type)
 		kfree(oinfo);
 	}
 	brelse(bh);
-	mutex_lock(&sb_dqopt(sb)->dqio_mutex);
+	down_write(&sb_dqopt(sb)->dqio_sem);
 	return -1;
 }
 
diff --git a/fs/quota/dquot.c b/fs/quota/dquot.c
index 53a17496c5c5..29d447598642 100644
--- a/fs/quota/dquot.c
+++ b/fs/quota/dquot.c
@@ -120,7 +120,7 @@
  * spinlock to internal buffers before writing.
  *
  * Lock ordering (including related VFS locks) is the following:
- *   s_umount > i_mutex > journal_lock > dquot->dq_lock > dqio_mutex
+ *   s_umount > i_mutex > journal_lock > dquot->dq_lock > dqio_sem
  */
 
 static __cacheline_aligned_in_smp DEFINE_SPINLOCK(dq_list_lock);
@@ -406,7 +406,7 @@ int dquot_acquire(struct dquot *dquot)
 	struct quota_info *dqopt = sb_dqopt(dquot->dq_sb);
 
 	mutex_lock(&dquot->dq_lock);
-	mutex_lock(&dqopt->dqio_mutex);
+	down_write(&dqopt->dqio_sem);
 	if (!test_bit(DQ_READ_B, &dquot->dq_flags))
 		ret = dqopt->ops[dquot->dq_id.type]->read_dqblk(dquot);
 	if (ret < 0)
@@ -436,7 +436,7 @@ int dquot_acquire(struct dquot *dquot)
 	smp_mb__before_atomic();
 	set_bit(DQ_ACTIVE_B, &dquot->dq_flags);
 out_iolock:
-	mutex_unlock(&dqopt->dqio_mutex);
+	up_write(&dqopt->dqio_sem);
 	mutex_unlock(&dquot->dq_lock);
 	return ret;
 }
@@ -450,7 +450,7 @@ int dquot_commit(struct dquot *dquot)
 	int ret = 0;
 	struct quota_info *dqopt = sb_dqopt(dquot->dq_sb);
 
-	mutex_lock(&dqopt->dqio_mutex);
+	down_write(&dqopt->dqio_sem);
 	spin_lock(&dq_list_lock);
 	if (!clear_dquot_dirty(dquot)) {
 		spin_unlock(&dq_list_lock);
@@ -464,7 +464,7 @@ int dquot_commit(struct dquot *dquot)
 	else
 		ret = -EIO;
 out_sem:
-	mutex_unlock(&dqopt->dqio_mutex);
+	up_write(&dqopt->dqio_sem);
 	return ret;
 }
 EXPORT_SYMBOL(dquot_commit);
@@ -481,7 +481,7 @@ int dquot_release(struct dquot *dquot)
 	/* Check whether we are not racing with some other dqget() */
 	if (atomic_read(&dquot->dq_count) > 1)
 		goto out_dqlock;
-	mutex_lock(&dqopt->dqio_mutex);
+	down_write(&dqopt->dqio_sem);
 	if (dqopt->ops[dquot->dq_id.type]->release_dqblk) {
 		ret = dqopt->ops[dquot->dq_id.type]->release_dqblk(dquot);
 		/* Write the info */
@@ -493,7 +493,7 @@ int dquot_release(struct dquot *dquot)
 			ret = ret2;
 	}
 	clear_bit(DQ_ACTIVE_B, &dquot->dq_flags);
-	mutex_unlock(&dqopt->dqio_mutex);
+	up_write(&dqopt->dqio_sem);
 out_dqlock:
 	mutex_unlock(&dquot->dq_lock);
 	return ret;
@@ -2060,9 +2060,9 @@ int dquot_commit_info(struct super_block *sb, int type)
 	int ret;
 	struct quota_info *dqopt = sb_dqopt(sb);
 
-	mutex_lock(&dqopt->dqio_mutex);
+	down_write(&dqopt->dqio_sem);
 	ret = dqopt->ops[type]->write_file_info(sb, type);
-	mutex_unlock(&dqopt->dqio_mutex);
+	up_write(&dqopt->dqio_sem);
 	return ret;
 }
 EXPORT_SYMBOL(dquot_commit_info);
@@ -2076,9 +2076,9 @@ int dquot_get_next_id(struct super_block *sb, struct kqid *qid)
 		return -ESRCH;
 	if (!dqopt->ops[qid->type]->get_next_id)
 		return -ENOSYS;
-	mutex_lock(&dqopt->dqio_mutex);
+	down_write(&dqopt->dqio_sem);
 	err = dqopt->ops[qid->type]->get_next_id(sb, qid);
-	mutex_unlock(&dqopt->dqio_mutex);
+	up_write(&dqopt->dqio_sem);
 	return err;
 }
 EXPORT_SYMBOL(dquot_get_next_id);
@@ -2328,15 +2328,15 @@ static int vfs_load_quota_inode(struct inode *inode, int type, int format_id,
 	dqopt->info[type].dqi_format = fmt;
 	dqopt->info[type].dqi_fmt_id = format_id;
 	INIT_LIST_HEAD(&dqopt->info[type].dqi_dirty_list);
-	mutex_lock(&dqopt->dqio_mutex);
+	down_write(&dqopt->dqio_sem);
 	error = dqopt->ops[type]->read_file_info(sb, type);
 	if (error < 0) {
-		mutex_unlock(&dqopt->dqio_mutex);
+		up_write(&dqopt->dqio_sem);
 		goto out_file_init;
 	}
 	if (dqopt->flags & DQUOT_QUOTA_SYS_FILE)
 		dqopt->info[type].dqi_flags |= DQF_SYS_FILE;
-	mutex_unlock(&dqopt->dqio_mutex);
+	up_write(&dqopt->dqio_sem);
 	spin_lock(&dq_state_lock);
 	dqopt->flags |= dquot_state_flag(flags, type);
 	spin_unlock(&dq_state_lock);
diff --git a/fs/quota/quota_tree.c b/fs/quota/quota_tree.c
index 0738972e8d3f..54f85eb2609c 100644
--- a/fs/quota/quota_tree.c
+++ b/fs/quota/quota_tree.c
@@ -379,7 +379,7 @@ int qtree_write_dquot(struct qtree_mem_dqinfo *info, struct dquot *dquot)
 	if (!ddquot)
 		return -ENOMEM;
 
-	/* dq_off is guarded by dqio_mutex */
+	/* dq_off is guarded by dqio_sem */
 	if (!dquot->dq_off) {
 		ret = dq_insert_tree(info, dquot);
 		if (ret < 0) {
diff --git a/fs/super.c b/fs/super.c
index 6bc3352adcf3..221cfa1f4e92 100644
--- a/fs/super.c
+++ b/fs/super.c
@@ -242,7 +242,7 @@ static struct super_block *alloc_super(struct file_system_type *type, int flags,
 	atomic_set(&s->s_active, 1);
 	mutex_init(&s->s_vfs_rename_mutex);
 	lockdep_set_class(&s->s_vfs_rename_mutex, &type->s_vfs_rename_key);
-	mutex_init(&s->s_dquot.dqio_mutex);
+	init_rwsem(&s->s_dquot.dqio_sem);
 	s->s_maxbytes = MAX_NON_LFS;
 	s->s_op = &default_op;
 	s->s_time_gran = 1000000000;
diff --git a/include/linux/quota.h b/include/linux/quota.h
index bfd077ca6ac3..3a6df7461642 100644
--- a/include/linux/quota.h
+++ b/include/linux/quota.h
@@ -521,7 +521,7 @@ static inline void quota_send_warning(struct kqid qid, dev_t dev,
 
 struct quota_info {
 	unsigned int flags;			/* Flags for diskquotas on this device */
-	struct mutex dqio_mutex;		/* lock device while I/O in progress */
+	struct rw_semaphore dqio_sem;		/* Lock quota file while I/O in progress */
 	struct inode *files[MAXQUOTAS];		/* inodes of quotafiles */
 	struct mem_dqinfo info[MAXQUOTAS];	/* Information for each quota type */
 	const struct quota_format_ops *ops[MAXQUOTAS];	/* Operations for each type */
-- 
2.12.3

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH 02/27] quota: Do more fine-grained locking in dquot_acquire()
  2017-08-16 15:41 [PATCH 0/27 v1] Quota scalability patches Jan Kara
  2017-08-16 15:41 ` [PATCH 01/27] quota: Convert dqio_mutex to rwsem Jan Kara
@ 2017-08-16 15:41 ` Jan Kara
  2017-08-16 16:35   ` Andreas Dilger
  2017-08-16 15:41 ` [PATCH 03/27] quota: Acquire dqio_sem for reading in dquot_get_next_id() Jan Kara
                   ` (24 subsequent siblings)
  26 siblings, 1 reply; 55+ messages in thread
From: Jan Kara @ 2017-08-16 15:41 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: Wang Shilong, Jan Kara

We need dqio_sem held just for reading when calling ->read_dqblk() in
dquot_acquire(). Also dqio_sem is not needed when manipulating dquot
flags (those are protected by dq_lock) so acquire and release it closer
to the place where it is needed. This reduces lock hold time and will
make locking changes easier.

Signed-off-by: Jan Kara <jack@suse.cz>
---
 fs/quota/dquot.c | 9 ++++++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/fs/quota/dquot.c b/fs/quota/dquot.c
index 29d447598642..21358f31923d 100644
--- a/fs/quota/dquot.c
+++ b/fs/quota/dquot.c
@@ -406,9 +406,11 @@ int dquot_acquire(struct dquot *dquot)
 	struct quota_info *dqopt = sb_dqopt(dquot->dq_sb);
 
 	mutex_lock(&dquot->dq_lock);
-	down_write(&dqopt->dqio_sem);
-	if (!test_bit(DQ_READ_B, &dquot->dq_flags))
+	if (!test_bit(DQ_READ_B, &dquot->dq_flags)) {
+		down_read(&dqopt->dqio_sem);
 		ret = dqopt->ops[dquot->dq_id.type]->read_dqblk(dquot);
+		up_read(&dqopt->dqio_sem);
+	}
 	if (ret < 0)
 		goto out_iolock;
 	/* Make sure flags update is visible after dquot has been filled */
@@ -416,12 +418,14 @@ int dquot_acquire(struct dquot *dquot)
 	set_bit(DQ_READ_B, &dquot->dq_flags);
 	/* Instantiate dquot if needed */
 	if (!test_bit(DQ_ACTIVE_B, &dquot->dq_flags) && !dquot->dq_off) {
+		down_write(&dqopt->dqio_sem);
 		ret = dqopt->ops[dquot->dq_id.type]->commit_dqblk(dquot);
 		/* Write the info if needed */
 		if (info_dirty(&dqopt->info[dquot->dq_id.type])) {
 			ret2 = dqopt->ops[dquot->dq_id.type]->write_file_info(
 					dquot->dq_sb, dquot->dq_id.type);
 		}
+		up_write(&dqopt->dqio_sem);
 		if (ret < 0)
 			goto out_iolock;
 		if (ret2 < 0) {
@@ -436,7 +440,6 @@ int dquot_acquire(struct dquot *dquot)
 	smp_mb__before_atomic();
 	set_bit(DQ_ACTIVE_B, &dquot->dq_flags);
 out_iolock:
-	up_write(&dqopt->dqio_sem);
 	mutex_unlock(&dquot->dq_lock);
 	return ret;
 }
-- 
2.12.3

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH 03/27] quota: Acquire dqio_sem for reading in dquot_get_next_id()
  2017-08-16 15:41 [PATCH 0/27 v1] Quota scalability patches Jan Kara
  2017-08-16 15:41 ` [PATCH 01/27] quota: Convert dqio_mutex to rwsem Jan Kara
  2017-08-16 15:41 ` [PATCH 02/27] quota: Do more fine-grained locking in dquot_acquire() Jan Kara
@ 2017-08-16 15:41 ` Jan Kara
  2017-08-16 16:37   ` Andreas Dilger
  2017-08-16 15:41 ` [PATCH 04/27] quota: Acquire dqio_sem for reading in vfs_load_quota_inode() Jan Kara
                   ` (23 subsequent siblings)
  26 siblings, 1 reply; 55+ messages in thread
From: Jan Kara @ 2017-08-16 15:41 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: Wang Shilong, Jan Kara

dquot_get_next_id() needs dqio_sem only for reading to protect against
racing with modification of quota file structure.

Signed-off-by: Jan Kara <jack@suse.cz>
---
 fs/quota/dquot.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/fs/quota/dquot.c b/fs/quota/dquot.c
index 21358f31923d..8d5ccad3bf3e 100644
--- a/fs/quota/dquot.c
+++ b/fs/quota/dquot.c
@@ -2079,9 +2079,9 @@ int dquot_get_next_id(struct super_block *sb, struct kqid *qid)
 		return -ESRCH;
 	if (!dqopt->ops[qid->type]->get_next_id)
 		return -ENOSYS;
-	down_write(&dqopt->dqio_sem);
+	down_read(&dqopt->dqio_sem);
 	err = dqopt->ops[qid->type]->get_next_id(sb, qid);
-	up_write(&dqopt->dqio_sem);
+	up_read(&dqopt->dqio_sem);
 	return err;
 }
 EXPORT_SYMBOL(dquot_get_next_id);
-- 
2.12.3

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH 04/27] quota: Acquire dqio_sem for reading in vfs_load_quota_inode()
  2017-08-16 15:41 [PATCH 0/27 v1] Quota scalability patches Jan Kara
                   ` (2 preceding siblings ...)
  2017-08-16 15:41 ` [PATCH 03/27] quota: Acquire dqio_sem for reading in dquot_get_next_id() Jan Kara
@ 2017-08-16 15:41 ` Jan Kara
  2017-08-16 16:38   ` Andreas Dilger
  2017-08-16 15:41 ` [PATCH 05/27] quota: Protect dquot writeout with dq_lock Jan Kara
                   ` (22 subsequent siblings)
  26 siblings, 1 reply; 55+ messages in thread
From: Jan Kara @ 2017-08-16 15:41 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: Wang Shilong, Jan Kara

vfs_load_quota_inode() needs dqio_sem only for reading. In fact dqio_sem
is not needed there at all since the function can be called only during
quota on when quota file cannot be modified but let's leave the
protection there since it is logical and the path is in no way
performance critical.

Signed-off-by: Jan Kara <jack@suse.cz>
---
 fs/ocfs2/quota_local.c | 6 +++---
 fs/quota/dquot.c       | 6 +++---
 2 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/fs/ocfs2/quota_local.c b/fs/ocfs2/quota_local.c
index 1311eff1c050..1829f6a45d46 100644
--- a/fs/ocfs2/quota_local.c
+++ b/fs/ocfs2/quota_local.c
@@ -693,7 +693,7 @@ static int ocfs2_local_read_info(struct super_block *sb, int type)
 
 	/* We don't need the lock and we have to acquire quota file locks
 	 * which will later depend on this lock */
-	up_write(&sb_dqopt(sb)->dqio_sem);
+	up_read(&sb_dqopt(sb)->dqio_sem);
 	info->dqi_max_spc_limit = 0x7fffffffffffffffLL;
 	info->dqi_max_ino_limit = 0x7fffffffffffffffLL;
 	oinfo = kmalloc(sizeof(struct ocfs2_mem_dqinfo), GFP_NOFS);
@@ -772,7 +772,7 @@ static int ocfs2_local_read_info(struct super_block *sb, int type)
 		goto out_err;
 	}
 
-	down_write(&sb_dqopt(sb)->dqio_sem);
+	down_read(&sb_dqopt(sb)->dqio_sem);
 	return 0;
 out_err:
 	if (oinfo) {
@@ -786,7 +786,7 @@ static int ocfs2_local_read_info(struct super_block *sb, int type)
 		kfree(oinfo);
 	}
 	brelse(bh);
-	down_write(&sb_dqopt(sb)->dqio_sem);
+	down_read(&sb_dqopt(sb)->dqio_sem);
 	return -1;
 }
 
diff --git a/fs/quota/dquot.c b/fs/quota/dquot.c
index 8d5ccad3bf3e..3852a3c79ac9 100644
--- a/fs/quota/dquot.c
+++ b/fs/quota/dquot.c
@@ -2331,15 +2331,15 @@ static int vfs_load_quota_inode(struct inode *inode, int type, int format_id,
 	dqopt->info[type].dqi_format = fmt;
 	dqopt->info[type].dqi_fmt_id = format_id;
 	INIT_LIST_HEAD(&dqopt->info[type].dqi_dirty_list);
-	down_write(&dqopt->dqio_sem);
+	down_read(&dqopt->dqio_sem);
 	error = dqopt->ops[type]->read_file_info(sb, type);
 	if (error < 0) {
-		up_write(&dqopt->dqio_sem);
+		up_read(&dqopt->dqio_sem);
 		goto out_file_init;
 	}
 	if (dqopt->flags & DQUOT_QUOTA_SYS_FILE)
 		dqopt->info[type].dqi_flags |= DQF_SYS_FILE;
-	up_write(&dqopt->dqio_sem);
+	up_read(&dqopt->dqio_sem);
 	spin_lock(&dq_state_lock);
 	dqopt->flags |= dquot_state_flag(flags, type);
 	spin_unlock(&dq_state_lock);
-- 
2.12.3

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH 05/27] quota: Protect dquot writeout with dq_lock
  2017-08-16 15:41 [PATCH 0/27 v1] Quota scalability patches Jan Kara
                   ` (3 preceding siblings ...)
  2017-08-16 15:41 ` [PATCH 04/27] quota: Acquire dqio_sem for reading in vfs_load_quota_inode() Jan Kara
@ 2017-08-16 15:41 ` Jan Kara
  2017-08-16 16:44   ` Andreas Dilger
  2017-08-16 15:41 ` [PATCH 06/27] quota: Push dqio_sem down to ->read_dqblk() Jan Kara
                   ` (21 subsequent siblings)
  26 siblings, 1 reply; 55+ messages in thread
From: Jan Kara @ 2017-08-16 15:41 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: Wang Shilong, Jan Kara

Currently dquot writeout is only protected by dqio_sem held for writing.
As we transition to a finer grained locking we will use dquot->dq_lock
instead. So acquire it in dquot_commit() and move dqio_sem just around
->commit_dqblk() call as it is still needed to serialize quota file
changes.

Signed-off-by: Jan Kara <jack@suse.cz>
---
 fs/quota/dquot.c | 27 +++++++++++++--------------
 1 file changed, 13 insertions(+), 14 deletions(-)

diff --git a/fs/quota/dquot.c b/fs/quota/dquot.c
index 3852a3c79ac9..99693c6d5dae 100644
--- a/fs/quota/dquot.c
+++ b/fs/quota/dquot.c
@@ -110,14 +110,11 @@
  * sure they cannot race with quotaon which first sets S_NOQUOTA flag and
  * then drops all pointers to dquots from an inode.
  *
- * Each dquot has its dq_lock mutex. Locked dquots might not be referenced
- * from inodes (dquot_alloc_space() and such don't check the dq_lock).
- * Currently dquot is locked only when it is being read to memory (or space for
- * it is being allocated) on the first dqget() and when it is being released on
- * the last dqput(). The allocation and release oparations are serialized by
- * the dq_lock and by checking the use count in dquot_release().  Write
- * operations on dquots don't hold dq_lock as they copy data under dq_data_lock
- * spinlock to internal buffers before writing.
+ * Each dquot has its dq_lock mutex.  Dquot is locked when it is being read to
+ * memory (or space for it is being allocated) on the first dqget(), when it is
+ * being written out, and when it is being released on the last dqput(). The
+ * allocation and release operations are serialized by the dq_lock and by
+ * checking the use count in dquot_release().
  *
  * Lock ordering (including related VFS locks) is the following:
  *   s_umount > i_mutex > journal_lock > dquot->dq_lock > dqio_sem
@@ -453,21 +450,23 @@ int dquot_commit(struct dquot *dquot)
 	int ret = 0;
 	struct quota_info *dqopt = sb_dqopt(dquot->dq_sb);
 
-	down_write(&dqopt->dqio_sem);
+	mutex_lock(&dquot->dq_lock);
 	spin_lock(&dq_list_lock);
 	if (!clear_dquot_dirty(dquot)) {
 		spin_unlock(&dq_list_lock);
-		goto out_sem;
+		goto out_lock;
 	}
 	spin_unlock(&dq_list_lock);
 	/* Inactive dquot can be only if there was error during read/init
 	 * => we have better not writing it */
-	if (test_bit(DQ_ACTIVE_B, &dquot->dq_flags))
+	if (test_bit(DQ_ACTIVE_B, &dquot->dq_flags)) {
+		down_write(&dqopt->dqio_sem);
 		ret = dqopt->ops[dquot->dq_id.type]->commit_dqblk(dquot);
-	else
+		up_write(&dqopt->dqio_sem);
+	} else
 		ret = -EIO;
-out_sem:
-	up_write(&dqopt->dqio_sem);
+out_lock:
+	mutex_unlock(&dquot->dq_lock);
 	return ret;
 }
 EXPORT_SYMBOL(dquot_commit);
-- 
2.12.3

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH 06/27] quota: Push dqio_sem down to ->read_dqblk()
  2017-08-16 15:41 [PATCH 0/27 v1] Quota scalability patches Jan Kara
                   ` (4 preceding siblings ...)
  2017-08-16 15:41 ` [PATCH 05/27] quota: Protect dquot writeout with dq_lock Jan Kara
@ 2017-08-16 15:41 ` Jan Kara
  2017-08-16 16:46   ` Andreas Dilger
  2017-08-16 15:41 ` [PATCH 07/27] quota: Remove locking for reading from the old quota format Jan Kara
                   ` (20 subsequent siblings)
  26 siblings, 1 reply; 55+ messages in thread
From: Jan Kara @ 2017-08-16 15:41 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: Wang Shilong, Jan Kara

Push down acquisition of dqio_sem into ->read_dqblk() callback. It will
allow quota formats to decide whether they need it or not.

Signed-off-by: Jan Kara <jack@suse.cz>
---
 fs/quota/dquot.c    |  5 +----
 fs/quota/quota_v1.c |  5 ++++-
 fs/quota/quota_v2.c | 10 +++++++++-
 3 files changed, 14 insertions(+), 6 deletions(-)

diff --git a/fs/quota/dquot.c b/fs/quota/dquot.c
index 99693c6d5dae..70b9a6afe5a8 100644
--- a/fs/quota/dquot.c
+++ b/fs/quota/dquot.c
@@ -403,11 +403,8 @@ int dquot_acquire(struct dquot *dquot)
 	struct quota_info *dqopt = sb_dqopt(dquot->dq_sb);
 
 	mutex_lock(&dquot->dq_lock);
-	if (!test_bit(DQ_READ_B, &dquot->dq_flags)) {
-		down_read(&dqopt->dqio_sem);
+	if (!test_bit(DQ_READ_B, &dquot->dq_flags))
 		ret = dqopt->ops[dquot->dq_id.type]->read_dqblk(dquot);
-		up_read(&dqopt->dqio_sem);
-	}
 	if (ret < 0)
 		goto out_iolock;
 	/* Make sure flags update is visible after dquot has been filled */
diff --git a/fs/quota/quota_v1.c b/fs/quota/quota_v1.c
index 8fe79beced5c..d534c4043237 100644
--- a/fs/quota/quota_v1.c
+++ b/fs/quota/quota_v1.c
@@ -56,10 +56,12 @@ static int v1_read_dqblk(struct dquot *dquot)
 {
 	int type = dquot->dq_id.type;
 	struct v1_disk_dqblk dqblk;
+	struct quota_info *dqopt = sb_dqopt(dquot->dq_sb);
 
-	if (!sb_dqopt(dquot->dq_sb)->files[type])
+	if (!dqopt->files[type])
 		return -EINVAL;
 
+	down_read(&dqopt->dqio_sem);
 	/* Set structure to 0s in case read fails/is after end of file */
 	memset(&dqblk, 0, sizeof(struct v1_disk_dqblk));
 	dquot->dq_sb->s_op->quota_read(dquot->dq_sb, type, (char *)&dqblk,
@@ -73,6 +75,7 @@ static int v1_read_dqblk(struct dquot *dquot)
 	    dquot->dq_dqb.dqb_isoftlimit == 0)
 		set_bit(DQ_FAKE_B, &dquot->dq_flags);
 	dqstats_inc(DQST_READS);
+	up_read(&dqopt->dqio_sem);
 
 	return 0;
 }
diff --git a/fs/quota/quota_v2.c b/fs/quota/quota_v2.c
index ca71bf881ad1..b2cd34f6b3da 100644
--- a/fs/quota/quota_v2.c
+++ b/fs/quota/quota_v2.c
@@ -285,7 +285,15 @@ static int v2r1_is_id(void *dp, struct dquot *dquot)
 
 static int v2_read_dquot(struct dquot *dquot)
 {
-	return qtree_read_dquot(sb_dqinfo(dquot->dq_sb, dquot->dq_id.type)->dqi_priv, dquot);
+	struct quota_info *dqopt = sb_dqopt(dquot->dq_sb);
+	int ret;
+
+	down_read(&dqopt->dqio_sem);
+	ret = qtree_read_dquot(
+			sb_dqinfo(dquot->dq_sb, dquot->dq_id.type)->dqi_priv,
+			dquot);
+	up_read(&dqopt->dqio_sem);
+	return ret;
 }
 
 static int v2_write_dquot(struct dquot *dquot)
-- 
2.12.3

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH 07/27] quota: Remove locking for reading from the old quota format
  2017-08-16 15:41 [PATCH 0/27 v1] Quota scalability patches Jan Kara
                   ` (5 preceding siblings ...)
  2017-08-16 15:41 ` [PATCH 06/27] quota: Push dqio_sem down to ->read_dqblk() Jan Kara
@ 2017-08-16 15:41 ` Jan Kara
  2017-08-16 16:47   ` Andreas Dilger
  2017-08-16 15:41 ` [PATCH 08/27] quota: Push dqio_sem down to ->write_dqblk() Jan Kara
                   ` (19 subsequent siblings)
  26 siblings, 1 reply; 55+ messages in thread
From: Jan Kara @ 2017-08-16 15:41 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: Wang Shilong, Jan Kara

The old quota format has fixed offset in quota file based on ID so
there's no locking needed against concurrent modifications of the file
(locking against concurrent IO on the same dquot is still provided by
dq_lock).

Signed-off-by: Jan Kara <jack@suse.cz>
---
 fs/quota/quota_v1.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/fs/quota/quota_v1.c b/fs/quota/quota_v1.c
index d534c4043237..12d69cda57cc 100644
--- a/fs/quota/quota_v1.c
+++ b/fs/quota/quota_v1.c
@@ -61,7 +61,6 @@ static int v1_read_dqblk(struct dquot *dquot)
 	if (!dqopt->files[type])
 		return -EINVAL;
 
-	down_read(&dqopt->dqio_sem);
 	/* Set structure to 0s in case read fails/is after end of file */
 	memset(&dqblk, 0, sizeof(struct v1_disk_dqblk));
 	dquot->dq_sb->s_op->quota_read(dquot->dq_sb, type, (char *)&dqblk,
@@ -75,7 +74,6 @@ static int v1_read_dqblk(struct dquot *dquot)
 	    dquot->dq_dqb.dqb_isoftlimit == 0)
 		set_bit(DQ_FAKE_B, &dquot->dq_flags);
 	dqstats_inc(DQST_READS);
-	up_read(&dqopt->dqio_sem);
 
 	return 0;
 }
-- 
2.12.3

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH 08/27] quota: Push dqio_sem down to ->write_dqblk()
  2017-08-16 15:41 [PATCH 0/27 v1] Quota scalability patches Jan Kara
                   ` (6 preceding siblings ...)
  2017-08-16 15:41 ` [PATCH 07/27] quota: Remove locking for reading from the old quota format Jan Kara
@ 2017-08-16 15:41 ` Jan Kara
  2017-08-16 16:48   ` Andreas Dilger
  2017-08-16 15:41 ` [PATCH 09/27] quota: Do not acquire dqio_sem for dquot overwrites in v2 format Jan Kara
                   ` (18 subsequent siblings)
  26 siblings, 1 reply; 55+ messages in thread
From: Jan Kara @ 2017-08-16 15:41 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: Wang Shilong, Jan Kara

Push down acquisition of dqio_sem into ->write_dqblk() callback. It will
allow quota formats to decide whether they need it or not.

Signed-off-by: Jan Kara <jack@suse.cz>
---
 fs/quota/dquot.c    | 10 ++++------
 fs/quota/quota_v1.c |  3 +++
 fs/quota/quota_v2.c | 10 +++++++++-
 3 files changed, 16 insertions(+), 7 deletions(-)

diff --git a/fs/quota/dquot.c b/fs/quota/dquot.c
index 70b9a6afe5a8..562f5978488f 100644
--- a/fs/quota/dquot.c
+++ b/fs/quota/dquot.c
@@ -412,14 +412,14 @@ int dquot_acquire(struct dquot *dquot)
 	set_bit(DQ_READ_B, &dquot->dq_flags);
 	/* Instantiate dquot if needed */
 	if (!test_bit(DQ_ACTIVE_B, &dquot->dq_flags) && !dquot->dq_off) {
-		down_write(&dqopt->dqio_sem);
 		ret = dqopt->ops[dquot->dq_id.type]->commit_dqblk(dquot);
 		/* Write the info if needed */
 		if (info_dirty(&dqopt->info[dquot->dq_id.type])) {
+			down_write(&dqopt->dqio_sem);
 			ret2 = dqopt->ops[dquot->dq_id.type]->write_file_info(
 					dquot->dq_sb, dquot->dq_id.type);
+			up_write(&dqopt->dqio_sem);
 		}
-		up_write(&dqopt->dqio_sem);
 		if (ret < 0)
 			goto out_iolock;
 		if (ret2 < 0) {
@@ -456,11 +456,9 @@ int dquot_commit(struct dquot *dquot)
 	spin_unlock(&dq_list_lock);
 	/* Inactive dquot can be only if there was error during read/init
 	 * => we have better not writing it */
-	if (test_bit(DQ_ACTIVE_B, &dquot->dq_flags)) {
-		down_write(&dqopt->dqio_sem);
+	if (test_bit(DQ_ACTIVE_B, &dquot->dq_flags))
 		ret = dqopt->ops[dquot->dq_id.type]->commit_dqblk(dquot);
-		up_write(&dqopt->dqio_sem);
-	} else
+	else
 		ret = -EIO;
 out_lock:
 	mutex_unlock(&dquot->dq_lock);
diff --git a/fs/quota/quota_v1.c b/fs/quota/quota_v1.c
index 12d69cda57cc..94cceb76b9a3 100644
--- a/fs/quota/quota_v1.c
+++ b/fs/quota/quota_v1.c
@@ -83,7 +83,9 @@ static int v1_commit_dqblk(struct dquot *dquot)
 	short type = dquot->dq_id.type;
 	ssize_t ret;
 	struct v1_disk_dqblk dqblk;
+	struct quota_info *dqopt = sb_dqopt(dquot->dq_sb);
 
+	down_write(&dqopt->dqio_sem);
 	v1_mem2disk_dqblk(&dqblk, &dquot->dq_dqb);
 	if (((type == USRQUOTA) && uid_eq(dquot->dq_id.uid, GLOBAL_ROOT_UID)) ||
 	    ((type == GRPQUOTA) && gid_eq(dquot->dq_id.gid, GLOBAL_ROOT_GID))) {
@@ -97,6 +99,7 @@ static int v1_commit_dqblk(struct dquot *dquot)
 		ret = dquot->dq_sb->s_op->quota_write(dquot->dq_sb, type,
 			(char *)&dqblk, sizeof(struct v1_disk_dqblk),
 			v1_dqoff(from_kqid(&init_user_ns, dquot->dq_id)));
+	up_write(&dqopt->dqio_sem);
 	if (ret != sizeof(struct v1_disk_dqblk)) {
 		quota_error(dquot->dq_sb, "dquota write failed");
 		if (ret >= 0)
diff --git a/fs/quota/quota_v2.c b/fs/quota/quota_v2.c
index b2cd34f6b3da..482733abe4ac 100644
--- a/fs/quota/quota_v2.c
+++ b/fs/quota/quota_v2.c
@@ -298,7 +298,15 @@ static int v2_read_dquot(struct dquot *dquot)
 
 static int v2_write_dquot(struct dquot *dquot)
 {
-	return qtree_write_dquot(sb_dqinfo(dquot->dq_sb, dquot->dq_id.type)->dqi_priv, dquot);
+	struct quota_info *dqopt = sb_dqopt(dquot->dq_sb);
+	int ret;
+
+	down_write(&dqopt->dqio_sem);
+	ret = qtree_write_dquot(
+			sb_dqinfo(dquot->dq_sb, dquot->dq_id.type)->dqi_priv,
+			dquot);
+	up_write(&dqopt->dqio_sem);
+	return ret;
 }
 
 static int v2_release_dquot(struct dquot *dquot)
-- 
2.12.3

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH 09/27] quota: Do not acquire dqio_sem for dquot overwrites in v2 format
  2017-08-16 15:41 [PATCH 0/27 v1] Quota scalability patches Jan Kara
                   ` (7 preceding siblings ...)
  2017-08-16 15:41 ` [PATCH 08/27] quota: Push dqio_sem down to ->write_dqblk() Jan Kara
@ 2017-08-16 15:41 ` Jan Kara
  2017-08-16 16:49   ` Andreas Dilger
  2017-08-16 15:41 ` [PATCH 10/27] quota: Remove locking for writing to the old quota format Jan Kara
                   ` (17 subsequent siblings)
  26 siblings, 1 reply; 55+ messages in thread
From: Jan Kara @ 2017-08-16 15:41 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: Wang Shilong, Jan Kara

When dquot has space already allocated in a quota file, we just
overwrite that place when writing dquot. So we don't need any protection
against other modifications of quota file as these keep dquot in place.

Signed-off-by: Jan Kara <jack@suse.cz>
---
 fs/quota/quota_v2.c | 17 ++++++++++++++---
 1 file changed, 14 insertions(+), 3 deletions(-)

diff --git a/fs/quota/quota_v2.c b/fs/quota/quota_v2.c
index 482733abe4ac..7a05b80f3b8c 100644
--- a/fs/quota/quota_v2.c
+++ b/fs/quota/quota_v2.c
@@ -300,12 +300,23 @@ static int v2_write_dquot(struct dquot *dquot)
 {
 	struct quota_info *dqopt = sb_dqopt(dquot->dq_sb);
 	int ret;
-
-	down_write(&dqopt->dqio_sem);
+	bool alloc = false;
+
+	/*
+	 * If space for dquot is already allocated, we don't need any
+	 * protection as we'll only overwrite the place of dquot. We are
+	 * still protected by concurrent writes of the same dquot by
+	 * dquot->dq_lock.
+	 */
+	if (!dquot->dq_off) {
+		alloc = true;
+		down_write(&dqopt->dqio_sem);
+	}
 	ret = qtree_write_dquot(
 			sb_dqinfo(dquot->dq_sb, dquot->dq_id.type)->dqi_priv,
 			dquot);
-	up_write(&dqopt->dqio_sem);
+	if (alloc)
+		up_write(&dqopt->dqio_sem);
 	return ret;
 }
 
-- 
2.12.3

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH 10/27] quota: Remove locking for writing to the old quota format
  2017-08-16 15:41 [PATCH 0/27 v1] Quota scalability patches Jan Kara
                   ` (8 preceding siblings ...)
  2017-08-16 15:41 ` [PATCH 09/27] quota: Do not acquire dqio_sem for dquot overwrites in v2 format Jan Kara
@ 2017-08-16 15:41 ` Jan Kara
  2017-08-16 16:50   ` Andreas Dilger
  2017-08-16 15:41 ` [PATCH 11/27] quota: Push dqio_sem down to ->release_dqblk() Jan Kara
                   ` (16 subsequent siblings)
  26 siblings, 1 reply; 55+ messages in thread
From: Jan Kara @ 2017-08-16 15:41 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: Wang Shilong, Jan Kara

The old quota quota format has fixed offset in quota file based on ID so
there's no locking needed against concurrent modifications of the file
(locking against concurrent IO on the same dquot is still provided by
dq_lock).

Signed-off-by: Jan Kara <jack@suse.cz>
---
 fs/quota/quota_v1.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/fs/quota/quota_v1.c b/fs/quota/quota_v1.c
index 94cceb76b9a3..12d69cda57cc 100644
--- a/fs/quota/quota_v1.c
+++ b/fs/quota/quota_v1.c
@@ -83,9 +83,7 @@ static int v1_commit_dqblk(struct dquot *dquot)
 	short type = dquot->dq_id.type;
 	ssize_t ret;
 	struct v1_disk_dqblk dqblk;
-	struct quota_info *dqopt = sb_dqopt(dquot->dq_sb);
 
-	down_write(&dqopt->dqio_sem);
 	v1_mem2disk_dqblk(&dqblk, &dquot->dq_dqb);
 	if (((type == USRQUOTA) && uid_eq(dquot->dq_id.uid, GLOBAL_ROOT_UID)) ||
 	    ((type == GRPQUOTA) && gid_eq(dquot->dq_id.gid, GLOBAL_ROOT_GID))) {
@@ -99,7 +97,6 @@ static int v1_commit_dqblk(struct dquot *dquot)
 		ret = dquot->dq_sb->s_op->quota_write(dquot->dq_sb, type,
 			(char *)&dqblk, sizeof(struct v1_disk_dqblk),
 			v1_dqoff(from_kqid(&init_user_ns, dquot->dq_id)));
-	up_write(&dqopt->dqio_sem);
 	if (ret != sizeof(struct v1_disk_dqblk)) {
 		quota_error(dquot->dq_sb, "dquota write failed");
 		if (ret >= 0)
-- 
2.12.3

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH 11/27] quota: Push dqio_sem down to ->release_dqblk()
  2017-08-16 15:41 [PATCH 0/27 v1] Quota scalability patches Jan Kara
                   ` (9 preceding siblings ...)
  2017-08-16 15:41 ` [PATCH 10/27] quota: Remove locking for writing to the old quota format Jan Kara
@ 2017-08-16 15:41 ` Jan Kara
  2017-08-16 16:56   ` Andreas Dilger
  2017-08-16 15:41 ` [PATCH 12/27] quota: Push dqio_sem down to ->get_next_id() Jan Kara
                   ` (15 subsequent siblings)
  26 siblings, 1 reply; 55+ messages in thread
From: Jan Kara @ 2017-08-16 15:41 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: Wang Shilong, Jan Kara

Push down acquisition of dqio_sem into ->release_dqblk() callback. It
will allow quota formats to decide whether they need it or not.

Signed-off-by: Jan Kara <jack@suse.cz>
---
 fs/quota/dquot.c    | 4 ++--
 fs/quota/quota_v2.c | 9 ++++++++-
 2 files changed, 10 insertions(+), 3 deletions(-)

diff --git a/fs/quota/dquot.c b/fs/quota/dquot.c
index 562f5978488f..3b3c7f094ff8 100644
--- a/fs/quota/dquot.c
+++ b/fs/quota/dquot.c
@@ -478,19 +478,19 @@ int dquot_release(struct dquot *dquot)
 	/* Check whether we are not racing with some other dqget() */
 	if (atomic_read(&dquot->dq_count) > 1)
 		goto out_dqlock;
-	down_write(&dqopt->dqio_sem);
 	if (dqopt->ops[dquot->dq_id.type]->release_dqblk) {
 		ret = dqopt->ops[dquot->dq_id.type]->release_dqblk(dquot);
 		/* Write the info */
 		if (info_dirty(&dqopt->info[dquot->dq_id.type])) {
+			down_write(&dqopt->dqio_sem);
 			ret2 = dqopt->ops[dquot->dq_id.type]->write_file_info(
 						dquot->dq_sb, dquot->dq_id.type);
+			up_write(&dqopt->dqio_sem);
 		}
 		if (ret >= 0)
 			ret = ret2;
 	}
 	clear_bit(DQ_ACTIVE_B, &dquot->dq_flags);
-	up_write(&dqopt->dqio_sem);
 out_dqlock:
 	mutex_unlock(&dquot->dq_lock);
 	return ret;
diff --git a/fs/quota/quota_v2.c b/fs/quota/quota_v2.c
index 7a05b80f3b8c..8f54b159e423 100644
--- a/fs/quota/quota_v2.c
+++ b/fs/quota/quota_v2.c
@@ -322,7 +322,14 @@ static int v2_write_dquot(struct dquot *dquot)
 
 static int v2_release_dquot(struct dquot *dquot)
 {
-	return qtree_release_dquot(sb_dqinfo(dquot->dq_sb, dquot->dq_id.type)->dqi_priv, dquot);
+	struct quota_info *dqopt = sb_dqopt(dquot->dq_sb);
+	int ret;
+
+	down_write(&dqopt->dqio_sem);
+	ret = qtree_release_dquot(sb_dqinfo(dquot->dq_sb, dquot->dq_id.type)->dqi_priv, dquot);
+	up_write(&dqopt->dqio_sem);
+
+	return ret;
 }
 
 static int v2_free_file_info(struct super_block *sb, int type)
-- 
2.12.3

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH 12/27] quota: Push dqio_sem down to ->get_next_id()
  2017-08-16 15:41 [PATCH 0/27 v1] Quota scalability patches Jan Kara
                   ` (10 preceding siblings ...)
  2017-08-16 15:41 ` [PATCH 11/27] quota: Push dqio_sem down to ->release_dqblk() Jan Kara
@ 2017-08-16 15:41 ` Jan Kara
  2017-08-16 17:08   ` Andreas Dilger
  2017-08-16 15:41 ` [PATCH 13/27] quota: Push dqio_sem down to ->write_file_info() Jan Kara
                   ` (14 subsequent siblings)
  26 siblings, 1 reply; 55+ messages in thread
From: Jan Kara @ 2017-08-16 15:41 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: Wang Shilong, Jan Kara

Push down acquisition of dqio_sem into ->get_next_id() callback. Mostly
for consistency with other operations.

Signed-off-by: Jan Kara <jack@suse.cz>
---
 fs/quota/dquot.c    | 6 +-----
 fs/quota/quota_v2.c | 8 +++++++-
 2 files changed, 8 insertions(+), 6 deletions(-)

diff --git a/fs/quota/dquot.c b/fs/quota/dquot.c
index 3b3c7f094ff8..332f7026edad 100644
--- a/fs/quota/dquot.c
+++ b/fs/quota/dquot.c
@@ -2067,16 +2067,12 @@ EXPORT_SYMBOL(dquot_commit_info);
 int dquot_get_next_id(struct super_block *sb, struct kqid *qid)
 {
 	struct quota_info *dqopt = sb_dqopt(sb);
-	int err;
 
 	if (!sb_has_quota_active(sb, qid->type))
 		return -ESRCH;
 	if (!dqopt->ops[qid->type]->get_next_id)
 		return -ENOSYS;
-	down_read(&dqopt->dqio_sem);
-	err = dqopt->ops[qid->type]->get_next_id(sb, qid);
-	up_read(&dqopt->dqio_sem);
-	return err;
+	return dqopt->ops[qid->type]->get_next_id(sb, qid);
 }
 EXPORT_SYMBOL(dquot_get_next_id);
 
diff --git a/fs/quota/quota_v2.c b/fs/quota/quota_v2.c
index 8f54b159e423..f82638e43c07 100644
--- a/fs/quota/quota_v2.c
+++ b/fs/quota/quota_v2.c
@@ -340,7 +340,13 @@ static int v2_free_file_info(struct super_block *sb, int type)
 
 static int v2_get_next_id(struct super_block *sb, struct kqid *qid)
 {
-	return qtree_get_next_id(sb_dqinfo(sb, qid->type)->dqi_priv, qid);
+	struct quota_info *dqopt = sb_dqopt(sb);
+	int ret;
+
+	down_read(&dqopt->dqio_sem);
+	ret = qtree_get_next_id(sb_dqinfo(sb, qid->type)->dqi_priv, qid);
+	up_read(&dqopt->dqio_sem);
+	return ret;
 }
 
 static const struct quota_format_ops v2_format_ops = {
-- 
2.12.3

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH 13/27] quota: Push dqio_sem down to ->write_file_info()
  2017-08-16 15:41 [PATCH 0/27 v1] Quota scalability patches Jan Kara
                   ` (11 preceding siblings ...)
  2017-08-16 15:41 ` [PATCH 12/27] quota: Push dqio_sem down to ->get_next_id() Jan Kara
@ 2017-08-16 15:41 ` Jan Kara
  2017-08-16 17:33   ` Andreas Dilger
  2017-08-16 15:41 ` [PATCH 14/27] quota: Push dqio_sem down to ->read_file_info() Jan Kara
                   ` (13 subsequent siblings)
  26 siblings, 1 reply; 55+ messages in thread
From: Jan Kara @ 2017-08-16 15:41 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: Wang Shilong, Jan Kara

Push down acquisition of dqio_sem into ->write_file_info() callback.
Mostly for consistency with other operations.

Signed-off-by: Jan Kara <jack@suse.cz>
---
 fs/ocfs2/quota_global.c |  9 +++++++--
 fs/quota/dquot.c        | 10 +---------
 fs/quota/quota_v1.c     |  2 ++
 fs/quota/quota_v2.c     |  5 ++++-
 4 files changed, 14 insertions(+), 12 deletions(-)

diff --git a/fs/ocfs2/quota_global.c b/fs/ocfs2/quota_global.c
index 4134d557a8e5..6f7c5bd3cbc6 100644
--- a/fs/ocfs2/quota_global.c
+++ b/fs/ocfs2/quota_global.c
@@ -443,13 +443,18 @@ static int __ocfs2_global_write_info(struct super_block *sb, int type)
 int ocfs2_global_write_info(struct super_block *sb, int type)
 {
 	int err;
-	struct ocfs2_mem_dqinfo *info = sb_dqinfo(sb, type)->dqi_priv;
+	struct quota_info *dqopt = sb_dqopt(sb);
+	struct ocfs2_mem_dqinfo *info = dqopt->info[type].dqi_priv;
 
+	down_write(&dqopt->dqio_sem);
 	err = ocfs2_qinfo_lock(info, 1);
-	if (err < 0)
+	if (err < 0) {
+		up_write(&dqopt->dqio_sem);
 		return err;
+	}
 	err = __ocfs2_global_write_info(sb, type);
 	ocfs2_qinfo_unlock(info, 1);
+	up_write(&dqopt->dqio_sem);
 	return err;
 }
 
diff --git a/fs/quota/dquot.c b/fs/quota/dquot.c
index 332f7026edad..1e1ff97098ec 100644
--- a/fs/quota/dquot.c
+++ b/fs/quota/dquot.c
@@ -415,10 +415,8 @@ int dquot_acquire(struct dquot *dquot)
 		ret = dqopt->ops[dquot->dq_id.type]->commit_dqblk(dquot);
 		/* Write the info if needed */
 		if (info_dirty(&dqopt->info[dquot->dq_id.type])) {
-			down_write(&dqopt->dqio_sem);
 			ret2 = dqopt->ops[dquot->dq_id.type]->write_file_info(
 					dquot->dq_sb, dquot->dq_id.type);
-			up_write(&dqopt->dqio_sem);
 		}
 		if (ret < 0)
 			goto out_iolock;
@@ -482,10 +480,8 @@ int dquot_release(struct dquot *dquot)
 		ret = dqopt->ops[dquot->dq_id.type]->release_dqblk(dquot);
 		/* Write the info */
 		if (info_dirty(&dqopt->info[dquot->dq_id.type])) {
-			down_write(&dqopt->dqio_sem);
 			ret2 = dqopt->ops[dquot->dq_id.type]->write_file_info(
 						dquot->dq_sb, dquot->dq_id.type);
-			up_write(&dqopt->dqio_sem);
 		}
 		if (ret >= 0)
 			ret = ret2;
@@ -2054,13 +2050,9 @@ EXPORT_SYMBOL(dquot_transfer);
  */
 int dquot_commit_info(struct super_block *sb, int type)
 {
-	int ret;
 	struct quota_info *dqopt = sb_dqopt(sb);
 
-	down_write(&dqopt->dqio_sem);
-	ret = dqopt->ops[type]->write_file_info(sb, type);
-	up_write(&dqopt->dqio_sem);
-	return ret;
+	return dqopt->ops[type]->write_file_info(sb, type);
 }
 EXPORT_SYMBOL(dquot_commit_info);
 
diff --git a/fs/quota/quota_v1.c b/fs/quota/quota_v1.c
index 12d69cda57cc..fe68bf544b29 100644
--- a/fs/quota/quota_v1.c
+++ b/fs/quota/quota_v1.c
@@ -186,6 +186,7 @@ static int v1_write_file_info(struct super_block *sb, int type)
 	struct v1_disk_dqblk dqblk;
 	int ret;
 
+	down_write(&dqopt->dqio_sem);
 	dqopt->info[type].dqi_flags &= ~DQF_INFO_DIRTY;
 	ret = sb->s_op->quota_read(sb, type, (char *)&dqblk,
 				sizeof(struct v1_disk_dqblk), v1_dqoff(0));
@@ -203,6 +204,7 @@ static int v1_write_file_info(struct super_block *sb, int type)
 	else if (ret > 0)
 		ret = -EIO;
 out:
+	up_write(&dqopt->dqio_sem);
 	return ret;
 }
 
diff --git a/fs/quota/quota_v2.c b/fs/quota/quota_v2.c
index f82638e43c07..5e47012d2f57 100644
--- a/fs/quota/quota_v2.c
+++ b/fs/quota/quota_v2.c
@@ -154,10 +154,12 @@ static int v2_read_file_info(struct super_block *sb, int type)
 static int v2_write_file_info(struct super_block *sb, int type)
 {
 	struct v2_disk_dqinfo dinfo;
-	struct mem_dqinfo *info = sb_dqinfo(sb, type);
+	struct quota_info *dqopt = sb_dqopt(sb);
+	struct mem_dqinfo *info = &dqopt->info[type];
 	struct qtree_mem_dqinfo *qinfo = info->dqi_priv;
 	ssize_t size;
 
+	down_write(&dqopt->dqio_sem);
 	spin_lock(&dq_data_lock);
 	info->dqi_flags &= ~DQF_INFO_DIRTY;
 	dinfo.dqi_bgrace = cpu_to_le32(info->dqi_bgrace);
@@ -170,6 +172,7 @@ static int v2_write_file_info(struct super_block *sb, int type)
 	dinfo.dqi_free_entry = cpu_to_le32(qinfo->dqi_free_entry);
 	size = sb->s_op->quota_write(sb, type, (char *)&dinfo,
 	       sizeof(struct v2_disk_dqinfo), V2_DQINFOOFF);
+	up_write(&dqopt->dqio_sem);
 	if (size != sizeof(struct v2_disk_dqinfo)) {
 		quota_error(sb, "Can't write info structure");
 		return -1;
-- 
2.12.3

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH 14/27] quota: Push dqio_sem down to ->read_file_info()
  2017-08-16 15:41 [PATCH 0/27 v1] Quota scalability patches Jan Kara
                   ` (12 preceding siblings ...)
  2017-08-16 15:41 ` [PATCH 13/27] quota: Push dqio_sem down to ->write_file_info() Jan Kara
@ 2017-08-16 15:41 ` Jan Kara
  2017-08-16 17:57   ` Andreas Dilger
  2017-08-16 15:41 ` [PATCH 15/27] quota: Fix error codes in v2_read_file_info() Jan Kara
                   ` (12 subsequent siblings)
  26 siblings, 1 reply; 55+ messages in thread
From: Jan Kara @ 2017-08-16 15:41 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: Wang Shilong, Jan Kara

Push down acquisition of dqio_sem into ->read_file_info() callback. This
is for consistency with other operations and it also allows us to get
rid of an ugliness in OCFS2.

Signed-off-by: Jan Kara <jack@suse.cz>
---
 fs/ocfs2/quota_local.c |  5 -----
 fs/quota/dquot.c       |  6 +-----
 fs/quota/quota_v1.c    |  2 ++
 fs/quota/quota_v2.c    | 28 ++++++++++++++++++++--------
 4 files changed, 23 insertions(+), 18 deletions(-)

diff --git a/fs/ocfs2/quota_local.c b/fs/ocfs2/quota_local.c
index 1829f6a45d46..2eeedcc2e9da 100644
--- a/fs/ocfs2/quota_local.c
+++ b/fs/ocfs2/quota_local.c
@@ -691,9 +691,6 @@ static int ocfs2_local_read_info(struct super_block *sb, int type)
 	struct ocfs2_quota_recovery *rec;
 	int locked = 0;
 
-	/* We don't need the lock and we have to acquire quota file locks
-	 * which will later depend on this lock */
-	up_read(&sb_dqopt(sb)->dqio_sem);
 	info->dqi_max_spc_limit = 0x7fffffffffffffffLL;
 	info->dqi_max_ino_limit = 0x7fffffffffffffffLL;
 	oinfo = kmalloc(sizeof(struct ocfs2_mem_dqinfo), GFP_NOFS);
@@ -772,7 +769,6 @@ static int ocfs2_local_read_info(struct super_block *sb, int type)
 		goto out_err;
 	}
 
-	down_read(&sb_dqopt(sb)->dqio_sem);
 	return 0;
 out_err:
 	if (oinfo) {
@@ -786,7 +782,6 @@ static int ocfs2_local_read_info(struct super_block *sb, int type)
 		kfree(oinfo);
 	}
 	brelse(bh);
-	down_read(&sb_dqopt(sb)->dqio_sem);
 	return -1;
 }
 
diff --git a/fs/quota/dquot.c b/fs/quota/dquot.c
index 1e1ff97098ec..5e77c4da69a6 100644
--- a/fs/quota/dquot.c
+++ b/fs/quota/dquot.c
@@ -2313,15 +2313,11 @@ static int vfs_load_quota_inode(struct inode *inode, int type, int format_id,
 	dqopt->info[type].dqi_format = fmt;
 	dqopt->info[type].dqi_fmt_id = format_id;
 	INIT_LIST_HEAD(&dqopt->info[type].dqi_dirty_list);
-	down_read(&dqopt->dqio_sem);
 	error = dqopt->ops[type]->read_file_info(sb, type);
-	if (error < 0) {
-		up_read(&dqopt->dqio_sem);
+	if (error < 0)
 		goto out_file_init;
-	}
 	if (dqopt->flags & DQUOT_QUOTA_SYS_FILE)
 		dqopt->info[type].dqi_flags |= DQF_SYS_FILE;
-	up_read(&dqopt->dqio_sem);
 	spin_lock(&dq_state_lock);
 	dqopt->flags |= dquot_state_flag(flags, type);
 	spin_unlock(&dq_state_lock);
diff --git a/fs/quota/quota_v1.c b/fs/quota/quota_v1.c
index fe68bf544b29..b2d8e04e567a 100644
--- a/fs/quota/quota_v1.c
+++ b/fs/quota/quota_v1.c
@@ -161,6 +161,7 @@ static int v1_read_file_info(struct super_block *sb, int type)
 	struct v1_disk_dqblk dqblk;
 	int ret;
 
+	down_read(&dqopt->dqio_sem);
 	ret = sb->s_op->quota_read(sb, type, (char *)&dqblk,
 				sizeof(struct v1_disk_dqblk), v1_dqoff(0));
 	if (ret != sizeof(struct v1_disk_dqblk)) {
@@ -177,6 +178,7 @@ static int v1_read_file_info(struct super_block *sb, int type)
 	dqopt->info[type].dqi_bgrace =
 			dqblk.dqb_btime ? dqblk.dqb_btime : MAX_DQ_TIME;
 out:
+	up_read(&dqopt->dqio_sem);
 	return ret;
 }
 
diff --git a/fs/quota/quota_v2.c b/fs/quota/quota_v2.c
index 5e47012d2f57..373d7cfea5b0 100644
--- a/fs/quota/quota_v2.c
+++ b/fs/quota/quota_v2.c
@@ -90,29 +90,38 @@ static int v2_read_file_info(struct super_block *sb, int type)
 {
 	struct v2_disk_dqinfo dinfo;
 	struct v2_disk_dqheader dqhead;
-	struct mem_dqinfo *info = sb_dqinfo(sb, type);
+	struct quota_info *dqopt = sb_dqopt(sb);
+	struct mem_dqinfo *info = &dqopt->info[type];
 	struct qtree_mem_dqinfo *qinfo;
 	ssize_t size;
 	unsigned int version;
+	int ret;
 
-	if (!v2_read_header(sb, type, &dqhead))
-		return -1;
+	down_read(&dqopt->dqio_sem);
+	if (!v2_read_header(sb, type, &dqhead)) {
+		ret = -1;
+		goto out;
+	}
 	version = le32_to_cpu(dqhead.dqh_version);
 	if ((info->dqi_fmt_id == QFMT_VFS_V0 && version != 0) ||
-	    (info->dqi_fmt_id == QFMT_VFS_V1 && version != 1))
-		return -1;
+	    (info->dqi_fmt_id == QFMT_VFS_V1 && version != 1)) {
+		ret = -1;
+		goto out;
+	}
 
 	size = sb->s_op->quota_read(sb, type, (char *)&dinfo,
 	       sizeof(struct v2_disk_dqinfo), V2_DQINFOOFF);
 	if (size != sizeof(struct v2_disk_dqinfo)) {
 		quota_error(sb, "Can't read info structure");
-		return -1;
+		ret = -1;
+		goto out;
 	}
 	info->dqi_priv = kmalloc(sizeof(struct qtree_mem_dqinfo), GFP_NOFS);
 	if (!info->dqi_priv) {
 		printk(KERN_WARNING
 		       "Not enough memory for quota information structure.\n");
-		return -ENOMEM;
+		ret = -ENOMEM;
+		goto out;
 	}
 	qinfo = info->dqi_priv;
 	if (version == 0) {
@@ -147,7 +156,10 @@ static int v2_read_file_info(struct super_block *sb, int type)
 		qinfo->dqi_entry_size = sizeof(struct v2r1_disk_dqblk);
 		qinfo->dqi_ops = &v2r1_qtree_ops;
 	}
-	return 0;
+	ret = 0;
+out:
+	up_read(&dqopt->dqio_sem);
+	return ret;
 }
 
 /* Write information header to quota file */
-- 
2.12.3

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH 15/27] quota: Fix error codes in v2_read_file_info()
  2017-08-16 15:41 [PATCH 0/27 v1] Quota scalability patches Jan Kara
                   ` (13 preceding siblings ...)
  2017-08-16 15:41 ` [PATCH 14/27] quota: Push dqio_sem down to ->read_file_info() Jan Kara
@ 2017-08-16 15:41 ` Jan Kara
  2017-08-16 18:00   ` Andreas Dilger
  2017-08-16 15:41 ` [PATCH 16/27] quota: Fix possible corruption of dqi_flags Jan Kara
                   ` (11 subsequent siblings)
  26 siblings, 1 reply; 55+ messages in thread
From: Jan Kara @ 2017-08-16 15:41 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: Wang Shilong, Jan Kara

v2_read_file_info() returned -1 instead of proper error codes on error.
Luckily this is not easily visible from userspace as we have called
->check_quota_file shortly before and thus already verified the quota
file is sane. Still set the error codes to proper values.

Signed-off-by: Jan Kara <jack@suse.cz>
---
 fs/quota/quota_v2.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/fs/quota/quota_v2.c b/fs/quota/quota_v2.c
index 373d7cfea5b0..21a8bdf7cfec 100644
--- a/fs/quota/quota_v2.c
+++ b/fs/quota/quota_v2.c
@@ -99,13 +99,13 @@ static int v2_read_file_info(struct super_block *sb, int type)
 
 	down_read(&dqopt->dqio_sem);
 	if (!v2_read_header(sb, type, &dqhead)) {
-		ret = -1;
+		ret = -EIO;
 		goto out;
 	}
 	version = le32_to_cpu(dqhead.dqh_version);
 	if ((info->dqi_fmt_id == QFMT_VFS_V0 && version != 0) ||
 	    (info->dqi_fmt_id == QFMT_VFS_V1 && version != 1)) {
-		ret = -1;
+		ret = -EINVAL;
 		goto out;
 	}
 
@@ -113,7 +113,7 @@ static int v2_read_file_info(struct super_block *sb, int type)
 	       sizeof(struct v2_disk_dqinfo), V2_DQINFOOFF);
 	if (size != sizeof(struct v2_disk_dqinfo)) {
 		quota_error(sb, "Can't read info structure");
-		ret = -1;
+		ret = -EIO;
 		goto out;
 	}
 	info->dqi_priv = kmalloc(sizeof(struct qtree_mem_dqinfo), GFP_NOFS);
-- 
2.12.3

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH 16/27] quota: Fix possible corruption of dqi_flags
  2017-08-16 15:41 [PATCH 0/27 v1] Quota scalability patches Jan Kara
                   ` (14 preceding siblings ...)
  2017-08-16 15:41 ` [PATCH 15/27] quota: Fix error codes in v2_read_file_info() Jan Kara
@ 2017-08-16 15:41 ` Jan Kara
  2017-08-16 18:14   ` Andreas Dilger
  2017-08-16 15:41 ` [PATCH 17/27] quota: Drop return value of mark_all_dquot_dirty() Jan Kara
                   ` (10 subsequent siblings)
  26 siblings, 1 reply; 55+ messages in thread
From: Jan Kara @ 2017-08-16 15:41 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: Wang Shilong, Jan Kara

dqi_flags modifications are protected by dq_data_lock. However the
modifications in vfs_load_quota_inode() and in mark_info_dirty() were
not which could lead to corruption of dqi_flags. Since modifications to
dqi_flags are rare, this is hard to observe in practice but in theory it
could happen. Fix the problem by always using dq_data_lock for
protection.

Signed-off-by: Jan Kara <jack@suse.cz>
---
 fs/quota/dquot.c | 9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/fs/quota/dquot.c b/fs/quota/dquot.c
index 5e77c4da69a6..e1a155e8db15 100644
--- a/fs/quota/dquot.c
+++ b/fs/quota/dquot.c
@@ -389,7 +389,9 @@ static inline int clear_dquot_dirty(struct dquot *dquot)
 
 void mark_info_dirty(struct super_block *sb, int type)
 {
-	set_bit(DQF_INFO_DIRTY_B, &sb_dqopt(sb)->info[type].dqi_flags);
+	spin_lock(&dq_data_lock);
+	sb_dqopt(sb)->info[type].dqi_flags |= DQF_INFO_DIRTY;
+	spin_unlock(&dq_data_lock);
 }
 EXPORT_SYMBOL(mark_info_dirty);
 
@@ -2316,8 +2318,11 @@ static int vfs_load_quota_inode(struct inode *inode, int type, int format_id,
 	error = dqopt->ops[type]->read_file_info(sb, type);
 	if (error < 0)
 		goto out_file_init;
-	if (dqopt->flags & DQUOT_QUOTA_SYS_FILE)
+	if (dqopt->flags & DQUOT_QUOTA_SYS_FILE) {
+		spin_lock(&dq_data_lock);
 		dqopt->info[type].dqi_flags |= DQF_SYS_FILE;
+		spin_unlock(&dq_data_lock);
+	}
 	spin_lock(&dq_state_lock);
 	dqopt->flags |= dquot_state_flag(flags, type);
 	spin_unlock(&dq_state_lock);
-- 
2.12.3

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH 17/27] quota: Drop return value of mark_all_dquot_dirty()
  2017-08-16 15:41 [PATCH 0/27 v1] Quota scalability patches Jan Kara
                   ` (15 preceding siblings ...)
  2017-08-16 15:41 ` [PATCH 16/27] quota: Fix possible corruption of dqi_flags Jan Kara
@ 2017-08-16 15:41 ` Jan Kara
  2017-08-16 15:41 ` [PATCH 18/27] quota: Do not dirty bad dquots Jan Kara
                   ` (9 subsequent siblings)
  26 siblings, 0 replies; 55+ messages in thread
From: Jan Kara @ 2017-08-16 15:41 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: Wang Shilong, Jan Kara

Nobody uses the return value and later it would just complicate things.

Signed-off-by: Jan Kara <jack@suse.cz>
---
 fs/quota/dquot.c | 14 ++++----------
 1 file changed, 4 insertions(+), 10 deletions(-)

diff --git a/fs/quota/dquot.c b/fs/quota/dquot.c
index e1a155e8db15..234535ed8124 100644
--- a/fs/quota/dquot.c
+++ b/fs/quota/dquot.c
@@ -355,19 +355,13 @@ int dquot_mark_dquot_dirty(struct dquot *dquot)
 EXPORT_SYMBOL(dquot_mark_dquot_dirty);
 
 /* Dirtify all the dquots - this can block when journalling */
-static inline int mark_all_dquot_dirty(struct dquot * const *dquot)
+static void mark_all_dquot_dirty(struct dquot * const *dquot)
 {
-	int ret, err, cnt;
+	int cnt;
 
-	ret = err = 0;
-	for (cnt = 0; cnt < MAXQUOTAS; cnt++) {
+	for (cnt = 0; cnt < MAXQUOTAS; cnt++)
 		if (dquot[cnt])
-			/* Even in case of error we have to continue */
-			ret = mark_dquot_dirty(dquot[cnt]);
-		if (!err)
-			err = ret;
-	}
-	return err;
+			mark_dquot_dirty(dquot[cnt]);
 }
 
 static inline void dqput_all(struct dquot **dquot)
-- 
2.12.3

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH 18/27] quota: Do not dirty bad dquots
  2017-08-16 15:41 [PATCH 0/27 v1] Quota scalability patches Jan Kara
                   ` (16 preceding siblings ...)
  2017-08-16 15:41 ` [PATCH 17/27] quota: Drop return value of mark_all_dquot_dirty() Jan Kara
@ 2017-08-16 15:41 ` Jan Kara
  2017-08-16 15:41 ` [PATCH 19/27] quota: Move locking into clear_dquot_dirty() Jan Kara
                   ` (8 subsequent siblings)
  26 siblings, 0 replies; 55+ messages in thread
From: Jan Kara @ 2017-08-16 15:41 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: Wang Shilong, Jan Kara

Currently we mark dirty even dquots that are not active (i.e.,
initialization or reading failed for them). Thus later we have to check
whether dirty dquot is really active and just clear the dirty bit if
not. Avoid this complication by just never marking non-active dquot as
dirty.

Signed-off-by: Jan Kara <jack@suse.cz>
---
 fs/quota/dquot.c | 15 +++++++--------
 1 file changed, 7 insertions(+), 8 deletions(-)

diff --git a/fs/quota/dquot.c b/fs/quota/dquot.c
index 234535ed8124..67f154231072 100644
--- a/fs/quota/dquot.c
+++ b/fs/quota/dquot.c
@@ -339,6 +339,9 @@ int dquot_mark_dquot_dirty(struct dquot *dquot)
 {
 	int ret = 1;
 
+	if (!test_bit(DQ_ACTIVE_B, &dquot->dq_flags))
+		return 0;
+
 	/* If quota is dirty already, we don't have to acquire dq_list_lock */
 	if (test_bit(DQ_MOD_B, &dquot->dq_flags))
 		return 1;
@@ -618,11 +621,9 @@ int dquot_writeback_dquots(struct super_block *sb, int type)
 		while (!list_empty(dirty)) {
 			dquot = list_first_entry(dirty, struct dquot,
 						 dq_dirty);
-			/* Dirty and inactive can be only bad dquot... */
-			if (!test_bit(DQ_ACTIVE_B, &dquot->dq_flags)) {
-				clear_dquot_dirty(dquot);
-				continue;
-			}
+
+			WARN_ON(!test_bit(DQ_ACTIVE_B, &dquot->dq_flags));
+
 			/* Now we have active dquot from which someone is
  			 * holding reference so we can safely just increase
 			 * use count */
@@ -753,7 +754,7 @@ void dqput(struct dquot *dquot)
 		return;
 	}
 	/* Need to release dquot? */
-	if (test_bit(DQ_ACTIVE_B, &dquot->dq_flags) && dquot_dirty(dquot)) {
+	if (dquot_dirty(dquot)) {
 		spin_unlock(&dq_list_lock);
 		/* Commit dquot before releasing */
 		ret = dquot->dq_sb->dq_op->write_dquot(dquot);
@@ -771,8 +772,6 @@ void dqput(struct dquot *dquot)
 		}
 		goto we_slept;
 	}
-	/* Clear flag in case dquot was inactive (something bad happened) */
-	clear_dquot_dirty(dquot);
 	if (test_bit(DQ_ACTIVE_B, &dquot->dq_flags)) {
 		spin_unlock(&dq_list_lock);
 		dquot->dq_sb->dq_op->release_dquot(dquot);
-- 
2.12.3

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH 19/27] quota: Move locking into clear_dquot_dirty()
  2017-08-16 15:41 [PATCH 0/27 v1] Quota scalability patches Jan Kara
                   ` (17 preceding siblings ...)
  2017-08-16 15:41 ` [PATCH 18/27] quota: Do not dirty bad dquots Jan Kara
@ 2017-08-16 15:41 ` Jan Kara
  2017-08-16 18:29   ` Andreas Dilger
  2017-08-16 15:41 ` [PATCH 20/27] quota: Remove dq_wait_unused from dquot Jan Kara
                   ` (7 subsequent siblings)
  26 siblings, 1 reply; 55+ messages in thread
From: Jan Kara @ 2017-08-16 15:41 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: Wang Shilong, Jan Kara

Move locking of dq_list_lock into clear_dquot_dirty(). It makes the
function more self-contained and will simplify our life later.

Signed-off-by: Jan Kara <jack@suse.cz>
---
 fs/quota/dquot.c | 15 ++++++---------
 1 file changed, 6 insertions(+), 9 deletions(-)

diff --git a/fs/quota/dquot.c b/fs/quota/dquot.c
index 67f154231072..4a407f30922d 100644
--- a/fs/quota/dquot.c
+++ b/fs/quota/dquot.c
@@ -375,12 +375,15 @@ static inline void dqput_all(struct dquot **dquot)
 		dqput(dquot[cnt]);
 }
 
-/* This function needs dq_list_lock */
 static inline int clear_dquot_dirty(struct dquot *dquot)
 {
-	if (!test_and_clear_bit(DQ_MOD_B, &dquot->dq_flags))
+	spin_lock(&dq_list_lock);
+	if (!test_and_clear_bit(DQ_MOD_B, &dquot->dq_flags)) {
+		spin_unlock(&dq_list_lock);
 		return 0;
+	}
 	list_del_init(&dquot->dq_dirty);
+	spin_unlock(&dq_list_lock);
 	return 1;
 }
 
@@ -445,12 +448,8 @@ int dquot_commit(struct dquot *dquot)
 	struct quota_info *dqopt = sb_dqopt(dquot->dq_sb);
 
 	mutex_lock(&dquot->dq_lock);
-	spin_lock(&dq_list_lock);
-	if (!clear_dquot_dirty(dquot)) {
-		spin_unlock(&dq_list_lock);
+	if (!clear_dquot_dirty(dquot))
 		goto out_lock;
-	}
-	spin_unlock(&dq_list_lock);
 	/* Inactive dquot can be only if there was error during read/init
 	 * => we have better not writing it */
 	if (test_bit(DQ_ACTIVE_B, &dquot->dq_flags))
@@ -766,9 +765,7 @@ void dqput(struct dquot *dquot)
 			 * We clear dirty bit anyway, so that we avoid
 			 * infinite loop here
 			 */
-			spin_lock(&dq_list_lock);
 			clear_dquot_dirty(dquot);
-			spin_unlock(&dq_list_lock);
 		}
 		goto we_slept;
 	}
-- 
2.12.3

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH 20/27] quota: Remove dq_wait_unused from dquot
  2017-08-16 15:41 [PATCH 0/27 v1] Quota scalability patches Jan Kara
                   ` (18 preceding siblings ...)
  2017-08-16 15:41 ` [PATCH 19/27] quota: Move locking into clear_dquot_dirty() Jan Kara
@ 2017-08-16 15:41 ` Jan Kara
  2017-08-16 15:41 ` [PATCH 21/27] quota: Allow disabling tracking of dirty dquots in a list Jan Kara
                   ` (6 subsequent siblings)
  26 siblings, 0 replies; 55+ messages in thread
From: Jan Kara @ 2017-08-16 15:41 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: Wang Shilong, Jan Kara

Currently every dquot carries a wait_queue_head_t used only when we are
turning quotas off to wait for last users to drop dquot references.
Since such rare case is not performance sensitive in any means, just use
a global waitqueue for this and save space in struct dquot. Also convert
the logic to use wait_event() instead of open-coding it.

Signed-off-by: Jan Kara <jack@suse.cz>
---
 fs/quota/dquot.c      | 17 +++++++----------
 include/linux/quota.h |  1 -
 2 files changed, 7 insertions(+), 11 deletions(-)

diff --git a/fs/quota/dquot.c b/fs/quota/dquot.c
index 4a407f30922d..57de84ab0ce7 100644
--- a/fs/quota/dquot.c
+++ b/fs/quota/dquot.c
@@ -126,6 +126,8 @@ __cacheline_aligned_in_smp DEFINE_SPINLOCK(dq_data_lock);
 EXPORT_SYMBOL(dq_data_lock);
 DEFINE_STATIC_SRCU(dquot_srcu);
 
+static DECLARE_WAIT_QUEUE_HEAD(dquot_ref_wq);
+
 void __quota_error(struct super_block *sb, const char *func,
 		   const char *fmt, ...)
 {
@@ -521,22 +523,18 @@ static void invalidate_dquots(struct super_block *sb, int type)
 			continue;
 		/* Wait for dquot users */
 		if (atomic_read(&dquot->dq_count)) {
-			DEFINE_WAIT(wait);
-
 			dqgrab(dquot);
-			prepare_to_wait(&dquot->dq_wait_unused, &wait,
-					TASK_UNINTERRUPTIBLE);
 			spin_unlock(&dq_list_lock);
-			/* Once dqput() wakes us up, we know it's time to free
+			/*
+			 * Once dqput() wakes us up, we know it's time to free
 			 * the dquot.
 			 * IMPORTANT: we rely on the fact that there is always
 			 * at most one process waiting for dquot to free.
 			 * Otherwise dq_count would be > 1 and we would never
 			 * wake up.
 			 */
-			if (atomic_read(&dquot->dq_count) > 1)
-				schedule();
-			finish_wait(&dquot->dq_wait_unused, &wait);
+			wait_event(dquot_ref_wq,
+				   atomic_read(&dquot->dq_count) == 1);
 			dqput(dquot);
 			/* At this moment dquot() need not exist (it could be
 			 * reclaimed by prune_dqcache(). Hence we must
@@ -748,7 +746,7 @@ void dqput(struct dquot *dquot)
 		/* Releasing dquot during quotaoff phase? */
 		if (!sb_has_quota_active(dquot->dq_sb, dquot->dq_id.type) &&
 		    atomic_read(&dquot->dq_count) == 1)
-			wake_up(&dquot->dq_wait_unused);
+			wake_up(&dquot_ref_wq);
 		spin_unlock(&dq_list_lock);
 		return;
 	}
@@ -803,7 +801,6 @@ static struct dquot *get_empty_dquot(struct super_block *sb, int type)
 	INIT_LIST_HEAD(&dquot->dq_inuse);
 	INIT_HLIST_NODE(&dquot->dq_hash);
 	INIT_LIST_HEAD(&dquot->dq_dirty);
-	init_waitqueue_head(&dquot->dq_wait_unused);
 	dquot->dq_sb = sb;
 	dquot->dq_id = make_kqid_invalid(type);
 	atomic_set(&dquot->dq_count, 1);
diff --git a/include/linux/quota.h b/include/linux/quota.h
index 3a6df7461642..ad6809f099ac 100644
--- a/include/linux/quota.h
+++ b/include/linux/quota.h
@@ -299,7 +299,6 @@ struct dquot {
 	struct list_head dq_dirty;	/* List of dirty dquots */
 	struct mutex dq_lock;		/* dquot IO lock */
 	atomic_t dq_count;		/* Use count */
-	wait_queue_head_t dq_wait_unused;	/* Wait queue for dquot to become unused */
 	struct super_block *dq_sb;	/* superblock this applies to */
 	struct kqid dq_id;		/* ID this applies to (uid, gid, projid) */
 	loff_t dq_off;			/* Offset of dquot on disk */
-- 
2.12.3

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH 21/27] quota: Allow disabling tracking of dirty dquots in a list
  2017-08-16 15:41 [PATCH 0/27 v1] Quota scalability patches Jan Kara
                   ` (19 preceding siblings ...)
  2017-08-16 15:41 ` [PATCH 20/27] quota: Remove dq_wait_unused from dquot Jan Kara
@ 2017-08-16 15:41 ` Jan Kara
  2017-08-16 15:41 ` [PATCH 22/27] ext4: Disable dirty list tracking of dquots when journalling quotas Jan Kara
                   ` (5 subsequent siblings)
  26 siblings, 0 replies; 55+ messages in thread
From: Jan Kara @ 2017-08-16 15:41 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: Wang Shilong, Jan Kara

Filesystems that are journalling quotas generally don't need tracking of
dirty dquots in a list since forcing a transaction commit flushes all
quotas anyway. Allow filesystem to say it doesn't want dquots to be
tracked as it reduces contention on the dq_list_lock.

Signed-off-by: Jan Kara <jack@suse.cz>
---
 fs/quota/dquot.c      | 6 ++++++
 include/linux/quota.h | 3 +++
 2 files changed, 9 insertions(+)

diff --git a/fs/quota/dquot.c b/fs/quota/dquot.c
index 57de84ab0ce7..0b3fbf75a5c9 100644
--- a/fs/quota/dquot.c
+++ b/fs/quota/dquot.c
@@ -344,6 +344,9 @@ int dquot_mark_dquot_dirty(struct dquot *dquot)
 	if (!test_bit(DQ_ACTIVE_B, &dquot->dq_flags))
 		return 0;
 
+	if (sb_dqopt(dquot->dq_sb)->flags & DQUOT_NOLIST_DIRTY)
+		return test_and_set_bit(DQ_MOD_B, &dquot->dq_flags);
+
 	/* If quota is dirty already, we don't have to acquire dq_list_lock */
 	if (test_bit(DQ_MOD_B, &dquot->dq_flags))
 		return 1;
@@ -379,6 +382,9 @@ static inline void dqput_all(struct dquot **dquot)
 
 static inline int clear_dquot_dirty(struct dquot *dquot)
 {
+	if (sb_dqopt(dquot->dq_sb)->flags & DQUOT_NOLIST_DIRTY)
+		return test_and_clear_bit(DQ_MOD_B, &dquot->dq_flags);
+
 	spin_lock(&dq_list_lock);
 	if (!test_and_clear_bit(DQ_MOD_B, &dquot->dq_flags)) {
 		spin_unlock(&dq_list_lock);
diff --git a/include/linux/quota.h b/include/linux/quota.h
index ad6809f099ac..eccc1cb6274e 100644
--- a/include/linux/quota.h
+++ b/include/linux/quota.h
@@ -490,6 +490,9 @@ enum {
 						 */
 #define DQUOT_NEGATIVE_USAGE	(1 << (DQUOT_STATE_LAST + 1))
 					       /* Allow negative quota usage */
+/* Do not track dirty dquots in a list */
+#define DQUOT_NOLIST_DIRTY	(1 << (DQUOT_STATE_LAST + 2))
+
 static inline unsigned int dquot_state_flag(unsigned int flags, int type)
 {
 	return flags << type;
-- 
2.12.3

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH 22/27] ext4: Disable dirty list tracking of dquots when journalling quotas
  2017-08-16 15:41 [PATCH 0/27 v1] Quota scalability patches Jan Kara
                   ` (20 preceding siblings ...)
  2017-08-16 15:41 ` [PATCH 21/27] quota: Allow disabling tracking of dirty dquots in a list Jan Kara
@ 2017-08-16 15:41 ` Jan Kara
  2017-08-16 15:41 ` [PATCH 23/27] quota: Inline functions into their callsites Jan Kara
                   ` (4 subsequent siblings)
  26 siblings, 0 replies; 55+ messages in thread
From: Jan Kara @ 2017-08-16 15:41 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: Wang Shilong, Jan Kara

When journalling quotas, we writeback all dquots immediately after
changing them as part of current transation. Thus there's no need to
write anything in dquot_writeback_dquots() and so we can avoid updating
list of dirty dquots to reduce dq_list_lock contention.

This change reduces time to create 500000 files on ext4 on ramdisk by 50
different processes in separate directories by 15% when user quota is
turned on.

Signed-off-by: Jan Kara <jack@suse.cz>
---
 fs/ext4/super.c | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index 8e0c27387ab7..eac9f8c50631 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -5409,6 +5409,13 @@ static int ext4_quota_on(struct super_block *sb, int type, int format_id,
 			ext4_msg(sb, KERN_WARNING,
 				"Quota file not on filesystem root. "
 				"Journaled quota will not work");
+		sb_dqopt(sb)->flags |= DQUOT_NOLIST_DIRTY;
+	} else {
+		/*
+		 * Clear the flag just in case mount options changed since
+		 * last time.
+		 */
+		sb_dqopt(sb)->flags &= ~DQUOT_NOLIST_DIRTY;
 	}
 
 	/*
@@ -5505,7 +5512,7 @@ static int ext4_enable_quotas(struct super_block *sb)
 		test_opt(sb, PRJQUOTA),
 	};
 
-	sb_dqopt(sb)->flags |= DQUOT_QUOTA_SYS_FILE;
+	sb_dqopt(sb)->flags |= DQUOT_QUOTA_SYS_FILE | DQUOT_NOLIST_DIRTY;
 	for (type = 0; type < EXT4_MAXQUOTAS; type++) {
 		if (qf_inums[type]) {
 			err = ext4_quota_enable(sb, type, QFMT_VFS_V1,
-- 
2.12.3

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH 23/27] quota: Inline functions into their callsites
  2017-08-16 15:41 [PATCH 0/27 v1] Quota scalability patches Jan Kara
                   ` (21 preceding siblings ...)
  2017-08-16 15:41 ` [PATCH 22/27] ext4: Disable dirty list tracking of dquots when journalling quotas Jan Kara
@ 2017-08-16 15:41 ` Jan Kara
  2017-08-16 15:41 ` [PATCH 24/27] quota: Inline inode_{incr,decr}_space() into callsites Jan Kara
                   ` (3 subsequent siblings)
  26 siblings, 0 replies; 55+ messages in thread
From: Jan Kara @ 2017-08-16 15:41 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: Wang Shilong, Jan Kara

inode_add_rsv_space() and inode_sub_rsv_space() had only one callsite.
Inline them there directly. inode_claim_rsv_space() and
inode_reclaim_rsv_space() had two callsites so inline them there as
well. This will simplify further locking changes.

Signed-off-by: Jan Kara <jack@suse.cz>
---
 fs/quota/dquot.c         | 72 +++++++++++++++++++-----------------------------
 include/linux/quotaops.h |  5 ----
 2 files changed, 28 insertions(+), 49 deletions(-)

diff --git a/fs/quota/dquot.c b/fs/quota/dquot.c
index 0b3fbf75a5c9..d9b9c046f848 100644
--- a/fs/quota/dquot.c
+++ b/fs/quota/dquot.c
@@ -1577,40 +1577,6 @@ static qsize_t *inode_reserved_space(struct inode * inode)
 	return inode->i_sb->dq_op->get_reserved_space(inode);
 }
 
-void inode_add_rsv_space(struct inode *inode, qsize_t number)
-{
-	spin_lock(&inode->i_lock);
-	*inode_reserved_space(inode) += number;
-	spin_unlock(&inode->i_lock);
-}
-EXPORT_SYMBOL(inode_add_rsv_space);
-
-void inode_claim_rsv_space(struct inode *inode, qsize_t number)
-{
-	spin_lock(&inode->i_lock);
-	*inode_reserved_space(inode) -= number;
-	__inode_add_bytes(inode, number);
-	spin_unlock(&inode->i_lock);
-}
-EXPORT_SYMBOL(inode_claim_rsv_space);
-
-void inode_reclaim_rsv_space(struct inode *inode, qsize_t number)
-{
-	spin_lock(&inode->i_lock);
-	*inode_reserved_space(inode) += number;
-	__inode_sub_bytes(inode, number);
-	spin_unlock(&inode->i_lock);
-}
-EXPORT_SYMBOL(inode_reclaim_rsv_space);
-
-void inode_sub_rsv_space(struct inode *inode, qsize_t number)
-{
-	spin_lock(&inode->i_lock);
-	*inode_reserved_space(inode) -= number;
-	spin_unlock(&inode->i_lock);
-}
-EXPORT_SYMBOL(inode_sub_rsv_space);
-
 static qsize_t inode_get_rsv_space(struct inode *inode)
 {
 	qsize_t ret;
@@ -1626,18 +1592,24 @@ static qsize_t inode_get_rsv_space(struct inode *inode)
 static void inode_incr_space(struct inode *inode, qsize_t number,
 				int reserve)
 {
-	if (reserve)
-		inode_add_rsv_space(inode, number);
-	else
+	if (reserve) {
+		spin_lock(&inode->i_lock);
+		*inode_reserved_space(inode) += number;
+		spin_unlock(&inode->i_lock);
+	} else {
 		inode_add_bytes(inode, number);
+	}
 }
 
 static void inode_decr_space(struct inode *inode, qsize_t number, int reserve)
 {
-	if (reserve)
-		inode_sub_rsv_space(inode, number);
-	else
+	if (reserve) {
+		spin_lock(&inode->i_lock);
+		*inode_reserved_space(inode) -= number;
+		spin_unlock(&inode->i_lock);
+	} else {
 		inode_sub_bytes(inode, number);
+	}
 }
 
 /*
@@ -1753,7 +1725,10 @@ int dquot_claim_space_nodirty(struct inode *inode, qsize_t number)
 	int cnt, index;
 
 	if (!dquot_active(inode)) {
-		inode_claim_rsv_space(inode, number);
+		spin_lock(&inode->i_lock);
+		*inode_reserved_space(inode) -= number;
+		__inode_add_bytes(inode, number);
+		spin_unlock(&inode->i_lock);
 		return 0;
 	}
 
@@ -1766,7 +1741,10 @@ int dquot_claim_space_nodirty(struct inode *inode, qsize_t number)
 			dquot_claim_reserved_space(dquots[cnt], number);
 	}
 	/* Update inode bytes */
-	inode_claim_rsv_space(inode, number);
+	spin_lock(&inode->i_lock);
+	*inode_reserved_space(inode) -= number;
+	__inode_add_bytes(inode, number);
+	spin_unlock(&inode->i_lock);
 	spin_unlock(&dq_data_lock);
 	mark_all_dquot_dirty(dquots);
 	srcu_read_unlock(&dquot_srcu, index);
@@ -1783,7 +1761,10 @@ void dquot_reclaim_space_nodirty(struct inode *inode, qsize_t number)
 	int cnt, index;
 
 	if (!dquot_active(inode)) {
-		inode_reclaim_rsv_space(inode, number);
+		spin_lock(&inode->i_lock);
+		*inode_reserved_space(inode) += number;
+		__inode_sub_bytes(inode, number);
+		spin_unlock(&inode->i_lock);
 		return;
 	}
 
@@ -1796,7 +1777,10 @@ void dquot_reclaim_space_nodirty(struct inode *inode, qsize_t number)
 			dquot_reclaim_reserved_space(dquots[cnt], number);
 	}
 	/* Update inode bytes */
-	inode_reclaim_rsv_space(inode, number);
+	spin_lock(&inode->i_lock);
+	*inode_reserved_space(inode) += number;
+	__inode_sub_bytes(inode, number);
+	spin_unlock(&inode->i_lock);
 	spin_unlock(&dq_data_lock);
 	mark_all_dquot_dirty(dquots);
 	srcu_read_unlock(&dquot_srcu, index);
diff --git a/include/linux/quotaops.h b/include/linux/quotaops.h
index dda22f45fc1b..0ce6fc49962e 100644
--- a/include/linux/quotaops.h
+++ b/include/linux/quotaops.h
@@ -38,11 +38,6 @@ void __quota_error(struct super_block *sb, const char *func,
 /*
  * declaration of quota_function calls in kernel.
  */
-void inode_add_rsv_space(struct inode *inode, qsize_t number);
-void inode_claim_rsv_space(struct inode *inode, qsize_t number);
-void inode_sub_rsv_space(struct inode *inode, qsize_t number);
-void inode_reclaim_rsv_space(struct inode *inode, qsize_t number);
-
 int dquot_initialize(struct inode *inode);
 bool dquot_initialize_needed(struct inode *inode);
 void dquot_drop(struct inode *inode);
-- 
2.12.3

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH 24/27] quota: Inline inode_{incr,decr}_space() into callsites
  2017-08-16 15:41 [PATCH 0/27 v1] Quota scalability patches Jan Kara
                   ` (22 preceding siblings ...)
  2017-08-16 15:41 ` [PATCH 23/27] quota: Inline functions into their callsites Jan Kara
@ 2017-08-16 15:41 ` Jan Kara
  2017-08-16 15:41 ` [PATCH 25/27] quota: Inline dquot_[re]claim_reserved_space() into callsite Jan Kara
                   ` (2 subsequent siblings)
  26 siblings, 0 replies; 55+ messages in thread
From: Jan Kara @ 2017-08-16 15:41 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: Wang Shilong, Jan Kara

inode_incr_space() and inode_decr_space() have only two callsites.
Inline them there as that will make locking changes simpler.

Signed-off-by: Jan Kara <jack@suse.cz>
---
 fs/quota/dquot.c | 55 ++++++++++++++++++++++++++++---------------------------
 1 file changed, 28 insertions(+), 27 deletions(-)

diff --git a/fs/quota/dquot.c b/fs/quota/dquot.c
index d9b9c046f848..038f70c1ebff 100644
--- a/fs/quota/dquot.c
+++ b/fs/quota/dquot.c
@@ -1589,29 +1589,6 @@ static qsize_t inode_get_rsv_space(struct inode *inode)
 	return ret;
 }
 
-static void inode_incr_space(struct inode *inode, qsize_t number,
-				int reserve)
-{
-	if (reserve) {
-		spin_lock(&inode->i_lock);
-		*inode_reserved_space(inode) += number;
-		spin_unlock(&inode->i_lock);
-	} else {
-		inode_add_bytes(inode, number);
-	}
-}
-
-static void inode_decr_space(struct inode *inode, qsize_t number, int reserve)
-{
-	if (reserve) {
-		spin_lock(&inode->i_lock);
-		*inode_reserved_space(inode) -= number;
-		spin_unlock(&inode->i_lock);
-	} else {
-		inode_sub_bytes(inode, number);
-	}
-}
-
 /*
  * This functions updates i_blocks+i_bytes fields and quota information
  * (together with appropriate checks).
@@ -1633,7 +1610,13 @@ int __dquot_alloc_space(struct inode *inode, qsize_t number, int flags)
 	struct dquot **dquots;
 
 	if (!dquot_active(inode)) {
-		inode_incr_space(inode, number, reserve);
+		if (reserve) {
+			spin_lock(&inode->i_lock);
+			*inode_reserved_space(inode) += number;
+			spin_unlock(&inode->i_lock);
+		} else {
+			inode_add_bytes(inode, number);
+		}
 		goto out;
 	}
 
@@ -1661,7 +1644,13 @@ int __dquot_alloc_space(struct inode *inode, qsize_t number, int flags)
 		else
 			dquot_incr_space(dquots[cnt], number);
 	}
-	inode_incr_space(inode, number, reserve);
+	if (reserve) {
+		spin_lock(&inode->i_lock);
+		*inode_reserved_space(inode) += number;
+		spin_unlock(&inode->i_lock);
+	} else {
+		inode_add_bytes(inode, number);
+	}
 	spin_unlock(&dq_data_lock);
 
 	if (reserve)
@@ -1799,7 +1788,13 @@ void __dquot_free_space(struct inode *inode, qsize_t number, int flags)
 	int reserve = flags & DQUOT_SPACE_RESERVE, index;
 
 	if (!dquot_active(inode)) {
-		inode_decr_space(inode, number, reserve);
+		if (reserve) {
+			spin_lock(&inode->i_lock);
+			*inode_reserved_space(inode) -= number;
+			spin_unlock(&inode->i_lock);
+		} else {
+			inode_sub_bytes(inode, number);
+		}
 		return;
 	}
 
@@ -1820,7 +1815,13 @@ void __dquot_free_space(struct inode *inode, qsize_t number, int flags)
 		else
 			dquot_decr_space(dquots[cnt], number);
 	}
-	inode_decr_space(inode, number, reserve);
+	if (reserve) {
+		spin_lock(&inode->i_lock);
+		*inode_reserved_space(inode) -= number;
+		spin_unlock(&inode->i_lock);
+	} else {
+		inode_sub_bytes(inode, number);
+	}
 	spin_unlock(&dq_data_lock);
 
 	if (reserve)
-- 
2.12.3

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH 25/27] quota: Inline dquot_[re]claim_reserved_space() into callsite
  2017-08-16 15:41 [PATCH 0/27 v1] Quota scalability patches Jan Kara
                   ` (23 preceding siblings ...)
  2017-08-16 15:41 ` [PATCH 24/27] quota: Inline inode_{incr,decr}_space() into callsites Jan Kara
@ 2017-08-16 15:41 ` Jan Kara
  2017-08-16 19:52   ` Andreas Dilger
  2017-08-16 15:41 ` [PATCH 26/27] fs: Provide __inode_get_bytes() Jan Kara
  2017-08-16 15:41 ` [PATCH 27/27] quota: Reduce contention on dq_data_lock Jan Kara
  26 siblings, 1 reply; 55+ messages in thread
From: Jan Kara @ 2017-08-16 15:41 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: Wang Shilong, Jan Kara

dquot_claim_reserved_space() and dquot_reclaim_reserved_space() have
only a single callsite. Inline them there.

Signed-off-by: Jan Kara <jack@suse.cz>
---
 fs/quota/dquot.c | 43 ++++++++++++++++++-------------------------
 1 file changed, 18 insertions(+), 25 deletions(-)

diff --git a/fs/quota/dquot.c b/fs/quota/dquot.c
index 038f70c1ebff..9f4cf78096e4 100644
--- a/fs/quota/dquot.c
+++ b/fs/quota/dquot.c
@@ -1082,27 +1082,6 @@ static inline void dquot_resv_space(struct dquot *dquot, qsize_t number)
 	dquot->dq_dqb.dqb_rsvspace += number;
 }
 
-/*
- * Claim reserved quota space
- */
-static void dquot_claim_reserved_space(struct dquot *dquot, qsize_t number)
-{
-	if (dquot->dq_dqb.dqb_rsvspace < number) {
-		WARN_ON_ONCE(1);
-		number = dquot->dq_dqb.dqb_rsvspace;
-	}
-	dquot->dq_dqb.dqb_curspace += number;
-	dquot->dq_dqb.dqb_rsvspace -= number;
-}
-
-static void dquot_reclaim_reserved_space(struct dquot *dquot, qsize_t number)
-{
-	if (WARN_ON_ONCE(dquot->dq_dqb.dqb_curspace < number))
-		number = dquot->dq_dqb.dqb_curspace;
-	dquot->dq_dqb.dqb_rsvspace += number;
-	dquot->dq_dqb.dqb_curspace -= number;
-}
-
 static inline
 void dquot_free_reserved_space(struct dquot *dquot, qsize_t number)
 {
@@ -1726,8 +1705,16 @@ int dquot_claim_space_nodirty(struct inode *inode, qsize_t number)
 	spin_lock(&dq_data_lock);
 	/* Claim reserved quotas to allocated quotas */
 	for (cnt = 0; cnt < MAXQUOTAS; cnt++) {
-		if (dquots[cnt])
-			dquot_claim_reserved_space(dquots[cnt], number);
+		if (dquots[cnt]) {
+			struct dquot *dquot = dquots[cnt];
+
+			if (dquot->dq_dqb.dqb_rsvspace < number) {
+				WARN_ON_ONCE(1);
+				number = dquot->dq_dqb.dqb_rsvspace;
+			}
+			dquot->dq_dqb.dqb_curspace += number;
+			dquot->dq_dqb.dqb_rsvspace -= number;
+		}
 	}
 	/* Update inode bytes */
 	spin_lock(&inode->i_lock);
@@ -1762,8 +1749,14 @@ void dquot_reclaim_space_nodirty(struct inode *inode, qsize_t number)
 	spin_lock(&dq_data_lock);
 	/* Claim reserved quotas to allocated quotas */
 	for (cnt = 0; cnt < MAXQUOTAS; cnt++) {
-		if (dquots[cnt])
-			dquot_reclaim_reserved_space(dquots[cnt], number);
+		if (dquots[cnt]) {
+			struct dquot *dquot = dquots[cnt];
+
+			if (WARN_ON_ONCE(dquot->dq_dqb.dqb_curspace < number))
+				number = dquot->dq_dqb.dqb_curspace;
+			dquot->dq_dqb.dqb_rsvspace += number;
+			dquot->dq_dqb.dqb_curspace -= number;
+		}
 	}
 	/* Update inode bytes */
 	spin_lock(&inode->i_lock);
-- 
2.12.3

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH 26/27] fs: Provide __inode_get_bytes()
  2017-08-16 15:41 [PATCH 0/27 v1] Quota scalability patches Jan Kara
                   ` (24 preceding siblings ...)
  2017-08-16 15:41 ` [PATCH 25/27] quota: Inline dquot_[re]claim_reserved_space() into callsite Jan Kara
@ 2017-08-16 15:41 ` Jan Kara
  2017-08-16 16:12   ` Andreas Dilger
  2017-08-16 15:41 ` [PATCH 27/27] quota: Reduce contention on dq_data_lock Jan Kara
  26 siblings, 1 reply; 55+ messages in thread
From: Jan Kara @ 2017-08-16 15:41 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: Wang Shilong, Jan Kara

Provide helper __inode_get_bytes() which assumes i_lock is already
acquired. Quota code will need this to be able to use i_lock to protect
consistency of quota accounting information and inode usage.

Signed-off-by: Jan Kara <jack@suse.cz>
---
 fs/stat.c          | 2 +-
 include/linux/fs.h | 4 ++++
 2 files changed, 5 insertions(+), 1 deletion(-)

diff --git a/fs/stat.c b/fs/stat.c
index c35610845ab1..8a6aa8caf891 100644
--- a/fs/stat.c
+++ b/fs/stat.c
@@ -710,7 +710,7 @@ loff_t inode_get_bytes(struct inode *inode)
 	loff_t ret;
 
 	spin_lock(&inode->i_lock);
-	ret = (((loff_t)inode->i_blocks) << 9) + inode->i_bytes;
+	ret = __inode_get_bytes(inode);
 	spin_unlock(&inode->i_lock);
 	return ret;
 }
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 6e1fd5d21248..d6e9ab7f184f 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -2998,6 +2998,10 @@ void __inode_add_bytes(struct inode *inode, loff_t bytes);
 void inode_add_bytes(struct inode *inode, loff_t bytes);
 void __inode_sub_bytes(struct inode *inode, loff_t bytes);
 void inode_sub_bytes(struct inode *inode, loff_t bytes);
+static inline loff_t __inode_get_bytes(struct inode *inode)
+{
+	return (((loff_t)inode->i_blocks) << 9) + inode->i_bytes;
+}
 loff_t inode_get_bytes(struct inode *inode);
 void inode_set_bytes(struct inode *inode, loff_t bytes);
 const char *simple_get_link(struct dentry *, struct inode *,
-- 
2.12.3

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH 27/27] quota: Reduce contention on dq_data_lock
  2017-08-16 15:41 [PATCH 0/27 v1] Quota scalability patches Jan Kara
                   ` (25 preceding siblings ...)
  2017-08-16 15:41 ` [PATCH 26/27] fs: Provide __inode_get_bytes() Jan Kara
@ 2017-08-16 15:41 ` Jan Kara
  2017-08-16 20:17   ` Andreas Dilger
  26 siblings, 1 reply; 55+ messages in thread
From: Jan Kara @ 2017-08-16 15:41 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: Wang Shilong, Jan Kara

dq_data_lock is currently used to protect all modifications of quota
accounting information, consistency of quota accounting on the inode,
and dquot pointers from inode. As a result contention on the lock can be
pretty heavy.

Reduce the contention on the lock by protecting quota accounting
information by a new dquot->dq_dqb_lock and consistency of quota
accounting with inode usage by inode->i_lock.

This change reduces time to create 500000 files on ext4 on ramdisk by 50
different processes in separate directories by 6% when user quota is
turned on. When those 50 processes belong to 50 different users, the
improvement is about 9%.

Signed-off-by: Jan Kara <jack@suse.cz>
---
 fs/ext4/super.c         |   4 +-
 fs/ocfs2/quota_global.c |   8 +-
 fs/ocfs2/quota_local.c  |   8 +-
 fs/quota/dquot.c        | 288 ++++++++++++++++++++++++++++--------------------
 fs/quota/quota_tree.c   |   8 +-
 include/linux/quota.h   |   1 +
 6 files changed, 186 insertions(+), 131 deletions(-)

diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index eac9f8c50631..843264a71b08 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -5194,7 +5194,7 @@ static int ext4_statfs_project(struct super_block *sb,
 	dquot = dqget(sb, qid);
 	if (IS_ERR(dquot))
 		return PTR_ERR(dquot);
-	spin_lock(&dq_data_lock);
+	spin_lock(&dquot->dq_dqb_lock);
 
 	limit = (dquot->dq_dqb.dqb_bsoftlimit ?
 		 dquot->dq_dqb.dqb_bsoftlimit :
@@ -5217,7 +5217,7 @@ static int ext4_statfs_project(struct super_block *sb,
 			 (buf->f_files - dquot->dq_dqb.dqb_curinodes) : 0;
 	}
 
-	spin_unlock(&dq_data_lock);
+	spin_unlock(&dquot->dq_dqb_lock);
 	dqput(dquot);
 	return 0;
 }
diff --git a/fs/ocfs2/quota_global.c b/fs/ocfs2/quota_global.c
index 6f7c5bd3cbc6..449a1a69e81f 100644
--- a/fs/ocfs2/quota_global.c
+++ b/fs/ocfs2/quota_global.c
@@ -505,7 +505,7 @@ int __ocfs2_sync_dquot(struct dquot *dquot, int freeing)
 	/* Update space and inode usage. Get also other information from
 	 * global quota file so that we don't overwrite any changes there.
 	 * We are */
-	spin_lock(&dq_data_lock);
+	spin_lock(&dquot->dq_dqb_lock);
 	spacechange = dquot->dq_dqb.dqb_curspace -
 					OCFS2_DQUOT(dquot)->dq_origspace;
 	inodechange = dquot->dq_dqb.dqb_curinodes -
@@ -561,7 +561,7 @@ int __ocfs2_sync_dquot(struct dquot *dquot, int freeing)
 	__clear_bit(DQ_LASTSET_B + QIF_ITIME_B, &dquot->dq_flags);
 	OCFS2_DQUOT(dquot)->dq_origspace = dquot->dq_dqb.dqb_curspace;
 	OCFS2_DQUOT(dquot)->dq_originodes = dquot->dq_dqb.dqb_curinodes;
-	spin_unlock(&dq_data_lock);
+	spin_unlock(&dquot->dq_dqb_lock);
 	err = ocfs2_qinfo_lock(info, freeing);
 	if (err < 0) {
 		mlog(ML_ERROR, "Failed to lock quota info, losing quota write"
@@ -925,10 +925,10 @@ static int ocfs2_mark_dquot_dirty(struct dquot *dquot)
 
 	/* In case user set some limits, sync dquot immediately to global
 	 * quota file so that information propagates quicker */
-	spin_lock(&dq_data_lock);
+	spin_lock(&dquot->dq_dqb_lock);
 	if (dquot->dq_flags & mask)
 		sync = 1;
-	spin_unlock(&dq_data_lock);
+	spin_unlock(&dquot->dq_dqb_lock);
 	/* This is a slight hack but we can't afford getting global quota
 	 * lock if we already have a transaction started. */
 	if (!sync || journal_current_handle()) {
diff --git a/fs/ocfs2/quota_local.c b/fs/ocfs2/quota_local.c
index 2eeedcc2e9da..aa700fd10610 100644
--- a/fs/ocfs2/quota_local.c
+++ b/fs/ocfs2/quota_local.c
@@ -521,7 +521,7 @@ static int ocfs2_recover_local_quota_file(struct inode *lqinode,
 				goto out_drop_lock;
 			}
 			down_write(&sb_dqopt(sb)->dqio_sem);
-			spin_lock(&dq_data_lock);
+			spin_lock(&dquot->dq_dqb_lock);
 			/* Add usage from quota entry into quota changes
 			 * of our node. Auxiliary variables are important
 			 * due to signedness */
@@ -529,7 +529,7 @@ static int ocfs2_recover_local_quota_file(struct inode *lqinode,
 			inodechange = le64_to_cpu(dqblk->dqb_inodemod);
 			dquot->dq_dqb.dqb_curspace += spacechange;
 			dquot->dq_dqb.dqb_curinodes += inodechange;
-			spin_unlock(&dq_data_lock);
+			spin_unlock(&dquot->dq_dqb_lock);
 			/* We want to drop reference held by the crashed
 			 * node. Since we have our own reference we know
 			 * global structure actually won't be freed. */
@@ -877,12 +877,12 @@ static void olq_set_dquot(struct buffer_head *bh, void *private)
 
 	dqblk->dqb_id = cpu_to_le64(from_kqid(&init_user_ns,
 					      od->dq_dquot.dq_id));
-	spin_lock(&dq_data_lock);
+	spin_lock(&od->dq_dquot.dq_dqb_lock);
 	dqblk->dqb_spacemod = cpu_to_le64(od->dq_dquot.dq_dqb.dqb_curspace -
 					  od->dq_origspace);
 	dqblk->dqb_inodemod = cpu_to_le64(od->dq_dquot.dq_dqb.dqb_curinodes -
 					  od->dq_originodes);
-	spin_unlock(&dq_data_lock);
+	spin_unlock(&od->dq_dquot.dq_dqb_lock);
 	trace_olq_set_dquot(
 		(unsigned long long)le64_to_cpu(dqblk->dqb_spacemod),
 		(unsigned long long)le64_to_cpu(dqblk->dqb_inodemod),
diff --git a/fs/quota/dquot.c b/fs/quota/dquot.c
index 9f4cf78096e4..b89df9bec519 100644
--- a/fs/quota/dquot.c
+++ b/fs/quota/dquot.c
@@ -82,16 +82,19 @@
 #include <linux/uaccess.h>
 
 /*
- * There are three quota SMP locks. dq_list_lock protects all lists with quotas
- * and quota formats.
- * dq_data_lock protects data from dq_dqb and also mem_dqinfo structures and
- * also guards consistency of dquot->dq_dqb with inode->i_blocks, i_bytes.
- * i_blocks and i_bytes updates itself are guarded by i_lock acquired directly
- * in inode_add_bytes() and inode_sub_bytes(). dq_state_lock protects
- * modifications of quota state (on quotaon and quotaoff) and readers who care
- * about latest values take it as well.
+ * There are five quota SMP locks:
+ * * dq_list_lock protects all lists with quotas and quota formats.
+ * * dquot->dq_dqb_lock protects data from dq_dqb
+ * * inode->i_lock protects inode->i_blocks, i_bytes and also guards
+ *   consistency of dquot->dq_dqb with inode->i_blocks, i_bytes so that
+ *   dquot_transfer() can stabilize amount it transfers
+ * * dq_data_lock protects mem_dqinfo structures and modifications of dquot
+ *   pointers in the inode
+ * * dq_state_lock protects modifications of quota state (on quotaon and
+ *   quotaoff) and readers who care about latest values take it as well.
  *
- * The spinlock ordering is hence: dq_data_lock > dq_list_lock > i_lock,
+ * The spinlock ordering is hence:
+ *   dq_data_lock > dq_list_lock > i_lock > dquot->dq_dqb_lock,
  *   dq_list_lock > dq_state_lock
  *
  * Note that some things (eg. sb pointer, type, id) doesn't change during
@@ -246,6 +249,7 @@ struct dqstats dqstats;
 EXPORT_SYMBOL(dqstats);
 
 static qsize_t inode_get_rsv_space(struct inode *inode);
+static qsize_t __inode_get_rsv_space(struct inode *inode);
 static int __dquot_initialize(struct inode *inode, int type);
 
 static inline unsigned int
@@ -810,6 +814,7 @@ static struct dquot *get_empty_dquot(struct super_block *sb, int type)
 	dquot->dq_sb = sb;
 	dquot->dq_id = make_kqid_invalid(type);
 	atomic_set(&dquot->dq_count, 1);
+	spin_lock_init(&dquot->dq_dqb_lock);
 
 	return dquot;
 }
@@ -1067,21 +1072,6 @@ static void drop_dquot_ref(struct super_block *sb, int type)
 	}
 }
 
-static inline void dquot_incr_inodes(struct dquot *dquot, qsize_t number)
-{
-	dquot->dq_dqb.dqb_curinodes += number;
-}
-
-static inline void dquot_incr_space(struct dquot *dquot, qsize_t number)
-{
-	dquot->dq_dqb.dqb_curspace += number;
-}
-
-static inline void dquot_resv_space(struct dquot *dquot, qsize_t number)
-{
-	dquot->dq_dqb.dqb_rsvspace += number;
-}
-
 static inline
 void dquot_free_reserved_space(struct dquot *dquot, qsize_t number)
 {
@@ -1240,21 +1230,24 @@ static int ignore_hardlimit(struct dquot *dquot)
 		!(info->dqi_flags & DQF_ROOT_SQUASH));
 }
 
-/* needs dq_data_lock */
-static int check_idq(struct dquot *dquot, qsize_t inodes,
-		     struct dquot_warn *warn)
+static int dquot_add_inodes(struct dquot *dquot, qsize_t inodes,
+			    struct dquot_warn *warn)
 {
-	qsize_t newinodes = dquot->dq_dqb.dqb_curinodes + inodes;
+	qsize_t newinodes;
+	int ret = 0;
 
+	spin_lock(&dquot->dq_dqb_lock);
+	newinodes = dquot->dq_dqb.dqb_curinodes + inodes;
 	if (!sb_has_quota_limits_enabled(dquot->dq_sb, dquot->dq_id.type) ||
 	    test_bit(DQ_FAKE_B, &dquot->dq_flags))
-		return 0;
+		goto add;
 
 	if (dquot->dq_dqb.dqb_ihardlimit &&
 	    newinodes > dquot->dq_dqb.dqb_ihardlimit &&
             !ignore_hardlimit(dquot)) {
 		prepare_warning(warn, dquot, QUOTA_NL_IHARDWARN);
-		return -EDQUOT;
+		ret = -EDQUOT;
+		goto out;
 	}
 
 	if (dquot->dq_dqb.dqb_isoftlimit &&
@@ -1263,7 +1256,8 @@ static int check_idq(struct dquot *dquot, qsize_t inodes,
 	    ktime_get_real_seconds() >= dquot->dq_dqb.dqb_itime &&
             !ignore_hardlimit(dquot)) {
 		prepare_warning(warn, dquot, QUOTA_NL_ISOFTLONGWARN);
-		return -EDQUOT;
+		ret = -EDQUOT;
+		goto out;
 	}
 
 	if (dquot->dq_dqb.dqb_isoftlimit &&
@@ -1273,30 +1267,40 @@ static int check_idq(struct dquot *dquot, qsize_t inodes,
 		dquot->dq_dqb.dqb_itime = ktime_get_real_seconds() +
 		    sb_dqopt(dquot->dq_sb)->info[dquot->dq_id.type].dqi_igrace;
 	}
+add:
+	dquot->dq_dqb.dqb_curinodes = newinodes;
 
-	return 0;
+out:
+	spin_unlock(&dquot->dq_dqb_lock);
+	return ret;
 }
 
-/* needs dq_data_lock */
-static int check_bdq(struct dquot *dquot, qsize_t space, int prealloc,
-		     struct dquot_warn *warn)
+static int dquot_add_space(struct dquot *dquot, qsize_t space,
+			   qsize_t rsv_space, unsigned int flags,
+			   struct dquot_warn *warn)
 {
 	qsize_t tspace;
 	struct super_block *sb = dquot->dq_sb;
+	int ret = 0;
 
+	spin_lock(&dquot->dq_dqb_lock);
 	if (!sb_has_quota_limits_enabled(sb, dquot->dq_id.type) ||
 	    test_bit(DQ_FAKE_B, &dquot->dq_flags))
-		return 0;
+		goto add;
 
 	tspace = dquot->dq_dqb.dqb_curspace + dquot->dq_dqb.dqb_rsvspace
-		+ space;
+		+ space + rsv_space;
+
+	if (flags & DQUOT_SPACE_NOFAIL)
+		goto add;
 
 	if (dquot->dq_dqb.dqb_bhardlimit &&
 	    tspace > dquot->dq_dqb.dqb_bhardlimit &&
             !ignore_hardlimit(dquot)) {
-		if (!prealloc)
+		if (flags & DQUOT_SPACE_WARN)
 			prepare_warning(warn, dquot, QUOTA_NL_BHARDWARN);
-		return -EDQUOT;
+		ret = -EDQUOT;
+		goto out;
 	}
 
 	if (dquot->dq_dqb.dqb_bsoftlimit &&
@@ -1304,28 +1308,34 @@ static int check_bdq(struct dquot *dquot, qsize_t space, int prealloc,
 	    dquot->dq_dqb.dqb_btime &&
 	    ktime_get_real_seconds() >= dquot->dq_dqb.dqb_btime &&
             !ignore_hardlimit(dquot)) {
-		if (!prealloc)
+		if (flags & DQUOT_SPACE_WARN)
 			prepare_warning(warn, dquot, QUOTA_NL_BSOFTLONGWARN);
-		return -EDQUOT;
+		ret = -EDQUOT;
+		goto out;
 	}
 
 	if (dquot->dq_dqb.dqb_bsoftlimit &&
 	    tspace > dquot->dq_dqb.dqb_bsoftlimit &&
 	    dquot->dq_dqb.dqb_btime == 0) {
-		if (!prealloc) {
+		if (flags & DQUOT_SPACE_WARN) {
 			prepare_warning(warn, dquot, QUOTA_NL_BSOFTWARN);
 			dquot->dq_dqb.dqb_btime = ktime_get_real_seconds() +
 			    sb_dqopt(sb)->info[dquot->dq_id.type].dqi_bgrace;
-		}
-		else
+		} else {
 			/*
 			 * We don't allow preallocation to exceed softlimit so exceeding will
 			 * be always printed
 			 */
-			return -EDQUOT;
+			ret = -EDQUOT;
+			goto out;
+		}
 	}
-
-	return 0;
+add:
+	dquot->dq_dqb.dqb_rsvspace += rsv_space;
+	dquot->dq_dqb.dqb_curspace += space;
+out:
+	spin_unlock(&dquot->dq_dqb_lock);
+	return ret;
 }
 
 static int info_idq_free(struct dquot *dquot, qsize_t inodes)
@@ -1460,8 +1470,15 @@ static int __dquot_initialize(struct inode *inode, int type)
 			 * did a write before quota was turned on
 			 */
 			rsv = inode_get_rsv_space(inode);
-			if (unlikely(rsv))
-				dquot_resv_space(dquots[cnt], rsv);
+			if (unlikely(rsv)) {
+				spin_lock(&inode->i_lock);
+				/* Get reservation again under proper lock */
+				rsv = __inode_get_rsv_space(inode);
+				spin_lock(&dquots[cnt]->dq_dqb_lock);
+				dquots[cnt]->dq_dqb.dqb_rsvspace += rsv;
+				spin_unlock(&dquots[cnt]->dq_dqb_lock);
+				spin_unlock(&inode->i_lock);
+			}
 		}
 	}
 out_lock:
@@ -1556,6 +1573,13 @@ static qsize_t *inode_reserved_space(struct inode * inode)
 	return inode->i_sb->dq_op->get_reserved_space(inode);
 }
 
+static qsize_t __inode_get_rsv_space(struct inode *inode)
+{
+	if (!inode->i_sb->dq_op->get_reserved_space)
+		return 0;
+	return *inode_reserved_space(inode);
+}
+
 static qsize_t inode_get_rsv_space(struct inode *inode)
 {
 	qsize_t ret;
@@ -1563,7 +1587,7 @@ static qsize_t inode_get_rsv_space(struct inode *inode)
 	if (!inode->i_sb->dq_op->get_reserved_space)
 		return 0;
 	spin_lock(&inode->i_lock);
-	ret = *inode_reserved_space(inode);
+	ret = __inode_get_rsv_space(inode);
 	spin_unlock(&inode->i_lock);
 	return ret;
 }
@@ -1604,33 +1628,41 @@ int __dquot_alloc_space(struct inode *inode, qsize_t number, int flags)
 
 	dquots = i_dquot(inode);
 	index = srcu_read_lock(&dquot_srcu);
-	spin_lock(&dq_data_lock);
+	spin_lock(&inode->i_lock);
 	for (cnt = 0; cnt < MAXQUOTAS; cnt++) {
 		if (!dquots[cnt])
 			continue;
-		ret = check_bdq(dquots[cnt], number,
-				!(flags & DQUOT_SPACE_WARN), &warn[cnt]);
-		if (ret && !(flags & DQUOT_SPACE_NOFAIL)) {
-			spin_unlock(&dq_data_lock);
+		if (flags & DQUOT_SPACE_RESERVE) {
+			ret = dquot_add_space(dquots[cnt], 0, number, flags,
+					      &warn[cnt]);
+		} else {
+			ret = dquot_add_space(dquots[cnt], number, 0, flags,
+					      &warn[cnt]);
+		}
+		if (ret) {
+			/* Back out changes we already did */
+			for (cnt--; cnt >= 0; cnt--) {
+				if (!dquots[cnt])
+					continue;
+				spin_lock(&dquots[cnt]->dq_dqb_lock);
+				if (flags & DQUOT_SPACE_RESERVE) {
+					dquots[cnt]->dq_dqb.dqb_rsvspace -=
+									number;
+				} else {
+					dquots[cnt]->dq_dqb.dqb_curspace -=
+									number;
+				}
+				spin_unlock(&dquots[cnt]->dq_dqb_lock);
+			}
+			spin_unlock(&inode->i_lock);
 			goto out_flush_warn;
 		}
 	}
-	for (cnt = 0; cnt < MAXQUOTAS; cnt++) {
-		if (!dquots[cnt])
-			continue;
-		if (reserve)
-			dquot_resv_space(dquots[cnt], number);
-		else
-			dquot_incr_space(dquots[cnt], number);
-	}
-	if (reserve) {
-		spin_lock(&inode->i_lock);
+	if (reserve)
 		*inode_reserved_space(inode) += number;
-		spin_unlock(&inode->i_lock);
-	} else {
-		inode_add_bytes(inode, number);
-	}
-	spin_unlock(&dq_data_lock);
+	else
+		__inode_add_bytes(inode, number);
+	spin_unlock(&inode->i_lock);
 
 	if (reserve)
 		goto out_flush_warn;
@@ -1659,23 +1691,26 @@ int dquot_alloc_inode(struct inode *inode)
 
 	dquots = i_dquot(inode);
 	index = srcu_read_lock(&dquot_srcu);
-	spin_lock(&dq_data_lock);
+	spin_lock(&inode->i_lock);
 	for (cnt = 0; cnt < MAXQUOTAS; cnt++) {
 		if (!dquots[cnt])
 			continue;
-		ret = check_idq(dquots[cnt], 1, &warn[cnt]);
-		if (ret)
+		ret = dquot_add_inodes(dquots[cnt], 1, &warn[cnt]);
+		if (ret) {
+			for (cnt--; cnt >= 0; cnt--) {
+				if (!dquots[cnt])
+					continue;
+				/* Back out changes we already did */
+				spin_lock(&dquots[cnt]->dq_dqb_lock);
+				dquots[cnt]->dq_dqb.dqb_curinodes--;
+				spin_unlock(&dquots[cnt]->dq_dqb_lock);
+			}
 			goto warn_put_all;
-	}
-
-	for (cnt = 0; cnt < MAXQUOTAS; cnt++) {
-		if (!dquots[cnt])
-			continue;
-		dquot_incr_inodes(dquots[cnt], 1);
+		}
 	}
 
 warn_put_all:
-	spin_unlock(&dq_data_lock);
+	spin_unlock(&inode->i_lock);
 	if (ret == 0)
 		mark_all_dquot_dirty(dquots);
 	srcu_read_unlock(&dquot_srcu, index);
@@ -1702,26 +1737,27 @@ int dquot_claim_space_nodirty(struct inode *inode, qsize_t number)
 
 	dquots = i_dquot(inode);
 	index = srcu_read_lock(&dquot_srcu);
-	spin_lock(&dq_data_lock);
+	spin_lock(&inode->i_lock);
 	/* Claim reserved quotas to allocated quotas */
 	for (cnt = 0; cnt < MAXQUOTAS; cnt++) {
 		if (dquots[cnt]) {
 			struct dquot *dquot = dquots[cnt];
 
+			spin_lock(&dquot->dq_dqb_lock);
 			if (dquot->dq_dqb.dqb_rsvspace < number) {
+				printk("type=%d want: %lu, have %lu, space %lu\n", cnt, (unsigned long)number, (unsigned long)dquot->dq_dqb.dqb_rsvspace, (unsigned long)dquot->dq_dqb.dqb_curspace);
 				WARN_ON_ONCE(1);
 				number = dquot->dq_dqb.dqb_rsvspace;
 			}
 			dquot->dq_dqb.dqb_curspace += number;
 			dquot->dq_dqb.dqb_rsvspace -= number;
+			spin_unlock(&dquot->dq_dqb_lock);
 		}
 	}
 	/* Update inode bytes */
-	spin_lock(&inode->i_lock);
 	*inode_reserved_space(inode) -= number;
 	__inode_add_bytes(inode, number);
 	spin_unlock(&inode->i_lock);
-	spin_unlock(&dq_data_lock);
 	mark_all_dquot_dirty(dquots);
 	srcu_read_unlock(&dquot_srcu, index);
 	return 0;
@@ -1746,24 +1782,24 @@ void dquot_reclaim_space_nodirty(struct inode *inode, qsize_t number)
 
 	dquots = i_dquot(inode);
 	index = srcu_read_lock(&dquot_srcu);
-	spin_lock(&dq_data_lock);
+	spin_lock(&inode->i_lock);
 	/* Claim reserved quotas to allocated quotas */
 	for (cnt = 0; cnt < MAXQUOTAS; cnt++) {
 		if (dquots[cnt]) {
 			struct dquot *dquot = dquots[cnt];
 
+			spin_lock(&dquot->dq_dqb_lock);
 			if (WARN_ON_ONCE(dquot->dq_dqb.dqb_curspace < number))
 				number = dquot->dq_dqb.dqb_curspace;
 			dquot->dq_dqb.dqb_rsvspace += number;
 			dquot->dq_dqb.dqb_curspace -= number;
+			spin_unlock(&dquot->dq_dqb_lock);
 		}
 	}
 	/* Update inode bytes */
-	spin_lock(&inode->i_lock);
 	*inode_reserved_space(inode) += number;
 	__inode_sub_bytes(inode, number);
 	spin_unlock(&inode->i_lock);
-	spin_unlock(&dq_data_lock);
 	mark_all_dquot_dirty(dquots);
 	srcu_read_unlock(&dquot_srcu, index);
 	return;
@@ -1793,13 +1829,14 @@ void __dquot_free_space(struct inode *inode, qsize_t number, int flags)
 
 	dquots = i_dquot(inode);
 	index = srcu_read_lock(&dquot_srcu);
-	spin_lock(&dq_data_lock);
+	spin_lock(&inode->i_lock);
 	for (cnt = 0; cnt < MAXQUOTAS; cnt++) {
 		int wtype;
 
 		warn[cnt].w_type = QUOTA_NL_NOWARN;
 		if (!dquots[cnt])
 			continue;
+		spin_lock(&dquots[cnt]->dq_dqb_lock);
 		wtype = info_bdq_free(dquots[cnt], number);
 		if (wtype != QUOTA_NL_NOWARN)
 			prepare_warning(&warn[cnt], dquots[cnt], wtype);
@@ -1807,15 +1844,13 @@ void __dquot_free_space(struct inode *inode, qsize_t number, int flags)
 			dquot_free_reserved_space(dquots[cnt], number);
 		else
 			dquot_decr_space(dquots[cnt], number);
+		spin_unlock(&dquots[cnt]->dq_dqb_lock);
 	}
-	if (reserve) {
-		spin_lock(&inode->i_lock);
+	if (reserve)
 		*inode_reserved_space(inode) -= number;
-		spin_unlock(&inode->i_lock);
-	} else {
-		inode_sub_bytes(inode, number);
-	}
-	spin_unlock(&dq_data_lock);
+	else
+		__inode_sub_bytes(inode, number);
+	spin_unlock(&inode->i_lock);
 
 	if (reserve)
 		goto out_unlock;
@@ -1841,19 +1876,21 @@ void dquot_free_inode(struct inode *inode)
 
 	dquots = i_dquot(inode);
 	index = srcu_read_lock(&dquot_srcu);
-	spin_lock(&dq_data_lock);
+	spin_lock(&inode->i_lock);
 	for (cnt = 0; cnt < MAXQUOTAS; cnt++) {
 		int wtype;
 
 		warn[cnt].w_type = QUOTA_NL_NOWARN;
 		if (!dquots[cnt])
 			continue;
+		spin_lock(&dquots[cnt]->dq_dqb_lock);
 		wtype = info_idq_free(dquots[cnt], 1);
 		if (wtype != QUOTA_NL_NOWARN)
 			prepare_warning(&warn[cnt], dquots[cnt], wtype);
 		dquot_decr_inodes(dquots[cnt], 1);
+		spin_unlock(&dquots[cnt]->dq_dqb_lock);
 	}
-	spin_unlock(&dq_data_lock);
+	spin_unlock(&inode->i_lock);
 	mark_all_dquot_dirty(dquots);
 	srcu_read_unlock(&dquot_srcu, index);
 	flush_warnings(warn);
@@ -1874,7 +1911,7 @@ EXPORT_SYMBOL(dquot_free_inode);
  */
 int __dquot_transfer(struct inode *inode, struct dquot **transfer_to)
 {
-	qsize_t space, cur_space;
+	qsize_t cur_space;
 	qsize_t rsv_space = 0;
 	qsize_t inode_usage = 1;
 	struct dquot *transfer_from[MAXQUOTAS] = {};
@@ -1901,14 +1938,18 @@ int __dquot_transfer(struct inode *inode, struct dquot **transfer_to)
 	}
 
 	spin_lock(&dq_data_lock);
+	spin_lock(&inode->i_lock);
 	if (IS_NOQUOTA(inode)) {	/* File without quota accounting? */
+		spin_unlock(&inode->i_lock);
 		spin_unlock(&dq_data_lock);
 		return 0;
 	}
-	cur_space = inode_get_bytes(inode);
-	rsv_space = inode_get_rsv_space(inode);
-	space = cur_space + rsv_space;
-	/* Build the transfer_from list and check the limits */
+	cur_space = __inode_get_bytes(inode);
+	rsv_space = __inode_get_rsv_space(inode);
+	/*
+	 * Build the transfer_from list, check limits, and update usage in
+	 * the target structures.
+	 */
 	for (cnt = 0; cnt < MAXQUOTAS; cnt++) {
 		/*
 		 * Skip changes for same uid or gid or for turned off quota-type.
@@ -1920,28 +1961,33 @@ int __dquot_transfer(struct inode *inode, struct dquot **transfer_to)
 			continue;
 		is_valid[cnt] = 1;
 		transfer_from[cnt] = i_dquot(inode)[cnt];
-		ret = check_idq(transfer_to[cnt], inode_usage, &warn_to[cnt]);
+		ret = dquot_add_inodes(transfer_to[cnt], inode_usage,
+				       &warn_to[cnt]);
 		if (ret)
 			goto over_quota;
-		ret = check_bdq(transfer_to[cnt], space, 0, &warn_to[cnt]);
-		if (ret)
+		ret = dquot_add_space(transfer_to[cnt], cur_space, rsv_space, 0,
+				      &warn_to[cnt]);
+		if (ret) {
+			dquot_decr_inodes(transfer_to[cnt], inode_usage);
 			goto over_quota;
+		}
 	}
 
-	/*
-	 * Finally perform the needed transfer from transfer_from to transfer_to
-	 */
+	/* Decrease usage for source structures and update quota pointers */
 	for (cnt = 0; cnt < MAXQUOTAS; cnt++) {
 		if (!is_valid[cnt])
 			continue;
 		/* Due to IO error we might not have transfer_from[] structure */
 		if (transfer_from[cnt]) {
 			int wtype;
+
+			spin_lock(&transfer_from[cnt]->dq_dqb_lock);
 			wtype = info_idq_free(transfer_from[cnt], inode_usage);
 			if (wtype != QUOTA_NL_NOWARN)
 				prepare_warning(&warn_from_inodes[cnt],
 						transfer_from[cnt], wtype);
-			wtype = info_bdq_free(transfer_from[cnt], space);
+			wtype = info_bdq_free(transfer_from[cnt],
+					      cur_space + rsv_space);
 			if (wtype != QUOTA_NL_NOWARN)
 				prepare_warning(&warn_from_space[cnt],
 						transfer_from[cnt], wtype);
@@ -1949,14 +1995,11 @@ int __dquot_transfer(struct inode *inode, struct dquot **transfer_to)
 			dquot_decr_space(transfer_from[cnt], cur_space);
 			dquot_free_reserved_space(transfer_from[cnt],
 						  rsv_space);
+			spin_unlock(&transfer_from[cnt]->dq_dqb_lock);
 		}
-
-		dquot_incr_inodes(transfer_to[cnt], inode_usage);
-		dquot_incr_space(transfer_to[cnt], cur_space);
-		dquot_resv_space(transfer_to[cnt], rsv_space);
-
 		i_dquot(inode)[cnt] = transfer_to[cnt];
 	}
+	spin_unlock(&inode->i_lock);
 	spin_unlock(&dq_data_lock);
 
 	mark_all_dquot_dirty(transfer_from);
@@ -1970,6 +2013,17 @@ int __dquot_transfer(struct inode *inode, struct dquot **transfer_to)
 			transfer_to[cnt] = transfer_from[cnt];
 	return 0;
 over_quota:
+	/* Back out changes we already did */
+	for (cnt--; cnt >= 0; cnt--) {
+		if (!is_valid[cnt])
+			continue;
+		spin_lock(&transfer_to[cnt]->dq_dqb_lock);
+		dquot_decr_inodes(transfer_to[cnt], inode_usage);
+		dquot_decr_space(transfer_to[cnt], cur_space);
+		dquot_free_reserved_space(transfer_to[cnt], rsv_space);
+		spin_unlock(&transfer_to[cnt]->dq_dqb_lock);
+	}
+	spin_unlock(&inode->i_lock);
 	spin_unlock(&dq_data_lock);
 	flush_warnings(warn_to);
 	return ret;
@@ -2520,7 +2574,7 @@ static void do_get_dqblk(struct dquot *dquot, struct qc_dqblk *di)
 	struct mem_dqblk *dm = &dquot->dq_dqb;
 
 	memset(di, 0, sizeof(*di));
-	spin_lock(&dq_data_lock);
+	spin_lock(&dquot->dq_dqb_lock);
 	di->d_spc_hardlimit = dm->dqb_bhardlimit;
 	di->d_spc_softlimit = dm->dqb_bsoftlimit;
 	di->d_ino_hardlimit = dm->dqb_ihardlimit;
@@ -2529,7 +2583,7 @@ static void do_get_dqblk(struct dquot *dquot, struct qc_dqblk *di)
 	di->d_ino_count = dm->dqb_curinodes;
 	di->d_spc_timer = dm->dqb_btime;
 	di->d_ino_timer = dm->dqb_itime;
-	spin_unlock(&dq_data_lock);
+	spin_unlock(&dquot->dq_dqb_lock);
 }
 
 int dquot_get_dqblk(struct super_block *sb, struct kqid qid,
@@ -2593,7 +2647,7 @@ static int do_set_dqblk(struct dquot *dquot, struct qc_dqblk *di)
 	     (di->d_ino_hardlimit > dqi->dqi_max_ino_limit)))
 		return -ERANGE;
 
-	spin_lock(&dq_data_lock);
+	spin_lock(&dquot->dq_dqb_lock);
 	if (di->d_fieldmask & QC_SPACE) {
 		dm->dqb_curspace = di->d_space - dm->dqb_rsvspace;
 		check_blim = 1;
@@ -2659,7 +2713,7 @@ static int do_set_dqblk(struct dquot *dquot, struct qc_dqblk *di)
 		clear_bit(DQ_FAKE_B, &dquot->dq_flags);
 	else
 		set_bit(DQ_FAKE_B, &dquot->dq_flags);
-	spin_unlock(&dq_data_lock);
+	spin_unlock(&dquot->dq_dqb_lock);
 	mark_dquot_dirty(dquot);
 
 	return 0;
diff --git a/fs/quota/quota_tree.c b/fs/quota/quota_tree.c
index 54f85eb2609c..bb3f59bcfcf5 100644
--- a/fs/quota/quota_tree.c
+++ b/fs/quota/quota_tree.c
@@ -389,9 +389,9 @@ int qtree_write_dquot(struct qtree_mem_dqinfo *info, struct dquot *dquot)
 			return ret;
 		}
 	}
-	spin_lock(&dq_data_lock);
+	spin_lock(&dquot->dq_dqb_lock);
 	info->dqi_ops->mem2disk_dqblk(ddquot, dquot);
-	spin_unlock(&dq_data_lock);
+	spin_unlock(&dquot->dq_dqb_lock);
 	ret = sb->s_op->quota_write(sb, type, ddquot, info->dqi_entry_size,
 				    dquot->dq_off);
 	if (ret != info->dqi_entry_size) {
@@ -649,14 +649,14 @@ int qtree_read_dquot(struct qtree_mem_dqinfo *info, struct dquot *dquot)
 		kfree(ddquot);
 		goto out;
 	}
-	spin_lock(&dq_data_lock);
+	spin_lock(&dquot->dq_dqb_lock);
 	info->dqi_ops->disk2mem_dqblk(dquot, ddquot);
 	if (!dquot->dq_dqb.dqb_bhardlimit &&
 	    !dquot->dq_dqb.dqb_bsoftlimit &&
 	    !dquot->dq_dqb.dqb_ihardlimit &&
 	    !dquot->dq_dqb.dqb_isoftlimit)
 		set_bit(DQ_FAKE_B, &dquot->dq_flags);
-	spin_unlock(&dq_data_lock);
+	spin_unlock(&dquot->dq_dqb_lock);
 	kfree(ddquot);
 out:
 	dqstats_inc(DQST_READS);
diff --git a/include/linux/quota.h b/include/linux/quota.h
index eccc1cb6274e..4b5180efff21 100644
--- a/include/linux/quota.h
+++ b/include/linux/quota.h
@@ -298,6 +298,7 @@ struct dquot {
 	struct list_head dq_free;	/* Free list element */
 	struct list_head dq_dirty;	/* List of dirty dquots */
 	struct mutex dq_lock;		/* dquot IO lock */
+	spinlock_t dq_dqb_lock;		/* Lock protecting dq_dqb changes */
 	atomic_t dq_count;		/* Use count */
 	struct super_block *dq_sb;	/* superblock this applies to */
 	struct kqid dq_id;		/* ID this applies to (uid, gid, projid) */
-- 
2.12.3

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* Re: [PATCH 26/27] fs: Provide __inode_get_bytes()
  2017-08-16 15:41 ` [PATCH 26/27] fs: Provide __inode_get_bytes() Jan Kara
@ 2017-08-16 16:12   ` Andreas Dilger
  2017-08-17 20:04     ` Jan Kara
  0 siblings, 1 reply; 55+ messages in thread
From: Andreas Dilger @ 2017-08-16 16:12 UTC (permalink / raw)
  To: Jan Kara; +Cc: linux-fsdevel, Wang Shilong

[-- Attachment #1: Type: text/plain, Size: 1947 bytes --]

On Aug 16, 2017, at 9:41 AM, Jan Kara <jack@suse.cz> wrote:
> 
> Provide helper __inode_get_bytes() which assumes i_lock is already
> acquired. Quota code will need this to be able to use i_lock to protect
> consistency of quota accounting information and inode usage.

Minor nit - it's better to give functions like this a useful name, like
inode_get_bytes_locked(), since "__" doesn't really convey any semantic info,
and also messes up function sorting order (e.g. tag completion on inode_get_bytes
will not show __inode_get_bytes() but would show inode_get_bytes_locked()).

Cheers, Andreas

> Signed-off-by: Jan Kara <jack@suse.cz>
> ---
> fs/stat.c          | 2 +-
> include/linux/fs.h | 4 ++++
> 2 files changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git a/fs/stat.c b/fs/stat.c
> index c35610845ab1..8a6aa8caf891 100644
> --- a/fs/stat.c
> +++ b/fs/stat.c
> @@ -710,7 +710,7 @@ loff_t inode_get_bytes(struct inode *inode)
> 	loff_t ret;
> 
> 	spin_lock(&inode->i_lock);
> -	ret = (((loff_t)inode->i_blocks) << 9) + inode->i_bytes;
> +	ret = __inode_get_bytes(inode);
> 	spin_unlock(&inode->i_lock);
> 	return ret;
> }
> diff --git a/include/linux/fs.h b/include/linux/fs.h
> index 6e1fd5d21248..d6e9ab7f184f 100644
> --- a/include/linux/fs.h
> +++ b/include/linux/fs.h
> @@ -2998,6 +2998,10 @@ void __inode_add_bytes(struct inode *inode, loff_t bytes);
> void inode_add_bytes(struct inode *inode, loff_t bytes);
> void __inode_sub_bytes(struct inode *inode, loff_t bytes);
> void inode_sub_bytes(struct inode *inode, loff_t bytes);
> +static inline loff_t __inode_get_bytes(struct inode *inode)
> +{
> +	return (((loff_t)inode->i_blocks) << 9) + inode->i_bytes;
> +}
> loff_t inode_get_bytes(struct inode *inode);
> void inode_set_bytes(struct inode *inode, loff_t bytes);
> const char *simple_get_link(struct dentry *, struct inode *,
> --
> 2.12.3
> 


Cheers, Andreas






[-- Attachment #2: Message signed with OpenPGP --]
[-- Type: application/pgp-signature, Size: 195 bytes --]

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 01/27] quota: Convert dqio_mutex to rwsem
  2017-08-16 15:41 ` [PATCH 01/27] quota: Convert dqio_mutex to rwsem Jan Kara
@ 2017-08-16 16:23   ` Andreas Dilger
  2017-08-17 16:47     ` Jan Kara
  0 siblings, 1 reply; 55+ messages in thread
From: Andreas Dilger @ 2017-08-16 16:23 UTC (permalink / raw)
  To: Jan Kara; +Cc: linux-fsdevel, Wang Shilong

[-- Attachment #1: Type: text/plain, Size: 12631 bytes --]

On Aug 16, 2017, at 9:41 AM, Jan Kara <jack@suse.cz> wrote:
> 
> Convert dqio_mutex to rwsem and call it dqio_sem. No functional changes
> yet.
> 
> Signed-off-by: Jan Kara <jack@suse.cz>
> ---
> fs/ext4/super.c         |  4 ++--
> fs/ocfs2/quota_global.c | 20 ++++++++++----------
> fs/ocfs2/quota_local.c  | 10 +++++-----
> fs/quota/dquot.c        | 28 ++++++++++++++--------------
> fs/quota/quota_tree.c   |  2 +-
> fs/super.c              |  2 +-
> include/linux/quota.h   |  2 +-
> 7 files changed, 34 insertions(+), 34 deletions(-)
> 
> diff --git a/fs/ext4/super.c b/fs/ext4/super.c
> index d61a70e2193a..8e0c27387ab7 100644
> --- a/fs/ext4/super.c
> +++ b/fs/ext4/super.c
> @@ -5268,8 +5268,8 @@ static int ext4_statfs(struct dentry *dentry, struct kstatfs *buf)
>   * Process 1                         Process 2
>   * ext4_create()                     quota_sync()
>   *   jbd2_journal_start()                  write_dquot()
> - *   dquot_initialize()                         down(dqio_mutex)
> - *     down(dqio_mutex)                    jbd2_journal_start()
> + *   dquot_initialize()                         down(dqio_sem)
> + *     down(dqio_sem)                           jbd2_journal_start()

Not a big deal, since this is a comment, but it should probably be changed
to down_write(dqio_sem) on both lines, as this deadlock wouldn't happen if
it was down_read(dqio_sem), and it is more consistent with the new usage.

That said, this comment doesn't appear to be relevant anymore.  At least I
couldn't find where in those callpaths dqio_mutex/dqio_sem is used anymore.

Cheers, Andreas

>   *
>   */
> 
> diff --git a/fs/ocfs2/quota_global.c b/fs/ocfs2/quota_global.c
> index cec495a921e3..4134d557a8e5 100644
> --- a/fs/ocfs2/quota_global.c
> +++ b/fs/ocfs2/quota_global.c
> @@ -33,7 +33,7 @@
>  * Locking of quotas with OCFS2 is rather complex. Here are rules that
>  * should be obeyed by all the functions:
>  * - any write of quota structure (either to local or global file) is protected
> - *   by dqio_mutex or dquot->dq_lock.
> + *   by dqio_sem or dquot->dq_lock.
>  * - any modification of global quota file holds inode cluster lock, i_mutex,
>  *   and ip_alloc_sem of the global quota file (achieved by
>  *   ocfs2_lock_global_qf). It also has to hold qinfo_lock.
> @@ -42,9 +42,9 @@
>  *
>  * A rough sketch of locking dependencies (lf = local file, gf = global file):
>  * Normal filesystem operation:
> - *   start_trans -> dqio_mutex -> write to lf
> + *   start_trans -> dqio_sem -> write to lf
>  * Syncing of local and global file:
> - *   ocfs2_lock_global_qf -> start_trans -> dqio_mutex -> qinfo_lock ->
> + *   ocfs2_lock_global_qf -> start_trans -> dqio_sem -> qinfo_lock ->
>  *     write to gf
>  *						       -> write to lf
>  * Acquire dquot for the first time:
> @@ -60,7 +60,7 @@
>  * Recovery:
>  *   inode cluster lock of recovered lf
>  *     -> read bitmaps -> ip_alloc_sem of lf
> - *     -> ocfs2_lock_global_qf -> start_trans -> dqio_mutex -> qinfo_lock ->
> + *     -> ocfs2_lock_global_qf -> start_trans -> dqio_sem -> qinfo_lock ->
>  *        write to gf
>  */
> 
> @@ -611,7 +611,7 @@ static int ocfs2_sync_dquot_helper(struct dquot *dquot, unsigned long type)
> 		mlog_errno(status);
> 		goto out_ilock;
> 	}
> -	mutex_lock(&sb_dqopt(sb)->dqio_mutex);
> +	down_write(&sb_dqopt(sb)->dqio_sem);
> 	status = ocfs2_sync_dquot(dquot);
> 	if (status < 0)
> 		mlog_errno(status);
> @@ -619,7 +619,7 @@ static int ocfs2_sync_dquot_helper(struct dquot *dquot, unsigned long type)
> 	status = ocfs2_local_write_dquot(dquot);
> 	if (status < 0)
> 		mlog_errno(status);
> -	mutex_unlock(&sb_dqopt(sb)->dqio_mutex);
> +	up_write(&sb_dqopt(sb)->dqio_sem);
> 	ocfs2_commit_trans(osb, handle);
> out_ilock:
> 	ocfs2_unlock_global_qf(oinfo, 1);
> @@ -666,9 +666,9 @@ static int ocfs2_write_dquot(struct dquot *dquot)
> 		mlog_errno(status);
> 		goto out;
> 	}
> -	mutex_lock(&sb_dqopt(dquot->dq_sb)->dqio_mutex);
> +	down_write(&sb_dqopt(dquot->dq_sb)->dqio_sem);
> 	status = ocfs2_local_write_dquot(dquot);
> -	mutex_unlock(&sb_dqopt(dquot->dq_sb)->dqio_mutex);
> +	up_write(&sb_dqopt(dquot->dq_sb)->dqio_sem);
> 	ocfs2_commit_trans(osb, handle);
> out:
> 	return status;
> @@ -939,7 +939,7 @@ static int ocfs2_mark_dquot_dirty(struct dquot *dquot)
> 		mlog_errno(status);
> 		goto out_ilock;
> 	}
> -	mutex_lock(&sb_dqopt(sb)->dqio_mutex);
> +	down_write(&sb_dqopt(sb)->dqio_sem);
> 	status = ocfs2_sync_dquot(dquot);
> 	if (status < 0) {
> 		mlog_errno(status);
> @@ -948,7 +948,7 @@ static int ocfs2_mark_dquot_dirty(struct dquot *dquot)
> 	/* Now write updated local dquot structure */
> 	status = ocfs2_local_write_dquot(dquot);
> out_dlock:
> -	mutex_unlock(&sb_dqopt(sb)->dqio_mutex);
> +	up_write(&sb_dqopt(sb)->dqio_sem);
> 	ocfs2_commit_trans(osb, handle);
> out_ilock:
> 	ocfs2_unlock_global_qf(oinfo, 1);
> diff --git a/fs/ocfs2/quota_local.c b/fs/ocfs2/quota_local.c
> index 32c5a40c1257..1311eff1c050 100644
> --- a/fs/ocfs2/quota_local.c
> +++ b/fs/ocfs2/quota_local.c
> @@ -520,7 +520,7 @@ static int ocfs2_recover_local_quota_file(struct inode *lqinode,
> 				mlog_errno(status);
> 				goto out_drop_lock;
> 			}
> -			mutex_lock(&sb_dqopt(sb)->dqio_mutex);
> +			down_write(&sb_dqopt(sb)->dqio_sem);
> 			spin_lock(&dq_data_lock);
> 			/* Add usage from quota entry into quota changes
> 			 * of our node. Auxiliary variables are important
> @@ -553,7 +553,7 @@ static int ocfs2_recover_local_quota_file(struct inode *lqinode,
> 			unlock_buffer(qbh);
> 			ocfs2_journal_dirty(handle, qbh);
> out_commit:
> -			mutex_unlock(&sb_dqopt(sb)->dqio_mutex);
> +			up_write(&sb_dqopt(sb)->dqio_sem);
> 			ocfs2_commit_trans(OCFS2_SB(sb), handle);
> out_drop_lock:
> 			ocfs2_unlock_global_qf(oinfo, 1);
> @@ -693,7 +693,7 @@ static int ocfs2_local_read_info(struct super_block *sb, int type)
> 
> 	/* We don't need the lock and we have to acquire quota file locks
> 	 * which will later depend on this lock */
> -	mutex_unlock(&sb_dqopt(sb)->dqio_mutex);
> +	up_write(&sb_dqopt(sb)->dqio_sem);
> 	info->dqi_max_spc_limit = 0x7fffffffffffffffLL;
> 	info->dqi_max_ino_limit = 0x7fffffffffffffffLL;
> 	oinfo = kmalloc(sizeof(struct ocfs2_mem_dqinfo), GFP_NOFS);
> @@ -772,7 +772,7 @@ static int ocfs2_local_read_info(struct super_block *sb, int type)
> 		goto out_err;
> 	}
> 
> -	mutex_lock(&sb_dqopt(sb)->dqio_mutex);
> +	down_write(&sb_dqopt(sb)->dqio_sem);
> 	return 0;
> out_err:
> 	if (oinfo) {
> @@ -786,7 +786,7 @@ static int ocfs2_local_read_info(struct super_block *sb, int type)
> 		kfree(oinfo);
> 	}
> 	brelse(bh);
> -	mutex_lock(&sb_dqopt(sb)->dqio_mutex);
> +	down_write(&sb_dqopt(sb)->dqio_sem);
> 	return -1;
> }
> 
> diff --git a/fs/quota/dquot.c b/fs/quota/dquot.c
> index 53a17496c5c5..29d447598642 100644
> --- a/fs/quota/dquot.c
> +++ b/fs/quota/dquot.c
> @@ -120,7 +120,7 @@
>  * spinlock to internal buffers before writing.
>  *
>  * Lock ordering (including related VFS locks) is the following:
> - *   s_umount > i_mutex > journal_lock > dquot->dq_lock > dqio_mutex
> + *   s_umount > i_mutex > journal_lock > dquot->dq_lock > dqio_sem
>  */
> 
> static __cacheline_aligned_in_smp DEFINE_SPINLOCK(dq_list_lock);
> @@ -406,7 +406,7 @@ int dquot_acquire(struct dquot *dquot)
> 	struct quota_info *dqopt = sb_dqopt(dquot->dq_sb);
> 
> 	mutex_lock(&dquot->dq_lock);
> -	mutex_lock(&dqopt->dqio_mutex);
> +	down_write(&dqopt->dqio_sem);
> 	if (!test_bit(DQ_READ_B, &dquot->dq_flags))
> 		ret = dqopt->ops[dquot->dq_id.type]->read_dqblk(dquot);
> 	if (ret < 0)
> @@ -436,7 +436,7 @@ int dquot_acquire(struct dquot *dquot)
> 	smp_mb__before_atomic();
> 	set_bit(DQ_ACTIVE_B, &dquot->dq_flags);
> out_iolock:
> -	mutex_unlock(&dqopt->dqio_mutex);
> +	up_write(&dqopt->dqio_sem);
> 	mutex_unlock(&dquot->dq_lock);
> 	return ret;
> }
> @@ -450,7 +450,7 @@ int dquot_commit(struct dquot *dquot)
> 	int ret = 0;
> 	struct quota_info *dqopt = sb_dqopt(dquot->dq_sb);
> 
> -	mutex_lock(&dqopt->dqio_mutex);
> +	down_write(&dqopt->dqio_sem);
> 	spin_lock(&dq_list_lock);
> 	if (!clear_dquot_dirty(dquot)) {
> 		spin_unlock(&dq_list_lock);
> @@ -464,7 +464,7 @@ int dquot_commit(struct dquot *dquot)
> 	else
> 		ret = -EIO;
> out_sem:
> -	mutex_unlock(&dqopt->dqio_mutex);
> +	up_write(&dqopt->dqio_sem);
> 	return ret;
> }
> EXPORT_SYMBOL(dquot_commit);
> @@ -481,7 +481,7 @@ int dquot_release(struct dquot *dquot)
> 	/* Check whether we are not racing with some other dqget() */
> 	if (atomic_read(&dquot->dq_count) > 1)
> 		goto out_dqlock;
> -	mutex_lock(&dqopt->dqio_mutex);
> +	down_write(&dqopt->dqio_sem);
> 	if (dqopt->ops[dquot->dq_id.type]->release_dqblk) {
> 		ret = dqopt->ops[dquot->dq_id.type]->release_dqblk(dquot);
> 		/* Write the info */
> @@ -493,7 +493,7 @@ int dquot_release(struct dquot *dquot)
> 			ret = ret2;
> 	}
> 	clear_bit(DQ_ACTIVE_B, &dquot->dq_flags);
> -	mutex_unlock(&dqopt->dqio_mutex);
> +	up_write(&dqopt->dqio_sem);
> out_dqlock:
> 	mutex_unlock(&dquot->dq_lock);
> 	return ret;
> @@ -2060,9 +2060,9 @@ int dquot_commit_info(struct super_block *sb, int type)
> 	int ret;
> 	struct quota_info *dqopt = sb_dqopt(sb);
> 
> -	mutex_lock(&dqopt->dqio_mutex);
> +	down_write(&dqopt->dqio_sem);
> 	ret = dqopt->ops[type]->write_file_info(sb, type);
> -	mutex_unlock(&dqopt->dqio_mutex);
> +	up_write(&dqopt->dqio_sem);
> 	return ret;
> }
> EXPORT_SYMBOL(dquot_commit_info);
> @@ -2076,9 +2076,9 @@ int dquot_get_next_id(struct super_block *sb, struct kqid *qid)
> 		return -ESRCH;
> 	if (!dqopt->ops[qid->type]->get_next_id)
> 		return -ENOSYS;
> -	mutex_lock(&dqopt->dqio_mutex);
> +	down_write(&dqopt->dqio_sem);
> 	err = dqopt->ops[qid->type]->get_next_id(sb, qid);
> -	mutex_unlock(&dqopt->dqio_mutex);
> +	up_write(&dqopt->dqio_sem);
> 	return err;
> }
> EXPORT_SYMBOL(dquot_get_next_id);
> @@ -2328,15 +2328,15 @@ static int vfs_load_quota_inode(struct inode *inode, int type, int format_id,
> 	dqopt->info[type].dqi_format = fmt;
> 	dqopt->info[type].dqi_fmt_id = format_id;
> 	INIT_LIST_HEAD(&dqopt->info[type].dqi_dirty_list);
> -	mutex_lock(&dqopt->dqio_mutex);
> +	down_write(&dqopt->dqio_sem);
> 	error = dqopt->ops[type]->read_file_info(sb, type);
> 	if (error < 0) {
> -		mutex_unlock(&dqopt->dqio_mutex);
> +		up_write(&dqopt->dqio_sem);
> 		goto out_file_init;
> 	}
> 	if (dqopt->flags & DQUOT_QUOTA_SYS_FILE)
> 		dqopt->info[type].dqi_flags |= DQF_SYS_FILE;
> -	mutex_unlock(&dqopt->dqio_mutex);
> +	up_write(&dqopt->dqio_sem);
> 	spin_lock(&dq_state_lock);
> 	dqopt->flags |= dquot_state_flag(flags, type);
> 	spin_unlock(&dq_state_lock);
> diff --git a/fs/quota/quota_tree.c b/fs/quota/quota_tree.c
> index 0738972e8d3f..54f85eb2609c 100644
> --- a/fs/quota/quota_tree.c
> +++ b/fs/quota/quota_tree.c
> @@ -379,7 +379,7 @@ int qtree_write_dquot(struct qtree_mem_dqinfo *info, struct dquot *dquot)
> 	if (!ddquot)
> 		return -ENOMEM;
> 
> -	/* dq_off is guarded by dqio_mutex */
> +	/* dq_off is guarded by dqio_sem */
> 	if (!dquot->dq_off) {
> 		ret = dq_insert_tree(info, dquot);
> 		if (ret < 0) {
> diff --git a/fs/super.c b/fs/super.c
> index 6bc3352adcf3..221cfa1f4e92 100644
> --- a/fs/super.c
> +++ b/fs/super.c
> @@ -242,7 +242,7 @@ static struct super_block *alloc_super(struct file_system_type *type, int flags,
> 	atomic_set(&s->s_active, 1);
> 	mutex_init(&s->s_vfs_rename_mutex);
> 	lockdep_set_class(&s->s_vfs_rename_mutex, &type->s_vfs_rename_key);
> -	mutex_init(&s->s_dquot.dqio_mutex);
> +	init_rwsem(&s->s_dquot.dqio_sem);
> 	s->s_maxbytes = MAX_NON_LFS;
> 	s->s_op = &default_op;
> 	s->s_time_gran = 1000000000;
> diff --git a/include/linux/quota.h b/include/linux/quota.h
> index bfd077ca6ac3..3a6df7461642 100644
> --- a/include/linux/quota.h
> +++ b/include/linux/quota.h
> @@ -521,7 +521,7 @@ static inline void quota_send_warning(struct kqid qid, dev_t dev,
> 
> struct quota_info {
> 	unsigned int flags;			/* Flags for diskquotas on this device */
> -	struct mutex dqio_mutex;		/* lock device while I/O in progress */
> +	struct rw_semaphore dqio_sem;		/* Lock quota file while I/O in progress */
> 	struct inode *files[MAXQUOTAS];		/* inodes of quotafiles */
> 	struct mem_dqinfo info[MAXQUOTAS];	/* Information for each quota type */
> 	const struct quota_format_ops *ops[MAXQUOTAS];	/* Operations for each type */
> --
> 2.12.3
> 


Cheers, Andreas






[-- Attachment #2: Message signed with OpenPGP --]
[-- Type: application/pgp-signature, Size: 195 bytes --]

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 02/27] quota: Do more fine-grained locking in dquot_acquire()
  2017-08-16 15:41 ` [PATCH 02/27] quota: Do more fine-grained locking in dquot_acquire() Jan Kara
@ 2017-08-16 16:35   ` Andreas Dilger
  2017-08-17 16:58     ` Jan Kara
  0 siblings, 1 reply; 55+ messages in thread
From: Andreas Dilger @ 2017-08-16 16:35 UTC (permalink / raw)
  To: Jan Kara; +Cc: linux-fsdevel, Wang Shilong

[-- Attachment #1: Type: text/plain, Size: 2592 bytes --]

On Aug 16, 2017, at 9:41 AM, Jan Kara <jack@suse.cz> wrote:
> 
> We need dqio_sem held just for reading when calling ->read_dqblk() in
> dquot_acquire(). Also dqio_sem is not needed when manipulating dquot
> flags (those are protected by dq_lock)

Nothing against the patch itself, but this comment is a bit confusing.

It would be good to list this dq_lock dependency at the dq_flags declaration.
Looking through the code that accesses dq_flags, I see dquot_mark_dquot_dirty(),
clear_dquot_dirty(), dquot_commit(), dqput() depend on dq_list_lock, but
I don't see many of the users besides dquot_acquire() and dquot_release()
getting dq_lock, only using the atomic bit operations on dq_flags.

Cheers, Andreas

> so acquire and release it closer to the place where it is needed. This
> reduces lock hold time and will make locking changes easier.
> 
> Signed-off-by: Jan Kara <jack@suse.cz>
> ---
> fs/quota/dquot.c | 9 ++++++---
> 1 file changed, 6 insertions(+), 3 deletions(-)
> 
> diff --git a/fs/quota/dquot.c b/fs/quota/dquot.c
> index 29d447598642..21358f31923d 100644
> --- a/fs/quota/dquot.c
> +++ b/fs/quota/dquot.c
> @@ -406,9 +406,11 @@ int dquot_acquire(struct dquot *dquot)
> 	struct quota_info *dqopt = sb_dqopt(dquot->dq_sb);
> 
> 	mutex_lock(&dquot->dq_lock);
> -	down_write(&dqopt->dqio_sem);
> -	if (!test_bit(DQ_READ_B, &dquot->dq_flags))
> +	if (!test_bit(DQ_READ_B, &dquot->dq_flags)) {
> +		down_read(&dqopt->dqio_sem);
> 		ret = dqopt->ops[dquot->dq_id.type]->read_dqblk(dquot);
> +		up_read(&dqopt->dqio_sem);
> +	}
> 	if (ret < 0)
> 		goto out_iolock;
> 	/* Make sure flags update is visible after dquot has been filled */
> @@ -416,12 +418,14 @@ int dquot_acquire(struct dquot *dquot)
> 	set_bit(DQ_READ_B, &dquot->dq_flags);
> 	/* Instantiate dquot if needed */
> 	if (!test_bit(DQ_ACTIVE_B, &dquot->dq_flags) && !dquot->dq_off) {
> +		down_write(&dqopt->dqio_sem);
> 		ret = dqopt->ops[dquot->dq_id.type]->commit_dqblk(dquot);
> 		/* Write the info if needed */
> 		if (info_dirty(&dqopt->info[dquot->dq_id.type])) {
> 			ret2 = dqopt->ops[dquot->dq_id.type]->write_file_info(
> 					dquot->dq_sb, dquot->dq_id.type);
> 		}
> +		up_write(&dqopt->dqio_sem);
> 		if (ret < 0)
> 			goto out_iolock;
> 		if (ret2 < 0) {
> @@ -436,7 +440,6 @@ int dquot_acquire(struct dquot *dquot)
> 	smp_mb__before_atomic();
> 	set_bit(DQ_ACTIVE_B, &dquot->dq_flags);
> out_iolock:
> -	up_write(&dqopt->dqio_sem);
> 	mutex_unlock(&dquot->dq_lock);
> 	return ret;
> }
> --
> 2.12.3
> 


Cheers, Andreas






[-- Attachment #2: Message signed with OpenPGP --]
[-- Type: application/pgp-signature, Size: 195 bytes --]

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 03/27] quota: Acquire dqio_sem for reading in dquot_get_next_id()
  2017-08-16 15:41 ` [PATCH 03/27] quota: Acquire dqio_sem for reading in dquot_get_next_id() Jan Kara
@ 2017-08-16 16:37   ` Andreas Dilger
  0 siblings, 0 replies; 55+ messages in thread
From: Andreas Dilger @ 2017-08-16 16:37 UTC (permalink / raw)
  To: Jan Kara; +Cc: linux-fsdevel, Wang Shilong

[-- Attachment #1: Type: text/plain, Size: 1002 bytes --]

On Aug 16, 2017, at 9:41 AM, Jan Kara <jack@suse.cz> wrote:
> 
> dquot_get_next_id() needs dqio_sem only for reading to protect against
> racing with modification of quota file structure.
> 
> Signed-off-by: Jan Kara <jack@suse.cz>

Reviewed-by: Andreas Dilger <adilger@dilger.ca>

> ---
> fs/quota/dquot.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/fs/quota/dquot.c b/fs/quota/dquot.c
> index 21358f31923d..8d5ccad3bf3e 100644
> --- a/fs/quota/dquot.c
> +++ b/fs/quota/dquot.c
> @@ -2079,9 +2079,9 @@ int dquot_get_next_id(struct super_block *sb, struct kqid *qid)
> 		return -ESRCH;
> 	if (!dqopt->ops[qid->type]->get_next_id)
> 		return -ENOSYS;
> -	down_write(&dqopt->dqio_sem);
> +	down_read(&dqopt->dqio_sem);
> 	err = dqopt->ops[qid->type]->get_next_id(sb, qid);
> -	up_write(&dqopt->dqio_sem);
> +	up_read(&dqopt->dqio_sem);
> 	return err;
> }
> EXPORT_SYMBOL(dquot_get_next_id);
> --
> 2.12.3
> 


Cheers, Andreas






[-- Attachment #2: Message signed with OpenPGP --]
[-- Type: application/pgp-signature, Size: 195 bytes --]

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 04/27] quota: Acquire dqio_sem for reading in vfs_load_quota_inode()
  2017-08-16 15:41 ` [PATCH 04/27] quota: Acquire dqio_sem for reading in vfs_load_quota_inode() Jan Kara
@ 2017-08-16 16:38   ` Andreas Dilger
  0 siblings, 0 replies; 55+ messages in thread
From: Andreas Dilger @ 2017-08-16 16:38 UTC (permalink / raw)
  To: Jan Kara; +Cc: linux-fsdevel, Wang Shilong

[-- Attachment #1: Type: text/plain, Size: 2685 bytes --]

On Aug 16, 2017, at 9:41 AM, Jan Kara <jack@suse.cz> wrote:
> 
> vfs_load_quota_inode() needs dqio_sem only for reading. In fact dqio_sem
> is not needed there at all since the function can be called only during
> quota on when quota file cannot be modified but let's leave the
> protection there since it is logical and the path is in no way
> performance critical.
> 
> Signed-off-by: Jan Kara <jack@suse.cz>

Reviewed-by: Andreas Dilger <adilger@dilger.ca>

> ---
> fs/ocfs2/quota_local.c | 6 +++---
> fs/quota/dquot.c       | 6 +++---
> 2 files changed, 6 insertions(+), 6 deletions(-)
> 
> diff --git a/fs/ocfs2/quota_local.c b/fs/ocfs2/quota_local.c
> index 1311eff1c050..1829f6a45d46 100644
> --- a/fs/ocfs2/quota_local.c
> +++ b/fs/ocfs2/quota_local.c
> @@ -693,7 +693,7 @@ static int ocfs2_local_read_info(struct super_block *sb, int type)
> 
> 	/* We don't need the lock and we have to acquire quota file locks
> 	 * which will later depend on this lock */
> -	up_write(&sb_dqopt(sb)->dqio_sem);
> +	up_read(&sb_dqopt(sb)->dqio_sem);
> 	info->dqi_max_spc_limit = 0x7fffffffffffffffLL;
> 	info->dqi_max_ino_limit = 0x7fffffffffffffffLL;
> 	oinfo = kmalloc(sizeof(struct ocfs2_mem_dqinfo), GFP_NOFS);
> @@ -772,7 +772,7 @@ static int ocfs2_local_read_info(struct super_block *sb, int type)
> 		goto out_err;
> 	}
> 
> -	down_write(&sb_dqopt(sb)->dqio_sem);
> +	down_read(&sb_dqopt(sb)->dqio_sem);
> 	return 0;
> out_err:
> 	if (oinfo) {
> @@ -786,7 +786,7 @@ static int ocfs2_local_read_info(struct super_block *sb, int type)
> 		kfree(oinfo);
> 	}
> 	brelse(bh);
> -	down_write(&sb_dqopt(sb)->dqio_sem);
> +	down_read(&sb_dqopt(sb)->dqio_sem);
> 	return -1;
> }
> 
> diff --git a/fs/quota/dquot.c b/fs/quota/dquot.c
> index 8d5ccad3bf3e..3852a3c79ac9 100644
> --- a/fs/quota/dquot.c
> +++ b/fs/quota/dquot.c
> @@ -2331,15 +2331,15 @@ static int vfs_load_quota_inode(struct inode *inode, int type, int format_id,
> 	dqopt->info[type].dqi_format = fmt;
> 	dqopt->info[type].dqi_fmt_id = format_id;
> 	INIT_LIST_HEAD(&dqopt->info[type].dqi_dirty_list);
> -	down_write(&dqopt->dqio_sem);
> +	down_read(&dqopt->dqio_sem);
> 	error = dqopt->ops[type]->read_file_info(sb, type);
> 	if (error < 0) {
> -		up_write(&dqopt->dqio_sem);
> +		up_read(&dqopt->dqio_sem);
> 		goto out_file_init;
> 	}
> 	if (dqopt->flags & DQUOT_QUOTA_SYS_FILE)
> 		dqopt->info[type].dqi_flags |= DQF_SYS_FILE;
> -	up_write(&dqopt->dqio_sem);
> +	up_read(&dqopt->dqio_sem);
> 	spin_lock(&dq_state_lock);
> 	dqopt->flags |= dquot_state_flag(flags, type);
> 	spin_unlock(&dq_state_lock);
> --
> 2.12.3
> 


Cheers, Andreas






[-- Attachment #2: Message signed with OpenPGP --]
[-- Type: application/pgp-signature, Size: 195 bytes --]

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 05/27] quota: Protect dquot writeout with dq_lock
  2017-08-16 15:41 ` [PATCH 05/27] quota: Protect dquot writeout with dq_lock Jan Kara
@ 2017-08-16 16:44   ` Andreas Dilger
  0 siblings, 0 replies; 55+ messages in thread
From: Andreas Dilger @ 2017-08-16 16:44 UTC (permalink / raw)
  To: Jan Kara; +Cc: linux-fsdevel, Wang Shilong

[-- Attachment #1: Type: text/plain, Size: 3135 bytes --]

On Aug 16, 2017, at 9:41 AM, Jan Kara <jack@suse.cz> wrote:
> 
> Currently dquot writeout is only protected by dqio_sem held for writing.
> As we transition to a finer grained locking we will use dquot->dq_lock
> instead. So acquire it in dquot_commit() and move dqio_sem just around
> ->commit_dqblk() call as it is still needed to serialize quota file
> changes.
> 
> Signed-off-by: Jan Kara <jack@suse.cz>
> ---
> fs/quota/dquot.c | 27 +++++++++++++--------------
> 1 file changed, 13 insertions(+), 14 deletions(-)
> 
> diff --git a/fs/quota/dquot.c b/fs/quota/dquot.c
> index 3852a3c79ac9..99693c6d5dae 100644
> --- a/fs/quota/dquot.c
> +++ b/fs/quota/dquot.c
> @@ -110,14 +110,11 @@
>  * sure they cannot race with quotaon which first sets S_NOQUOTA flag and
>  * then drops all pointers to dquots from an inode.
>  *
> - * Each dquot has its dq_lock mutex. Locked dquots might not be referenced
> - * from inodes (dquot_alloc_space() and such don't check the dq_lock).
> - * Currently dquot is locked only when it is being read to memory (or space for
> - * it is being allocated) on the first dqget() and when it is being released on
> - * the last dqput(). The allocation and release oparations are serialized by
> - * the dq_lock and by checking the use count in dquot_release().  Write
> - * operations on dquots don't hold dq_lock as they copy data under dq_data_lock
> - * spinlock to internal buffers before writing.
> + * Each dquot has its dq_lock mutex.  Dquot is locked when it is being read to
> + * memory (or space for it is being allocated) on the first dqget(), when it is
> + * being written out, and when it is being released on the last dqput(). The
> + * allocation and release operations are serialized by the dq_lock and by
> + * checking the use count in dquot_release().
>  *
>  * Lock ordering (including related VFS locks) is the following:
>  *   s_umount > i_mutex > journal_lock > dquot->dq_lock > dqio_sem
> @@ -453,21 +450,23 @@ int dquot_commit(struct dquot *dquot)
> 	int ret = 0;
> 	struct quota_info *dqopt = sb_dqopt(dquot->dq_sb);
> 
> -	down_write(&dqopt->dqio_sem);
> +	mutex_lock(&dquot->dq_lock);
> 	spin_lock(&dq_list_lock);
> 	if (!clear_dquot_dirty(dquot)) {
> 		spin_unlock(&dq_list_lock);
> -		goto out_sem;
> +		goto out_lock;
> 	}
> 	spin_unlock(&dq_list_lock);
> 	/* Inactive dquot can be only if there was error during read/init
> 	 * => we have better not writing it */
> -	if (test_bit(DQ_ACTIVE_B, &dquot->dq_flags))
> +	if (test_bit(DQ_ACTIVE_B, &dquot->dq_flags)) {
> +		down_write(&dqopt->dqio_sem);
> 		ret = dqopt->ops[dquot->dq_id.type]->commit_dqblk(dquot);
> -	else
> +		up_write(&dqopt->dqio_sem);
> +	} else
> 		ret = -EIO;

(style) typically braces are put around both branches of an if/else if either
one needs it.

Otherwise, looks fine to me.

Reviewed-by: Andreas Dilger <adilger@dilger.ca>

> -out_sem:
> -	up_write(&dqopt->dqio_sem);
> +out_lock:
> +	mutex_unlock(&dquot->dq_lock);
> 	return ret;
> }
> EXPORT_SYMBOL(dquot_commit);
> --
> 2.12.3
> 


Cheers, Andreas






[-- Attachment #2: Message signed with OpenPGP --]
[-- Type: application/pgp-signature, Size: 195 bytes --]

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 06/27] quota: Push dqio_sem down to ->read_dqblk()
  2017-08-16 15:41 ` [PATCH 06/27] quota: Push dqio_sem down to ->read_dqblk() Jan Kara
@ 2017-08-16 16:46   ` Andreas Dilger
  0 siblings, 0 replies; 55+ messages in thread
From: Andreas Dilger @ 2017-08-16 16:46 UTC (permalink / raw)
  To: Jan Kara; +Cc: linux-fsdevel, Wang Shilong

[-- Attachment #1: Type: text/plain, Size: 2805 bytes --]

On Aug 16, 2017, at 9:41 AM, Jan Kara <jack@suse.cz> wrote:
> 
> Push down acquisition of dqio_sem into ->read_dqblk() callback. It will
> allow quota formats to decide whether they need it or not.
> 
> Signed-off-by: Jan Kara <jack@suse.cz>

Reviewed-by: Andreas Dilger <adilger@dilger.ca>

> ---
> fs/quota/dquot.c    |  5 +----
> fs/quota/quota_v1.c |  5 ++++-
> fs/quota/quota_v2.c | 10 +++++++++-
> 3 files changed, 14 insertions(+), 6 deletions(-)
> 
> diff --git a/fs/quota/dquot.c b/fs/quota/dquot.c
> index 99693c6d5dae..70b9a6afe5a8 100644
> --- a/fs/quota/dquot.c
> +++ b/fs/quota/dquot.c
> @@ -403,11 +403,8 @@ int dquot_acquire(struct dquot *dquot)
> 	struct quota_info *dqopt = sb_dqopt(dquot->dq_sb);
> 
> 	mutex_lock(&dquot->dq_lock);
> -	if (!test_bit(DQ_READ_B, &dquot->dq_flags)) {
> -		down_read(&dqopt->dqio_sem);
> +	if (!test_bit(DQ_READ_B, &dquot->dq_flags))
> 		ret = dqopt->ops[dquot->dq_id.type]->read_dqblk(dquot);
> -		up_read(&dqopt->dqio_sem);
> -	}
> 	if (ret < 0)
> 		goto out_iolock;
> 	/* Make sure flags update is visible after dquot has been filled */
> diff --git a/fs/quota/quota_v1.c b/fs/quota/quota_v1.c
> index 8fe79beced5c..d534c4043237 100644
> --- a/fs/quota/quota_v1.c
> +++ b/fs/quota/quota_v1.c
> @@ -56,10 +56,12 @@ static int v1_read_dqblk(struct dquot *dquot)
> {
> 	int type = dquot->dq_id.type;
> 	struct v1_disk_dqblk dqblk;
> +	struct quota_info *dqopt = sb_dqopt(dquot->dq_sb);
> 
> -	if (!sb_dqopt(dquot->dq_sb)->files[type])
> +	if (!dqopt->files[type])
> 		return -EINVAL;
> 
> +	down_read(&dqopt->dqio_sem);
> 	/* Set structure to 0s in case read fails/is after end of file */
> 	memset(&dqblk, 0, sizeof(struct v1_disk_dqblk));
> 	dquot->dq_sb->s_op->quota_read(dquot->dq_sb, type, (char *)&dqblk,
> @@ -73,6 +75,7 @@ static int v1_read_dqblk(struct dquot *dquot)
> 	    dquot->dq_dqb.dqb_isoftlimit == 0)
> 		set_bit(DQ_FAKE_B, &dquot->dq_flags);
> 	dqstats_inc(DQST_READS);
> +	up_read(&dqopt->dqio_sem);
> 
> 	return 0;
> }
> diff --git a/fs/quota/quota_v2.c b/fs/quota/quota_v2.c
> index ca71bf881ad1..b2cd34f6b3da 100644
> --- a/fs/quota/quota_v2.c
> +++ b/fs/quota/quota_v2.c
> @@ -285,7 +285,15 @@ static int v2r1_is_id(void *dp, struct dquot *dquot)
> 
> static int v2_read_dquot(struct dquot *dquot)
> {
> -	return qtree_read_dquot(sb_dqinfo(dquot->dq_sb, dquot->dq_id.type)->dqi_priv, dquot);
> +	struct quota_info *dqopt = sb_dqopt(dquot->dq_sb);
> +	int ret;
> +
> +	down_read(&dqopt->dqio_sem);
> +	ret = qtree_read_dquot(
> +			sb_dqinfo(dquot->dq_sb, dquot->dq_id.type)->dqi_priv,
> +			dquot);
> +	up_read(&dqopt->dqio_sem);
> +	return ret;
> }
> 
> static int v2_write_dquot(struct dquot *dquot)
> --
> 2.12.3
> 


Cheers, Andreas






[-- Attachment #2: Message signed with OpenPGP --]
[-- Type: application/pgp-signature, Size: 195 bytes --]

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 07/27] quota: Remove locking for reading from the old quota format
  2017-08-16 15:41 ` [PATCH 07/27] quota: Remove locking for reading from the old quota format Jan Kara
@ 2017-08-16 16:47   ` Andreas Dilger
  0 siblings, 0 replies; 55+ messages in thread
From: Andreas Dilger @ 2017-08-16 16:47 UTC (permalink / raw)
  To: Jan Kara; +Cc: linux-fsdevel, Wang Shilong

[-- Attachment #1: Type: text/plain, Size: 1219 bytes --]

On Aug 16, 2017, at 9:41 AM, Jan Kara <jack@suse.cz> wrote:
> 
> The old quota format has fixed offset in quota file based on ID so
> there's no locking needed against concurrent modifications of the file
> (locking against concurrent IO on the same dquot is still provided by
> dq_lock).
> 
> Signed-off-by: Jan Kara <jack@suse.cz>

Reviewed-by: Andreas Dilger <adilger@dilger.ca>

> ---
> fs/quota/quota_v1.c | 2 --
> 1 file changed, 2 deletions(-)
> 
> diff --git a/fs/quota/quota_v1.c b/fs/quota/quota_v1.c
> index d534c4043237..12d69cda57cc 100644
> --- a/fs/quota/quota_v1.c
> +++ b/fs/quota/quota_v1.c
> @@ -61,7 +61,6 @@ static int v1_read_dqblk(struct dquot *dquot)
> 	if (!dqopt->files[type])
> 		return -EINVAL;
> 
> -	down_read(&dqopt->dqio_sem);
> 	/* Set structure to 0s in case read fails/is after end of file */
> 	memset(&dqblk, 0, sizeof(struct v1_disk_dqblk));
> 	dquot->dq_sb->s_op->quota_read(dquot->dq_sb, type, (char *)&dqblk,
> @@ -75,7 +74,6 @@ static int v1_read_dqblk(struct dquot *dquot)
> 	    dquot->dq_dqb.dqb_isoftlimit == 0)
> 		set_bit(DQ_FAKE_B, &dquot->dq_flags);
> 	dqstats_inc(DQST_READS);
> -	up_read(&dqopt->dqio_sem);
> 
> 	return 0;
> }
> --
> 2.12.3
> 


Cheers, Andreas






[-- Attachment #2: Message signed with OpenPGP --]
[-- Type: application/pgp-signature, Size: 195 bytes --]

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 08/27] quota: Push dqio_sem down to ->write_dqblk()
  2017-08-16 15:41 ` [PATCH 08/27] quota: Push dqio_sem down to ->write_dqblk() Jan Kara
@ 2017-08-16 16:48   ` Andreas Dilger
  0 siblings, 0 replies; 55+ messages in thread
From: Andreas Dilger @ 2017-08-16 16:48 UTC (permalink / raw)
  To: Jan Kara; +Cc: linux-fsdevel, Wang Shilong

[-- Attachment #1: Type: text/plain, Size: 3678 bytes --]

On Aug 16, 2017, at 9:41 AM, Jan Kara <jack@suse.cz> wrote:
> 
> Push down acquisition of dqio_sem into ->write_dqblk() callback. It will
> allow quota formats to decide whether they need it or not.
> 
> Signed-off-by: Jan Kara <jack@suse.cz>
> ---
> fs/quota/dquot.c    | 10 ++++------
> fs/quota/quota_v1.c |  3 +++
> fs/quota/quota_v2.c | 10 +++++++++-
> 3 files changed, 16 insertions(+), 7 deletions(-)
> 
> diff --git a/fs/quota/dquot.c b/fs/quota/dquot.c
> index 70b9a6afe5a8..562f5978488f 100644
> --- a/fs/quota/dquot.c
> +++ b/fs/quota/dquot.c
> @@ -412,14 +412,14 @@ int dquot_acquire(struct dquot *dquot)
> 	set_bit(DQ_READ_B, &dquot->dq_flags);
> 	/* Instantiate dquot if needed */
> 	if (!test_bit(DQ_ACTIVE_B, &dquot->dq_flags) && !dquot->dq_off) {
> -		down_write(&dqopt->dqio_sem);
> 		ret = dqopt->ops[dquot->dq_id.type]->commit_dqblk(dquot);
> 		/* Write the info if needed */
> 		if (info_dirty(&dqopt->info[dquot->dq_id.type])) {
> +			down_write(&dqopt->dqio_sem);
> 			ret2 = dqopt->ops[dquot->dq_id.type]->write_file_info(
> 					dquot->dq_sb, dquot->dq_id.type);
> +			up_write(&dqopt->dqio_sem);
> 		}
> -		up_write(&dqopt->dqio_sem);
> 		if (ret < 0)
> 			goto out_iolock;
> 		if (ret2 < 0) {
> @@ -456,11 +456,9 @@ int dquot_commit(struct dquot *dquot)
> 	spin_unlock(&dq_list_lock);
> 	/* Inactive dquot can be only if there was error during read/init
> 	 * => we have better not writing it */
> -	if (test_bit(DQ_ACTIVE_B, &dquot->dq_flags)) {
> -		down_write(&dqopt->dqio_sem);
> +	if (test_bit(DQ_ACTIVE_B, &dquot->dq_flags))
> 		ret = dqopt->ops[dquot->dq_id.type]->commit_dqblk(dquot);
> -		up_write(&dqopt->dqio_sem);
> -	} else
> +	else

Ah, I see you removed {} added in previous patch, ignore my previous comment.

Reviewed-by: Andreas Dilger <adilger@dilger.ca>

> 		ret = -EIO;
> out_lock:
> 	mutex_unlock(&dquot->dq_lock);
> diff --git a/fs/quota/quota_v1.c b/fs/quota/quota_v1.c
> index 12d69cda57cc..94cceb76b9a3 100644
> --- a/fs/quota/quota_v1.c
> +++ b/fs/quota/quota_v1.c
> @@ -83,7 +83,9 @@ static int v1_commit_dqblk(struct dquot *dquot)
> 	short type = dquot->dq_id.type;
> 	ssize_t ret;
> 	struct v1_disk_dqblk dqblk;
> +	struct quota_info *dqopt = sb_dqopt(dquot->dq_sb);
> 
> +	down_write(&dqopt->dqio_sem);
> 	v1_mem2disk_dqblk(&dqblk, &dquot->dq_dqb);
> 	if (((type == USRQUOTA) && uid_eq(dquot->dq_id.uid, GLOBAL_ROOT_UID)) ||
> 	    ((type == GRPQUOTA) && gid_eq(dquot->dq_id.gid, GLOBAL_ROOT_GID))) {
> @@ -97,6 +99,7 @@ static int v1_commit_dqblk(struct dquot *dquot)
> 		ret = dquot->dq_sb->s_op->quota_write(dquot->dq_sb, type,
> 			(char *)&dqblk, sizeof(struct v1_disk_dqblk),
> 			v1_dqoff(from_kqid(&init_user_ns, dquot->dq_id)));
> +	up_write(&dqopt->dqio_sem);
> 	if (ret != sizeof(struct v1_disk_dqblk)) {
> 		quota_error(dquot->dq_sb, "dquota write failed");
> 		if (ret >= 0)
> diff --git a/fs/quota/quota_v2.c b/fs/quota/quota_v2.c
> index b2cd34f6b3da..482733abe4ac 100644
> --- a/fs/quota/quota_v2.c
> +++ b/fs/quota/quota_v2.c
> @@ -298,7 +298,15 @@ static int v2_read_dquot(struct dquot *dquot)
> 
> static int v2_write_dquot(struct dquot *dquot)
> {
> -	return qtree_write_dquot(sb_dqinfo(dquot->dq_sb, dquot->dq_id.type)->dqi_priv, dquot);
> +	struct quota_info *dqopt = sb_dqopt(dquot->dq_sb);
> +	int ret;
> +
> +	down_write(&dqopt->dqio_sem);
> +	ret = qtree_write_dquot(
> +			sb_dqinfo(dquot->dq_sb, dquot->dq_id.type)->dqi_priv,
> +			dquot);
> +	up_write(&dqopt->dqio_sem);
> +	return ret;
> }
> 
> static int v2_release_dquot(struct dquot *dquot)
> --
> 2.12.3
> 


Cheers, Andreas






[-- Attachment #2: Message signed with OpenPGP --]
[-- Type: application/pgp-signature, Size: 195 bytes --]

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 09/27] quota: Do not acquire dqio_sem for dquot overwrites in v2 format
  2017-08-16 15:41 ` [PATCH 09/27] quota: Do not acquire dqio_sem for dquot overwrites in v2 format Jan Kara
@ 2017-08-16 16:49   ` Andreas Dilger
  0 siblings, 0 replies; 55+ messages in thread
From: Andreas Dilger @ 2017-08-16 16:49 UTC (permalink / raw)
  To: Jan Kara; +Cc: linux-fsdevel, Wang Shilong

[-- Attachment #1: Type: text/plain, Size: 1405 bytes --]

On Aug 16, 2017, at 9:41 AM, Jan Kara <jack@suse.cz> wrote:
> 
> When dquot has space already allocated in a quota file, we just
> overwrite that place when writing dquot. So we don't need any protection
> against other modifications of quota file as these keep dquot in place.
> 
> Signed-off-by: Jan Kara <jack@suse.cz>

Reviewed-by: Andreas Dilger <adilger@dilger.ca>

> ---
> fs/quota/quota_v2.c | 17 ++++++++++++++---
> 1 file changed, 14 insertions(+), 3 deletions(-)
> 
> diff --git a/fs/quota/quota_v2.c b/fs/quota/quota_v2.c
> index 482733abe4ac..7a05b80f3b8c 100644
> --- a/fs/quota/quota_v2.c
> +++ b/fs/quota/quota_v2.c
> @@ -300,12 +300,23 @@ static int v2_write_dquot(struct dquot *dquot)
> {
> 	struct quota_info *dqopt = sb_dqopt(dquot->dq_sb);
> 	int ret;
> -
> -	down_write(&dqopt->dqio_sem);
> +	bool alloc = false;
> +
> +	/*
> +	 * If space for dquot is already allocated, we don't need any
> +	 * protection as we'll only overwrite the place of dquot. We are
> +	 * still protected by concurrent writes of the same dquot by
> +	 * dquot->dq_lock.
> +	 */
> +	if (!dquot->dq_off) {
> +		alloc = true;
> +		down_write(&dqopt->dqio_sem);
> +	}
> 	ret = qtree_write_dquot(
> 			sb_dqinfo(dquot->dq_sb, dquot->dq_id.type)->dqi_priv,
> 			dquot);
> -	up_write(&dqopt->dqio_sem);
> +	if (alloc)
> +		up_write(&dqopt->dqio_sem);
> 	return ret;
> }
> 
> --
> 2.12.3
> 


Cheers, Andreas






[-- Attachment #2: Message signed with OpenPGP --]
[-- Type: application/pgp-signature, Size: 195 bytes --]

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 10/27] quota: Remove locking for writing to the old quota format
  2017-08-16 15:41 ` [PATCH 10/27] quota: Remove locking for writing to the old quota format Jan Kara
@ 2017-08-16 16:50   ` Andreas Dilger
  0 siblings, 0 replies; 55+ messages in thread
From: Andreas Dilger @ 2017-08-16 16:50 UTC (permalink / raw)
  To: Jan Kara; +Cc: linux-fsdevel, Wang Shilong

[-- Attachment #1: Type: text/plain, Size: 1483 bytes --]

On Aug 16, 2017, at 9:41 AM, Jan Kara <jack@suse.cz> wrote:
> 
> The old quota quota format has fixed offset in quota file based on ID so
> there's no locking needed against concurrent modifications of the file
> (locking against concurrent IO on the same dquot is still provided by
> dq_lock).
> 
> Signed-off-by: Jan Kara <jack@suse.cz>

Reviewed-by: Andreas Dilger <adilger@dilger.ca>

> ---
> fs/quota/quota_v1.c | 3 ---
> 1 file changed, 3 deletions(-)
> 
> diff --git a/fs/quota/quota_v1.c b/fs/quota/quota_v1.c
> index 94cceb76b9a3..12d69cda57cc 100644
> --- a/fs/quota/quota_v1.c
> +++ b/fs/quota/quota_v1.c
> @@ -83,9 +83,7 @@ static int v1_commit_dqblk(struct dquot *dquot)
> 	short type = dquot->dq_id.type;
> 	ssize_t ret;
> 	struct v1_disk_dqblk dqblk;
> -	struct quota_info *dqopt = sb_dqopt(dquot->dq_sb);
> 
> -	down_write(&dqopt->dqio_sem);
> 	v1_mem2disk_dqblk(&dqblk, &dquot->dq_dqb);
> 	if (((type == USRQUOTA) && uid_eq(dquot->dq_id.uid, GLOBAL_ROOT_UID)) ||
> 	    ((type == GRPQUOTA) && gid_eq(dquot->dq_id.gid, GLOBAL_ROOT_GID))) {
> @@ -99,7 +97,6 @@ static int v1_commit_dqblk(struct dquot *dquot)
> 		ret = dquot->dq_sb->s_op->quota_write(dquot->dq_sb, type,
> 			(char *)&dqblk, sizeof(struct v1_disk_dqblk),
> 			v1_dqoff(from_kqid(&init_user_ns, dquot->dq_id)));
> -	up_write(&dqopt->dqio_sem);
> 	if (ret != sizeof(struct v1_disk_dqblk)) {
> 		quota_error(dquot->dq_sb, "dquota write failed");
> 		if (ret >= 0)
> --
> 2.12.3
> 


Cheers, Andreas






[-- Attachment #2: Message signed with OpenPGP --]
[-- Type: application/pgp-signature, Size: 195 bytes --]

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 11/27] quota: Push dqio_sem down to ->release_dqblk()
  2017-08-16 15:41 ` [PATCH 11/27] quota: Push dqio_sem down to ->release_dqblk() Jan Kara
@ 2017-08-16 16:56   ` Andreas Dilger
  0 siblings, 0 replies; 55+ messages in thread
From: Andreas Dilger @ 2017-08-16 16:56 UTC (permalink / raw)
  To: Jan Kara; +Cc: linux-fsdevel, Wang Shilong

[-- Attachment #1: Type: text/plain, Size: 2156 bytes --]

On Aug 16, 2017, at 9:41 AM, Jan Kara <jack@suse.cz> wrote:
> 
> Push down acquisition of dqio_sem into ->release_dqblk() callback. It
> will allow quota formats to decide whether they need it or not.
> 
> Signed-off-by: Jan Kara <jack@suse.cz>

Reviewed-by: Andreas Dilger <adilger@dilger.ca>

> ---
> fs/quota/dquot.c    | 4 ++--
> fs/quota/quota_v2.c | 9 ++++++++-
> 2 files changed, 10 insertions(+), 3 deletions(-)
> 
> diff --git a/fs/quota/dquot.c b/fs/quota/dquot.c
> index 562f5978488f..3b3c7f094ff8 100644
> --- a/fs/quota/dquot.c
> +++ b/fs/quota/dquot.c
> @@ -478,19 +478,19 @@ int dquot_release(struct dquot *dquot)
> 	/* Check whether we are not racing with some other dqget() */
> 	if (atomic_read(&dquot->dq_count) > 1)
> 		goto out_dqlock;
> -	down_write(&dqopt->dqio_sem);
> 	if (dqopt->ops[dquot->dq_id.type]->release_dqblk) {
> 		ret = dqopt->ops[dquot->dq_id.type]->release_dqblk(dquot);
> 		/* Write the info */
> 		if (info_dirty(&dqopt->info[dquot->dq_id.type])) {
> +			down_write(&dqopt->dqio_sem);
> 			ret2 = dqopt->ops[dquot->dq_id.type]->write_file_info(
> 						dquot->dq_sb, dquot->dq_id.type);
> +			up_write(&dqopt->dqio_sem);
> 		}
> 		if (ret >= 0)
> 			ret = ret2;
> 	}
> 	clear_bit(DQ_ACTIVE_B, &dquot->dq_flags);
> -	up_write(&dqopt->dqio_sem);
> out_dqlock:
> 	mutex_unlock(&dquot->dq_lock);
> 	return ret;
> diff --git a/fs/quota/quota_v2.c b/fs/quota/quota_v2.c
> index 7a05b80f3b8c..8f54b159e423 100644
> --- a/fs/quota/quota_v2.c
> +++ b/fs/quota/quota_v2.c
> @@ -322,7 +322,14 @@ static int v2_write_dquot(struct dquot *dquot)
> 
> static int v2_release_dquot(struct dquot *dquot)
> {
> -	return qtree_release_dquot(sb_dqinfo(dquot->dq_sb, dquot->dq_id.type)->dqi_priv, dquot);
> +	struct quota_info *dqopt = sb_dqopt(dquot->dq_sb);
> +	int ret;
> +
> +	down_write(&dqopt->dqio_sem);
> +	ret = qtree_release_dquot(sb_dqinfo(dquot->dq_sb, dquot->dq_id.type)->dqi_priv, dquot);
> +	up_write(&dqopt->dqio_sem);
> +
> +	return ret;
> }
> 
> static int v2_free_file_info(struct super_block *sb, int type)
> --
> 2.12.3
> 


Cheers, Andreas






[-- Attachment #2: Message signed with OpenPGP --]
[-- Type: application/pgp-signature, Size: 195 bytes --]

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 12/27] quota: Push dqio_sem down to ->get_next_id()
  2017-08-16 15:41 ` [PATCH 12/27] quota: Push dqio_sem down to ->get_next_id() Jan Kara
@ 2017-08-16 17:08   ` Andreas Dilger
  2017-08-17 17:09     ` Jan Kara
  0 siblings, 1 reply; 55+ messages in thread
From: Andreas Dilger @ 2017-08-16 17:08 UTC (permalink / raw)
  To: Jan Kara; +Cc: linux-fsdevel, Wang Shilong

[-- Attachment #1: Type: text/plain, Size: 2454 bytes --]

On Aug 16, 2017, at 9:41 AM, Jan Kara <jack@suse.cz> wrote:
> 
> Push down acquisition of dqio_sem into ->get_next_id() callback. Mostly
> for consistency with other operations.
> 
> Signed-off-by: Jan Kara <jack@suse.cz>

I was initially thinking that this patch also needed to modify the
filesystem-specific code in a similar way, but ext4 and reiserfs
use dquot_get_next_id() internally, ocfs2 appears to have its own
locking, and XFS is doing completely its own thing.

It is a bit confusing that there are both dquot_operation.get_next_id
as well as quota_format_ops.get_next_id methods that are similar but
not totally the same thing.  It might make sense to rename the
quota_format_ops method like .get_next_id_format or similar to be able
to distinguish them?

Patch itself is fine though.

Reviewed-by: Andreas Dilger <adilger@dilger.ca>

> ---
> fs/quota/dquot.c    | 6 +-----
> fs/quota/quota_v2.c | 8 +++++++-
> 2 files changed, 8 insertions(+), 6 deletions(-)
> 
> diff --git a/fs/quota/dquot.c b/fs/quota/dquot.c
> index 3b3c7f094ff8..332f7026edad 100644
> --- a/fs/quota/dquot.c
> +++ b/fs/quota/dquot.c
> @@ -2067,16 +2067,12 @@ EXPORT_SYMBOL(dquot_commit_info);
> int dquot_get_next_id(struct super_block *sb, struct kqid *qid)
> {
> 	struct quota_info *dqopt = sb_dqopt(sb);
> -	int err;
> 
> 	if (!sb_has_quota_active(sb, qid->type))
> 		return -ESRCH;
> 	if (!dqopt->ops[qid->type]->get_next_id)
> 		return -ENOSYS;
> -	down_read(&dqopt->dqio_sem);
> -	err = dqopt->ops[qid->type]->get_next_id(sb, qid);
> -	up_read(&dqopt->dqio_sem);
> -	return err;
> +	return dqopt->ops[qid->type]->get_next_id(sb, qid);
> }
> EXPORT_SYMBOL(dquot_get_next_id);
> 
> diff --git a/fs/quota/quota_v2.c b/fs/quota/quota_v2.c
> index 8f54b159e423..f82638e43c07 100644
> --- a/fs/quota/quota_v2.c
> +++ b/fs/quota/quota_v2.c
> @@ -340,7 +340,13 @@ static int v2_free_file_info(struct super_block *sb, int type)
> 
> static int v2_get_next_id(struct super_block *sb, struct kqid *qid)
> {
> -	return qtree_get_next_id(sb_dqinfo(sb, qid->type)->dqi_priv, qid);
> +	struct quota_info *dqopt = sb_dqopt(sb);
> +	int ret;
> +
> +	down_read(&dqopt->dqio_sem);
> +	ret = qtree_get_next_id(sb_dqinfo(sb, qid->type)->dqi_priv, qid);
> +	up_read(&dqopt->dqio_sem);
> +	return ret;
> }
> 
> static const struct quota_format_ops v2_format_ops = {
> --
> 2.12.3
> 


Cheers, Andreas






[-- Attachment #2: Message signed with OpenPGP --]
[-- Type: application/pgp-signature, Size: 195 bytes --]

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 13/27] quota: Push dqio_sem down to ->write_file_info()
  2017-08-16 15:41 ` [PATCH 13/27] quota: Push dqio_sem down to ->write_file_info() Jan Kara
@ 2017-08-16 17:33   ` Andreas Dilger
  2017-08-17 17:13     ` Jan Kara
  0 siblings, 1 reply; 55+ messages in thread
From: Andreas Dilger @ 2017-08-16 17:33 UTC (permalink / raw)
  To: Jan Kara; +Cc: linux-fsdevel, Wang Shilong, Joel Becker

[-- Attachment #1: Type: text/plain, Size: 5064 bytes --]

On Aug 16, 2017, at 9:41 AM, Jan Kara <jack@suse.cz> wrote:
> 
> Push down acquisition of dqio_sem into ->write_file_info() callback.
> Mostly for consistency with other operations.
> 
> Signed-off-by: Jan Kara <jack@suse.cz>
> ---
> fs/ocfs2/quota_global.c |  9 +++++++--
> fs/quota/dquot.c        | 10 +---------
> fs/quota/quota_v1.c     |  2 ++
> fs/quota/quota_v2.c     |  5 ++++-
> 4 files changed, 14 insertions(+), 12 deletions(-)
> 
> diff --git a/fs/ocfs2/quota_global.c b/fs/ocfs2/quota_global.c
> index 4134d557a8e5..6f7c5bd3cbc6 100644
> --- a/fs/ocfs2/quota_global.c
> +++ b/fs/ocfs2/quota_global.c
> @@ -443,13 +443,18 @@ static int __ocfs2_global_write_info(struct super_block *sb, int type)
> int ocfs2_global_write_info(struct super_block *sb, int type)

Would be nice if this was named ocfs2_global_write_file_info() so that
it showed up in searches more easily, though not a huge deal.  I guess
the same for ocfs2_local_read_file_info() and ocfs2_local_free_file_info(),
if we wanted to be consistent here, but that should go into a separate
patch.

> {
> 	int err;
> -	struct ocfs2_mem_dqinfo *info = sb_dqinfo(sb, type)->dqi_priv;
> +	struct quota_info *dqopt = sb_dqopt(sb);
> +	struct ocfs2_mem_dqinfo *info = dqopt->info[type].dqi_priv;
> 
> +	down_write(&dqopt->dqio_sem);
> 	err = ocfs2_qinfo_lock(info, 1);
> -	if (err < 0)
> +	if (err < 0) {
> +		up_write(&dqopt->dqio_sem);
> 		return err;
> +	}

		goto out_sem?

> 	err = __ocfs2_global_write_info(sb, type);
> 	ocfs2_qinfo_unlock(info, 1);

out_sem:

Other than this minor suggestion,

Reviewed-by: Andreas Dilger <adilger@dilger.ca>


> +	up_write(&dqopt->dqio_sem);
> 	return err;
> }
> 
> diff --git a/fs/quota/dquot.c b/fs/quota/dquot.c
> index 332f7026edad..1e1ff97098ec 100644
> --- a/fs/quota/dquot.c
> +++ b/fs/quota/dquot.c
> @@ -415,10 +415,8 @@ int dquot_acquire(struct dquot *dquot)
> 		ret = dqopt->ops[dquot->dq_id.type]->commit_dqblk(dquot);
> 		/* Write the info if needed */
> 		if (info_dirty(&dqopt->info[dquot->dq_id.type])) {
> -			down_write(&dqopt->dqio_sem);
> 			ret2 = dqopt->ops[dquot->dq_id.type]->write_file_info(
> 					dquot->dq_sb, dquot->dq_id.type);
> -			up_write(&dqopt->dqio_sem);
> 		}
> 		if (ret < 0)
> 			goto out_iolock;
> @@ -482,10 +480,8 @@ int dquot_release(struct dquot *dquot)
> 		ret = dqopt->ops[dquot->dq_id.type]->release_dqblk(dquot);
> 		/* Write the info */
> 		if (info_dirty(&dqopt->info[dquot->dq_id.type])) {
> -			down_write(&dqopt->dqio_sem);
> 			ret2 = dqopt->ops[dquot->dq_id.type]->write_file_info(
> 						dquot->dq_sb, dquot->dq_id.type);
> -			up_write(&dqopt->dqio_sem);
> 		}
> 		if (ret >= 0)
> 			ret = ret2;
> @@ -2054,13 +2050,9 @@ EXPORT_SYMBOL(dquot_transfer);
>  */
> int dquot_commit_info(struct super_block *sb, int type)
> {
> -	int ret;
> 	struct quota_info *dqopt = sb_dqopt(sb);
> 
> -	down_write(&dqopt->dqio_sem);
> -	ret = dqopt->ops[type]->write_file_info(sb, type);
> -	up_write(&dqopt->dqio_sem);
> -	return ret;
> +	return dqopt->ops[type]->write_file_info(sb, type);
> }
> EXPORT_SYMBOL(dquot_commit_info);
> 
> diff --git a/fs/quota/quota_v1.c b/fs/quota/quota_v1.c
> index 12d69cda57cc..fe68bf544b29 100644
> --- a/fs/quota/quota_v1.c
> +++ b/fs/quota/quota_v1.c
> @@ -186,6 +186,7 @@ static int v1_write_file_info(struct super_block *sb, int type)
> 	struct v1_disk_dqblk dqblk;
> 	int ret;
> 
> +	down_write(&dqopt->dqio_sem);
> 	dqopt->info[type].dqi_flags &= ~DQF_INFO_DIRTY;
> 	ret = sb->s_op->quota_read(sb, type, (char *)&dqblk,
> 				sizeof(struct v1_disk_dqblk), v1_dqoff(0));
> @@ -203,6 +204,7 @@ static int v1_write_file_info(struct super_block *sb, int type)
> 	else if (ret > 0)
> 		ret = -EIO;
> out:
> +	up_write(&dqopt->dqio_sem);
> 	return ret;
> }
> 
> diff --git a/fs/quota/quota_v2.c b/fs/quota/quota_v2.c
> index f82638e43c07..5e47012d2f57 100644
> --- a/fs/quota/quota_v2.c
> +++ b/fs/quota/quota_v2.c
> @@ -154,10 +154,12 @@ static int v2_read_file_info(struct super_block *sb, int type)
> static int v2_write_file_info(struct super_block *sb, int type)
> {
> 	struct v2_disk_dqinfo dinfo;
> -	struct mem_dqinfo *info = sb_dqinfo(sb, type);
> +	struct quota_info *dqopt = sb_dqopt(sb);
> +	struct mem_dqinfo *info = &dqopt->info[type];
> 	struct qtree_mem_dqinfo *qinfo = info->dqi_priv;
> 	ssize_t size;
> 
> +	down_write(&dqopt->dqio_sem);
> 	spin_lock(&dq_data_lock);
> 	info->dqi_flags &= ~DQF_INFO_DIRTY;
> 	dinfo.dqi_bgrace = cpu_to_le32(info->dqi_bgrace);
> @@ -170,6 +172,7 @@ static int v2_write_file_info(struct super_block *sb, int type)
> 	dinfo.dqi_free_entry = cpu_to_le32(qinfo->dqi_free_entry);
> 	size = sb->s_op->quota_write(sb, type, (char *)&dinfo,
> 	       sizeof(struct v2_disk_dqinfo), V2_DQINFOOFF);
> +	up_write(&dqopt->dqio_sem);
> 	if (size != sizeof(struct v2_disk_dqinfo)) {
> 		quota_error(sb, "Can't write info structure");
> 		return -1;
> --
> 2.12.3
> 


Cheers, Andreas






[-- Attachment #2: Message signed with OpenPGP --]
[-- Type: application/pgp-signature, Size: 195 bytes --]

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 14/27] quota: Push dqio_sem down to ->read_file_info()
  2017-08-16 15:41 ` [PATCH 14/27] quota: Push dqio_sem down to ->read_file_info() Jan Kara
@ 2017-08-16 17:57   ` Andreas Dilger
  0 siblings, 0 replies; 55+ messages in thread
From: Andreas Dilger @ 2017-08-16 17:57 UTC (permalink / raw)
  To: Jan Kara; +Cc: linux-fsdevel, Wang Shilong

[-- Attachment #1: Type: text/plain, Size: 2946 bytes --]

On Aug 16, 2017, at 9:41 AM, Jan Kara <jack@suse.cz> wrote:
> 
> Push down acquisition of dqio_sem into ->read_file_info() callback. This
> is for consistency with other operations and it also allows us to get
> rid of an ugliness in OCFS2.

Always nice to get rid of workarounds like this (which I admit I've had to
do similar things in the past as well).

Reviewed-by: Andreas Dilger <adilger@dilger.ca>

One minor cleanup suggestion below, but probably for a later patch.

> Signed-off-by: Jan Kara <jack@suse.cz>
> 
> diff --git a/fs/quota/quota_v2.c b/fs/quota/quota_v2.c
> index 5e47012d2f57..373d7cfea5b0 100644
> --- a/fs/quota/quota_v2.c
> +++ b/fs/quota/quota_v2.c
> @@ -90,29 +90,38 @@ static int v2_read_file_info(struct super_block *sb, int type)
> {
> 	struct v2_disk_dqinfo dinfo;
> 	struct v2_disk_dqheader dqhead;
> -	struct mem_dqinfo *info = sb_dqinfo(sb, type);
> +	struct quota_info *dqopt = sb_dqopt(sb);
> +	struct mem_dqinfo *info = &dqopt->info[type];
> 	struct qtree_mem_dqinfo *qinfo;
> 	ssize_t size;
> 	unsigned int version;
> +	int ret;
> 
> -	if (!v2_read_header(sb, type, &dqhead))
> -		return -1;
> +	down_read(&dqopt->dqio_sem);
> +	if (!v2_read_header(sb, type, &dqhead)) {
> +		ret = -1;

This is a bit strange to return -1 here, but -ENOMEM below.  The various
->quota_read() methods return -EIO or other proper error codes, so it
makes sense to return those back from v2_read_header() and onward here.

> +		goto out;
> +	}
> 	version = le32_to_cpu(dqhead.dqh_version);
> 	if ((info->dqi_fmt_id == QFMT_VFS_V0 && version != 0) ||
> -	    (info->dqi_fmt_id == QFMT_VFS_V1 && version != 1))
> -		return -1;
> +	    (info->dqi_fmt_id == QFMT_VFS_V1 && version != 1)) {
> +		ret = -1;

This could be -EINVAL or -EIO or similar.

> +		goto out;
> +	}
> 
> 	size = sb->s_op->quota_read(sb, type, (char *)&dinfo,
> 	       sizeof(struct v2_disk_dqinfo), V2_DQINFOOFF);
> 	if (size != sizeof(struct v2_disk_dqinfo)) {
> 		quota_error(sb, "Can't read info structure");
> -		return -1;
> +		ret = -1;
> +		goto out;

->quota_read() is returning a proper error code, except in the case of
short reads, which could be converted to -EIO or similar.

> 	}
> 	info->dqi_priv = kmalloc(sizeof(struct qtree_mem_dqinfo), GFP_NOFS);
> 	if (!info->dqi_priv) {
> 		printk(KERN_WARNING
> 		       "Not enough memory for quota information structure.\n");
> -		return -ENOMEM;
> +		ret = -ENOMEM;
> +		goto out;
> 	}
> 	qinfo = info->dqi_priv;
> 	if (version == 0) {
> @@ -147,7 +156,10 @@ static int v2_read_file_info(struct super_block *sb, int type)
> 		qinfo->dqi_entry_size = sizeof(struct v2r1_disk_dqblk);
> 		qinfo->dqi_ops = &v2r1_qtree_ops;
> 	}
> -	return 0;
> +	ret = 0;
> +out:
> +	up_read(&dqopt->dqio_sem);
> +	return ret;
> }
> 
> /* Write information header to quota file */
> --
> 2.12.3
> 


Cheers, Andreas






[-- Attachment #2: Message signed with OpenPGP --]
[-- Type: application/pgp-signature, Size: 195 bytes --]

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 15/27] quota: Fix error codes in v2_read_file_info()
  2017-08-16 15:41 ` [PATCH 15/27] quota: Fix error codes in v2_read_file_info() Jan Kara
@ 2017-08-16 18:00   ` Andreas Dilger
  0 siblings, 0 replies; 55+ messages in thread
From: Andreas Dilger @ 2017-08-16 18:00 UTC (permalink / raw)
  To: Jan Kara; +Cc: linux-fsdevel, Wang Shilong

[-- Attachment #1: Type: text/plain, Size: 1735 bytes --]


> On Aug 16, 2017, at 9:41 AM, Jan Kara <jack@suse.cz> wrote:
> 
> v2_read_file_info() returned -1 instead of proper error codes on error.
> Luckily this is not easily visible from userspace as we have called
> ->check_quota_file shortly before and thus already verified the quota
> file is sane. Still set the error codes to proper values.
> 
> Signed-off-by: Jan Kara <jack@suse.cz>

Doh, should read patches ahead before replying.  Still, could return the
error codes from ->quota_read() via v2_read_header() as previously suggested.

Reviewed-by: Andreas Dilger <adilger@dilger.ca>

> ---
> fs/quota/quota_v2.c | 6 +++---
> 1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/fs/quota/quota_v2.c b/fs/quota/quota_v2.c
> index 373d7cfea5b0..21a8bdf7cfec 100644
> --- a/fs/quota/quota_v2.c
> +++ b/fs/quota/quota_v2.c
> @@ -99,13 +99,13 @@ static int v2_read_file_info(struct super_block *sb, int type)
> 
> 	down_read(&dqopt->dqio_sem);
> 	if (!v2_read_header(sb, type, &dqhead)) {
> -		ret = -1;
> +		ret = -EIO;
> 		goto out;
> 	}
> 	version = le32_to_cpu(dqhead.dqh_version);
> 	if ((info->dqi_fmt_id == QFMT_VFS_V0 && version != 0) ||
> 	    (info->dqi_fmt_id == QFMT_VFS_V1 && version != 1)) {
> -		ret = -1;
> +		ret = -EINVAL;
> 		goto out;
> 	}
> 
> @@ -113,7 +113,7 @@ static int v2_read_file_info(struct super_block *sb, int type)
> 	       sizeof(struct v2_disk_dqinfo), V2_DQINFOOFF);
> 	if (size != sizeof(struct v2_disk_dqinfo)) {
> 		quota_error(sb, "Can't read info structure");
> -		ret = -1;
> +		ret = -EIO;
> 		goto out;
> 	}
> 	info->dqi_priv = kmalloc(sizeof(struct qtree_mem_dqinfo), GFP_NOFS);
> --
> 2.12.3
> 


Cheers, Andreas






[-- Attachment #2: Message signed with OpenPGP --]
[-- Type: application/pgp-signature, Size: 195 bytes --]

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 16/27] quota: Fix possible corruption of dqi_flags
  2017-08-16 15:41 ` [PATCH 16/27] quota: Fix possible corruption of dqi_flags Jan Kara
@ 2017-08-16 18:14   ` Andreas Dilger
  0 siblings, 0 replies; 55+ messages in thread
From: Andreas Dilger @ 2017-08-16 18:14 UTC (permalink / raw)
  To: Jan Kara; +Cc: linux-fsdevel, Wang Shilong

[-- Attachment #1: Type: text/plain, Size: 1962 bytes --]

On Aug 16, 2017, at 9:41 AM, Jan Kara <jack@suse.cz> wrote:
> 
> dqi_flags modifications are protected by dq_data_lock.

It would be useful to annotate dqi_flags with a comment that dq_data_lock
is protecting it (along with any other fields similarly protected).

> However the
> modifications in vfs_load_quota_inode() and in mark_info_dirty() were
> not which could lead to corruption of dqi_flags. Since modifications to
> dqi_flags are rare, this is hard to observe in practice but in theory it
> could happen. Fix the problem by always using dq_data_lock for
> protection.
> 

What about v1_write_file_info()?  It is also modifying dqi_flags, but
it doesn't get dq_data_lock either.

> Signed-off-by: Jan Kara <jack@suse.cz>
> ---
> fs/quota/dquot.c | 9 +++++++--
> 1 file changed, 7 insertions(+), 2 deletions(-)
> 
> diff --git a/fs/quota/dquot.c b/fs/quota/dquot.c
> index 5e77c4da69a6..e1a155e8db15 100644
> --- a/fs/quota/dquot.c
> +++ b/fs/quota/dquot.c
> @@ -389,7 +389,9 @@ static inline int clear_dquot_dirty(struct dquot *dquot)
> 
> void mark_info_dirty(struct super_block *sb, int type)
> {
> -	set_bit(DQF_INFO_DIRTY_B, &sb_dqopt(sb)->info[type].dqi_flags);
> +	spin_lock(&dq_data_lock);
> +	sb_dqopt(sb)->info[type].dqi_flags |= DQF_INFO_DIRTY;
> +	spin_unlock(&dq_data_lock);
> }
> EXPORT_SYMBOL(mark_info_dirty);
> 
> @@ -2316,8 +2318,11 @@ static int vfs_load_quota_inode(struct inode *inode, int type, int format_id,
> 	error = dqopt->ops[type]->read_file_info(sb, type);
> 	if (error < 0)
> 		goto out_file_init;
> -	if (dqopt->flags & DQUOT_QUOTA_SYS_FILE)
> +	if (dqopt->flags & DQUOT_QUOTA_SYS_FILE) {
> +		spin_lock(&dq_data_lock);
> 		dqopt->info[type].dqi_flags |= DQF_SYS_FILE;
> +		spin_unlock(&dq_data_lock);
> +	}
> 	spin_lock(&dq_state_lock);
> 	dqopt->flags |= dquot_state_flag(flags, type);
> 	spin_unlock(&dq_state_lock);
> --
> 2.12.3
> 


Cheers, Andreas






[-- Attachment #2: Message signed with OpenPGP --]
[-- Type: application/pgp-signature, Size: 195 bytes --]

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 19/27] quota: Move locking into clear_dquot_dirty()
  2017-08-16 15:41 ` [PATCH 19/27] quota: Move locking into clear_dquot_dirty() Jan Kara
@ 2017-08-16 18:29   ` Andreas Dilger
  0 siblings, 0 replies; 55+ messages in thread
From: Andreas Dilger @ 2017-08-16 18:29 UTC (permalink / raw)
  To: Jan Kara; +Cc: linux-fsdevel, Wang Shilong

[-- Attachment #1: Type: text/plain, Size: 1841 bytes --]

On Aug 16, 2017, at 9:41 AM, Jan Kara <jack@suse.cz> wrote:
> 
> Move locking of dq_list_lock into clear_dquot_dirty(). It makes the
> function more self-contained and will simplify our life later.
> 
> Signed-off-by: Jan Kara <jack@suse.cz>

Reviewed-by: Andreas Dilger <adilger@dilger.ca>

> ---
> fs/quota/dquot.c | 15 ++++++---------
> 1 file changed, 6 insertions(+), 9 deletions(-)
> 
> diff --git a/fs/quota/dquot.c b/fs/quota/dquot.c
> index 67f154231072..4a407f30922d 100644
> --- a/fs/quota/dquot.c
> +++ b/fs/quota/dquot.c
> @@ -375,12 +375,15 @@ static inline void dqput_all(struct dquot **dquot)
> 		dqput(dquot[cnt]);
> }
> 
> -/* This function needs dq_list_lock */
> static inline int clear_dquot_dirty(struct dquot *dquot)
> {
> -	if (!test_and_clear_bit(DQ_MOD_B, &dquot->dq_flags))
> +	spin_lock(&dq_list_lock);
> +	if (!test_and_clear_bit(DQ_MOD_B, &dquot->dq_flags)) {
> +		spin_unlock(&dq_list_lock);
> 		return 0;
> +	}
> 	list_del_init(&dquot->dq_dirty);
> +	spin_unlock(&dq_list_lock);
> 	return 1;
> }
> 
> @@ -445,12 +448,8 @@ int dquot_commit(struct dquot *dquot)
> 	struct quota_info *dqopt = sb_dqopt(dquot->dq_sb);
> 
> 	mutex_lock(&dquot->dq_lock);
> -	spin_lock(&dq_list_lock);
> -	if (!clear_dquot_dirty(dquot)) {
> -		spin_unlock(&dq_list_lock);
> +	if (!clear_dquot_dirty(dquot))
> 		goto out_lock;
> -	}
> -	spin_unlock(&dq_list_lock);
> 	/* Inactive dquot can be only if there was error during read/init
> 	 * => we have better not writing it */
> 	if (test_bit(DQ_ACTIVE_B, &dquot->dq_flags))
> @@ -766,9 +765,7 @@ void dqput(struct dquot *dquot)
> 			 * We clear dirty bit anyway, so that we avoid
> 			 * infinite loop here
> 			 */
> -			spin_lock(&dq_list_lock);
> 			clear_dquot_dirty(dquot);
> -			spin_unlock(&dq_list_lock);
> 		}
> 		goto we_slept;
> 	}
> --
> 2.12.3
> 


Cheers, Andreas






[-- Attachment #2: Message signed with OpenPGP --]
[-- Type: application/pgp-signature, Size: 195 bytes --]

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 25/27] quota: Inline dquot_[re]claim_reserved_space() into callsite
  2017-08-16 15:41 ` [PATCH 25/27] quota: Inline dquot_[re]claim_reserved_space() into callsite Jan Kara
@ 2017-08-16 19:52   ` Andreas Dilger
  0 siblings, 0 replies; 55+ messages in thread
From: Andreas Dilger @ 2017-08-16 19:52 UTC (permalink / raw)
  To: Jan Kara; +Cc: linux-fsdevel, Wang Shilong

[-- Attachment #1: Type: text/plain, Size: 2989 bytes --]

On Aug 16, 2017, at 9:41 AM, Jan Kara <jack@suse.cz> wrote:
> 
> dquot_claim_reserved_space() and dquot_reclaim_reserved_space() have
> only a single callsite. Inline them there.
> 
> Signed-off-by: Jan Kara <jack@suse.cz>
> ---
> fs/quota/dquot.c | 43 ++++++++++++++++++-------------------------
> 1 file changed, 18 insertions(+), 25 deletions(-)
> 
> diff --git a/fs/quota/dquot.c b/fs/quota/dquot.c
> index 038f70c1ebff..9f4cf78096e4 100644
> --- a/fs/quota/dquot.c
> +++ b/fs/quota/dquot.c
> @@ -1082,27 +1082,6 @@ static inline void dquot_resv_space(struct dquot *dquot, qsize_t number)
> 	dquot->dq_dqb.dqb_rsvspace += number;
> }
> 
> -/*
> - * Claim reserved quota space
> - */
> -static void dquot_claim_reserved_space(struct dquot *dquot, qsize_t number)
> -{
> -	if (dquot->dq_dqb.dqb_rsvspace < number) {
> -		WARN_ON_ONCE(1);
> -		number = dquot->dq_dqb.dqb_rsvspace;
> -	}
> -	dquot->dq_dqb.dqb_curspace += number;
> -	dquot->dq_dqb.dqb_rsvspace -= number;
> -}
> -
> -static void dquot_reclaim_reserved_space(struct dquot *dquot, qsize_t number)
> -{
> -	if (WARN_ON_ONCE(dquot->dq_dqb.dqb_curspace < number))
> -		number = dquot->dq_dqb.dqb_curspace;
> -	dquot->dq_dqb.dqb_rsvspace += number;
> -	dquot->dq_dqb.dqb_curspace -= number;
> -}
> -
> static inline
> void dquot_free_reserved_space(struct dquot *dquot, qsize_t number)
> {
> @@ -1726,8 +1705,16 @@ int dquot_claim_space_nodirty(struct inode *inode, qsize_t number)
> 	spin_lock(&dq_data_lock);
> 	/* Claim reserved quotas to allocated quotas */
> 	for (cnt = 0; cnt < MAXQUOTAS; cnt++) {
> -		if (dquots[cnt])
> -			dquot_claim_reserved_space(dquots[cnt], number);
> +		if (dquots[cnt]) {
> +			struct dquot *dquot = dquots[cnt];
> +
> +			if (dquot->dq_dqb.dqb_rsvspace < number) {
> +				WARN_ON_ONCE(1);

Here it checks if (rsvspace < number) and calls WARN_ON_ONCE(1), but below it
instead calls if (WARN_ON_ONCE(curspace < number)).  It probably makes sense for
these to be consistent (I prefer the second form).

Cheers, Andreas

> +				number = dquot->dq_dqb.dqb_rsvspace;
> +			}
> +			dquot->dq_dqb.dqb_curspace += number;
> +			dquot->dq_dqb.dqb_rsvspace -= number;
> +		}
> 	}
> 	/* Update inode bytes */
> 	spin_lock(&inode->i_lock);
> @@ -1762,8 +1749,14 @@ void dquot_reclaim_space_nodirty(struct inode *inode, qsize_t number)
> 	spin_lock(&dq_data_lock);
> 	/* Claim reserved quotas to allocated quotas */
> 	for (cnt = 0; cnt < MAXQUOTAS; cnt++) {
> -		if (dquots[cnt])
> -			dquot_reclaim_reserved_space(dquots[cnt], number);
> +		if (dquots[cnt]) {
> +			struct dquot *dquot = dquots[cnt];
> +
> +			if (WARN_ON_ONCE(dquot->dq_dqb.dqb_curspace < number))
> +				number = dquot->dq_dqb.dqb_curspace;
> +			dquot->dq_dqb.dqb_rsvspace += number;
> +			dquot->dq_dqb.dqb_curspace -= number;
> +		}
> 	}
> 	/* Update inode bytes */
> 	spin_lock(&inode->i_lock);
> --
> 2.12.3
> 


Cheers, Andreas






[-- Attachment #2: Message signed with OpenPGP --]
[-- Type: application/pgp-signature, Size: 195 bytes --]

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 27/27] quota: Reduce contention on dq_data_lock
  2017-08-16 15:41 ` [PATCH 27/27] quota: Reduce contention on dq_data_lock Jan Kara
@ 2017-08-16 20:17   ` Andreas Dilger
  2017-08-17 20:08     ` Jan Kara
  0 siblings, 1 reply; 55+ messages in thread
From: Andreas Dilger @ 2017-08-16 20:17 UTC (permalink / raw)
  To: Jan Kara; +Cc: linux-fsdevel, Wang Shilong

[-- Attachment #1: Type: text/plain, Size: 2145 bytes --]

On Aug 16, 2017, at 9:41 AM, Jan Kara <jack@suse.cz> wrote:
> 
> dq_data_lock is currently used to protect all modifications of quota
> accounting information, consistency of quota accounting on the inode,
> and dquot pointers from inode. As a result contention on the lock can be
> pretty heavy.
> 
> Reduce the contention on the lock by protecting quota accounting
> information by a new dquot->dq_dqb_lock and consistency of quota
> accounting with inode usage by inode->i_lock.
> 
> This change reduces time to create 500000 files on ext4 on ramdisk by 50
> different processes in separate directories by 6% when user quota is
> turned on. When those 50 processes belong to 50 different users, the
> improvement is about 9%.
> 
> Signed-off-by: Jan Kara <jack@suse.cz>
> 
> @@ -1604,33 +1628,41 @@ int __dquot_alloc_space(struct inode *inode, qsize_t number, int flags)
> 
> 	dquots = i_dquot(inode);
> 	index = srcu_read_lock(&dquot_srcu);
> -	spin_lock(&dq_data_lock);
> +	spin_lock(&inode->i_lock);
> 	for (cnt = 0; cnt < MAXQUOTAS; cnt++) {
> 		if (!dquots[cnt])
> 			continue;
> -		ret = check_bdq(dquots[cnt], number,
> -				!(flags & DQUOT_SPACE_WARN), &warn[cnt]);
> -		if (ret && !(flags & DQUOT_SPACE_NOFAIL)) {
> -			spin_unlock(&dq_data_lock);
> +		if (flags & DQUOT_SPACE_RESERVE) {
> +			ret = dquot_add_space(dquots[cnt], 0, number, flags,
> +					      &warn[cnt]);
> +		} else {
> +			ret = dquot_add_space(dquots[cnt], number, 0, flags,
> +					      &warn[cnt]);
> +		}

Minor nit - no need for braces around these single-line functions.

> diff --git a/include/linux/quota.h b/include/linux/quota.h
> index eccc1cb6274e..4b5180efff21 100644
> --- a/include/linux/quota.h
> +++ b/include/linux/quota.h
> @@ -298,6 +298,7 @@ struct dquot {
> 	struct list_head dq_free;	/* Free list element */
> 	struct list_head dq_dirty;	/* List of dirty dquots */
> 	struct mutex dq_lock;		/* dquot IO lock */
> +	spinlock_t dq_dqb_lock;		/* Lock protecting dq_dqb changes */

Might be good to have a comment at mem_dqblk that dq_lock is used for locking?

Cheers, Andreas






[-- Attachment #2: Message signed with OpenPGP --]
[-- Type: application/pgp-signature, Size: 195 bytes --]

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 01/27] quota: Convert dqio_mutex to rwsem
  2017-08-16 16:23   ` Andreas Dilger
@ 2017-08-17 16:47     ` Jan Kara
  0 siblings, 0 replies; 55+ messages in thread
From: Jan Kara @ 2017-08-17 16:47 UTC (permalink / raw)
  To: Andreas Dilger; +Cc: Jan Kara, linux-fsdevel, Wang Shilong

On Wed 16-08-17 10:23:22, Andreas Dilger wrote:
> On Aug 16, 2017, at 9:41 AM, Jan Kara <jack@suse.cz> wrote:
> > 
> > Convert dqio_mutex to rwsem and call it dqio_sem. No functional changes
> > yet.
> > 
> > Signed-off-by: Jan Kara <jack@suse.cz>
> > ---
> > fs/ext4/super.c         |  4 ++--
> > fs/ocfs2/quota_global.c | 20 ++++++++++----------
> > fs/ocfs2/quota_local.c  | 10 +++++-----
> > fs/quota/dquot.c        | 28 ++++++++++++++--------------
> > fs/quota/quota_tree.c   |  2 +-
> > fs/super.c              |  2 +-
> > include/linux/quota.h   |  2 +-
> > 7 files changed, 34 insertions(+), 34 deletions(-)
> > 
> > diff --git a/fs/ext4/super.c b/fs/ext4/super.c
> > index d61a70e2193a..8e0c27387ab7 100644
> > --- a/fs/ext4/super.c
> > +++ b/fs/ext4/super.c
> > @@ -5268,8 +5268,8 @@ static int ext4_statfs(struct dentry *dentry, struct kstatfs *buf)
> >   * Process 1                         Process 2
> >   * ext4_create()                     quota_sync()
> >   *   jbd2_journal_start()                  write_dquot()
> > - *   dquot_initialize()                         down(dqio_mutex)
> > - *     down(dqio_mutex)                    jbd2_journal_start()
> > + *   dquot_initialize()                         down(dqio_sem)
> > + *     down(dqio_sem)                           jbd2_journal_start()
> 
> Not a big deal, since this is a comment, but it should probably be changed
> to down_write(dqio_sem) on both lines, as this deadlock wouldn't happen if
> it was down_read(dqio_sem), and it is more consistent with the new usage.

Yes, fixed.

> That said, this comment doesn't appear to be relevant anymore.  At least I
> couldn't find where in those callpaths dqio_mutex/dqio_sem is used anymore.

The comment is still correct (at least at this point in the series) however
it misses a few entries on stack and some names have probably changed. I'll
update it. Thanks for review!

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 02/27] quota: Do more fine-grained locking in dquot_acquire()
  2017-08-16 16:35   ` Andreas Dilger
@ 2017-08-17 16:58     ` Jan Kara
  0 siblings, 0 replies; 55+ messages in thread
From: Jan Kara @ 2017-08-17 16:58 UTC (permalink / raw)
  To: Andreas Dilger; +Cc: Jan Kara, linux-fsdevel, Wang Shilong

On Wed 16-08-17 10:35:48, Andreas Dilger wrote:
> On Aug 16, 2017, at 9:41 AM, Jan Kara <jack@suse.cz> wrote:
> > 
> > We need dqio_sem held just for reading when calling ->read_dqblk() in
> > dquot_acquire(). Also dqio_sem is not needed when manipulating dquot
> > flags (those are protected by dq_lock)
> 
> Nothing against the patch itself, but this comment is a bit confusing.
> 
> It would be good to list this dq_lock dependency at the dq_flags declaration.
> Looking through the code that accesses dq_flags, I see dquot_mark_dquot_dirty(),
> clear_dquot_dirty(), dquot_commit(), dqput() depend on dq_list_lock, but
> I don't see many of the users besides dquot_acquire() and dquot_release()
> getting dq_lock, only using the atomic bit operations on dq_flags.

Updated changelog to:

We need dqio_sem held just for reading when calling ->read_dqblk() in
dquot_acquire(). Also dqio_sem is not needed when setting DQ_READ_B and 
DQ_ACTIVE_B as concurrent reads and dquot activation are serialized by
dq_lock. So acquire and release dqio_sem closer to the place where it is
needed. This reduces lock hold time and will make locking changes
easier.

								Honza

-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 12/27] quota: Push dqio_sem down to ->get_next_id()
  2017-08-16 17:08   ` Andreas Dilger
@ 2017-08-17 17:09     ` Jan Kara
  0 siblings, 0 replies; 55+ messages in thread
From: Jan Kara @ 2017-08-17 17:09 UTC (permalink / raw)
  To: Andreas Dilger; +Cc: Jan Kara, linux-fsdevel, Wang Shilong

On Wed 16-08-17 11:08:12, Andreas Dilger wrote:
> On Aug 16, 2017, at 9:41 AM, Jan Kara <jack@suse.cz> wrote:
> > 
> > Push down acquisition of dqio_sem into ->get_next_id() callback. Mostly
> > for consistency with other operations.
> > 
> > Signed-off-by: Jan Kara <jack@suse.cz>
> 
> I was initially thinking that this patch also needed to modify the
> filesystem-specific code in a similar way, but ext4 and reiserfs
> use dquot_get_next_id() internally, ocfs2 appears to have its own
> locking, and XFS is doing completely its own thing.
> 
> It is a bit confusing that there are both dquot_operation.get_next_id
> as well as quota_format_ops.get_next_id methods that are similar but
> not totally the same thing.  It might make sense to rename the
> quota_format_ops method like .get_next_id_format or similar to be able
> to distinguish them?

Yeah, they are two different levels of abstraction of the same operation. I
agree that the fact that names are matching is a bit confusing. I'll think
how to differentiate these names...

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 13/27] quota: Push dqio_sem down to ->write_file_info()
  2017-08-16 17:33   ` Andreas Dilger
@ 2017-08-17 17:13     ` Jan Kara
  0 siblings, 0 replies; 55+ messages in thread
From: Jan Kara @ 2017-08-17 17:13 UTC (permalink / raw)
  To: Andreas Dilger; +Cc: Jan Kara, linux-fsdevel, Wang Shilong, Joel Becker

On Wed 16-08-17 11:33:37, Andreas Dilger wrote:
> On Aug 16, 2017, at 9:41 AM, Jan Kara <jack@suse.cz> wrote:
> > 
> > Push down acquisition of dqio_sem into ->write_file_info() callback.
> > Mostly for consistency with other operations.
> > 
> > Signed-off-by: Jan Kara <jack@suse.cz>
> > ---
> > fs/ocfs2/quota_global.c |  9 +++++++--
> > fs/quota/dquot.c        | 10 +---------
> > fs/quota/quota_v1.c     |  2 ++
> > fs/quota/quota_v2.c     |  5 ++++-
> > 4 files changed, 14 insertions(+), 12 deletions(-)
> > 
> > diff --git a/fs/ocfs2/quota_global.c b/fs/ocfs2/quota_global.c
> > index 4134d557a8e5..6f7c5bd3cbc6 100644
> > --- a/fs/ocfs2/quota_global.c
> > +++ b/fs/ocfs2/quota_global.c
> > @@ -443,13 +443,18 @@ static int __ocfs2_global_write_info(struct super_block *sb, int type)
> > int ocfs2_global_write_info(struct super_block *sb, int type)
> 
> Would be nice if this was named ocfs2_global_write_file_info() so that
> it showed up in searches more easily, though not a huge deal.  I guess
> the same for ocfs2_local_read_file_info() and ocfs2_local_free_file_info(),
> if we wanted to be consistent here, but that should go into a separate
> patch.

Well, rather than adding 'file' into the name I would add 'quota'
somewhere. But yes, the name seems too generic.

> > {
> > 	int err;
> > -	struct ocfs2_mem_dqinfo *info = sb_dqinfo(sb, type)->dqi_priv;
> > +	struct quota_info *dqopt = sb_dqopt(sb);
> > +	struct ocfs2_mem_dqinfo *info = dqopt->info[type].dqi_priv;
> > 
> > +	down_write(&dqopt->dqio_sem);
> > 	err = ocfs2_qinfo_lock(info, 1);
> > -	if (err < 0)
> > +	if (err < 0) {
> > +		up_write(&dqopt->dqio_sem);
> > 		return err;
> > +	}
> 
> 		goto out_sem?
> 
> > 	err = __ocfs2_global_write_info(sb, type);
> > 	ocfs2_qinfo_unlock(info, 1);
> 
> out_sem:

Thanks, done.

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 26/27] fs: Provide __inode_get_bytes()
  2017-08-16 16:12   ` Andreas Dilger
@ 2017-08-17 20:04     ` Jan Kara
  0 siblings, 0 replies; 55+ messages in thread
From: Jan Kara @ 2017-08-17 20:04 UTC (permalink / raw)
  To: Andreas Dilger; +Cc: Jan Kara, linux-fsdevel, Wang Shilong

On Wed 16-08-17 10:12:41, Andreas Dilger wrote:
> On Aug 16, 2017, at 9:41 AM, Jan Kara <jack@suse.cz> wrote:
> > 
> > Provide helper __inode_get_bytes() which assumes i_lock is already
> > acquired. Quota code will need this to be able to use i_lock to protect
> > consistency of quota accounting information and inode usage.
> 
> Minor nit - it's better to give functions like this a useful name, like
> inode_get_bytes_locked(), since "__" doesn't really convey any semantic info,
> and also messes up function sorting order (e.g. tag completion on inode_get_bytes
> will not show __inode_get_bytes() but would show inode_get_bytes_locked()).

Well, I did this to be consistent with __inode_add_bytes(),
__inode_sub_bytes(), etc.

								Honza

> 
> Cheers, Andreas
> 
> > Signed-off-by: Jan Kara <jack@suse.cz>
> > ---
> > fs/stat.c          | 2 +-
> > include/linux/fs.h | 4 ++++
> > 2 files changed, 5 insertions(+), 1 deletion(-)
> > 
> > diff --git a/fs/stat.c b/fs/stat.c
> > index c35610845ab1..8a6aa8caf891 100644
> > --- a/fs/stat.c
> > +++ b/fs/stat.c
> > @@ -710,7 +710,7 @@ loff_t inode_get_bytes(struct inode *inode)
> > 	loff_t ret;
> > 
> > 	spin_lock(&inode->i_lock);
> > -	ret = (((loff_t)inode->i_blocks) << 9) + inode->i_bytes;
> > +	ret = __inode_get_bytes(inode);
> > 	spin_unlock(&inode->i_lock);
> > 	return ret;
> > }
> > diff --git a/include/linux/fs.h b/include/linux/fs.h
> > index 6e1fd5d21248..d6e9ab7f184f 100644
> > --- a/include/linux/fs.h
> > +++ b/include/linux/fs.h
> > @@ -2998,6 +2998,10 @@ void __inode_add_bytes(struct inode *inode, loff_t bytes);
> > void inode_add_bytes(struct inode *inode, loff_t bytes);
> > void __inode_sub_bytes(struct inode *inode, loff_t bytes);
> > void inode_sub_bytes(struct inode *inode, loff_t bytes);
> > +static inline loff_t __inode_get_bytes(struct inode *inode)
> > +{
> > +	return (((loff_t)inode->i_blocks) << 9) + inode->i_bytes;
> > +}
> > loff_t inode_get_bytes(struct inode *inode);
> > void inode_set_bytes(struct inode *inode, loff_t bytes);
> > const char *simple_get_link(struct dentry *, struct inode *,
> > --
> > 2.12.3
> > 
> 
> 
> Cheers, Andreas
> 
> 
> 
> 
> 


-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 27/27] quota: Reduce contention on dq_data_lock
  2017-08-16 20:17   ` Andreas Dilger
@ 2017-08-17 20:08     ` Jan Kara
  0 siblings, 0 replies; 55+ messages in thread
From: Jan Kara @ 2017-08-17 20:08 UTC (permalink / raw)
  To: Andreas Dilger; +Cc: Jan Kara, linux-fsdevel, Wang Shilong

On Wed 16-08-17 14:17:23, Andreas Dilger wrote:
> On Aug 16, 2017, at 9:41 AM, Jan Kara <jack@suse.cz> wrote:
> > 
> > dq_data_lock is currently used to protect all modifications of quota
> > accounting information, consistency of quota accounting on the inode,
> > and dquot pointers from inode. As a result contention on the lock can be
> > pretty heavy.
> > 
> > Reduce the contention on the lock by protecting quota accounting
> > information by a new dquot->dq_dqb_lock and consistency of quota
> > accounting with inode usage by inode->i_lock.
> > 
> > This change reduces time to create 500000 files on ext4 on ramdisk by 50
> > different processes in separate directories by 6% when user quota is
> > turned on. When those 50 processes belong to 50 different users, the
> > improvement is about 9%.
> > 
> > Signed-off-by: Jan Kara <jack@suse.cz>
> > 
> > @@ -1604,33 +1628,41 @@ int __dquot_alloc_space(struct inode *inode, qsize_t number, int flags)
> > 
> > 	dquots = i_dquot(inode);
> > 	index = srcu_read_lock(&dquot_srcu);
> > -	spin_lock(&dq_data_lock);
> > +	spin_lock(&inode->i_lock);
> > 	for (cnt = 0; cnt < MAXQUOTAS; cnt++) {
> > 		if (!dquots[cnt])
> > 			continue;
> > -		ret = check_bdq(dquots[cnt], number,
> > -				!(flags & DQUOT_SPACE_WARN), &warn[cnt]);
> > -		if (ret && !(flags & DQUOT_SPACE_NOFAIL)) {
> > -			spin_unlock(&dq_data_lock);
> > +		if (flags & DQUOT_SPACE_RESERVE) {
> > +			ret = dquot_add_space(dquots[cnt], 0, number, flags,
> > +					      &warn[cnt]);
> > +		} else {
> > +			ret = dquot_add_space(dquots[cnt], number, 0, flags,
> > +					      &warn[cnt]);
> > +		}
> 
> Minor nit - no need for braces around these single-line functions.

Yeah, but I prefer braces in case lines are wrapped...

> > diff --git a/include/linux/quota.h b/include/linux/quota.h
> > index eccc1cb6274e..4b5180efff21 100644
> > --- a/include/linux/quota.h
> > +++ b/include/linux/quota.h
> > @@ -298,6 +298,7 @@ struct dquot {
> > 	struct list_head dq_free;	/* Free list element */
> > 	struct list_head dq_dirty;	/* List of dirty dquots */
> > 	struct mutex dq_lock;		/* dquot IO lock */
> > +	spinlock_t dq_dqb_lock;		/* Lock protecting dq_dqb changes */
> 
> Might be good to have a comment at mem_dqblk that dq_lock is used for locking?

OK, will do.

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 55+ messages in thread

* [PATCH 02/27] quota: Do more fine-grained locking in dquot_acquire()
  2017-08-21 11:54 [PATCH 0/27 v2] Quota scalability patches Jan Kara
@ 2017-08-21 11:54 ` Jan Kara
  0 siblings, 0 replies; 55+ messages in thread
From: Jan Kara @ 2017-08-21 11:54 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: Wang Shilong, Andreas Dilger, Jan Kara

We need dqio_sem held just for reading when calling ->read_dqblk() in
dquot_acquire(). Also dqio_sem is not needed when setting DQ_READ_B and
DQ_ACTIVE_B as concurrent reads and dquot activations are serialized by
dq_lock. So acquire and release dqio_sem closer to the place where it is
needed. This reduces lock hold time and will make locking changes
easier.

Signed-off-by: Jan Kara <jack@suse.cz>
---
 fs/quota/dquot.c | 9 ++++++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/fs/quota/dquot.c b/fs/quota/dquot.c
index 29d447598642..21358f31923d 100644
--- a/fs/quota/dquot.c
+++ b/fs/quota/dquot.c
@@ -406,9 +406,11 @@ int dquot_acquire(struct dquot *dquot)
 	struct quota_info *dqopt = sb_dqopt(dquot->dq_sb);
 
 	mutex_lock(&dquot->dq_lock);
-	down_write(&dqopt->dqio_sem);
-	if (!test_bit(DQ_READ_B, &dquot->dq_flags))
+	if (!test_bit(DQ_READ_B, &dquot->dq_flags)) {
+		down_read(&dqopt->dqio_sem);
 		ret = dqopt->ops[dquot->dq_id.type]->read_dqblk(dquot);
+		up_read(&dqopt->dqio_sem);
+	}
 	if (ret < 0)
 		goto out_iolock;
 	/* Make sure flags update is visible after dquot has been filled */
@@ -416,12 +418,14 @@ int dquot_acquire(struct dquot *dquot)
 	set_bit(DQ_READ_B, &dquot->dq_flags);
 	/* Instantiate dquot if needed */
 	if (!test_bit(DQ_ACTIVE_B, &dquot->dq_flags) && !dquot->dq_off) {
+		down_write(&dqopt->dqio_sem);
 		ret = dqopt->ops[dquot->dq_id.type]->commit_dqblk(dquot);
 		/* Write the info if needed */
 		if (info_dirty(&dqopt->info[dquot->dq_id.type])) {
 			ret2 = dqopt->ops[dquot->dq_id.type]->write_file_info(
 					dquot->dq_sb, dquot->dq_id.type);
 		}
+		up_write(&dqopt->dqio_sem);
 		if (ret < 0)
 			goto out_iolock;
 		if (ret2 < 0) {
@@ -436,7 +440,6 @@ int dquot_acquire(struct dquot *dquot)
 	smp_mb__before_atomic();
 	set_bit(DQ_ACTIVE_B, &dquot->dq_flags);
 out_iolock:
-	up_write(&dqopt->dqio_sem);
 	mutex_unlock(&dquot->dq_lock);
 	return ret;
 }
-- 
2.12.3

^ permalink raw reply related	[flat|nested] 55+ messages in thread

end of thread, other threads:[~2017-08-21 11:55 UTC | newest]

Thread overview: 55+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-08-16 15:41 [PATCH 0/27 v1] Quota scalability patches Jan Kara
2017-08-16 15:41 ` [PATCH 01/27] quota: Convert dqio_mutex to rwsem Jan Kara
2017-08-16 16:23   ` Andreas Dilger
2017-08-17 16:47     ` Jan Kara
2017-08-16 15:41 ` [PATCH 02/27] quota: Do more fine-grained locking in dquot_acquire() Jan Kara
2017-08-16 16:35   ` Andreas Dilger
2017-08-17 16:58     ` Jan Kara
2017-08-16 15:41 ` [PATCH 03/27] quota: Acquire dqio_sem for reading in dquot_get_next_id() Jan Kara
2017-08-16 16:37   ` Andreas Dilger
2017-08-16 15:41 ` [PATCH 04/27] quota: Acquire dqio_sem for reading in vfs_load_quota_inode() Jan Kara
2017-08-16 16:38   ` Andreas Dilger
2017-08-16 15:41 ` [PATCH 05/27] quota: Protect dquot writeout with dq_lock Jan Kara
2017-08-16 16:44   ` Andreas Dilger
2017-08-16 15:41 ` [PATCH 06/27] quota: Push dqio_sem down to ->read_dqblk() Jan Kara
2017-08-16 16:46   ` Andreas Dilger
2017-08-16 15:41 ` [PATCH 07/27] quota: Remove locking for reading from the old quota format Jan Kara
2017-08-16 16:47   ` Andreas Dilger
2017-08-16 15:41 ` [PATCH 08/27] quota: Push dqio_sem down to ->write_dqblk() Jan Kara
2017-08-16 16:48   ` Andreas Dilger
2017-08-16 15:41 ` [PATCH 09/27] quota: Do not acquire dqio_sem for dquot overwrites in v2 format Jan Kara
2017-08-16 16:49   ` Andreas Dilger
2017-08-16 15:41 ` [PATCH 10/27] quota: Remove locking for writing to the old quota format Jan Kara
2017-08-16 16:50   ` Andreas Dilger
2017-08-16 15:41 ` [PATCH 11/27] quota: Push dqio_sem down to ->release_dqblk() Jan Kara
2017-08-16 16:56   ` Andreas Dilger
2017-08-16 15:41 ` [PATCH 12/27] quota: Push dqio_sem down to ->get_next_id() Jan Kara
2017-08-16 17:08   ` Andreas Dilger
2017-08-17 17:09     ` Jan Kara
2017-08-16 15:41 ` [PATCH 13/27] quota: Push dqio_sem down to ->write_file_info() Jan Kara
2017-08-16 17:33   ` Andreas Dilger
2017-08-17 17:13     ` Jan Kara
2017-08-16 15:41 ` [PATCH 14/27] quota: Push dqio_sem down to ->read_file_info() Jan Kara
2017-08-16 17:57   ` Andreas Dilger
2017-08-16 15:41 ` [PATCH 15/27] quota: Fix error codes in v2_read_file_info() Jan Kara
2017-08-16 18:00   ` Andreas Dilger
2017-08-16 15:41 ` [PATCH 16/27] quota: Fix possible corruption of dqi_flags Jan Kara
2017-08-16 18:14   ` Andreas Dilger
2017-08-16 15:41 ` [PATCH 17/27] quota: Drop return value of mark_all_dquot_dirty() Jan Kara
2017-08-16 15:41 ` [PATCH 18/27] quota: Do not dirty bad dquots Jan Kara
2017-08-16 15:41 ` [PATCH 19/27] quota: Move locking into clear_dquot_dirty() Jan Kara
2017-08-16 18:29   ` Andreas Dilger
2017-08-16 15:41 ` [PATCH 20/27] quota: Remove dq_wait_unused from dquot Jan Kara
2017-08-16 15:41 ` [PATCH 21/27] quota: Allow disabling tracking of dirty dquots in a list Jan Kara
2017-08-16 15:41 ` [PATCH 22/27] ext4: Disable dirty list tracking of dquots when journalling quotas Jan Kara
2017-08-16 15:41 ` [PATCH 23/27] quota: Inline functions into their callsites Jan Kara
2017-08-16 15:41 ` [PATCH 24/27] quota: Inline inode_{incr,decr}_space() into callsites Jan Kara
2017-08-16 15:41 ` [PATCH 25/27] quota: Inline dquot_[re]claim_reserved_space() into callsite Jan Kara
2017-08-16 19:52   ` Andreas Dilger
2017-08-16 15:41 ` [PATCH 26/27] fs: Provide __inode_get_bytes() Jan Kara
2017-08-16 16:12   ` Andreas Dilger
2017-08-17 20:04     ` Jan Kara
2017-08-16 15:41 ` [PATCH 27/27] quota: Reduce contention on dq_data_lock Jan Kara
2017-08-16 20:17   ` Andreas Dilger
2017-08-17 20:08     ` Jan Kara
2017-08-21 11:54 [PATCH 0/27 v2] Quota scalability patches Jan Kara
2017-08-21 11:54 ` [PATCH 02/27] quota: Do more fine-grained locking in dquot_acquire() Jan Kara

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).