All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCHv2 REBASED 0/2] jbd2: Kill t_handle_lock transaction spinlock
@ 2022-02-16  7:00 Ritesh Harjani
  2022-02-16  7:00 ` [PATCHv2 REBASED 1/2] " Ritesh Harjani
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: Ritesh Harjani @ 2022-02-16  7:00 UTC (permalink / raw)
  To: linux-ext4
  Cc: Jan Kara, Theodore Ts'o, Andreas Dilger, linux-fsdevel,
	Ritesh Harjani

Hello,

This is mainly just rebasing the patch series on top of recent use-after-free
fix submitted [4], due to conflict in function jbd2_journal_wait_updates().
No other changes apart from that.


Testing
========
I have again tested xfstests with -g log,metadata,auto group with 4k bs
config (COFIG_KASAN disabled). I haven't found any regression due to these
patches in my testing.


Original cover letter
======================
This small patch series kills the age-old t_handle_lock transaction spinlock,
which on careful inspection, came out to be not very useful anymore.
At some of the places it isn't required at all now and for the rest
(e.g. update_t_max_wait()), we could make use of atomic cmpxchg to make the
code path lockless.

This was tested with fstests with -g quick and -g log on my qemu setup.
I had also done some extensive fsmark testing to see that we don't see any
bottleneck resulting from removal of CONFIG_JBD2_DEBUG to update t_max_wait
in patch-2. None of my test showed any bottleneck.

Note that there had been several patches in the past over time which had led to
t_handle_lock becoming obselete now e.g. [1-2]
In this work, couple of the code paths to remove this spinlock were observed
while doing code review and to get completely rid of it was something which was
suggested by Jan [3].
Thanks to Jan for thorough review and suggestions :)


References
===========
[v1]: https://lore.kernel.org/all/cover.1642953021.git.riteshh@linux.ibm.com/
[1]: https://lore.kernel.org/linux-ext4/1280939957-3277-4-git-send-email-tytso@mit.edu/
[2]: https://lore.kernel.org/linux-ext4/20120103153245.GE31457@quack.suse.cz/
[3]: https://lore.kernel.org/linux-ext4/20220113112749.d5tfszcksvxvshnn@quack3.lan/
[4]: https://lore.kernel.org/all/948c2fed518ae739db6a8f7f83f1d58b504f87d0.1644497105.git.ritesh.list@gmail.com/


Ritesh Harjani (2):
  jbd2: Kill t_handle_lock transaction spinlock
  jbd2: Remove CONFIG_JBD2_DEBUG to update t_max_wait

 fs/jbd2/transaction.c | 35 ++++++++++++-----------------------
 include/linux/jbd2.h  |  3 ---
 2 files changed, 12 insertions(+), 26 deletions(-)

--
2.31.1


^ permalink raw reply	[flat|nested] 4+ messages in thread

* [PATCHv2 REBASED 1/2] jbd2: Kill t_handle_lock transaction spinlock
  2022-02-16  7:00 [PATCHv2 REBASED 0/2] jbd2: Kill t_handle_lock transaction spinlock Ritesh Harjani
@ 2022-02-16  7:00 ` Ritesh Harjani
  2022-02-16  7:00 ` [PATCHv2 REBASED 2/2] jbd2: Remove CONFIG_JBD2_DEBUG to update t_max_wait Ritesh Harjani
  2022-03-03  0:41 ` [PATCHv2 REBASED 0/2] jbd2: Kill t_handle_lock transaction spinlock Theodore Ts'o
  2 siblings, 0 replies; 4+ messages in thread
From: Ritesh Harjani @ 2022-02-16  7:00 UTC (permalink / raw)
  To: linux-ext4
  Cc: Jan Kara, Theodore Ts'o, Andreas Dilger, linux-fsdevel,
	Ritesh Harjani

This patch kills t_handle_lock transaction spinlock completely from
jbd2.
To explain the reasoning, currently there were three sites at which this
spinlock was used.
1. jbd2_journal_wait_updates()
   a. Based on careful code review it can be seen that, we don't need this
      lock here. This is since we wait for any currently ongoing updates
      based on a atomic variable t_updates. And we anyway don't take any
      t_handle_lock while in stop_this_handle().
      i.e.

	write_lock(&journal->j_state_lock()
	jbd2_journal_wait_updates() 			stop_this_handle()
		while (atomic_read(txn->t_updates) { 		|
		DEFINE_WAIT(wait); 				|
		prepare_to_wait(); 				|
		if (atomic_read(txn->t_updates) 		if (atomic_dec_and_test(txn->t_updates))
			write_unlock(&journal->j_state_lock);
			schedule();					wake_up()
			write_lock(&journal->j_state_lock);
		finish_wait();
	   }
	txn->t_state = T_COMMIT
	write_unlock(&journal->j_state_lock);

   b.  Also note that between atomic_inc(&txn->t_updates) in
       start_this_handle() and jbd2_journal_wait_updates(), the
       synchronization happens via read_lock(journal->j_state_lock) in
       start_this_handle();

2. jbd2_journal_extend()
   a. jbd2_journal_extend() is called with the handle of each process from
      task_struct. So no lock required in updating member fields of handle_t

   b. For member fields of h_transaction, all updates happens only via
      atomic APIs (which is also within read_lock()).
      So, no need of this transaction spinlock.

3. update_t_max_wait()
   Based on Jan suggestion, this can be carefully removed using atomic
   cmpxchg API.
   Note that there can be several processes which are waiting for a new
   transaction to be allocated and started. For doing this only one
   process will succeed in taking write_lock() and allocating a new txn.
   After that all of the process will be updating the t_max_wait (max
   transaction wait time). This can be done via below method w/o taking
   any locks using atomic cmpxchg.
   For more details refer [1]

	   new = get_new_val();
	   old = READ_ONCE(ptr->max_val);
	   while (old < new)
		old = cmpxchg(&ptr->max_val, old, new);

[1]: https://lwn.net/Articles/849237/

Suggested-by: Jan Kara <jack@suse.cz>
Signed-off-by: Ritesh Harjani <riteshh@linux.ibm.com>
Reviewed-by: Jan Kara <jack@suse.cz>
---
 fs/jbd2/transaction.c | 28 +++++++++-------------------
 include/linux/jbd2.h  |  3 ---
 2 files changed, 9 insertions(+), 22 deletions(-)

diff --git a/fs/jbd2/transaction.c b/fs/jbd2/transaction.c
index 259e00046a8b..83801a8be078 100644
--- a/fs/jbd2/transaction.c
+++ b/fs/jbd2/transaction.c
@@ -107,7 +107,6 @@ static void jbd2_get_transaction(journal_t *journal,
 	transaction->t_start_time = ktime_get();
 	transaction->t_tid = journal->j_transaction_sequence++;
 	transaction->t_expires = jiffies + journal->j_commit_interval;
-	spin_lock_init(&transaction->t_handle_lock);
 	atomic_set(&transaction->t_updates, 0);
 	atomic_set(&transaction->t_outstanding_credits,
 		   jbd2_descriptor_blocks_per_trans(journal) +
@@ -139,24 +138,21 @@ static void jbd2_get_transaction(journal_t *journal,
 /*
  * Update transaction's maximum wait time, if debugging is enabled.
  *
- * In order for t_max_wait to be reliable, it must be protected by a
- * lock.  But doing so will mean that start_this_handle() can not be
- * run in parallel on SMP systems, which limits our scalability.  So
- * unless debugging is enabled, we no longer update t_max_wait, which
- * means that maximum wait time reported by the jbd2_run_stats
- * tracepoint will always be zero.
+ * t_max_wait is carefully updated here with use of atomic compare exchange.
+ * Note that there could be multiplre threads trying to do this simultaneously
+ * hence using cmpxchg to avoid any use of locks in this case.
  */
 static inline void update_t_max_wait(transaction_t *transaction,
 				     unsigned long ts)
 {
 #ifdef CONFIG_JBD2_DEBUG
+	unsigned long oldts, newts;
 	if (jbd2_journal_enable_debug &&
 	    time_after(transaction->t_start, ts)) {
-		ts = jbd2_time_diff(ts, transaction->t_start);
-		spin_lock(&transaction->t_handle_lock);
-		if (ts > transaction->t_max_wait)
-			transaction->t_max_wait = ts;
-		spin_unlock(&transaction->t_handle_lock);
+		newts = jbd2_time_diff(ts, transaction->t_start);
+		oldts = READ_ONCE(transaction->t_max_wait);
+		while (oldts < newts)
+			oldts = cmpxchg(&transaction->t_max_wait, oldts, newts);
 	}
 #endif
 }
@@ -690,7 +686,6 @@ int jbd2_journal_extend(handle_t *handle, int nblocks, int revoke_records)
 		DIV_ROUND_UP(
 			handle->h_revoke_credits_requested,
 			journal->j_revoke_records_per_block);
-	spin_lock(&transaction->t_handle_lock);
 	wanted = atomic_add_return(nblocks,
 				   &transaction->t_outstanding_credits);
 
@@ -698,7 +693,7 @@ int jbd2_journal_extend(handle_t *handle, int nblocks, int revoke_records)
 		jbd_debug(3, "denied handle %p %d blocks: "
 			  "transaction too large\n", handle, nblocks);
 		atomic_sub(nblocks, &transaction->t_outstanding_credits);
-		goto unlock;
+		goto error_out;
 	}
 
 	trace_jbd2_handle_extend(journal->j_fs_dev->bd_dev,
@@ -714,8 +709,6 @@ int jbd2_journal_extend(handle_t *handle, int nblocks, int revoke_records)
 	result = 0;
 
 	jbd_debug(3, "extended handle %p by %d\n", handle, nblocks);
-unlock:
-	spin_unlock(&transaction->t_handle_lock);
 error_out:
 	read_unlock(&journal->j_state_lock);
 	return result;
@@ -860,15 +853,12 @@ void jbd2_journal_wait_updates(journal_t *journal)
 		if (!transaction)
 			break;
 
-		spin_lock(&transaction->t_handle_lock);
 		prepare_to_wait(&journal->j_wait_updates, &wait,
 				TASK_UNINTERRUPTIBLE);
 		if (!atomic_read(&transaction->t_updates)) {
-			spin_unlock(&transaction->t_handle_lock);
 			finish_wait(&journal->j_wait_updates, &wait);
 			break;
 		}
-		spin_unlock(&transaction->t_handle_lock);
 		write_unlock(&journal->j_state_lock);
 		schedule();
 		finish_wait(&journal->j_wait_updates, &wait);
diff --git a/include/linux/jbd2.h b/include/linux/jbd2.h
index 9c3ada74ffb1..a787872e1e86 100644
--- a/include/linux/jbd2.h
+++ b/include/linux/jbd2.h
@@ -554,9 +554,6 @@ struct transaction_chp_stats_s {
  *    ->j_list_lock
  *
  *    j_state_lock
- *    ->t_handle_lock
- *
- *    j_state_lock
  *    ->j_list_lock			(journal_unmap_buffer)
  *
  */
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [PATCHv2 REBASED 2/2] jbd2: Remove CONFIG_JBD2_DEBUG to update t_max_wait
  2022-02-16  7:00 [PATCHv2 REBASED 0/2] jbd2: Kill t_handle_lock transaction spinlock Ritesh Harjani
  2022-02-16  7:00 ` [PATCHv2 REBASED 1/2] " Ritesh Harjani
@ 2022-02-16  7:00 ` Ritesh Harjani
  2022-03-03  0:41 ` [PATCHv2 REBASED 0/2] jbd2: Kill t_handle_lock transaction spinlock Theodore Ts'o
  2 siblings, 0 replies; 4+ messages in thread
From: Ritesh Harjani @ 2022-02-16  7:00 UTC (permalink / raw)
  To: linux-ext4
  Cc: Jan Kara, Theodore Ts'o, Andreas Dilger, linux-fsdevel,
	Ritesh Harjani

CONFIG_JBD2_DEBUG and jbd2_journal_enable_debug knobs were added in
update_t_max_wait(), since earlier it used to take a spinlock for updating
t_max_wait, which could cause a bottleneck while starting a txn
(start_this_handle()).
Since in previous patch, we have killed t_handle_lock completely, we
could get rid of this debug config and knob to let t_max_wait be updated
by default again.

Signed-off-by: Ritesh Harjani <riteshh@linux.ibm.com>
Reviewed-by: Jan Kara <jack@suse.cz>
---
 fs/jbd2/transaction.c | 7 +++----
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/fs/jbd2/transaction.c b/fs/jbd2/transaction.c
index 83801a8be078..73ed02f061e1 100644
--- a/fs/jbd2/transaction.c
+++ b/fs/jbd2/transaction.c
@@ -141,20 +141,19 @@ static void jbd2_get_transaction(journal_t *journal,
  * t_max_wait is carefully updated here with use of atomic compare exchange.
  * Note that there could be multiplre threads trying to do this simultaneously
  * hence using cmpxchg to avoid any use of locks in this case.
+ * With this t_max_wait can be updated w/o enabling jbd2_journal_enable_debug.
  */
 static inline void update_t_max_wait(transaction_t *transaction,
 				     unsigned long ts)
 {
-#ifdef CONFIG_JBD2_DEBUG
 	unsigned long oldts, newts;
-	if (jbd2_journal_enable_debug &&
-	    time_after(transaction->t_start, ts)) {
+
+	if (time_after(transaction->t_start, ts)) {
 		newts = jbd2_time_diff(ts, transaction->t_start);
 		oldts = READ_ONCE(transaction->t_max_wait);
 		while (oldts < newts)
 			oldts = cmpxchg(&transaction->t_max_wait, oldts, newts);
 	}
-#endif
 }
 
 /*
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCHv2 REBASED 0/2] jbd2: Kill t_handle_lock transaction spinlock
  2022-02-16  7:00 [PATCHv2 REBASED 0/2] jbd2: Kill t_handle_lock transaction spinlock Ritesh Harjani
  2022-02-16  7:00 ` [PATCHv2 REBASED 1/2] " Ritesh Harjani
  2022-02-16  7:00 ` [PATCHv2 REBASED 2/2] jbd2: Remove CONFIG_JBD2_DEBUG to update t_max_wait Ritesh Harjani
@ 2022-03-03  0:41 ` Theodore Ts'o
  2 siblings, 0 replies; 4+ messages in thread
From: Theodore Ts'o @ 2022-03-03  0:41 UTC (permalink / raw)
  To: Ritesh Harjani, linux-ext4
  Cc: Theodore Ts'o, linux-fsdevel, Andreas Dilger, Jan Kara

On Wed, 16 Feb 2022 12:30:34 +0530, Ritesh Harjani wrote:
> This is mainly just rebasing the patch series on top of recent use-after-free
> fix submitted [4], due to conflict in function jbd2_journal_wait_updates().
> No other changes apart from that.
> 
> 
> Testing
> ========
> I have again tested xfstests with -g log,metadata,auto group with 4k bs
> config (COFIG_KASAN disabled). I haven't found any regression due to these
> patches in my testing.
> 
> [...]

Applied, thanks!

[1/2] jbd2: Kill t_handle_lock transaction spinlock
      commit: f7f497cb702462e8505ff3d8d4e7722ad95626a1
[2/2] jbd2: Remove CONFIG_JBD2_DEBUG to update t_max_wait
      commit: 2d4429205882817100e5e88870ce0663d30c77af

Best regards,
-- 
Theodore Ts'o <tytso@mit.edu>

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2022-03-03  0:41 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-02-16  7:00 [PATCHv2 REBASED 0/2] jbd2: Kill t_handle_lock transaction spinlock Ritesh Harjani
2022-02-16  7:00 ` [PATCHv2 REBASED 1/2] " Ritesh Harjani
2022-02-16  7:00 ` [PATCHv2 REBASED 2/2] jbd2: Remove CONFIG_JBD2_DEBUG to update t_max_wait Ritesh Harjani
2022-03-03  0:41 ` [PATCHv2 REBASED 0/2] jbd2: Kill t_handle_lock transaction spinlock Theodore Ts'o

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.