linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* rtmutex, pi_blocked_on, and blk_flush_plug()
@ 2023-02-10  4:31 Crystal Wood
  2023-02-16 14:17 ` Sebastian Andrzej Siewior
  0 siblings, 1 reply; 10+ messages in thread
From: Crystal Wood @ 2023-02-10  4:31 UTC (permalink / raw)
  To: Thomas Gleixner, Sebastian Andrzej Siewior, John Keeping
  Cc: linux-rt-users, linux-kernel

Hello!

It is possible for blk_flush_plug() to be called while
current->pi_blocked_on is set, in the process of trying to acquire an rwsem.
If the block flush blocks trying to acquire some lock, then it appears that
current->pi_blocked_on will be overwritten, and then set to NULL once that
lock is acquired, even though the task is still blocked on the original
rwsem.  Am I missing something that deals with this situation?  It seems
like the lock types that are supposed to call blk_flush_plug() should do so
before calling task_blocks_on_rt_mutex().

I originally noticed this while investigating a related issue on an older
RHEL kernel where task_blocked_on_mutex() has a BUG_ON if entered with
current->pi_blocked_on non-NULL.  Current kernels lack this check.

To demonstrate that the recursive blocking scenario can happen (without
actually waiting to hit the scenario where the second lock is contended),
I put a WARN_ON_ONCE(current->pi_blocked_on) in rtlock_lock() (plus a few
other places, but this is the one I hit):

[83246.479396] ------------[ cut here ]------------
[83246.479400] WARNING: CPU: 94 PID: 75492 at kernel/locking/spinlock_rt.c:40 rt_spin_lock+0x46/0x70
[83246.494329] Modules linked in: xt_CHECKSUM(E) xt_MASQUERADE(E) xt_conntrack(E) ipt_REJECT(E) nf_reject_ipv4(E) nft_compat(E) nft_chain_nat(E) nf_nat(E) nf_conntrack(E) nf_defrag_ipv6(E) nf_defrag_ipv4(E) nf_tables(E) nfnetlink(E) bridge(E) stp(E) llc(E) rfkill(E) sunrpc(E) vfat(E) fat(E) intel_rapl_msr(E) intel_rapl_common(E) amd64_edac(E) edac_mce_amd(E) kvm_amd(E) kvm(E) irqbypass(E) crct10dif_pclmul(E) crc32_pclmul(E) ghash_clmulni_intel(E) rapl(E) ipmi_si(E) ipmi_devintf(E) wmi_bmof(E) ipmi_msghandler(E) pcspkr(E) mlx5_ib(E) ib_uverbs(E) joydev(E) sp5100_tco(E) ib_core(E) ccp(E) ptdma(E) i2c_piix4(E) k10temp(E) i2c_designware_platform(E) i2c_designware_core(E) acpi_cpufreq(E) xfs(E) libcrc32c(E) crc32c_intel(E) sd_mod(E) t10_pi(E) ast(E) crc64_rocksoft(E) crc64(E) drm_kms_helper(E) sg(E) syscopyarea(E) sysfillrect(E) mlx5_core(E) sysimgblt(E) i2c_algo_bit(E) ahci(E) drm_shmem_helper(E) libahci(E) mlxfw(E) drm(E) libata(E) tls(E) psample(E) wmi(E) dm_mirror(E) dm_region_hash(E)
[83246.494667]  dm_log(E) dm_mod(E) fuse(E)
[83246.593228] CPU: 94 PID: 75492 Comm: kworker/u514:1 Tainted: G            E      6.2.0-rc3-rt1.std+ #4
[83246.603919] Hardware name: AMD Corporation DAYTONA_X/DAYTONA_X, BIOS RYM1009B 05/14/2022
[83246.614174] Workqueue: writeback wb_workfn (flush-253:2)
[83246.620021] RIP: 0010:rt_spin_lock+0x46/0x70
[83246.626265] Code: 48 89 fb 8b 90 34 08 00 00 75 1f 65 48 8b 14 25 c0 14 03 00 31 c0 f0 48 0f b1 53 18 75 10 e8 d1 45 5c ff 5b 5d e9 fa c2 56 ff <0f> 0b eb dd 48 89 df e8 ee 42 00 00 48 89 df 48 89 c5 e8 63 ff ff
[83246.646613] RSP: 0018:ffffafd3f0cb7438 EFLAGS: 00010286
[83246.651943] RAX: ffff8ee41531cc80 RBX: ffff8ec428839f48 RCX: 0000000000000001
[83246.659327] RDX: 0000000000000000 RSI: ffffafd3f0cb7510 RDI: ffff8ec428839f48
[83246.669216] RBP: 0000000000000001 R08: ffff8ec4494d3ec8 R09: 0000000000031480
[83246.676709] R10: ffff8ec41ab79000 R11: 0000000000000000 R12: ffffafd3f0cb7510
[83246.684536] R13: ffffafd3f0cb7510 R14: ffffcfd37e596bc0 R15: 0000000000000001
[83246.692026] FS:  0000000000000000(0000) GS:ffff8f02c7980000(0000) knlGS:0000000000000000
[83246.702144] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[83246.709800] CR2: 0000565164f09000 CR3: 00000003a2f58003 CR4: 0000000000770ee0
[83246.718014] PKRU: 55555554
[83246.722313] Call Trace:
[83246.726219]  <TASK>
[83246.729719]  dd_insert_requests+0x55/0x360
[83246.735712]  blk_mq_sched_insert_requests+0x64/0x170
[83246.741579]  blk_mq_flush_plug_list+0x135/0x330
[83246.747219]  __blk_flush_plug+0xf6/0x130
[83246.752481]  schedule+0x46/0xd0
[83246.756625]  __rt_mutex_slowlock.isra.18.constprop.21+0x3c2/0x9d0
[83246.763522]  __rwbase_read_lock+0xae/0x210
[83246.768737]  xlog_cil_commit+0x1e0/0xc80 [xfs]
[83246.771346]  ? __pfx_xfs_trans_precommit_sort+0x10/0x10 [xfs]
[83246.771697]  __xfs_trans_commit+0x135/0x400 [xfs]
[83246.772313]  xfs_bmapi_convert_delalloc+0x39f/0x500 [xfs]
[83246.773696]  xfs_map_blocks+0x1ad/0x480 [xfs]
[83246.774339]  iomap_writepage_map+0x105/0x710
[83246.774345]  write_cache_pages+0x1bf/0x4f0
[83246.774640]  ? __pfx_iomap_do_writepage+0x10/0x10
[83246.774644]  ? submit_bio_noacct_nocheck+0xe9/0x2b0
[83246.775236]  iomap_writepages+0x20/0x40
[83246.775238]  xfs_vm_writepages+0x7e/0xb0 [xfs]
[83246.775628]  do_writepages+0xd9/0x1c0
[83246.775633]  __writeback_single_inode+0x4a/0x490
[83246.775636]  writeback_sb_inodes+0x1f4/0x4c0
[83246.775640]  __writeback_inodes_wb+0x5d/0xd0
[83246.775643]  wb_writeback+0x244/0x370
[83246.775646]  wb_workfn+0x3c0/0x530
[83246.776510]  ? __raw_spin_unlock+0x10/0x50
[83246.776514]  ? finish_task_switch+0xbd/0x320
[83246.776517]  process_one_work+0x1a5/0x400
[83246.776521]  ? __pfx_worker_thread+0x10/0x10
[83246.776523]  worker_thread+0x34/0x360
[83246.776526]  ? __pfx_worker_thread+0x10/0x10
[83246.776527]  kthread+0x10f/0x130
[83246.776530]  ? __pfx_kthread+0x10/0x10
[83246.776532]  ret_from_fork+0x2c/0x50
[83246.776537]  </TASK>
[83246.776538] ---[ end trace 0000000000000000 ]---

-Crystal


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: rtmutex, pi_blocked_on, and blk_flush_plug()
  2023-02-10  4:31 rtmutex, pi_blocked_on, and blk_flush_plug() Crystal Wood
@ 2023-02-16 14:17 ` Sebastian Andrzej Siewior
  2023-02-16 15:05   ` Sebastian Andrzej Siewior
  2023-02-20  9:49   ` Thomas Gleixner
  0 siblings, 2 replies; 10+ messages in thread
From: Sebastian Andrzej Siewior @ 2023-02-16 14:17 UTC (permalink / raw)
  To: Crystal Wood; +Cc: Thomas Gleixner, John Keeping, linux-rt-users, linux-kernel

On 2023-02-09 22:31:57 [-0600], Crystal Wood wrote:
> Hello!
Hi,

> It is possible for blk_flush_plug() to be called while
> current->pi_blocked_on is set, in the process of trying to acquire an rwsem.
> If the block flush blocks trying to acquire some lock, then it appears that
> current->pi_blocked_on will be overwritten, and then set to NULL once that
> lock is acquired, even though the task is still blocked on the original
> rwsem.  Am I missing something that deals with this situation?  It seems
> like the lock types that are supposed to call blk_flush_plug() should do so
> before calling task_blocks_on_rt_mutex().

Do you experience a problem in v6.1-RT?

> I originally noticed this while investigating a related issue on an older
> RHEL kernel where task_blocked_on_mutex() has a BUG_ON if entered with
> current->pi_blocked_on non-NULL.  Current kernels lack this check.

The logic is different but the deadlock should be avoided:
- mutex_t and rw_semaphore invoke schedule() while blocking on a lock.
  As part of schedule() sched_submit_work() is invoked.
  This is the same in RT and !RT so I don't expect any dead lock since
  the involved locks are the same.

- spinlock_t invokes schedule_rtlock() which avoids sched_submit_work().
  This is the behaviour as with !RT because it spins and does not submit
  work either.
  rwlock_t should be have the same way but invokes schedule() instead.
  This looks wrong. And it could deadlock in sched_submit_work().

> To demonstrate that the recursive blocking scenario can happen (without
> actually waiting to hit the scenario where the second lock is contended),
> I put a WARN_ON_ONCE(current->pi_blocked_on) in rtlock_lock() (plus a few
> other places, but this is the one I hit):

XFS does not use rwlock_t directly.

> -Crystal

Sebastian

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: rtmutex, pi_blocked_on, and blk_flush_plug()
  2023-02-16 14:17 ` Sebastian Andrzej Siewior
@ 2023-02-16 15:05   ` Sebastian Andrzej Siewior
  2023-02-20  9:49   ` Thomas Gleixner
  1 sibling, 0 replies; 10+ messages in thread
From: Sebastian Andrzej Siewior @ 2023-02-16 15:05 UTC (permalink / raw)
  To: Crystal Wood; +Cc: Thomas Gleixner, John Keeping, linux-rt-users, linux-kernel

On 2023-02-16 15:18:03 [+0100], To Crystal Wood wrote:
> On 2023-02-09 22:31:57 [-0600], Crystal Wood wrote:
> > Hello!
Hi,

> - spinlock_t invokes schedule_rtlock() which avoids sched_submit_work().
>   This is the behaviour as with !RT because it spins and does not submit
>   work either.
>   rwlock_t should be have the same way but invokes schedule() instead.
>   This looks wrong. And it could deadlock in sched_submit_work().

I have to correct myself. Also for rwlock_t it uses schedule_rtlock()
which means it does not invoke sched_submit_work().

There there I don't expect any deadlock here.

> > -Crystal

Sebastian

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: rtmutex, pi_blocked_on, and blk_flush_plug()
  2023-02-16 14:17 ` Sebastian Andrzej Siewior
  2023-02-16 15:05   ` Sebastian Andrzej Siewior
@ 2023-02-20  9:49   ` Thomas Gleixner
  2023-02-20 11:04     ` Sebastian Andrzej Siewior
  1 sibling, 1 reply; 10+ messages in thread
From: Thomas Gleixner @ 2023-02-20  9:49 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior, Crystal Wood
  Cc: John Keeping, linux-rt-users, linux-kernel

On Thu, Feb 16 2023 at 15:17, Sebastian Andrzej Siewior wrote:
> On 2023-02-09 22:31:57 [-0600], Crystal Wood wrote:
>> It is possible for blk_flush_plug() to be called while
>> current->pi_blocked_on is set, in the process of trying to acquire an rwsem.
>> If the block flush blocks trying to acquire some lock, then it appears that
>> current->pi_blocked_on will be overwritten, and then set to NULL once that
>> lock is acquired, even though the task is still blocked on the original
>> rwsem.  Am I missing something that deals with this situation?  It seems
>> like the lock types that are supposed to call blk_flush_plug() should do so
>> before calling task_blocks_on_rt_mutex().
>
> Do you experience a problem in v6.1-RT?
>
>> I originally noticed this while investigating a related issue on an older
>> RHEL kernel where task_blocked_on_mutex() has a BUG_ON if entered with
>> current->pi_blocked_on non-NULL.  Current kernels lack this check.
>
> The logic is different but the deadlock should be avoided:
> - mutex_t and rw_semaphore invoke schedule() while blocking on a lock.
>   As part of schedule() sched_submit_work() is invoked.
>   This is the same in RT and !RT so I don't expect any dead lock since
>   the involved locks are the same.

Huch?

xlog_cil_commit()
  down_read(&cil->xc_ctx_lock)
    __rwbase_read_lock()
       __rt_mutex_slowlock()
         current->pi_blocked_on = ...
         schedule()
           __blk_flush_plug()
             dd_insert_requests()
               rt_spin_lock()
                 WARN_ON(current->pi_blocked_on);

So something like the below is required. But that might not cut it
completely. wq_worker_sleeping() is fine, but I'm not convinced that
io_wq_worker_sleeping() is safe. That needs some investigation.

Thanks,

        tglx
---

--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -6666,6 +6666,9 @@ static inline void sched_submit_work(str
 	 */
 	SCHED_WARN_ON(current->__state & TASK_RTLOCK_WAIT);
 
+	if (current->pi_blocked_on)
+		return;
+
 	/*
 	 * If we are going to sleep and we have plugged IO queued,
 	 * make sure to submit it to avoid deadlocks.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: rtmutex, pi_blocked_on, and blk_flush_plug()
  2023-02-20  9:49   ` Thomas Gleixner
@ 2023-02-20 11:04     ` Sebastian Andrzej Siewior
  2023-02-20 11:42       ` Sebastian Andrzej Siewior
  0 siblings, 1 reply; 10+ messages in thread
From: Sebastian Andrzej Siewior @ 2023-02-20 11:04 UTC (permalink / raw)
  To: Thomas Gleixner; +Cc: Crystal Wood, John Keeping, linux-rt-users, linux-kernel

On 2023-02-20 10:49:26 [+0100], Thomas Gleixner wrote:
> > The logic is different but the deadlock should be avoided:
> > - mutex_t and rw_semaphore invoke schedule() while blocking on a lock.
> >   As part of schedule() sched_submit_work() is invoked.
> >   This is the same in RT and !RT so I don't expect any dead lock since
> >   the involved locks are the same.
> 
> Huch?
> 
> xlog_cil_commit()
>   down_read(&cil->xc_ctx_lock)
>     __rwbase_read_lock()
>        __rt_mutex_slowlock()
>          current->pi_blocked_on = ...
>          schedule()
>            __blk_flush_plug()
>              dd_insert_requests()
>                rt_spin_lock()
>                  WARN_ON(current->pi_blocked_on);
> 
> So something like the below is required. But that might not cut it
> completely. wq_worker_sleeping() is fine, but I'm not convinced that
> io_wq_worker_sleeping() is safe. That needs some investigation.

Okay, so this makes sense.

> Thanks,
> 
>         tglx
> ---
> 
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -6666,6 +6666,9 @@ static inline void sched_submit_work(str
>  	 */
>  	SCHED_WARN_ON(current->__state & TASK_RTLOCK_WAIT);
>  
> +	if (current->pi_blocked_on)
> +		return;
> +

The ->pi_blocked_on field is set by __rwbase_read_lock() before
schedule() is invoked while blocking on the sleeping lock. By doing this
we avoid __blk_flush_plug() and as such will may deadlock because we are
going to sleep and made I/O progress earlier which is not globally
visibly but might be (s/might be/is/ in the deadlock case) expected by
the owner of the lock.

We could trylock and if this fails, flush and do the proper lock.
This would ensure that we set pi_blocked_on after we flushed.

>  	/*
>  	 * If we are going to sleep and we have plugged IO queued,
>  	 * make sure to submit it to avoid deadlocks.

Sebastian

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: rtmutex, pi_blocked_on, and blk_flush_plug()
  2023-02-20 11:04     ` Sebastian Andrzej Siewior
@ 2023-02-20 11:42       ` Sebastian Andrzej Siewior
  2023-02-20 18:21         ` Thomas Gleixner
  0 siblings, 1 reply; 10+ messages in thread
From: Sebastian Andrzej Siewior @ 2023-02-20 11:42 UTC (permalink / raw)
  To: Thomas Gleixner; +Cc: Crystal Wood, John Keeping, linux-rt-users, linux-kernel

On 2023-02-20 12:04:56 [+0100], To Thomas Gleixner wrote:
> The ->pi_blocked_on field is set by __rwbase_read_lock() before
> schedule() is invoked while blocking on the sleeping lock. By doing this
> we avoid __blk_flush_plug() and as such will may deadlock because we are
> going to sleep and made I/O progress earlier which is not globally
> visibly but might be (s/might be/is/ in the deadlock case) expected by
> the owner of the lock.
> 
> We could trylock and if this fails, flush and do the proper lock.
> This would ensure that we set pi_blocked_on after we flushed.

Something like the diff below takes down_read(), down_write() into
account. read_lock()/ write_lock() is excluded via the state check.
mutex_t is missing. It needs to be flushed before the pi_blocked_on is
assigned, before the wait lock is acquired:

diff --git a/kernel/locking/rtmutex.c b/kernel/locking/rtmutex.c
index 728f434de2bbf..95731d0c9e87f 100644
--- a/kernel/locking/rtmutex.c
+++ b/kernel/locking/rtmutex.c
@@ -1700,6 +1700,13 @@ static __always_inline int __rt_mutex_lock(struct rt_mutex_base *lock,
 	if (likely(rt_mutex_cmpxchg_acquire(lock, NULL, current)))
 		return 0;
 
+	if (state != TASK_RTLOCK_WAIT) {
+		/*
+		 * If we are going to sleep and we have plugged IO queued,
+		 * make sure to submit it to avoid deadlocks.
+		 */
+		blk_flush_plug(tsk->plug, true);
+	}
 	return rt_mutex_slowlock(lock, NULL, state);
 }
 #endif /* RT_MUTEX_BUILD_MUTEX */
diff --git a/kernel/locking/rwbase_rt.c b/kernel/locking/rwbase_rt.c
index c201aadb93017..6c6c88a2d9228 100644
--- a/kernel/locking/rwbase_rt.c
+++ b/kernel/locking/rwbase_rt.c
@@ -143,6 +143,14 @@ static __always_inline int rwbase_read_lock(struct rwbase_rt *rwb,
 	if (rwbase_read_trylock(rwb))
 		return 0;
 
+	if (state != TASK_RTLOCK_WAIT) {
+		/*
+		 * If we are going to sleep and we have plugged IO queued,
+		 * make sure to submit it to avoid deadlocks.
+		 */
+		blk_flush_plug(tsk->plug, true);
+	}
+
 	return __rwbase_read_lock(rwb, state);
 }
 
Sebastian

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: rtmutex, pi_blocked_on, and blk_flush_plug()
  2023-02-20 11:42       ` Sebastian Andrzej Siewior
@ 2023-02-20 18:21         ` Thomas Gleixner
  2023-03-05  5:39           ` Crystal Wood
  0 siblings, 1 reply; 10+ messages in thread
From: Thomas Gleixner @ 2023-02-20 18:21 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior
  Cc: Crystal Wood, John Keeping, linux-rt-users, linux-kernel, Peter Zijlstra

On Mon, Feb 20 2023 at 12:42, Sebastian Andrzej Siewior wrote:
> On 2023-02-20 12:04:56 [+0100], To Thomas Gleixner wrote:
>> The ->pi_blocked_on field is set by __rwbase_read_lock() before
>> schedule() is invoked while blocking on the sleeping lock. By doing this
>> we avoid __blk_flush_plug() and as such will may deadlock because we are
>> going to sleep and made I/O progress earlier which is not globally
>> visibly but might be (s/might be/is/ in the deadlock case) expected by
>> the owner of the lock.

Fair enough.

> --- a/kernel/locking/rtmutex.c
> +++ b/kernel/locking/rtmutex.c
> @@ -1700,6 +1700,13 @@ static __always_inline int __rt_mutex_lock(struct rt_mutex_base *lock,
>  	if (likely(rt_mutex_cmpxchg_acquire(lock, NULL, current)))
>  		return 0;
>  
> +	if (state != TASK_RTLOCK_WAIT) {
> +		/*
> +		 * If we are going to sleep and we have plugged IO queued,
> +		 * make sure to submit it to avoid deadlocks.
> +		 */
> +		blk_flush_plug(tsk->plug, true);

This still leaves the problem vs. io_wq_worker_sleeping() and it's
running() counterpart after schedule().

Aside of that for CONFIG_DEBUG_RT_MUTEXES=y builds it flushes on every
lock operation whether the lock is contended or not.

Grmbl.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: rtmutex, pi_blocked_on, and blk_flush_plug()
  2023-02-20 18:21         ` Thomas Gleixner
@ 2023-03-05  5:39           ` Crystal Wood
  2023-03-22 16:41             ` Sebastian Andrzej Siewior
  0 siblings, 1 reply; 10+ messages in thread
From: Crystal Wood @ 2023-03-05  5:39 UTC (permalink / raw)
  To: Thomas Gleixner, Sebastian Andrzej Siewior
  Cc: John Keeping, linux-rt-users, linux-kernel, Peter Zijlstra

On Mon, 2023-02-20 at 19:21 +0100, Thomas Gleixner wrote:
> On Mon, Feb 20 2023 at 12:42, Sebastian Andrzej Siewior wrote:
> > On 2023-02-20 12:04:56 [+0100], To Thomas Gleixner wrote:
> > > The ->pi_blocked_on field is set by __rwbase_read_lock() before
> > > schedule() is invoked while blocking on the sleeping lock. By doing
> > > this
> > > we avoid __blk_flush_plug() and as such will may deadlock because we
> > > are
> > > going to sleep and made I/O progress earlier which is not globally
> > > visibly but might be (s/might be/is/ in the deadlock case) expected by
> > > the owner of the lock.
> 
> Fair enough.
> 
> > --- a/kernel/locking/rtmutex.c
> > +++ b/kernel/locking/rtmutex.c
> > @@ -1700,6 +1700,13 @@ static __always_inline int __rt_mutex_lock(struct
> > rt_mutex_base *lock,
> >         if (likely(rt_mutex_cmpxchg_acquire(lock, NULL, current)))
> >                 return 0;
> >  
> > +       if (state != TASK_RTLOCK_WAIT) {
> > +               /*
> > +                * If we are going to sleep and we have plugged IO
> > queued,
> > +                * make sure to submit it to avoid deadlocks.
> > +                */
> > +               blk_flush_plug(tsk->plug, true);
> 
> This still leaves the problem vs. io_wq_worker_sleeping() and it's
> running() counterpart after schedule().

The closest thing I can see to a problem there is io_wqe_dec_running()->
io_queue_worker_create()->io_wq_cancel_tw_create()->kfree(), but that only
happens with func == create_worker_cont(), and io_wqe_dec_running() uses 
create_worker_cb(). 

Are there any workloads I could run to stress out that path (with my
asserts in place)?

-Scott


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: rtmutex, pi_blocked_on, and blk_flush_plug()
  2023-03-05  5:39           ` Crystal Wood
@ 2023-03-22 16:41             ` Sebastian Andrzej Siewior
  2023-03-23 14:50               ` Sebastian Andrzej Siewior
  0 siblings, 1 reply; 10+ messages in thread
From: Sebastian Andrzej Siewior @ 2023-03-22 16:41 UTC (permalink / raw)
  To: Crystal Wood
  Cc: Thomas Gleixner, John Keeping, linux-rt-users, linux-kernel,
	Peter Zijlstra

On 2023-03-04 23:39:57 [-0600], Crystal Wood wrote:
> > This still leaves the problem vs. io_wq_worker_sleeping() and it's
> > running() counterpart after schedule().
> 
> The closest thing I can see to a problem there is io_wqe_dec_running()->
> io_queue_worker_create()->io_wq_cancel_tw_create()->kfree(), but that only
> happens with func == create_worker_cont(), and io_wqe_dec_running() uses 
> create_worker_cb(). 

So we may good then. The while loop in io_wq_cancel_tw_create() worries
me a little. I am not sure if only the submitted work gets cancel or
maybe other as well, including the one leading the kfree. 

> Are there any workloads I could run to stress out that path (with my
> asserts in place)?

None that I can think of. Maybe something from the io-ring test suite.
But then you may need to bend to code to get the task_add() to fail.
Maybe Jens knows something.

> -Scott

Sebastian

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: rtmutex, pi_blocked_on, and blk_flush_plug()
  2023-03-22 16:41             ` Sebastian Andrzej Siewior
@ 2023-03-23 14:50               ` Sebastian Andrzej Siewior
  0 siblings, 0 replies; 10+ messages in thread
From: Sebastian Andrzej Siewior @ 2023-03-23 14:50 UTC (permalink / raw)
  To: Crystal Wood
  Cc: Thomas Gleixner, John Keeping, linux-rt-users, linux-kernel,
	Peter Zijlstra

On 2023-03-22 17:41:31 [+0100], To Crystal Wood wrote:
> On 2023-03-04 23:39:57 [-0600], Crystal Wood wrote:
> > Are there any workloads I could run to stress out that path (with my
> > asserts in place)?
> 
> None that I can think of. Maybe something from the io-ring test suite.
> But then you may need to bend to code to get the task_add() to fail.
> Maybe Jens knows something.

quick question: Can you throw your testing on the patch I sent checking
if this really cures the problem you see?

> > -Scott

Sebastian

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2023-03-23 14:51 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-02-10  4:31 rtmutex, pi_blocked_on, and blk_flush_plug() Crystal Wood
2023-02-16 14:17 ` Sebastian Andrzej Siewior
2023-02-16 15:05   ` Sebastian Andrzej Siewior
2023-02-20  9:49   ` Thomas Gleixner
2023-02-20 11:04     ` Sebastian Andrzej Siewior
2023-02-20 11:42       ` Sebastian Andrzej Siewior
2023-02-20 18:21         ` Thomas Gleixner
2023-03-05  5:39           ` Crystal Wood
2023-03-22 16:41             ` Sebastian Andrzej Siewior
2023-03-23 14:50               ` Sebastian Andrzej Siewior

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).