linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Tasks stuck jbd2 for a long time
@ 2023-08-15 19:01 Bhatnagar, Rishabh
  2023-08-16  2:28 ` Theodore Ts'o
  0 siblings, 1 reply; 15+ messages in thread
From: Bhatnagar, Rishabh @ 2023-08-15 19:01 UTC (permalink / raw)
  To: jack, tytso; +Cc: linux-ext4, linux-kernel, gregkh, Park, SeongJae

Hi Jan/Ted

We are seeing lockups in journaling EXT4 code (5.10 - 6.1) under heavy 
load. The stack traces seem to suggest that kjournald thread is blocked 
for a long time.
Kjournald thread seem to be waiting on writeback thread to decrement 
t_updates and other writeback threads seem to be waiting on kjournald to 
flush the current transaction.
The system completely hangs in this case and the system IO drops to zero 
after sometime.

This is a RAID0 setup with 4 nvme (7TB each) disks. There is 390GB RAM 
available. The issue occurs when user starts downloading a big enough 
data set (60-70% disk capacity).
This is observed on 5.10 kernels (5.10.184). We tried moving to 6.1 
kernels and saw similar issue. The system completely freezes and we see 
these stack traces in serial console.

We have tried experimenting with dirty_ratio, dirty_background_ratio, 
noatime/lazytime updates but don't see much improvement.
One thing that helps is disabling journaling completely. Testing is 
ongoing after increasing the journal size. (current size 128MB).
We are trying to understand why journal threads are stuck for such a 
long time. This causes the entire IO stall in the system. Let us know if 
you have seen something similar before and if there are any suggestions 
that we can try.

INFO: task kworker/u145:1:376 blocked for more than 120 seconds.
       Not tainted 5.10.184-175.731.amzn2.x86_64
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:kworker/u145:1  state:D stack :    0 pid:  376 ppid:     2 
flags:0x00004000
Workqueue : writeback wb_workfn (flush-9:0)
Call Trace:
  __schedule+0x1f9/0x660
  schedule+0x46/0xb0
  wait_transaction_locked+0x8a/0xd0 [jbd2]
  ? add_wait_queue_exclusive+0x70/0x70
  add_transaction_credits+0xd6/0x2a0 [jbd2]
  ? blk_mq_flush_plug_list+0x100/0x1a0
  start_this_handle+0x12d/0x4d0 [jbd2]
  ? jbd2__journal_start+0xc3/0x1e0 [jbd2]
  ? kmem_cache_alloc+0x132/0x270
  jbd2__journal_start+0xfb/0x1e0 [jbd2]
  __ext4_journal_start_sb+0xfb/0x110 [ext4]
  ext4_writepages+0x32c/0x790 [ext4]
  do_writepages+0x34/0xc0
  ? write_inode+0x54/0xd0
  __writeback_single_inode+0x39/0x200
  writeback_sb_inodes+0x20d/0x4a0
  __writeback_inodes_wb+0x4c/0xe0
  wb_writeback+0x1d8/0x2a0
  wb_do_writeback+0x166/0x180
  wb_workfn+0x6e/0x250
  ? __switch_to_asm+0x3a/0x60
  ? __schedule+0x201/0x660
  process_one_work+0x1b0/0x350
  worker_thread+0x49/0x310
  ? process_one_work+0x350/0x350
  kthread+0x11b/0x140
  ? __kthread_bind_mask+0x60/0x60
  ret_from_fork+0x22/0x30

INFO: task jbd2/md0-8:8068 blocked for more than 120 seconds.

       Not tainted 5.10.184-175.731.amzn2.x86_64 #1
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:jbd2/md0-8      state:D stack:    0 pid: 8068 ppid:     2 
flags:0x00004080
Call Trace:
__schedule+0x1f9/0x660
  schedule+0x46/0xb0
  jbd2_journal_commit_transaction+0x35d/0x1880 [jbd2]
  ? update_load_avg+0x7a/0x5d0
  ? add_wait_queue_exclusive+0x70/0x70
  ? lock_timer_base+0x61/0x80
  ? kjournald2+0xcf/0x360 [jbd2]
  kjournald2+0xcf/0x360 [jbd2]
  ? add_wait_queue_exclusive+0x70/0x70
  ? load_superblock.part.0+0xb0/0xb0 [jbd2]
  kthread+0x11b/0x140
  ? __kthread_bind_mask+0x60/0x60
  ret_from_fork+0x22/0x30

INFO: task kvstore-leaf:39161 blocked for more than 121 seconds.
       Not tainted 5.10.184-175.731.amzn2.x86_64 #1
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:kvstore-leaf    state:D stack:    0 pid:39161 ppid: 39046 
flags:0x00000080
Call Trace:
  __schedule+0x1f9/0x660
  schedule+0x46/0xb0
  wait_transaction_locked+0x8a/0xd0 [jbd2]
  ? add_wait_queue_exclusive+0x70/0x70
  add_transaction_credits+0xd6/0x2a0 [jbd2]
  start_this_handle+0x12d/0x4d0 [jbd2]
  ? jbd2__journal_start+0x91/0x1e0 [jbd2]
  ? kmem_cache_alloc+0x132/0x270
  jbd2__journal_start+0xfb/0x1e0 [jbd2]
  __ext4_journal_start_sb+0xfb/0x110 [ext4]
  ext4_dirty_inode+0x3d/0x90 [ext4]
  __mark_inode_dirty+0x196/0x300
  generic_update_time+0x68/0xd0
  file_update_time+0x127/0x140
  ? generic_write_checks+0x61/0xd0
  ext4_buffered_write_iter+0x52/0x160 [ext4]
  new_sync_write+0x11c/0x1b0
  vfs_write+0x1c9/0x260
  ksys_write+0x5f/0xe0
  do_syscall_64+0x33/0x40
  entry_SYSCALL_64_after_hwframe+0x61/0xc6


Thanks
Rishabh


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Tasks stuck jbd2 for a long time
  2023-08-15 19:01 Tasks stuck jbd2 for a long time Bhatnagar, Rishabh
@ 2023-08-16  2:28 ` Theodore Ts'o
  2023-08-16  3:57   ` Bhatnagar, Rishabh
  0 siblings, 1 reply; 15+ messages in thread
From: Theodore Ts'o @ 2023-08-16  2:28 UTC (permalink / raw)
  To: Bhatnagar, Rishabh; +Cc: jack, linux-ext4, linux-kernel, gregkh, Park, SeongJae

It would be helpful if you can translate address in the stack trace to
line numbers.  See [1] and the script in
./scripts/decode_stacktrace.sh in the kernel sources.  (It is
referenced in the web page at [1].)

[1] https://docs.kernel.org/admin-guide/bug-hunting.html

Of course, in order to interpret the line numbers, we'll need a
pointer to the git repo of your kernel sources and the git commit ID
you were using that presumably corresponds to 5.10.184-175.731.amzn2.x86_64.

The stack trace for which I am particularly interested is the one for
the jbd2/md0-8 task, e.g.:

>       Not tainted 5.10.184-175.731.amzn2.x86_64 #1
> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> task:jbd2/md0-8      state:D stack:    0 pid: 8068 ppid:     2
> flags:0x00004080
> Call Trace:
> __schedule+0x1f9/0x660
>  schedule+0x46/0xb0
>  jbd2_journal_commit_transaction+0x35d/0x1880 [jbd2]  <--------- line #?
>  ? update_load_avg+0x7a/0x5d0
>  ? add_wait_queue_exclusive+0x70/0x70
>  ? lock_timer_base+0x61/0x80
>  ? kjournald2+0xcf/0x360 [jbd2]
>  kjournald2+0xcf/0x360 [jbd2]

Most of the other stack traces you refenced are tasks that are waiting
for the transaction commit to complete so they can proceed with some
file system operation.  The stack traces which have
start_this_handle() in them are examples of this going on.  Stack
traces of tasks that do *not* have start_this_handle() would be
specially interesting.

The question is why is the commit thread blocking, and on what.  It
could be blocking on some I/O; or some memory allocation; or waiting
for some process with an open transation handle to close it.  The line
number of the jbd2 thread in fs/jbd2/commit.c will give us at least a
partial answer to that question.  Of course, then we'll need to answer
the next question --- why is the I/O blocked?  Or why is the memory
allocation not completing?   etc.

I could make some speculation (such as perhaps some memory allocation
is being made without GFP_NOFS, and this is causing a deadlock between
the memory allocation code which is trying to initiate writeback, but
that is blocked on the transaction commit completing), but without
understanding what the jbd2_journal_commit_transaction() is blocking
at  jbd2_journal_commit_transaction+0x35d/0x1880, that would be justa
guess - pure speculation --- without knowing more.

Cheers,

						- Ted

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Tasks stuck jbd2 for a long time
  2023-08-16  2:28 ` Theodore Ts'o
@ 2023-08-16  3:57   ` Bhatnagar, Rishabh
  2023-08-16 14:53     ` Jan Kara
  0 siblings, 1 reply; 15+ messages in thread
From: Bhatnagar, Rishabh @ 2023-08-16  3:57 UTC (permalink / raw)
  To: Theodore Ts'o; +Cc: jack, linux-ext4, linux-kernel, gregkh, Park, SeongJae


On 8/15/23 7:28 PM, Theodore Ts'o wrote:
> CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe.
>
>
>
> It would be helpful if you can translate address in the stack trace to
> line numbers.  See [1] and the script in
> ./scripts/decode_stacktrace.sh in the kernel sources.  (It is
> referenced in the web page at [1].)
>
> [1] https://docs.kernel.org/admin-guide/bug-hunting.html
>
> Of course, in order to interpret the line numbers, we'll need a
> pointer to the git repo of your kernel sources and the git commit ID
> you were using that presumably corresponds to 5.10.184-175.731.amzn2.x86_64.
>
> The stack trace for which I am particularly interested is the one for
> the jbd2/md0-8 task, e.g.:

Thanks for checking Ted.

We don't have fast_commit feature enabled. So it should correspond to 
this line:
https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/fs/jbd2/commit.c?h=linux-5.10.y#n496

>
>>        Not tainted 5.10.184-175.731.amzn2.x86_64 #1
>> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>> task:jbd2/md0-8      state:D stack:    0 pid: 8068 ppid:     2
>> flags:0x00004080
>> Call Trace:
>> __schedule+0x1f9/0x660
>>   schedule+0x46/0xb0
>>   jbd2_journal_commit_transaction+0x35d/0x1880 [jbd2]  <--------- line #?
>>   ? update_load_avg+0x7a/0x5d0
>>   ? add_wait_queue_exclusive+0x70/0x70
>>   ? lock_timer_base+0x61/0x80
>>   ? kjournald2+0xcf/0x360 [jbd2]
>>   kjournald2+0xcf/0x360 [jbd2]
> Most of the other stack traces you refenced are tasks that are waiting
> for the transaction commit to complete so they can proceed with some
> file system operation.  The stack traces which have
> start_this_handle() in them are examples of this going on.  Stack
> traces of tasks that do *not* have start_this_handle() would be
> specially interesting.
I see all other stacks apart from kjournald have "start_this_handle".
>
> The question is why is the commit thread blocking, and on what.  It
> could be blocking on some I/O; or some memory allocation; or waiting
> for some process with an open transation handle to close it.  The line
> number of the jbd2 thread in fs/jbd2/commit.c will give us at least a
> partial answer to that question.  Of course, then we'll need to answer
> the next question --- why is the I/O blocked?  Or why is the memory
> allocation not completing?   etc.

To me it looks like its waiting on some process to close the transaction 
handle.
One point to note here is we pretty run low on memory in this usecase. 
The download starts
eating memory really fast.

>
> I could make some speculation (such as perhaps some memory allocation
> is being made without GFP_NOFS, and this is causing a deadlock between
> the memory allocation code which is trying to initiate writeback, but
> that is blocked on the transaction commit completing), but without
> understanding what the jbd2_journal_commit_transaction() is blocking
> at  jbd2_journal_commit_transaction+0x35d/0x1880, that would be justa
> guess - pure speculation --- without knowing more.
>
> Cheers,
>
>                                                  - Ted

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Tasks stuck jbd2 for a long time
  2023-08-16  3:57   ` Bhatnagar, Rishabh
@ 2023-08-16 14:53     ` Jan Kara
  2023-08-16 18:32       ` Bhatnagar, Rishabh
  0 siblings, 1 reply; 15+ messages in thread
From: Jan Kara @ 2023-08-16 14:53 UTC (permalink / raw)
  To: Bhatnagar, Rishabh
  Cc: Theodore Ts'o, jack, linux-ext4, linux-kernel, gregkh, Park,
	SeongJae

On Tue 15-08-23 20:57:14, Bhatnagar, Rishabh wrote:
> On 8/15/23 7:28 PM, Theodore Ts'o wrote:
> > CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe.
> > 
> > 
> > 
> > It would be helpful if you can translate address in the stack trace to
> > line numbers.  See [1] and the script in
> > ./scripts/decode_stacktrace.sh in the kernel sources.  (It is
> > referenced in the web page at [1].)
> > 
> > [1] https://docs.kernel.org/admin-guide/bug-hunting.html
> > 
> > Of course, in order to interpret the line numbers, we'll need a
> > pointer to the git repo of your kernel sources and the git commit ID
> > you were using that presumably corresponds to 5.10.184-175.731.amzn2.x86_64.
> > 
> > The stack trace for which I am particularly interested is the one for
> > the jbd2/md0-8 task, e.g.:
> 
> Thanks for checking Ted.
> 
> We don't have fast_commit feature enabled. So it should correspond to this
> line:
> https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/fs/jbd2/commit.c?h=linux-5.10.y#n496
> 
> > 
> > >        Not tainted 5.10.184-175.731.amzn2.x86_64 #1
> > > "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> > > task:jbd2/md0-8      state:D stack:    0 pid: 8068 ppid:     2
> > > flags:0x00004080
> > > Call Trace:
> > > __schedule+0x1f9/0x660
> > >   schedule+0x46/0xb0
> > >   jbd2_journal_commit_transaction+0x35d/0x1880 [jbd2]  <--------- line #?
> > >   ? update_load_avg+0x7a/0x5d0
> > >   ? add_wait_queue_exclusive+0x70/0x70
> > >   ? lock_timer_base+0x61/0x80
> > >   ? kjournald2+0xcf/0x360 [jbd2]
> > >   kjournald2+0xcf/0x360 [jbd2]
> > Most of the other stack traces you refenced are tasks that are waiting
> > for the transaction commit to complete so they can proceed with some
> > file system operation.  The stack traces which have
> > start_this_handle() in them are examples of this going on.  Stack
> > traces of tasks that do *not* have start_this_handle() would be
> > specially interesting.
> I see all other stacks apart from kjournald have "start_this_handle".

That would be strange. Can you post full output of "echo w
>/proc/sysrq-trigger" to dmesg, ideally passed through scripts/faddr2line as
Ted suggests. Thanks!

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Tasks stuck jbd2 for a long time
  2023-08-16 14:53     ` Jan Kara
@ 2023-08-16 18:32       ` Bhatnagar, Rishabh
  2023-08-16 21:52         ` Jan Kara
  0 siblings, 1 reply; 15+ messages in thread
From: Bhatnagar, Rishabh @ 2023-08-16 18:32 UTC (permalink / raw)
  To: Jan Kara
  Cc: Theodore Ts'o, jack, linux-ext4, linux-kernel, gregkh, Park,
	SeongJae


On 8/16/23 7:53 AM, Jan Kara wrote:
> CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe.
>
>
>
> On Tue 15-08-23 20:57:14, Bhatnagar, Rishabh wrote:
>> On 8/15/23 7:28 PM, Theodore Ts'o wrote:
>>> CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe.
>>>
>>>
>>>
>>> It would be helpful if you can translate address in the stack trace to
>>> line numbers.  See [1] and the script in
>>> ./scripts/decode_stacktrace.sh in the kernel sources.  (It is
>>> referenced in the web page at [1].)
>>>
>>> [1] https://docs.kernel.org/admin-guide/bug-hunting.html
>>>
>>> Of course, in order to interpret the line numbers, we'll need a
>>> pointer to the git repo of your kernel sources and the git commit ID
>>> you were using that presumably corresponds to 5.10.184-175.731.amzn2.x86_64.
>>>
>>> The stack trace for which I am particularly interested is the one for
>>> the jbd2/md0-8 task, e.g.:
>> Thanks for checking Ted.
>>
>> We don't have fast_commit feature enabled. So it should correspond to this
>> line:
>> https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/fs/jbd2/commit.c?h=linux-5.10.y#n496
>>
>>>>         Not tainted 5.10.184-175.731.amzn2.x86_64 #1
>>>> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>>>> task:jbd2/md0-8      state:D stack:    0 pid: 8068 ppid:     2
>>>> flags:0x00004080
>>>> Call Trace:
>>>> __schedule+0x1f9/0x660
>>>>    schedule+0x46/0xb0
>>>>    jbd2_journal_commit_transaction+0x35d/0x1880 [jbd2]  <--------- line #?
>>>>    ? update_load_avg+0x7a/0x5d0
>>>>    ? add_wait_queue_exclusive+0x70/0x70
>>>>    ? lock_timer_base+0x61/0x80
>>>>    ? kjournald2+0xcf/0x360 [jbd2]
>>>>    kjournald2+0xcf/0x360 [jbd2]
>>> Most of the other stack traces you refenced are tasks that are waiting
>>> for the transaction commit to complete so they can proceed with some
>>> file system operation.  The stack traces which have
>>> start_this_handle() in them are examples of this going on.  Stack
>>> traces of tasks that do *not* have start_this_handle() would be
>>> specially interesting.
>> I see all other stacks apart from kjournald have "start_this_handle".
> That would be strange. Can you post full output of "echo w
>> /proc/sysrq-trigger" to dmesg, ideally passed through scripts/faddr2line as
> Ted suggests. Thanks!

Sure i'll try to collect that. The system freezes when such a situation 
happens and i'm not able
to collect much information. I'll try to crash the kernel and collect 
kdump and see if i can get that info.

Can low available memory be a reason for a thread to not be able to 
close the transaction handle for a long time?
Maybe some writeback thread starts the handle but is not able to 
complete writeback?

>
>                                                                  Honza
> --
> Jan Kara <jack@suse.com>
> SUSE Labs, CR

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Tasks stuck jbd2 for a long time
  2023-08-16 18:32       ` Bhatnagar, Rishabh
@ 2023-08-16 21:52         ` Jan Kara
  2023-08-16 22:53           ` Bhatnagar, Rishabh
  0 siblings, 1 reply; 15+ messages in thread
From: Jan Kara @ 2023-08-16 21:52 UTC (permalink / raw)
  To: Bhatnagar, Rishabh
  Cc: Jan Kara, Theodore Ts'o, jack, linux-ext4, linux-kernel,
	gregkh, Park, SeongJae

On Wed 16-08-23 11:32:47, Bhatnagar, Rishabh wrote:
> On 8/16/23 7:53 AM, Jan Kara wrote:
> > CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe.
> > On Tue 15-08-23 20:57:14, Bhatnagar, Rishabh wrote:
> > > On 8/15/23 7:28 PM, Theodore Ts'o wrote:
> > > > CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe.
> > > > 
> > > > 
> > > > 
> > > > It would be helpful if you can translate address in the stack trace to
> > > > line numbers.  See [1] and the script in
> > > > ./scripts/decode_stacktrace.sh in the kernel sources.  (It is
> > > > referenced in the web page at [1].)
> > > > 
> > > > [1] https://docs.kernel.org/admin-guide/bug-hunting.html
> > > > 
> > > > Of course, in order to interpret the line numbers, we'll need a
> > > > pointer to the git repo of your kernel sources and the git commit ID
> > > > you were using that presumably corresponds to 5.10.184-175.731.amzn2.x86_64.
> > > > 
> > > > The stack trace for which I am particularly interested is the one for
> > > > the jbd2/md0-8 task, e.g.:
> > > Thanks for checking Ted.
> > > 
> > > We don't have fast_commit feature enabled. So it should correspond to this
> > > line:
> > > https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/fs/jbd2/commit.c?h=linux-5.10.y#n496
> > > 
> > > > >         Not tainted 5.10.184-175.731.amzn2.x86_64 #1
> > > > > "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> > > > > task:jbd2/md0-8      state:D stack:    0 pid: 8068 ppid:     2
> > > > > flags:0x00004080
> > > > > Call Trace:
> > > > > __schedule+0x1f9/0x660
> > > > >    schedule+0x46/0xb0
> > > > >    jbd2_journal_commit_transaction+0x35d/0x1880 [jbd2]  <--------- line #?
> > > > >    ? update_load_avg+0x7a/0x5d0
> > > > >    ? add_wait_queue_exclusive+0x70/0x70
> > > > >    ? lock_timer_base+0x61/0x80
> > > > >    ? kjournald2+0xcf/0x360 [jbd2]
> > > > >    kjournald2+0xcf/0x360 [jbd2]
> > > > Most of the other stack traces you refenced are tasks that are waiting
> > > > for the transaction commit to complete so they can proceed with some
> > > > file system operation.  The stack traces which have
> > > > start_this_handle() in them are examples of this going on.  Stack
> > > > traces of tasks that do *not* have start_this_handle() would be
> > > > specially interesting.
> > > I see all other stacks apart from kjournald have "start_this_handle".
> > That would be strange. Can you post full output of "echo w
> > > /proc/sysrq-trigger" to dmesg, ideally passed through scripts/faddr2line as
> > Ted suggests. Thanks!
> 
> Sure i'll try to collect that. The system freezes when such a situation
> happens and i'm not able
> to collect much information. I'll try to crash the kernel and collect kdump
> and see if i can get that info.

Thanks!

> Can low available memory be a reason for a thread to not be able to close
> the transaction handle for a long time?
> Maybe some writeback thread starts the handle but is not able to complete
> writeback?

Well, even that would be a bug but low memory conditions are certainly some
of less tested paths so it is possible there's a bug lurking there.

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Tasks stuck jbd2 for a long time
  2023-08-16 21:52         ` Jan Kara
@ 2023-08-16 22:53           ` Bhatnagar, Rishabh
  2023-08-17 10:49             ` Jan Kara
  0 siblings, 1 reply; 15+ messages in thread
From: Bhatnagar, Rishabh @ 2023-08-16 22:53 UTC (permalink / raw)
  To: Jan Kara
  Cc: Theodore Ts'o, jack, linux-ext4, linux-kernel, gregkh, Park,
	SeongJae


On 8/16/23 2:52 PM, Jan Kara wrote:
> CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe.
>
>
>
> On Wed 16-08-23 11:32:47, Bhatnagar, Rishabh wrote:
>> On 8/16/23 7:53 AM, Jan Kara wrote:
>>> CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe.
>>> On Tue 15-08-23 20:57:14, Bhatnagar, Rishabh wrote:
>>>> On 8/15/23 7:28 PM, Theodore Ts'o wrote:
>>>>> CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe.
>>>>>
>>>>>
>>>>>
>>>>> It would be helpful if you can translate address in the stack trace to
>>>>> line numbers.  See [1] and the script in
>>>>> ./scripts/decode_stacktrace.sh in the kernel sources.  (It is
>>>>> referenced in the web page at [1].)
>>>>>
>>>>> [1] https://docs.kernel.org/admin-guide/bug-hunting.html
>>>>>
>>>>> Of course, in order to interpret the line numbers, we'll need a
>>>>> pointer to the git repo of your kernel sources and the git commit ID
>>>>> you were using that presumably corresponds to 5.10.184-175.731.amzn2.x86_64.
>>>>>
>>>>> The stack trace for which I am particularly interested is the one for
>>>>> the jbd2/md0-8 task, e.g.:
>>>> Thanks for checking Ted.
>>>>
>>>> We don't have fast_commit feature enabled. So it should correspond to this
>>>> line:
>>>> https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/fs/jbd2/commit.c?h=linux-5.10.y#n496
>>>>
>>>>>>          Not tainted 5.10.184-175.731.amzn2.x86_64 #1
>>>>>> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>>>>>> task:jbd2/md0-8      state:D stack:    0 pid: 8068 ppid:     2
>>>>>> flags:0x00004080
>>>>>> Call Trace:
>>>>>> __schedule+0x1f9/0x660
>>>>>>     schedule+0x46/0xb0
>>>>>>     jbd2_journal_commit_transaction+0x35d/0x1880 [jbd2]  <--------- line #?
>>>>>>     ? update_load_avg+0x7a/0x5d0
>>>>>>     ? add_wait_queue_exclusive+0x70/0x70
>>>>>>     ? lock_timer_base+0x61/0x80
>>>>>>     ? kjournald2+0xcf/0x360 [jbd2]
>>>>>>     kjournald2+0xcf/0x360 [jbd2]
>>>>> Most of the other stack traces you refenced are tasks that are waiting
>>>>> for the transaction commit to complete so they can proceed with some
>>>>> file system operation.  The stack traces which have
>>>>> start_this_handle() in them are examples of this going on.  Stack
>>>>> traces of tasks that do *not* have start_this_handle() would be
>>>>> specially interesting.
>>>> I see all other stacks apart from kjournald have "start_this_handle".
>>> That would be strange. Can you post full output of "echo w
>>>> /proc/sysrq-trigger" to dmesg, ideally passed through scripts/faddr2line as
>>> Ted suggests. Thanks!
>> Sure i'll try to collect that. The system freezes when such a situation
>> happens and i'm not able
>> to collect much information. I'll try to crash the kernel and collect kdump
>> and see if i can get that info.
> Thanks!

I collected dump and looked at some processes that were stuck in 
uninterruptible sleep.These are from upstream stable tree:
https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/?h=linux-5.10.y 
(5.10.191)

One of them is the journal thread that is waiting for some other thread 
to close transaction handle.

PID: 10642  TASK: ffff9768823f4000  CPU: 37  COMMAND: "jbd2/md0-8"
  #0 [ffffbd6c40c17c60] __schedule+617 at ffffffffbb912df9
  #1 [ffffbd6c40c17cf8] schedule+60 at ffffffffbb91330c
  #2 [ffffbd6c40c17d08] jbd2_journal_commit_transaction+877 at 
ffffffffc016b90d [jbd2] (/home/ec2-user/linux/fs/jbd2/commit.c:497)
  #3 [ffffbd6c40c17ea0] kjournald2+282 at ffffffffc01723ba [jbd2] 
(/home/ec2-user/linux/fs/jbd2/journal.c:214)
  #4 [ffffbd6c40c17f10] kthread+279 at ffffffffbb0b9167
  #5 [ffffbd6c40c17f50] ret_from_fork+34 at ffffffffbb003802

One of threads that have started the handle and waiting for journal to 
commit and unlock the current transaction. This stack only shows 
ext4lazyinit but with lazyinit disabled we have seen other threads stuck 
in same place.

PID: 10644  TASK: ffff976901010000  CPU: 37  COMMAND: "ext4lazyinit"
  #0 [ffffbd6c40c1fbe0] __schedule+617 at ffffffffbb912df9
  #1 [ffffbd6c40c1fc78] schedule+60 at ffffffffbb91330c
  #2 [ffffbd6c40c1fc88] wait_transaction_locked+137 at ffffffffc0168089 
[jbd2] (/home/ec2-user/linux/fs/jbd2/transaction.c:184)
  #3 [ffffbd6c40c1fcd8] add_transaction_credits+62 at ffffffffc016813e 
[jbd2] (/home/ec2-user/linux/fs/jbd2/transaction.c:241)
  #4 [ffffbd6c40c1fd30] start_this_handle+533 at ffffffffc0168615 [jbd2] 
(/home/ec2-user/linux/fs/jbd2/transaction.c:416)
  #5 [ffffbd6c40c1fdc0] jbd2__journal_start+244 at ffffffffc0168dc4 [jbd2]
  #6 [ffffbd6c40c1fe00] __ext4_journal_start_sb+250 at ffffffffc02ef65a 
[ext4]
  #7 [ffffbd6c40c1fe40] ext4_init_inode_table+190 at ffffffffc0302ace [ext4]
  #8 [ffffbd6c40c1feb0] ext4_lazyinit_thread+906 at ffffffffc033ec9a [ext4]
  #9 [ffffbd6c40c1ff10] kthread+279 at ffffffffbb0b9167
#10 [ffffbd6c40c1ff50] ret_from_fork+34 at ffffffffbb003802

To replicate the download scenario i'm just using dd to copy random data 
to disk. I launch a bunch of threads and try to stress the system. Many 
of those threads seem to be stuck in balance_dirty_pages_ratelimited as 
can be seen below.

PID: 10709  TASK: ffff9769016f8000  CPU: 25  COMMAND: "dd"
  #0 [ffffbd6c40dafa48] __schedule+617 at ffffffffbb912df9
  #1 [ffffbd6c40dafae0] schedule+60 at ffffffffbb91330c
  #2 [ffffbd6c40dafaf0] schedule_timeout+570 at ffffffffbb916a7a
  #3 [ffffbd6c40dafb68] io_schedule_timeout+25 at ffffffffbb913619 
((inlined by) io_schedule_finish at 
/home/ec2-user/linux/kernel/sched/core.c:6274)
  #4 [ffffbd6c40dafb80] balance_dirty_pages+654 at ffffffffbb2367ce  
(/home/ec2-user/linux/mm/page-writeback.c:1799)
  #5 [ffffbd6c40dafcf0] balance_dirty_pages_ratelimited+763 at 
ffffffffbb23752b  (/home/ec2-user/linux/mm/page-writeback.c:1926)
  #6 [ffffbd6c40dafd18] generic_perform_write+308 at ffffffffbb22af44 
(/home/ec2-user/linux/mm/filemap.c:3370)
  #7 [ffffbd6c40dafd88] ext4_buffered_write_iter+161 at ffffffffc02fcba1 
[ext4] (/home/ec2-user/linux/fs/ext4/file.c:273)
  #8 [ffffbd6c40dafdb8] ext4_file_write_iter+96 at ffffffffc02fccf0 [ext4]
  #9 [ffffbd6c40dafe40] new_sync_write+287 at ffffffffbb2e0c0f
#10 [ffffbd6c40dafec8] vfs_write+481 at ffffffffbb2e3161
#11 [ffffbd6c40daff00] ksys_write+165 at ffffffffbb2e3385
#12 [ffffbd6c40daff40] do_syscall_64+51 at ffffffffbb906213
#13 [ffffbd6c40daff50] entry_SYSCALL_64_after_hwframe+103 at 
ffffffffbba000df

There are other dd threads that are trying to read and are handling page 
fault. These are in runnable state and not uninterruptible sleep.

PID: 14581  TASK: ffff97c3cfdbc000  CPU: 29  COMMAND: "dd"
  #0 [ffffbd6c4a1d3598] __schedule+617 at ffffffffbb912df9
  #1 [ffffbd6c4a1d3630] _cond_resched+38 at ffffffffbb9133e6
  #2 [ffffbd6c4a1d3638] shrink_page_list+126 at ffffffffbb2412fe
  #3 [ffffbd6c4a1d36c8] shrink_inactive_list+478 at ffffffffbb24441e
  #4 [ffffbd6c4a1d3768] shrink_lruvec+957 at ffffffffbb244e3d
  #5 [ffffbd6c4a1d3870] shrink_node+552 at ffffffffbb2452a8
  #6 [ffffbd6c4a1d38f0] do_try_to_free_pages+201 at ffffffffbb245829
  #7 [ffffbd6c4a1d3940] try_to_free_pages+239 at ffffffffbb246c0f
  #8 [ffffbd6c4a1d39d8] __alloc_pages_slowpath.constprop.114+913 at 
ffffffffbb28d741
  #9 [ffffbd6c4a1d3ab8] __alloc_pages_nodemask+679 at ffffffffbb28e2e7
#10 [ffffbd6c4a1d3b28] alloc_pages_vma+124 at ffffffffbb2a734c
#11 [ffffbd6c4a1d3b68] handle_mm_fault+3999 at ffffffffbb26de2f
#12 [ffffbd6c4a1d3c28] exc_page_fault+708 at ffffffffbb909c84
#13 [ffffbd6c4a1d3c80] asm_exc_page_fault+30 at ffffffffbba00b4e
  #14 [ffffbd6c4a1d3d30] copyout+28 at ffffffffbb5160bc
#15 [ffffbd6c4a1d3d38] _copy_to_iter+158 at ffffffffbb5188de
#16 [ffffbd6c4a1d3d98] get_random_bytes_user+136 at ffffffffbb644608
#17 [ffffbd6c4a1d3e48] new_sync_read+284 at ffffffffbb2e0a5c
#18 [ffffbd6c4a1d3ed0] vfs_read+353 at ffffffffbb2e2f51
#19 [ffffbd6c4a1d3f00] ksys_read+165 at ffffffffbb2e3265
#20 [ffffbd6c4a1d3f40] do_syscall_64+51 at ffffffffbb906213
#21 [ffffbd6c4a1d3f50] entry_SYSCALL_64_after_hwframe+103 at 
ffffffffbba000df

>
>> Can low available memory be a reason for a thread to not be able to close
>> the transaction handle for a long time?
>> Maybe some writeback thread starts the handle but is not able to complete
>> writeback?
> Well, even that would be a bug but low memory conditions are certainly some
> of less tested paths so it is possible there's a bug lurking there.
Amongst the things we have tested 2 things seem to give good improvements.

One is disabling journalling. We don't see any stuck tasks. System 
becomes slow but eventually recovers. But its not something we want to 
disable.

Other is enabling swap memory. Adding some swap memory also avoids 
system going into low memory state and system doesn't freeze.

>
>                                                                  Honza
> --
> Jan Kara <jack@suse.com>
> SUSE Labs, CR

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Tasks stuck jbd2 for a long time
  2023-08-16 22:53           ` Bhatnagar, Rishabh
@ 2023-08-17 10:49             ` Jan Kara
  2023-08-17 18:59               ` Bhatnagar, Rishabh
  0 siblings, 1 reply; 15+ messages in thread
From: Jan Kara @ 2023-08-17 10:49 UTC (permalink / raw)
  To: Bhatnagar, Rishabh
  Cc: Jan Kara, Theodore Ts'o, jack, linux-ext4, linux-kernel,
	gregkh, Park, SeongJae

On Wed 16-08-23 15:53:05, Bhatnagar, Rishabh wrote:
> I collected dump and looked at some processes that were stuck in
> uninterruptible sleep.These are from upstream stable tree:
> https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/?h=linux-5.10.y
> (5.10.191)
> 
> One of them is the journal thread that is waiting for some other thread to
> close transaction handle.
> 
> PID: 10642  TASK: ffff9768823f4000  CPU: 37  COMMAND: "jbd2/md0-8"
>  #0 [ffffbd6c40c17c60] __schedule+617 at ffffffffbb912df9
>  #1 [ffffbd6c40c17cf8] schedule+60 at ffffffffbb91330c
>  #2 [ffffbd6c40c17d08] jbd2_journal_commit_transaction+877 at
> ffffffffc016b90d [jbd2] (/home/ec2-user/linux/fs/jbd2/commit.c:497)
>  #3 [ffffbd6c40c17ea0] kjournald2+282 at ffffffffc01723ba [jbd2]
> (/home/ec2-user/linux/fs/jbd2/journal.c:214)
>  #4 [ffffbd6c40c17f10] kthread+279 at ffffffffbb0b9167
>  #5 [ffffbd6c40c17f50] ret_from_fork+34 at ffffffffbb003802

Yes, correct. This is waiting for transaction->t_updates to drop to 0.

> One of threads that have started the handle and waiting for journal to
> commit and unlock the current transaction. This stack only shows
> ext4lazyinit but with lazyinit disabled we have seen other threads stuck in
> same place.
> 
> PID: 10644  TASK: ffff976901010000  CPU: 37  COMMAND: "ext4lazyinit"
>  #0 [ffffbd6c40c1fbe0] __schedule+617 at ffffffffbb912df9
>  #1 [ffffbd6c40c1fc78] schedule+60 at ffffffffbb91330c
>  #2 [ffffbd6c40c1fc88] wait_transaction_locked+137 at ffffffffc0168089
> [jbd2] (/home/ec2-user/linux/fs/jbd2/transaction.c:184)
>  #3 [ffffbd6c40c1fcd8] add_transaction_credits+62 at ffffffffc016813e [jbd2]
> (/home/ec2-user/linux/fs/jbd2/transaction.c:241)
>  #4 [ffffbd6c40c1fd30] start_this_handle+533 at ffffffffc0168615 [jbd2]
> (/home/ec2-user/linux/fs/jbd2/transaction.c:416)
>  #5 [ffffbd6c40c1fdc0] jbd2__journal_start+244 at ffffffffc0168dc4 [jbd2]
>  #6 [ffffbd6c40c1fe00] __ext4_journal_start_sb+250 at ffffffffc02ef65a
> [ext4]
>  #7 [ffffbd6c40c1fe40] ext4_init_inode_table+190 at ffffffffc0302ace [ext4]
>  #8 [ffffbd6c40c1feb0] ext4_lazyinit_thread+906 at ffffffffc033ec9a [ext4]
>  #9 [ffffbd6c40c1ff10] kthread+279 at ffffffffbb0b9167
> #10 [ffffbd6c40c1ff50] ret_from_fork+34 at ffffffffbb003802

This thread actually didn't start a transaction. It is *trying* to start a
transaction but it has failed and we are now waiting for transaction commit
to proceed (i.e., for jbd2/md0-8 process). So this isn't the process jbd2
is waiting for.

> To replicate the download scenario i'm just using dd to copy random data to
> disk. I launch a bunch of threads and try to stress the system. Many of
> those threads seem to be stuck in balance_dirty_pages_ratelimited as can be
> seen below.
> 
> PID: 10709  TASK: ffff9769016f8000  CPU: 25  COMMAND: "dd"
>  #0 [ffffbd6c40dafa48] __schedule+617 at ffffffffbb912df9
>  #1 [ffffbd6c40dafae0] schedule+60 at ffffffffbb91330c
>  #2 [ffffbd6c40dafaf0] schedule_timeout+570 at ffffffffbb916a7a
>  #3 [ffffbd6c40dafb68] io_schedule_timeout+25 at ffffffffbb913619 ((inlined
> by) io_schedule_finish at /home/ec2-user/linux/kernel/sched/core.c:6274)
>  #4 [ffffbd6c40dafb80] balance_dirty_pages+654 at ffffffffbb2367ce 
> (/home/ec2-user/linux/mm/page-writeback.c:1799)
>  #5 [ffffbd6c40dafcf0] balance_dirty_pages_ratelimited+763 at
> ffffffffbb23752b  (/home/ec2-user/linux/mm/page-writeback.c:1926)
>  #6 [ffffbd6c40dafd18] generic_perform_write+308 at ffffffffbb22af44
> (/home/ec2-user/linux/mm/filemap.c:3370)
>  #7 [ffffbd6c40dafd88] ext4_buffered_write_iter+161 at ffffffffc02fcba1
> [ext4] (/home/ec2-user/linux/fs/ext4/file.c:273)
>  #8 [ffffbd6c40dafdb8] ext4_file_write_iter+96 at ffffffffc02fccf0 [ext4]
>  #9 [ffffbd6c40dafe40] new_sync_write+287 at ffffffffbb2e0c0f
> #10 [ffffbd6c40dafec8] vfs_write+481 at ffffffffbb2e3161
> #11 [ffffbd6c40daff00] ksys_write+165 at ffffffffbb2e3385
> #12 [ffffbd6c40daff40] do_syscall_64+51 at ffffffffbb906213
> #13 [ffffbd6c40daff50] entry_SYSCALL_64_after_hwframe+103 at
> ffffffffbba000df

Yes, this is waiting for page writeback to reduce amount of dirty pages in
the pagecache. We are not holding transaction handle during this wait so
this is also not the task jbd2 is waiting for.

> There are other dd threads that are trying to read and are handling page
> fault. These are in runnable state and not uninterruptible sleep.
> 
> PID: 14581  TASK: ffff97c3cfdbc000  CPU: 29  COMMAND: "dd"
>  #0 [ffffbd6c4a1d3598] __schedule+617 at ffffffffbb912df9
>  #1 [ffffbd6c4a1d3630] _cond_resched+38 at ffffffffbb9133e6
>  #2 [ffffbd6c4a1d3638] shrink_page_list+126 at ffffffffbb2412fe
>  #3 [ffffbd6c4a1d36c8] shrink_inactive_list+478 at ffffffffbb24441e
>  #4 [ffffbd6c4a1d3768] shrink_lruvec+957 at ffffffffbb244e3d
>  #5 [ffffbd6c4a1d3870] shrink_node+552 at ffffffffbb2452a8
>  #6 [ffffbd6c4a1d38f0] do_try_to_free_pages+201 at ffffffffbb245829
>  #7 [ffffbd6c4a1d3940] try_to_free_pages+239 at ffffffffbb246c0f
>  #8 [ffffbd6c4a1d39d8] __alloc_pages_slowpath.constprop.114+913 at
> ffffffffbb28d741
>  #9 [ffffbd6c4a1d3ab8] __alloc_pages_nodemask+679 at ffffffffbb28e2e7
> #10 [ffffbd6c4a1d3b28] alloc_pages_vma+124 at ffffffffbb2a734c
> #11 [ffffbd6c4a1d3b68] handle_mm_fault+3999 at ffffffffbb26de2f
> #12 [ffffbd6c4a1d3c28] exc_page_fault+708 at ffffffffbb909c84
> #13 [ffffbd6c4a1d3c80] asm_exc_page_fault+30 at ffffffffbba00b4e
>  #14 [ffffbd6c4a1d3d30] copyout+28 at ffffffffbb5160bc
> #15 [ffffbd6c4a1d3d38] _copy_to_iter+158 at ffffffffbb5188de
> #16 [ffffbd6c4a1d3d98] get_random_bytes_user+136 at ffffffffbb644608
> #17 [ffffbd6c4a1d3e48] new_sync_read+284 at ffffffffbb2e0a5c
> #18 [ffffbd6c4a1d3ed0] vfs_read+353 at ffffffffbb2e2f51
> #19 [ffffbd6c4a1d3f00] ksys_read+165 at ffffffffbb2e3265
> #20 [ffffbd6c4a1d3f40] do_syscall_64+51 at ffffffffbb906213
> #21 [ffffbd6c4a1d3f50] entry_SYSCALL_64_after_hwframe+103 at
> ffffffffbba000df

This process is in direct reclaim trying to free more memory. It doesn't
have transaction handle started so jbd2 also isn't waiting for this
process.

> > > Can low available memory be a reason for a thread to not be able to close
> > > the transaction handle for a long time?
> > > Maybe some writeback thread starts the handle but is not able to complete
> > > writeback?
> > Well, even that would be a bug but low memory conditions are certainly some
> > of less tested paths so it is possible there's a bug lurking there.
> Amongst the things we have tested 2 things seem to give good improvements.
> 
> One is disabling journalling. We don't see any stuck tasks. System becomes
> slow but eventually recovers. But its not something we want to disable.
> 
> Other is enabling swap memory. Adding some swap memory also avoids system
> going into low memory state and system doesn't freeze.

OK, these are just workarounds. The question really is which process holds
the transaction handle jbd2 thread is waiting for. It is none of the
processes you have shown above. Since you have the crashdump, you can also
search all the processes and find those which have non-zero
task->journal_info. And from these processes you can select those where
task->journal_info points to an object from jbd2_handle_cache and then you
can verify whether the handles indeed point (through handle->h_transaction)
to the transaction jbd2 thread is trying to commit. After you've identified
such task it is interesting to see what is it doing...

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Tasks stuck jbd2 for a long time
  2023-08-17 10:49             ` Jan Kara
@ 2023-08-17 18:59               ` Bhatnagar, Rishabh
  2023-08-18  1:19                 ` Theodore Ts'o
  2023-08-18  1:31                 ` Lu, Davina
  0 siblings, 2 replies; 15+ messages in thread
From: Bhatnagar, Rishabh @ 2023-08-17 18:59 UTC (permalink / raw)
  To: Jan Kara
  Cc: Theodore Ts'o, jack, linux-ext4, linux-kernel, gregkh, Park,
	SeongJae


On 8/17/23 3:49 AM, Jan Kara wrote:
> CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe.
>
>
>
> On Wed 16-08-23 15:53:05, Bhatnagar, Rishabh wrote:
>> I collected dump and looked at some processes that were stuck in
>> uninterruptible sleep.These are from upstream stable tree:
>> https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/?h=linux-5.10.y
>> (5.10.191)
>>
>> One of them is the journal thread that is waiting for some other thread to
>> close transaction handle.
>>
>> PID: 10642  TASK: ffff9768823f4000  CPU: 37  COMMAND: "jbd2/md0-8"
>>   #0 [ffffbd6c40c17c60] __schedule+617 at ffffffffbb912df9
>>   #1 [ffffbd6c40c17cf8] schedule+60 at ffffffffbb91330c
>>   #2 [ffffbd6c40c17d08] jbd2_journal_commit_transaction+877 at
>> ffffffffc016b90d [jbd2] (/home/ec2-user/linux/fs/jbd2/commit.c:497)
>>   #3 [ffffbd6c40c17ea0] kjournald2+282 at ffffffffc01723ba [jbd2]
>> (/home/ec2-user/linux/fs/jbd2/journal.c:214)
>>   #4 [ffffbd6c40c17f10] kthread+279 at ffffffffbb0b9167
>>   #5 [ffffbd6c40c17f50] ret_from_fork+34 at ffffffffbb003802
> Yes, correct. This is waiting for transaction->t_updates to drop to 0.
>
>> One of threads that have started the handle and waiting for journal to
>> commit and unlock the current transaction. This stack only shows
>> ext4lazyinit but with lazyinit disabled we have seen other threads stuck in
>> same place.
>>
>> PID: 10644  TASK: ffff976901010000  CPU: 37  COMMAND: "ext4lazyinit"
>>   #0 [ffffbd6c40c1fbe0] __schedule+617 at ffffffffbb912df9
>>   #1 [ffffbd6c40c1fc78] schedule+60 at ffffffffbb91330c
>>   #2 [ffffbd6c40c1fc88] wait_transaction_locked+137 at ffffffffc0168089
>> [jbd2] (/home/ec2-user/linux/fs/jbd2/transaction.c:184)
>>   #3 [ffffbd6c40c1fcd8] add_transaction_credits+62 at ffffffffc016813e [jbd2]
>> (/home/ec2-user/linux/fs/jbd2/transaction.c:241)
>>   #4 [ffffbd6c40c1fd30] start_this_handle+533 at ffffffffc0168615 [jbd2]
>> (/home/ec2-user/linux/fs/jbd2/transaction.c:416)
>>   #5 [ffffbd6c40c1fdc0] jbd2__journal_start+244 at ffffffffc0168dc4 [jbd2]
>>   #6 [ffffbd6c40c1fe00] __ext4_journal_start_sb+250 at ffffffffc02ef65a
>> [ext4]
>>   #7 [ffffbd6c40c1fe40] ext4_init_inode_table+190 at ffffffffc0302ace [ext4]
>>   #8 [ffffbd6c40c1feb0] ext4_lazyinit_thread+906 at ffffffffc033ec9a [ext4]
>>   #9 [ffffbd6c40c1ff10] kthread+279 at ffffffffbb0b9167
>> #10 [ffffbd6c40c1ff50] ret_from_fork+34 at ffffffffbb003802
> This thread actually didn't start a transaction. It is *trying* to start a
> transaction but it has failed and we are now waiting for transaction commit
> to proceed (i.e., for jbd2/md0-8 process). So this isn't the process jbd2
> is waiting for.
>
>> To replicate the download scenario i'm just using dd to copy random data to
>> disk. I launch a bunch of threads and try to stress the system. Many of
>> those threads seem to be stuck in balance_dirty_pages_ratelimited as can be
>> seen below.
>>
>> PID: 10709  TASK: ffff9769016f8000  CPU: 25  COMMAND: "dd"
>>   #0 [ffffbd6c40dafa48] __schedule+617 at ffffffffbb912df9
>>   #1 [ffffbd6c40dafae0] schedule+60 at ffffffffbb91330c
>>   #2 [ffffbd6c40dafaf0] schedule_timeout+570 at ffffffffbb916a7a
>>   #3 [ffffbd6c40dafb68] io_schedule_timeout+25 at ffffffffbb913619 ((inlined
>> by) io_schedule_finish at /home/ec2-user/linux/kernel/sched/core.c:6274)
>>   #4 [ffffbd6c40dafb80] balance_dirty_pages+654 at ffffffffbb2367ce
>> (/home/ec2-user/linux/mm/page-writeback.c:1799)
>>   #5 [ffffbd6c40dafcf0] balance_dirty_pages_ratelimited+763 at
>> ffffffffbb23752b  (/home/ec2-user/linux/mm/page-writeback.c:1926)
>>   #6 [ffffbd6c40dafd18] generic_perform_write+308 at ffffffffbb22af44
>> (/home/ec2-user/linux/mm/filemap.c:3370)
>>   #7 [ffffbd6c40dafd88] ext4_buffered_write_iter+161 at ffffffffc02fcba1
>> [ext4] (/home/ec2-user/linux/fs/ext4/file.c:273)
>>   #8 [ffffbd6c40dafdb8] ext4_file_write_iter+96 at ffffffffc02fccf0 [ext4]
>>   #9 [ffffbd6c40dafe40] new_sync_write+287 at ffffffffbb2e0c0f
>> #10 [ffffbd6c40dafec8] vfs_write+481 at ffffffffbb2e3161
>> #11 [ffffbd6c40daff00] ksys_write+165 at ffffffffbb2e3385
>> #12 [ffffbd6c40daff40] do_syscall_64+51 at ffffffffbb906213
>> #13 [ffffbd6c40daff50] entry_SYSCALL_64_after_hwframe+103 at
>> ffffffffbba000df
> Yes, this is waiting for page writeback to reduce amount of dirty pages in
> the pagecache. We are not holding transaction handle during this wait so
> this is also not the task jbd2 is waiting for.
>
>> There are other dd threads that are trying to read and are handling page
>> fault. These are in runnable state and not uninterruptible sleep.
>>
>> PID: 14581  TASK: ffff97c3cfdbc000  CPU: 29  COMMAND: "dd"
>>   #0 [ffffbd6c4a1d3598] __schedule+617 at ffffffffbb912df9
>>   #1 [ffffbd6c4a1d3630] _cond_resched+38 at ffffffffbb9133e6
>>   #2 [ffffbd6c4a1d3638] shrink_page_list+126 at ffffffffbb2412fe
>>   #3 [ffffbd6c4a1d36c8] shrink_inactive_list+478 at ffffffffbb24441e
>>   #4 [ffffbd6c4a1d3768] shrink_lruvec+957 at ffffffffbb244e3d
>>   #5 [ffffbd6c4a1d3870] shrink_node+552 at ffffffffbb2452a8
>>   #6 [ffffbd6c4a1d38f0] do_try_to_free_pages+201 at ffffffffbb245829
>>   #7 [ffffbd6c4a1d3940] try_to_free_pages+239 at ffffffffbb246c0f
>>   #8 [ffffbd6c4a1d39d8] __alloc_pages_slowpath.constprop.114+913 at
>> ffffffffbb28d741
>>   #9 [ffffbd6c4a1d3ab8] __alloc_pages_nodemask+679 at ffffffffbb28e2e7
>> #10 [ffffbd6c4a1d3b28] alloc_pages_vma+124 at ffffffffbb2a734c
>> #11 [ffffbd6c4a1d3b68] handle_mm_fault+3999 at ffffffffbb26de2f
>> #12 [ffffbd6c4a1d3c28] exc_page_fault+708 at ffffffffbb909c84
>> #13 [ffffbd6c4a1d3c80] asm_exc_page_fault+30 at ffffffffbba00b4e
>>   #14 [ffffbd6c4a1d3d30] copyout+28 at ffffffffbb5160bc
>> #15 [ffffbd6c4a1d3d38] _copy_to_iter+158 at ffffffffbb5188de
>> #16 [ffffbd6c4a1d3d98] get_random_bytes_user+136 at ffffffffbb644608
>> #17 [ffffbd6c4a1d3e48] new_sync_read+284 at ffffffffbb2e0a5c
>> #18 [ffffbd6c4a1d3ed0] vfs_read+353 at ffffffffbb2e2f51
>> #19 [ffffbd6c4a1d3f00] ksys_read+165 at ffffffffbb2e3265
>> #20 [ffffbd6c4a1d3f40] do_syscall_64+51 at ffffffffbb906213
>> #21 [ffffbd6c4a1d3f50] entry_SYSCALL_64_after_hwframe+103 at
>> ffffffffbba000df
> This process is in direct reclaim trying to free more memory. It doesn't
> have transaction handle started so jbd2 also isn't waiting for this
> process.
>
>>>> Can low available memory be a reason for a thread to not be able to close
>>>> the transaction handle for a long time?
>>>> Maybe some writeback thread starts the handle but is not able to complete
>>>> writeback?
>>> Well, even that would be a bug but low memory conditions are certainly some
>>> of less tested paths so it is possible there's a bug lurking there.
>> Amongst the things we have tested 2 things seem to give good improvements.
>>
>> One is disabling journalling. We don't see any stuck tasks. System becomes
>> slow but eventually recovers. But its not something we want to disable.
>>
>> Other is enabling swap memory. Adding some swap memory also avoids system
>> going into low memory state and system doesn't freeze.
> OK, these are just workarounds. The question really is which process holds
> the transaction handle jbd2 thread is waiting for. It is none of the
> processes you have shown above. Since you have the crashdump, you can also
> search all the processes and find those which have non-zero
> task->journal_info. And from these processes you can select those where
> task->journal_info points to an object from jbd2_handle_cache and then you
> can verify whether the handles indeed point (through handle->h_transaction)
> to the transaction jbd2 thread is trying to commit. After you've identified
> such task it is interesting to see what is it doing...
>
Hi Jan

I think I found the thread that is holding the transaction handle. It 
seems to be in runnable state though.
It has the journal_info set to the journal handle that has the matching 
transaction as the journal's running transaction.
Here is the associated stack trace. It is converting unwritten extents 
to extents.

PID: 287    TASK: ffff976801890000  CPU: 20  COMMAND: "kworker/u96:35"
  #0 [ffffbd6c40b3f498] __schedule+617 at ffffffffbb912df9
  #1 [ffffbd6c40b3f530] _cond_resched+38 at ffffffffbb9133e6
  #2 [ffffbd6c40b3f538] shrink_lruvec+670 at ffffffffbb244d1e
  #3 [ffffbd6c40b3f640] _cond_resched+21 at ffffffffbb9133d5
  #4 [ffffbd6c40b3f648] shrink_node+552 at ffffffffbb2452a8
  #5 [ffffbd6c40b3f6c8] do_try_to_free_pages+201 at ffffffffbb245829
  #6 [ffffbd6c40b3f718] try_to_free_pages+239 at ffffffffbb246c0f
  #7 [ffffbd6c40b3f7b0] __alloc_pages_slowpath.constprop.114+913 at 
ffffffffbb28d741
  #8 [ffffbd6c40b3f890] __alloc_pages_nodemask+679 at ffffffffbb28e2e7
  #9 [ffffbd6c40b3f900] allocate_slab+726 at ffffffffbb2b0886
#10 [ffffbd6c40b3f958] ___slab_alloc+1173 at ffffffffbb2b3ff5
#11 [ffffbd6c40b3f988] insert_revoke_hash+37 at ffffffffc016f435 [jbd2]  
(/home/ec2-user/linux/fs/jbd2/revoke.c:146)
#12 [ffffbd6c40b3f9b8] kmem_cache_free+924 at ffffffffbb2b712c )(inlined 
by) slab_alloc at /home/ec2-user/linux/mm/slub.c:2904)
#13 [ffffbd6c40b3fa18] insert_revoke_hash+37 at ffffffffc016f435 [jbd2] 
(/home/ec2-user/linux/fs/jbd2/revoke.c:146)
#14 [ffffbd6c40b3fa40] kmem_cache_alloc+928 at ffffffffbb2b4590 
(/home/ec2-user/linux/mm/slub.c:290)
#15 [ffffbd6c40b3fa78] insert_revoke_hash+37 at ffffffffc016f435 [jbd2] 
(/home/ec2-user/linux/fs/jbd2/revoke.c:146)
#16 [ffffbd6c40b3faa0] __ext4_forget+338 at ffffffffc02efb32 [ext4]  
(/home/ec2-user/linux/fs/ext4/ext4_jbd2.c:298)
#17 [ffffbd6c40b3fae0] ext4_free_blocks+2437 at ffffffffc031fd55 [ext4] 
(/home/ec2-user/linux/fs/ext4/mballoc.c:5709 (discriminator 2))
#18 [ffffbd6c40b3fbb0] ext4_ext_handle_unwritten_extents+596 at 
ffffffffc02f56a4 [ext4] ((inlined by) ext4_ext_handle_unwritten_extents 
at /home/ec2-user/linux/fs/ext4/extents.c:3892)
#19 [ffffbd6c40b3fc98] ext4_ext_map_blocks+1325 at ffffffffc02f710d 
[ext4] (/home/ec2-user/linux/fs/ext4/extents.c:4165)
#20 [ffffbd6c40b3fd60] ext4_map_blocks+813 at ffffffffc030bd0d [ext4] 
(/home/ec2-user/linux/fs/ext4/inode.c:659)
#21 [ffffbd6c40b3fdd0] ext4_convert_unwritten_extents+303 at 
ffffffffc02f8adf [ext4] (/home/ec2-user/linux/fs/ext4/extents.c:4810)
#22 [ffffbd6c40b3fe28] ext4_convert_unwritten_io_end_vec+95 at 
ffffffffc02f8c5f [ext4] (/home/ec2-user/linux/fs/ext4/extents.c:4850)
#23 [ffffbd6c40b3fe58] ext4_end_io_rsv_work+269 at ffffffffc032c3fd 
[ext4] ((inlined by) ext4_do_flush_completed_IO at 
/home/ec2-user/linux/fs/ext4/page-io.c:262)
#24 [ffffbd6c40b3fe98] process_one_work+405 at ffffffffbb0b2725
#25 [ffffbd6c40b3fed8] worker_thread+48 at ffffffffbb0b2920
#26 [ffffbd6c40b3ff10] kthread+279 at ffffffffbb0b9167
#27 [ffffbd6c40b3ff50] ret_from_fork+34 at ffffffffbb003802

Thanks
Rishabh


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Tasks stuck jbd2 for a long time
  2023-08-17 18:59               ` Bhatnagar, Rishabh
@ 2023-08-18  1:19                 ` Theodore Ts'o
  2023-08-18  1:31                 ` Lu, Davina
  1 sibling, 0 replies; 15+ messages in thread
From: Theodore Ts'o @ 2023-08-18  1:19 UTC (permalink / raw)
  To: Bhatnagar, Rishabh
  Cc: Jan Kara, jack, linux-ext4, linux-kernel, gregkh, Park, SeongJae

On Thu, Aug 17, 2023 at 11:59:03AM -0700, Bhatnagar, Rishabh wrote:
> 
> I think I found the thread that is holding the transaction handle. It seems
> to be in runnable state though.

This looks like it's case of livelock...

> It has the journal_info set to the journal handle that has the matching
> transaction as the journal's running transaction.
> Here is the associated stack trace. It is converting unwritten extents to
> extents.
> 
> PID: 287    TASK: ffff976801890000  CPU: 20  COMMAND: "kworker/u96:35"
>  #0 [ffffbd6c40b3f498] __schedule+617 at ffffffffbb912df9
>  #1 [ffffbd6c40b3f530] _cond_resched+38 at ffffffffbb9133e6
>  #2 [ffffbd6c40b3f538] shrink_lruvec+670 at ffffffffbb244d1e
>  #3 [ffffbd6c40b3f640] _cond_resched+21 at ffffffffbb9133d5
>  #4 [ffffbd6c40b3f648] shrink_node+552 at ffffffffbb2452a8
>  #5 [ffffbd6c40b3f6c8] do_try_to_free_pages+201 at ffffffffbb245829
>  #6 [ffffbd6c40b3f718] try_to_free_pages+239 at ffffffffbb246c0f
>  #7 [ffffbd6c40b3f7b0] __alloc_pages_slowpath.constprop.114+913 at
> ffffffffbb28d741
>  #8 [ffffbd6c40b3f890] __alloc_pages_nodemask+679 at ffffffffbb28e2e7
>  #9 [ffffbd6c40b3f900] allocate_slab+726 at ffffffffbb2b0886
> #10 [ffffbd6c40b3f958] ___slab_alloc+1173 at ffffffffbb2b3ff5
> #11 [ffffbd6c40b3f988] insert_revoke_hash+37 at ffffffffc016f435 [jbd2] 
> (/home/ec2-user/linux/fs/jbd2/revoke.c:146)

insert_revoke_hash is trying to do a memory allocation of a 48 byte
structure using kmem_cache_alloc() with the __GFP_NOFAIL bit set.  (We
use GFP_NOFAIL because if the memory allocation fails, the only
recourse we can have is to shut down the journal and force the file
system to be read-only --- or crash the system, of course.)

Since we have set __GFP_NOFAIL, the memory allocator is apparently not
able to find even a single free page for the slab allocator, and so
it's apparently trying and trying to free memory --- and failing
miserably.

Hmm... something that might be worth trying is to see if running the
job in a memcg, since how the kernel handles OOM, and how it will
handle OOM kills will differ depending whether is getting constrained
by container memory or by completely running out of memory.

I wonder why this version of the 5.10 kernel isn't solving the problem
performing an OOM kill to free memory.  We're running a 5.10 based
kernel in our data centers at $WORK, and normally the OOM killer is
quite free to make memory available by killing as necessasry deal with
these situations.  (As Spock once said, "The needs of the many
outweighs the needs of the few --- or the one."  And sometimes the
best way to keep the system running is to sacrifice one of the
userspace processes.)  Do you by any chance have all or most of the
user processes exempted from the OOM killer?

> #16 [ffffbd6c40b3faa0] __ext4_forget+338 at ffffffffc02efb32 [ext4] 
> (/home/ec2-user/linux/fs/ext4/ext4_jbd2.c:298)
> #17 [ffffbd6c40b3fae0] ext4_free_blocks+2437 at ffffffffc031fd55 [ext4]
> (/home/ec2-user/linux/fs/ext4/mballoc.c:5709 (discriminator 2))
> #18 [ffffbd6c40b3fbb0] ext4_ext_handle_unwritten_extents+596 at
> ffffffffc02f56a4 [ext4] ((inlined by) ext4_ext_handle_unwritten_extents at
> /home/ec2-user/linux/fs/ext4/extents.c:3892)

This part of the stack trace is weird.  I don't see what callees of
ext4_ext_handle_unwritten_extents() would result in ext4_free_blocks()
getting called.  Unfortunately, the code is very heavily inlined, and
we only see the first level of inlining.  My best guess is that it was
in some error handling code, such as this:

	if (err) {
		/* free all allocated blocks in error case */
		for (i = 0; i < depth; i++) {
			if (!ablocks[i])
				continue;
			ext4_free_blocks(handle, inode, NULL, ablocks[i], 1,
					 EXT4_FREE_BLOCKS_METADATA);
		}
	}

... and this call to ext4_free_blocks() resulted in the call to
__ext4_forget, which in turn tried to create a journal revoke record.

And the cause of the error may very well have been caused by some
other memory allocation, if the system was so desperately low on
memory.


Anyway, the big question is why the system allowed the system to get
so low in memory in the first place.  In addition to OOM killing
processes, one of the things that Linux is supposed to do is "write
throttling", where if a process is dirtying too many pages, to put the
guilty processes to sleep so that page cleaning can have a chance to
catch up.

Quoting from section 14.1.5 (Writeback) from [1]:

    As applications write to files, the pagecache becomes dirty and
    the buffercache may become dirty. When the amount of dirty memory
    reaches a specified number of pages in bytes
    (vm.dirty_background_bytes), or when the amount of dirty memory
    reaches a specific ratio to total memory
    (vm.dirty_background_ratio), or when the pages have been dirty for
    longer than a specified amount of time
    (vm.dirty_expire_centisecs), the kernel begins writeback of pages
    starting with files that had the pages dirtied first. The
    background bytes and ratios are mutually exclusive and setting one
    will overwrite the other. Flusher threads perform writeback in the
    background and allow applications to continue running. If the I/O
    cannot keep up with applications dirtying pagecache, and dirty
    data reaches a critical setting (vm.dirty_bytes or
    vm.dirty_ratio), then applications begin to be throttled to
    prevent dirty data exceeding this threshold.

[1] https://documentation.suse.com/sles/15-SP3/html/SLES-all/cha-tuning-memory.html

So it might be worth looking at if your system has non-default values
for /proc/sys/vm/{dirty_bytes,dirty_ratio,dirty_background_bytes,etc.}.

Cheers,

					- Ted

^ permalink raw reply	[flat|nested] 15+ messages in thread

* RE: Tasks stuck jbd2 for a long time
  2023-08-17 18:59               ` Bhatnagar, Rishabh
  2023-08-18  1:19                 ` Theodore Ts'o
@ 2023-08-18  1:31                 ` Lu, Davina
  2023-08-18  2:41                   ` Theodore Ts'o
  1 sibling, 1 reply; 15+ messages in thread
From: Lu, Davina @ 2023-08-18  1:31 UTC (permalink / raw)
  To: Bhatnagar, Rishabh, Jan Kara
  Cc: Theodore Ts'o, jack, linux-ext4, linux-kernel, gregkh, Park,
	SeongJae

Hi Bhatnagar, 

Looks like this is a similar issue I saw before with fio test (buffered IO with 100 threads), it is also shows "ext4-rsv-conversion" work queue takes lots CPU and make journal update every stuck. It stuck at
[<0>] do_get_write_access+0x291/0x350 [jbd2]
[<0>] jbd2_journal_get_write_access+0x67/0x90 [jbd2]
[<0>] __ext4_journal_get_write_access+0x44/0x90 [ext4]
[<0>] ext4_reserve_inode_write+0x83/0xc0 [ext4]
[<0>] __ext4_mark_inode_dirty+0x50/0x120 [ext4]
[<0>] ext4_convert_unwritten_extents+0x179/0x220 [ext4]
[<0>] ext4_convert_unwritten_io_end_vec+0x64/0xe0 [ext4]
[<0>] ext4_do_flush_completed_IO.isra.0+0xf5/0x190 [ext4]
[<0>] process_one_work+0x1b0/0x350
[<0>] worker_thread+0x49/0x310
[<0>] kthread+0x11b/0x140
[<0>] ret_from_fork+0x22/0x30

And show the lock related to 
&journal→j_state_lock-W:  waittime-avg 178.75us, holdtime-avg  4.84 us.
&journal→j_state_lock-R:  waittime-avg 1269.72us, holdtime-avg  11.07 us.

There is a  patch and see if this is the same issue? this is not the finial patch since there may have some issue from Ted. I will forward that email to you in a different loop. I didn't continue on this patch that time since we thought is might not be the real case in RDS.

diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index 091db733834e..b3c7544798b8 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -5212,7 +5213,9 @@ static int __ext4_fill_super(struct fs_context *fc, struct super_block *sb)
         * concurrency isn't really necessary.  Limit it to 1.
         */
        EXT4_SB(sb)->rsv_conversion_wq =
-               alloc_workqueue("ext4-rsv-conversion", WQ_MEM_RECLAIM | WQ_UNBOUND, 1);
+               alloc_workqueue("ext4-rsv-conversion",
+                               WQ_MEM_RECLAIM | WQ_UNBOUND | __WQ_ORDERED,
+                               num_active_cpus() > 1 ? num_active_cpus() : 1);

Thanks
Davina
-----Original Message-----
From: Bhatnagar, Rishabh <risbhat@amazon.com> 
Sent: Friday, August 18, 2023 4:59 AM
To: Jan Kara <jack@suse.cz>
Cc: Theodore Ts'o <tytso@mit.edu>; jack@suse.com; linux-ext4@vger.kernel.org; linux-kernel@vger.kernel.org; gregkh@linuxfoundation.org; Park, SeongJae <sjpark@amazon.com>
Subject: Re: Tasks stuck jbd2 for a long time


On 8/17/23 3:49 AM, Jan Kara wrote:
> CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe.
>
>
>
> On Wed 16-08-23 15:53:05, Bhatnagar, Rishabh wrote:
>> I collected dump and looked at some processes that were stuck in 
>> uninterruptible sleep.These are from upstream stable tree:
>> https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree
>> /?h=linux-5.10.y
>> (5.10.191)
>>
>> One of them is the journal thread that is waiting for some other 
>> thread to close transaction handle.
>>
>> PID: 10642  TASK: ffff9768823f4000  CPU: 37  COMMAND: "jbd2/md0-8"
>>   #0 [ffffbd6c40c17c60] __schedule+617 at ffffffffbb912df9
>>   #1 [ffffbd6c40c17cf8] schedule+60 at ffffffffbb91330c
>>   #2 [ffffbd6c40c17d08] jbd2_journal_commit_transaction+877 at 
>> ffffffffc016b90d [jbd2] (/home/ec2-user/linux/fs/jbd2/commit.c:497)
>>   #3 [ffffbd6c40c17ea0] kjournald2+282 at ffffffffc01723ba [jbd2]
>> (/home/ec2-user/linux/fs/jbd2/journal.c:214)
>>   #4 [ffffbd6c40c17f10] kthread+279 at ffffffffbb0b9167
>>   #5 [ffffbd6c40c17f50] ret_from_fork+34 at ffffffffbb003802
> Yes, correct. This is waiting for transaction->t_updates to drop to 0.
>
>> One of threads that have started the handle and waiting for journal 
>> to commit and unlock the current transaction. This stack only shows 
>> ext4lazyinit but with lazyinit disabled we have seen other threads 
>> stuck in same place.
>>
>> PID: 10644  TASK: ffff976901010000  CPU: 37  COMMAND: "ext4lazyinit"
>>   #0 [ffffbd6c40c1fbe0] __schedule+617 at ffffffffbb912df9
>>   #1 [ffffbd6c40c1fc78] schedule+60 at ffffffffbb91330c
>>   #2 [ffffbd6c40c1fc88] wait_transaction_locked+137 at 
>> ffffffffc0168089 [jbd2] (/home/ec2-user/linux/fs/jbd2/transaction.c:184)
>>   #3 [ffffbd6c40c1fcd8] add_transaction_credits+62 at 
>> ffffffffc016813e [jbd2]
>> (/home/ec2-user/linux/fs/jbd2/transaction.c:241)
>>   #4 [ffffbd6c40c1fd30] start_this_handle+533 at ffffffffc0168615 
>> [jbd2]
>> (/home/ec2-user/linux/fs/jbd2/transaction.c:416)
>>   #5 [ffffbd6c40c1fdc0] jbd2__journal_start+244 at ffffffffc0168dc4 [jbd2]
>>   #6 [ffffbd6c40c1fe00] __ext4_journal_start_sb+250 at 
>> ffffffffc02ef65a [ext4]
>>   #7 [ffffbd6c40c1fe40] ext4_init_inode_table+190 at ffffffffc0302ace [ext4]
>>   #8 [ffffbd6c40c1feb0] ext4_lazyinit_thread+906 at ffffffffc033ec9a [ext4]
>>   #9 [ffffbd6c40c1ff10] kthread+279 at ffffffffbb0b9167
>> #10 [ffffbd6c40c1ff50] ret_from_fork+34 at ffffffffbb003802
> This thread actually didn't start a transaction. It is *trying* to 
> start a transaction but it has failed and we are now waiting for 
> transaction commit to proceed (i.e., for jbd2/md0-8 process). So this 
> isn't the process jbd2 is waiting for.
>
>> To replicate the download scenario i'm just using dd to copy random 
>> data to disk. I launch a bunch of threads and try to stress the 
>> system. Many of those threads seem to be stuck in 
>> balance_dirty_pages_ratelimited as can be seen below.
>>
>> PID: 10709  TASK: ffff9769016f8000  CPU: 25  COMMAND: "dd"
>>   #0 [ffffbd6c40dafa48] __schedule+617 at ffffffffbb912df9
>>   #1 [ffffbd6c40dafae0] schedule+60 at ffffffffbb91330c
>>   #2 [ffffbd6c40dafaf0] schedule_timeout+570 at ffffffffbb916a7a
>>   #3 [ffffbd6c40dafb68] io_schedule_timeout+25 at ffffffffbb913619 
>> ((inlined
>> by) io_schedule_finish at /home/ec2-user/linux/kernel/sched/core.c:6274)
>>   #4 [ffffbd6c40dafb80] balance_dirty_pages+654 at ffffffffbb2367ce
>> (/home/ec2-user/linux/mm/page-writeback.c:1799)
>>   #5 [ffffbd6c40dafcf0] balance_dirty_pages_ratelimited+763 at 
>> ffffffffbb23752b  (/home/ec2-user/linux/mm/page-writeback.c:1926)
>>   #6 [ffffbd6c40dafd18] generic_perform_write+308 at ffffffffbb22af44
>> (/home/ec2-user/linux/mm/filemap.c:3370)
>>   #7 [ffffbd6c40dafd88] ext4_buffered_write_iter+161 at 
>> ffffffffc02fcba1 [ext4] (/home/ec2-user/linux/fs/ext4/file.c:273)
>>   #8 [ffffbd6c40dafdb8] ext4_file_write_iter+96 at ffffffffc02fccf0 [ext4]
>>   #9 [ffffbd6c40dafe40] new_sync_write+287 at ffffffffbb2e0c0f
>> #10 [ffffbd6c40dafec8] vfs_write+481 at ffffffffbb2e3161
>> #11 [ffffbd6c40daff00] ksys_write+165 at ffffffffbb2e3385
>> #12 [ffffbd6c40daff40] do_syscall_64+51 at ffffffffbb906213
>> #13 [ffffbd6c40daff50] entry_SYSCALL_64_after_hwframe+103 at 
>> ffffffffbba000df
> Yes, this is waiting for page writeback to reduce amount of dirty 
> pages in the pagecache. We are not holding transaction handle during 
> this wait so this is also not the task jbd2 is waiting for.
>
>> There are other dd threads that are trying to read and are handling 
>> page fault. These are in runnable state and not uninterruptible sleep.
>>
>> PID: 14581  TASK: ffff97c3cfdbc000  CPU: 29  COMMAND: "dd"
>>   #0 [ffffbd6c4a1d3598] __schedule+617 at ffffffffbb912df9
>>   #1 [ffffbd6c4a1d3630] _cond_resched+38 at ffffffffbb9133e6
>>   #2 [ffffbd6c4a1d3638] shrink_page_list+126 at ffffffffbb2412fe
>>   #3 [ffffbd6c4a1d36c8] shrink_inactive_list+478 at ffffffffbb24441e
>>   #4 [ffffbd6c4a1d3768] shrink_lruvec+957 at ffffffffbb244e3d
>>   #5 [ffffbd6c4a1d3870] shrink_node+552 at ffffffffbb2452a8
>>   #6 [ffffbd6c4a1d38f0] do_try_to_free_pages+201 at ffffffffbb245829
>>   #7 [ffffbd6c4a1d3940] try_to_free_pages+239 at ffffffffbb246c0f
>>   #8 [ffffbd6c4a1d39d8] __alloc_pages_slowpath.constprop.114+913 at
>> ffffffffbb28d741
>>   #9 [ffffbd6c4a1d3ab8] __alloc_pages_nodemask+679 at 
>> ffffffffbb28e2e7
>> #10 [ffffbd6c4a1d3b28] alloc_pages_vma+124 at ffffffffbb2a734c
>> #11 [ffffbd6c4a1d3b68] handle_mm_fault+3999 at ffffffffbb26de2f
>> #12 [ffffbd6c4a1d3c28] exc_page_fault+708 at ffffffffbb909c84
>> #13 [ffffbd6c4a1d3c80] asm_exc_page_fault+30 at ffffffffbba00b4e
>>   #14 [ffffbd6c4a1d3d30] copyout+28 at ffffffffbb5160bc
>> #15 [ffffbd6c4a1d3d38] _copy_to_iter+158 at ffffffffbb5188de
>> #16 [ffffbd6c4a1d3d98] get_random_bytes_user+136 at ffffffffbb644608
>> #17 [ffffbd6c4a1d3e48] new_sync_read+284 at ffffffffbb2e0a5c
>> #18 [ffffbd6c4a1d3ed0] vfs_read+353 at ffffffffbb2e2f51
>> #19 [ffffbd6c4a1d3f00] ksys_read+165 at ffffffffbb2e3265
>> #20 [ffffbd6c4a1d3f40] do_syscall_64+51 at ffffffffbb906213
>> #21 [ffffbd6c4a1d3f50] entry_SYSCALL_64_after_hwframe+103 at 
>> ffffffffbba000df
> This process is in direct reclaim trying to free more memory. It 
> doesn't have transaction handle started so jbd2 also isn't waiting for 
> this process.
>
>>>> Can low available memory be a reason for a thread to not be able to 
>>>> close the transaction handle for a long time?
>>>> Maybe some writeback thread starts the handle but is not able to 
>>>> complete writeback?
>>> Well, even that would be a bug but low memory conditions are 
>>> certainly some of less tested paths so it is possible there's a bug lurking there.
>> Amongst the things we have tested 2 things seem to give good improvements.
>>
>> One is disabling journalling. We don't see any stuck tasks. System 
>> becomes slow but eventually recovers. But its not something we want to disable.
>>
>> Other is enabling swap memory. Adding some swap memory also avoids 
>> system going into low memory state and system doesn't freeze.
> OK, these are just workarounds. The question really is which process 
> holds the transaction handle jbd2 thread is waiting for. It is none of 
> the processes you have shown above. Since you have the crashdump, you 
> can also search all the processes and find those which have non-zero
> task->journal_info. And from these processes you can select those 
> task->where journal_info points to an object from jbd2_handle_cache 
> task->and then you
> can verify whether the handles indeed point (through 
> handle->h_transaction) to the transaction jbd2 thread is trying to 
> commit. After you've identified such task it is interesting to see what is it doing...
>
Hi Jan

I think I found the thread that is holding the transaction handle. It seems to be in runnable state though.
It has the journal_info set to the journal handle that has the matching transaction as the journal's running transaction.
Here is the associated stack trace. It is converting unwritten extents to extents.

PID: 287    TASK: ffff976801890000  CPU: 20  COMMAND: "kworker/u96:35"
  #0 [ffffbd6c40b3f498] __schedule+617 at ffffffffbb912df9
  #1 [ffffbd6c40b3f530] _cond_resched+38 at ffffffffbb9133e6
  #2 [ffffbd6c40b3f538] shrink_lruvec+670 at ffffffffbb244d1e
  #3 [ffffbd6c40b3f640] _cond_resched+21 at ffffffffbb9133d5
  #4 [ffffbd6c40b3f648] shrink_node+552 at ffffffffbb2452a8
  #5 [ffffbd6c40b3f6c8] do_try_to_free_pages+201 at ffffffffbb245829
  #6 [ffffbd6c40b3f718] try_to_free_pages+239 at ffffffffbb246c0f
  #7 [ffffbd6c40b3f7b0] __alloc_pages_slowpath.constprop.114+913 at
ffffffffbb28d741
  #8 [ffffbd6c40b3f890] __alloc_pages_nodemask+679 at ffffffffbb28e2e7
  #9 [ffffbd6c40b3f900] allocate_slab+726 at ffffffffbb2b0886
#10 [ffffbd6c40b3f958] ___slab_alloc+1173 at ffffffffbb2b3ff5
#11 [ffffbd6c40b3f988] insert_revoke_hash+37 at ffffffffc016f435 [jbd2]
(/home/ec2-user/linux/fs/jbd2/revoke.c:146)
#12 [ffffbd6c40b3f9b8] kmem_cache_free+924 at ffffffffbb2b712c )(inlined
by) slab_alloc at /home/ec2-user/linux/mm/slub.c:2904)
#13 [ffffbd6c40b3fa18] insert_revoke_hash+37 at ffffffffc016f435 [jbd2]
(/home/ec2-user/linux/fs/jbd2/revoke.c:146)
#14 [ffffbd6c40b3fa40] kmem_cache_alloc+928 at ffffffffbb2b4590
(/home/ec2-user/linux/mm/slub.c:290)
#15 [ffffbd6c40b3fa78] insert_revoke_hash+37 at ffffffffc016f435 [jbd2]
(/home/ec2-user/linux/fs/jbd2/revoke.c:146)
#16 [ffffbd6c40b3faa0] __ext4_forget+338 at ffffffffc02efb32 [ext4]
(/home/ec2-user/linux/fs/ext4/ext4_jbd2.c:298)
#17 [ffffbd6c40b3fae0] ext4_free_blocks+2437 at ffffffffc031fd55 [ext4]
(/home/ec2-user/linux/fs/ext4/mballoc.c:5709 (discriminator 2))
#18 [ffffbd6c40b3fbb0] ext4_ext_handle_unwritten_extents+596 at
ffffffffc02f56a4 [ext4] ((inlined by) ext4_ext_handle_unwritten_extents at /home/ec2-user/linux/fs/ext4/extents.c:3892)
#19 [ffffbd6c40b3fc98] ext4_ext_map_blocks+1325 at ffffffffc02f710d [ext4] (/home/ec2-user/linux/fs/ext4/extents.c:4165)
#20 [ffffbd6c40b3fd60] ext4_map_blocks+813 at ffffffffc030bd0d [ext4]
(/home/ec2-user/linux/fs/ext4/inode.c:659)
#21 [ffffbd6c40b3fdd0] ext4_convert_unwritten_extents+303 at ffffffffc02f8adf [ext4] (/home/ec2-user/linux/fs/ext4/extents.c:4810)
#22 [ffffbd6c40b3fe28] ext4_convert_unwritten_io_end_vec+95 at ffffffffc02f8c5f [ext4] (/home/ec2-user/linux/fs/ext4/extents.c:4850)
#23 [ffffbd6c40b3fe58] ext4_end_io_rsv_work+269 at ffffffffc032c3fd [ext4] ((inlined by) ext4_do_flush_completed_IO at
/home/ec2-user/linux/fs/ext4/page-io.c:262)
#24 [ffffbd6c40b3fe98] process_one_work+405 at ffffffffbb0b2725
#25 [ffffbd6c40b3fed8] worker_thread+48 at ffffffffbb0b2920
#26 [ffffbd6c40b3ff10] kthread+279 at ffffffffbb0b9167
#27 [ffffbd6c40b3ff50] ret_from_fork+34 at ffffffffbb003802

Thanks
Rishabh


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: Tasks stuck jbd2 for a long time
  2023-08-18  1:31                 ` Lu, Davina
@ 2023-08-18  2:41                   ` Theodore Ts'o
  2023-08-21  1:10                     ` Lu, Davina
  0 siblings, 1 reply; 15+ messages in thread
From: Theodore Ts'o @ 2023-08-18  2:41 UTC (permalink / raw)
  To: Lu, Davina
  Cc: Bhatnagar, Rishabh, Jan Kara, jack, linux-ext4, linux-kernel,
	gregkh, Park, SeongJae

On Fri, Aug 18, 2023 at 01:31:35AM +0000, Lu, Davina wrote:
> 
> Looks like this is a similar issue I saw before with fio test (buffered IO with 100 threads), it is also shows "ext4-rsv-conversion" work queue takes lots CPU and make journal update every stuck.

Given the stack traces, it is very much a different problem.

> There is a patch and see if this is the same issue? this is not the
> finial patch since there may have some issue from Ted. I will
> forward that email to you in a different loop. I didn't continue on
> this patch that time since we thought is might not be the real case
> in RDS.

The patch which you've included is dangerous and can cause file system
corruption.  See my reply at [1], and your corrected patch which
addressed my concern at [2].  If folks want to try a patch, please use
the one at [2], and not the one you quoted in this thread, since it's
missing critically needed locking.

[1] https://lore.kernel.org/r/YzTMZ26AfioIbl27@mit.edu
[2] https://lore.kernel.org/r/53153bdf0cce4675b09bc2ee6483409f@amazon.com

The reason why we never pursued it is because (a) at one of our weekly
ext4 video chats, I was informed by Oleg Kiselev that the performance
issue was addressed in a different way, and (b) I'd want to reproduce
the issue on a machine under my control so I could understand what was
was going on and so we could examine the dynamics of what was
happening with and without the patch.  So I'd would have needed to
know how many CPU's what kind of storage device (HDD?, SSD?  md-raid?
etc.) was in use, in addition to the fio recipe.

Finally, I'm a bit nervous about setting the internal __WQ_ORDERED
flag with max_active > 1.  What was that all about, anyway?

     	  	       	   	    - Ted

^ permalink raw reply	[flat|nested] 15+ messages in thread

* RE: Tasks stuck jbd2 for a long time
  2023-08-18  2:41                   ` Theodore Ts'o
@ 2023-08-21  1:10                     ` Lu, Davina
  2023-08-21 18:38                       ` Theodore Ts'o
  0 siblings, 1 reply; 15+ messages in thread
From: Lu, Davina @ 2023-08-21  1:10 UTC (permalink / raw)
  To: Theodore Ts'o
  Cc: Bhatnagar, Rishabh, Jan Kara, jack, linux-ext4, linux-kernel,
	gregkh, Park, SeongJae



CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe.

On Fri, Aug 18, 2023 at 01:31:35AM +0000, Lu, Davina wrote:
>>
>> Looks like this is a similar issue I saw before with fio test (buffered IO with 100 threads), it is also shows "ext4-rsv-conversion" work queue takes lots CPU and make journal update every stuck.

>Given the stack traces, it is very much a different problem.

I see, I thought it maybe the same since it is all related to convert unwritten extents to extents. I didn't look into details of the stuck though.

>> There is a patch and see if this is the same issue? this is not the 
>> finial patch since there may have some issue from Ted. I will forward 
>> that email to you in a different loop. I didn't continue on this patch 
>> that time since we thought is might not be the real case in RDS.

>The patch which you've included is dangerous and can cause file system corruption.  See my reply at [1], and your corrected patch which addressed my concern at [2].  If folks want to try a patch, please use the one at [2], and not the one you quoted in this thread, since it's missing critically needed locking.

>[1] https://lore.kernel.org/r/YzTMZ26AfioIbl27@mit.edu
> [2] https://lore.kernel.org/r/53153bdf0cce4675b09bc2ee6483409f@amazon.com


> The reason why we never pursued it is because (a) at one of our weekly
> ext4 video chats, I was informed by Oleg Kiselev that the performance issue was addressed in a different way, and (b) I'd want to reproduce the issue on a machine under my control so I could understand what was was going on and so we could examine the dynamics of what was happening with and without the patch.  So I'd would have needed to know how many CPU's what kind of storage device (HDD?, SSD?  md-raid?
> etc.) was in use, in addition to the fio recipe.

Thanks for pointed out, I almost forget I did this version 2. 
How to replicate this issue : CPU is X86_64, 64 cores, 2.50GHZ, MEM is 256GB (it is VM though). Attached with one NVME device (no lvm, drbd etc) with IOPS 64000 and 16GiB. I can also replicate with 10000 IOPS 1000GiB NVME volume.

Run fio test: 
1. Create new files, fio or dd, fio is: /usr/bin/fio --name=16kb_rand_write_only_2048_jobs --directory=/rdsdbdata --rw=randwrite --ioengine=sync --buffered=1 --bs=16k --max-jobs=2048 --numjobs=$1 --runtime=30 --thread --filesize=28800000 --fsync=1 --group_reporting --create_only=1 > /dev/null

2. sudo echo 1 > /proc/sys/vm/drop_caches

3. fio --name=16kb_rand_write_only_2048_jobs --directory=/rdsdbdata --rw=randwrite --ioengine=sync --buffered=1 --bs=16k --max-jobs=2048 --numjobs=2048 --runtime=60 --time_based --thread --filesize=28800000 --fsync=1 --group_reporting
Can see the IOPS drop from 17K to 
Jobs: 2048 (f=2048): [w(2048)] [13.3% done] [0KB/1296KB/0KB /s] [0/81/0 iops] [eta 00m:52s]  <----- IOPS drops to less than < 100

The way to create and mount fs is:
mke2fs -m 1 -t ext4 -b 4096 -L /rdsdbdata /dev/nvme5n1 -J size=128
mount -o rw,noatime,nodiratime,data=ordered /dev/nvme5n1 /rdsdbdata

Yes, Oleg is correct, there is another way to solve this: large the journal size from 128MB to 2GB. But looks like this is not an typical issue for RDS background so we didn't continue much on this.
What I can find is: the journal doesn't have enough space (cannot buffer much) so it has to wait all the current transaction completes in code add_transaction_credits() below:
if (needed > journal->j_max_transaction_buffers / 2) {   
               jbd2_might_wait_for_commit(journal);
               wait_event(journal->j_wait_reserved,
               atomic_read(&journal->j_reserved_credits) + rsv_blocks
                                             <= journal->j_max_transaction_buffers / 2);
And the journal locking journal→j_state_lock show stuck at a long time.  
But not sure why the "ext4-rsv-conversion" also plays a role here, this should be triggered by ext4_writepages(). But what I can see is when the journal lock stuck, each core's utility is almost 100% and the ext4-rsv-conversion shows at that time. 


> Finally, I'm a bit nervous about setting the internal __WQ_ORDERED flag with max_active > 1.  What was that all about, anyway?

Yes, you are correct. I didn't use "__WQ_ORDERED" carefully, it better not use with max_active > 1 . My purpose was try to guarantee the work queue can be sequentially implemented on each core.

Thanks
Davina Lu

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Tasks stuck jbd2 for a long time
  2023-08-21  1:10                     ` Lu, Davina
@ 2023-08-21 18:38                       ` Theodore Ts'o
  2023-08-24  3:52                         ` Lu, Davina
  0 siblings, 1 reply; 15+ messages in thread
From: Theodore Ts'o @ 2023-08-21 18:38 UTC (permalink / raw)
  To: Lu, Davina
  Cc: Bhatnagar, Rishabh, Jan Kara, jack, linux-ext4, linux-kernel,
	Park, SeongJae

On Mon, Aug 21, 2023 at 01:10:58AM +0000, Lu, Davina wrote:
> 
> > [2] https://lore.kernel.org/r/53153bdf0cce4675b09bc2ee6483409f@amazon.com
> 
> Thanks for pointed out, I almost forget I did this version 2.  How
> to replicate this issue : CPU is X86_64, 64 cores, 2.50GHZ, MEM is
> 256GB (it is VM though). Attached with one NVME device (no lvm, drbd
> etc) with IOPS 64000 and 16GiB. I can also replicate with 10000 IOPS
> 1000GiB NVME volume....

Thanks for the details.  This is something that am interested in
trying to potentially to merge, since for a sufficiently
coversion-heavy workload (assuming the conversion is happening across
multiple inodes, and not just a huge number of random writes into a
single fallocated file), limiting the number of kernel threads to one
CPU isn't always going to be the right thing.  The reason why we had
done this way was because at the time, the only choices that we had
was between a single kernel thread, or spawning a kernel thread for
every single CPU --- which for a very high-core-count system, consumed
a huge amount of system resources.  This is no longer the case with
the new Concurrency Managed Workqueue (cmwq), but we never did the
experiment to make sure cmwq didn't have surprising gotchas.

> > Finally, I'm a bit nervous about setting the internal __WQ_ORDERED
> > flag with max_active > 1.  What was that all about, anyway?
> 
> Yes, you are correct. I didn't use "__WQ_ORDERED" carefully, it
> better not use with max_active > 1 . My purpose was try to guarantee
> the work queue can be sequentially implemented on each core.

I won't have time to look at this before the next merge window, but
what I'm hoping to look at is your patch at [2], with two changes:

a)  Drop the _WQ_ORDERED flag, since it is an internal flag.

b) Just pass in 0 for max_active instead of "num_active_cpus() > 1 ?
   num_active_cpus() : 1", for two reasons.  Num_active_cpus() doesn't
   take into account CPU hotplugs (for example, if you have a
   dynmically adjustable VM shape where the number of active CPU's
   might change over time).  Is there a reason why we need to set that
   limit?

Do you see any potential problem with these changes?

Thanks,

						- Ted

^ permalink raw reply	[flat|nested] 15+ messages in thread

* RE: Tasks stuck jbd2 for a long time
  2023-08-21 18:38                       ` Theodore Ts'o
@ 2023-08-24  3:52                         ` Lu, Davina
  0 siblings, 0 replies; 15+ messages in thread
From: Lu, Davina @ 2023-08-24  3:52 UTC (permalink / raw)
  To: Theodore Ts'o
  Cc: Bhatnagar, Rishabh, Jan Kara, jack, linux-ext4, linux-kernel,
	Park, SeongJae


CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe.

> Thanks for the details.  This is something that am interested in trying to potentially to merge, since for a sufficiently coversion-heavy workload (assuming the conversion is happening 
> across multiple inodes, and not just a huge number of random writes into a single fallocated file), limiting the number of kernel threads to one CPU isn't always going to be the right thing.  
>The reason why we had done this way was because at the time, the only choices that we had was between a single kernel thread, or spawning a kernel thread for every single CPU -- 
>which for a very high-core-count system, consumed a huge amount of system resources.  This is no longer the case with the new Concurrency Managed Workqueue (cmwq), but we never 
>did the experiment to make sure cmwq didn't have surprising gotchas.

Thank you for the detailed explanation. 


> I won't have time to look at this before the next merge window, but what I'm hoping to look at is your patch at [2], with two changes:
> a)  Drop the _WQ_ORDERED flag, since it is an internal flag.
> b) Just pass in 0 for max_active instead of "num_active_cpus() > 1 ?
 > num_active_cpus() : 1", for two reasons.  Num_active_cpus() doesn't
 >  take into account CPU hotplugs (for example, if you have a
  > dynmically adjustable VM shape where the number of active CPU's
  > might change over time).  Is there a reason why we need to set that
  > limit?

> Do you see any potential problem with these changes?

Sorry for the late response, after the internal discussion, I can continue on this patch. These 2 points are easy to change, I will also do some xfstest for EXT4 and run BMS on RDS environment to do a quick verify.  We can change num_active_cpus() to 0. Why adding that: just because during fio test, the max active number goes to ~50 we won't see this issue. But this is not necessary. I will see what's Oleg's opinion later offline.

Thanks,
Davina
                                             

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2023-08-24  3:56 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-08-15 19:01 Tasks stuck jbd2 for a long time Bhatnagar, Rishabh
2023-08-16  2:28 ` Theodore Ts'o
2023-08-16  3:57   ` Bhatnagar, Rishabh
2023-08-16 14:53     ` Jan Kara
2023-08-16 18:32       ` Bhatnagar, Rishabh
2023-08-16 21:52         ` Jan Kara
2023-08-16 22:53           ` Bhatnagar, Rishabh
2023-08-17 10:49             ` Jan Kara
2023-08-17 18:59               ` Bhatnagar, Rishabh
2023-08-18  1:19                 ` Theodore Ts'o
2023-08-18  1:31                 ` Lu, Davina
2023-08-18  2:41                   ` Theodore Ts'o
2023-08-21  1:10                     ` Lu, Davina
2023-08-21 18:38                       ` Theodore Ts'o
2023-08-24  3:52                         ` Lu, Davina

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).