linux-rt-users.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* sync system call failure
@ 2021-02-22  1:46 Dipen Patel
  2021-02-22 13:41 ` Sebastian Andrzej Siewior
  0 siblings, 1 reply; 3+ messages in thread
From: Dipen Patel @ 2021-02-22  1:46 UTC (permalink / raw)
  To: linux-rt-users

Hi,

I encountered below crash during the sync system call possibly coming from the stress threads spawned because of the --io options which does sync system calls. I believe the crash is because of this upstream patch https://patchwork.kernel.org/project/linux-fsdevel/patch/20171212163830.GC3919388@devbig577.frc2.facebook.com/ which introduces read write semaphore. I am not sure how to mitigate this issue, any advice will be appreciated.

How to reproduce:
1. apt-get install stress
2. stress -c 8 --io 8 --vm 8 --hdd 4

Platform:
Nvidia Jetson AGX XAVIER, with 32GB eMMC 5.1 and 256 bit 32GB LPDDR4 RAM.

Kernel:
4.9.201-rt134

CPU:
8-Core ARM v8.2 64-Bit CPU, 8 MB L2 + 4 MB L3

Call stack:

[ 1813.814464] INFO: task stress blocked for more than 120 seconds.
[ 1813.814610]       Not tainted 4.9.201-rt134 #2
[ 1813.814613] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 1813.814615] stress          D    0 11510  11504 0x00000008
[ 1813.814616] Call trace:
[ 1813.814617] [<ffffff8008085eac>] __switch_to+0x98/0xb8
[ 1813.814618] [<ffffff8008f23b2c>] __schedule+0x270/0x570
[ 1813.814619] [<ffffff8008f23e74>] schedule+0x48/0xe0
[ 1813.814620] [<ffffff8008f25384>] __rt_mutex_slowlock+0xc4/0x144
[ 1813.814622] [<ffffff8008f2578c>] rt_mutex_slowlock_locked+0xcc/0x1e4
[ 1813.814624] [<ffffff8008f25910>] rt_mutex_slowlock.constprop.22+0x6c/0xb0
[ 1813.814625] [<ffffff8008f25c70>] rt_mutex_lock_state+0x94/0xc0
[ 1813.814626] [<ffffff8008f25f48>] __down_write_common+0x38/0x140
[ 1813.814627] [<ffffff8008f26148>] __down_write+0x24/0x30
[ 1813.814628] [<ffffff8008f25198>] down_write+0x20/0x2c
[ 1813.814629] [<ffffff8008279a08>] sync_inodes_sb+0x98/0x220
[ 1813.814630] [<ffffff8008280350>] sync_inodes_one_sb+0x28/0x34
[ 1813.814631] [<ffffff80082497d4>] iterate_supers+0x114/0x118
[ 1813.814632] [<ffffff8008280738>] sys_sync+0x44/0xac
[ 1813.814633] [<ffffff8008083100>] el0_svc_naked+0x34/0x38

[  363.940264] Showing all locks held in the system:
[  363.946481] 6 locks held by kworker/u16:0/7:
[  363.950503]  #0:  ("writeback"){......}, at: [<ffffff80080d3ed0>] process_one_work+0x1c0/0x670
[  363.959692]  #1:  ((&(&wb->dwork)->work)){......}, at: [<ffffff80080d3ed0>] process_one_work+0x1c0/0x670
[  363.969137]  #2:  (&type->s_umount_key#31){......}, at: [<ffffff800826b124>] trylock_super+0x24/0x70
[  363.978848]  #3:  (&sbi->s_journal_flag_rwsem){......}, at: [<ffffff80081e7688>] do_writepages+0x48/0xa0
[  363.988651]  #4:  (jbd2_handle){......}, at: [<ffffff8008369a80>] start_this_handle+0xf0/0x3b8
[  363.997576]  #5:  (&ei->i_data_sem){......}, at: [<ffffff8008315228>] ext4_map_blocks+0xd8/0x5a8
[  364.006530] 2 locks held by khungtaskd/740:
[  364.010699]  #0:  (rcu_read_lock){......}, at: [<ffffff800818f8e0>] watchdog+0xf0/0x510
[  364.018749]  #1:  (tasklist_lock){......}, at: [<ffffff8008117f64>] debug_show_all_locks+0x3c/0x1b8
[  364.027960] 1 lock held by in:imklog/5258:
[  364.032051]  #0:  (&f->f_pos_lock){......}, at: [<ffffff800828c120>] __fdget_pos+0x50/0x60
[  364.040366] 4 locks held by rs:main Q:Reg/5259:
[  364.044912]  #0:  (&f->f_pos_lock){......}, at: [<ffffff800828c120>] __fdget_pos+0x50/0x60
[  364.053311]  #1:  (rcu_read_lock){......}, at: [<ffffff800811578c>] cpuacct_charge+0x44/0xe8
[  364.061547]  #2:  (rcu_read_lock){......}, at: [<ffffff8008283b78>] dput.part.5+0x50/0x328
[  364.070289]  #3:  (&(&(&dentry->d_lockref.lock)->lock)->wait_lock){......}, at: [<ffffff8008fef254>] __schedule+0x84/0x6f0
[  364.081077] 2 locks held by agetty/6375:
[  364.085384]  #0:  (&tty->ldisc_sem){......}, at: [<ffffff8008ff2b94>] ldsem_down_read+0x2c/0x38
[  364.093567]  #1:  (&tty->atomic_write_lock){......}, at: [<ffffff80087399b8>] tty_write_lock+0x28/0x60
[  364.103206] 2 locks held by stress/12304:
[  364.107298]  #0:  (&type->s_umount_key#31){......}, at: [<ffffff800826b368>] iterate_supers+0x78/0x140
[  364.116317]  #1:  (&bdi->wb_switch_rwsem){......}, at: [<ffffff800829e630>] sync_inodes_sb+0x98/0x2a0
[  364.125589] 2 locks held by stress/12308:
[  364.129266]  #0:  (&type->s_umount_key#31){......}, at: [<ffffff800826b368>] iterate_supers+0x78/0x140
[  364.138627]  #1:  (&bdi->wb_switch_rwsem){......}, at: [<ffffff800829e630>] sync_inodes_sb+0x98/0x2a0
[  364.147821] 2 locks held by stress/12312:
[  364.152098]  #0:  (&type->s_umount_key#31){......}, at: [<ffffff800826b368>] iterate_supers+0x78/0x140
[  364.161392]  #1:  (&bdi->wb_switch_rwsem){......}, at: [<ffffff800829e630>] sync_inodes_sb+0x98/0x2a0
[  364.170913] 2 locks held by stress/12314:
[  364.174761]  #0:  (sb_writers#5){......}, at: [<ffffff800826631c>] vfs_write+0x19c/0x1b0
[  364.183180]  #1:  (&sb->s_type->i_mutex_key#13){......}, at: [<ffffff800830e564>] ext4_file_write_iter+0x44/0x388
[  364.193580] 2 locks held by stress/12316:
[  364.197423]  #0:  (&type->s_umount_key#31){......}, at: [<ffffff800826b368>] iterate_supers+0x78/0x140
[  364.206786]  #1:  (&bdi->wb_switch_rwsem){......}, at: [<ffffff800829e630>] sync_inodes_sb+0x98/0x2a0
[  364.215981] 2 locks held by stress/12320:
[  364.219829]  #0:  (&type->s_umount_key#31){......}, at: [<ffffff800826b368>] iterate_supers+0x78/0x140
[  364.229448]  #1:  (&bdi->wb_switch_rwsem){......}, at: [<ffffff800829e630>] sync_inodes_sb+0x98/0x2a0
[  364.238717] 2 locks held by stress/12323:
[  364.242495]  #0:  (&type->s_umount_key#31){......}, at: [<ffffff800826b368>] iterate_supers+0x78/0x140
[  364.251680]  #1:  (&bdi->wb_switch_rwsem){......}, at: [<ffffff800829e630>] sync_inodes_sb+0x98/0x2a0
[  364.261129] 2 locks held by stress/12326:
[  364.265237]  #0:  (&type->s_umount_key#31){......}, at: [<ffffff800826b368>] iterate_supers+0x78/0x140
[  364.274688]  #1:  (&bdi->wb_switch_rwsem){......}, at: [<ffffff800829e630>] sync_inodes_sb+0x98/0x2a0
[  364.283796] 2 locks held by stress/12330:
[  364.287641]  #0:  (&type->s_umount_key#31){......}, at: [<ffffff800826b368>] iterate_supers+0x78/0x140
[  364.297261]  #1:  (&bdi->wb_switch_rwsem){......}, at: [<ffffff800829e630>] sync_inodes_sb+0x98/0x2a0
[  364.306720] 2 locks held by Xorg/14558:
[  364.310301]  #0:  (&tty->legacy_mutex){......}, at: [<ffffff8008747d84>] tty_lock_interruptible+0x4c/0xb0
[  364.319683]  #1:  (tasklist_lock){......}, at: [<ffffff800873f088>] tty_open+0x438/0x5b0

Best Regards,
Dipen Patel

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: sync system call failure
  2021-02-22  1:46 sync system call failure Dipen Patel
@ 2021-02-22 13:41 ` Sebastian Andrzej Siewior
  2021-02-23  0:02   ` Dipen Patel
  0 siblings, 1 reply; 3+ messages in thread
From: Sebastian Andrzej Siewior @ 2021-02-22 13:41 UTC (permalink / raw)
  To: Dipen Patel; +Cc: linux-rt-users

On 2021-02-21 17:46:25 [-0800], Dipen Patel wrote:
> Hi,
Hi,

> I encountered below crash during the sync system call possibly coming
> from the stress threads spawned because of the --io options which does
> Platform:
> Nvidia Jetson AGX XAVIER, with 32GB eMMC 5.1 and 256 bit 32GB LPDDR4 RAM.
> 
> Kernel:
> 4.9.201-rt134
> Call stack:
> 
> [ 1813.814464] INFO: task stress blocked for more than 120 seconds.
> [ 1813.814610]       Not tainted 4.9.201-rt134 #2

This is not a crash. It is simply an information that a task was blocked
for quite some time (as it says).
The important information would be:
- does it recover
  - does the time in seconds always increase or does it also drop
  - does the message disappear if stress terminates.

- does it occur on v5.11-RT.

Sebastian

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: sync system call failure
  2021-02-22 13:41 ` Sebastian Andrzej Siewior
@ 2021-02-23  0:02   ` Dipen Patel
  0 siblings, 0 replies; 3+ messages in thread
From: Dipen Patel @ 2021-02-23  0:02 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior; +Cc: linux-rt-users



On 2/22/21 5:41 AM, Sebastian Andrzej Siewior wrote:
> On 2021-02-21 17:46:25 [-0800], Dipen Patel wrote:
>> Hi,
> Hi,
> 
>> I encountered below crash during the sync system call possibly coming
>> from the stress threads spawned because of the --io options which does
> …
> 
>> Platform:
>> Nvidia Jetson AGX XAVIER, with 32GB eMMC 5.1 and 256 bit 32GB LPDDR4 RAM.
>>
>> Kernel:
>> 4.9.201-rt134
> …
>> Call stack:
>>
>> [ 1813.814464] INFO: task stress blocked for more than 120 seconds.
>> [ 1813.814610]       Not tainted 4.9.201-rt134 #2
> 
> This is not a crash. It is simply an information that a task was blocked
> for quite some time (as it says).
Correct, its not a crash but fact that I have hung task panic enabled, it
reboots the system where it is not the case for the non-rt kernel with same
version, so I am guessing there has to be something in RT that contributed
to this.

> The important information would be:
> - does it recover
Yes, I have to disable hung_task_panic otherwise it will restart the system. The system is responsive so I am assuming there is no deadlock.

>   - does the time in seconds always increase or does it also drop
It varies.

>   - does the message disappear if stress terminates.
Yes
> 
> - does it occur on v5.11-RT.
I have not tried, I do not have other RT kernel versions besides 4.9.xxx.
> 
> Sebastian
> 

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2021-02-23  0:00 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-02-22  1:46 sync system call failure Dipen Patel
2021-02-22 13:41 ` Sebastian Andrzej Siewior
2021-02-23  0:02   ` Dipen Patel

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).