From: Qian Cai <cai@lca.pw>
To: Ming Lei <ming.lei@redhat.com>
Cc: Jens Axboe <axboe@kernel.dk>,
linux-block@vger.kernel.org,
Christoph Hellwig <hch@infradead.org>,
m.szyprowski@samsung.com
Subject: Re: [PATCH V3 3/3] blk-mq: centralise related handling into blk_mq_get_driver_tag
Date: Thu, 2 Jul 2020 00:37:21 -0400 [thread overview]
Message-ID: <20200702043721.GA1087@lca.pw> (raw)
In-Reply-To: <20200630140357.2251174-4-ming.lei@redhat.com>
On Tue, Jun 30, 2020 at 10:03:57PM +0800, Ming Lei wrote:
> Move .nr_active update and request assignment into blk_mq_get_driver_tag(),
> all are good to do during getting driver tag.
>
> Meantime blk-flush related code is simplified and flush request needn't
> to update the request table manually any more.
>
> Cc: Christoph Hellwig <hch@infradead.org>
> Signed-off-by: Ming Lei <ming.lei@redhat.com>
Reverting this commit on the top of next-20200701 fixed an issue where
swapping is unable to move progress for hours while it will only take
5-min after the reverting.
# time ./random 0
- start: alloc_mmap
- mmap 10% memory: 13473664204
- mmap 10% memory: 13473664204
- mmap 10% memory: 13473664204
- mmap 10% memory: 13473664204
- mmap 10% memory: 13473664204
- mmap 10% memory: 13473664204
- mmap 10% memory: 13473664204
- mmap 10% memory: 13473664204
- mmap 10% memory: 13473664204
- mmap 10% memory: 13473664204
real 2m12.333s
user 0m1.131s
sys 4m15.689s
# git clone https://github.com/cailca/linux-mm
# cd linux-mm; make
# ./random 0 (just malloc and CoW to trigger swapping.)
x86.config is also included there.
The server is,
HP Proliant BL660Gen9
Intel(R) Xeon(R) CPU E5-4650 v3 @ 2.10GHz
131072 MB memory, 1000 GB disk space
The swap device is on LVM,
06:00.0 RAID bus controller: Hewlett-Packard Company Smart Array Gen9 Controllers (rev 01)
# lvs
LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert
home rhel_hpe-bl660cgen9-01 -wi-ao---- 875.89g
root rhel_hpe-bl660cgen9-01 -wi-ao---- 50.00g
swap rhel_hpe-bl660cgen9-01 -wi-ao---- 4.00g
With the commit, the system will be totally locked up (unable to login
etc) when pageout. After the while, some of those messages are in the
serial console,
[ 1250.372489][ T352] INFO: task xfsaild/dm-0:795 blocked for more than 122 seconds.
[ 1250.410651][ T352] Not tainted 5.8.0-rc3-next-20200701 #4
[ 1250.440777][ T352] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 1250.480718][ T352] xfsaild/dm-0 D26184 795 2 0x00004000
[ 1250.509775][ T352] Call Trace:
[ 1250.524445][ T352] __schedule+0x768/0x1ba0
[ 1250.544935][ T352] ? kcore_callback+0x1d/0x1d
[ 1250.566408][ T352] ? lockdep_hardirqs_on_prepare+0x550/0x550
[ 1250.594339][ T352] schedule+0xc1/0x2b0
[ 1250.612820][ T352] schedule_timeout+0x390/0x8f0
[ 1250.635556][ T352] ? usleep_range+0x120/0x120
[ 1250.656769][ T352] ? lock_downgrade+0x720/0x720
[ 1250.679531][ T352] ? rcu_read_unlock+0x50/0x50
[ 1250.701174][ T352] ? lockdep_hardirqs_on_prepare+0x38c/0x550
[ 1250.728811][ T352] ? _raw_spin_unlock_irq+0x1f/0x30
[ 1250.752964][ T352] ? trace_hardirqs_on+0x20/0x1b5
[ 1250.776427][ T352] wait_for_completion+0x15e/0x250
[ 1250.799632][ T352] ? wait_for_completion_interruptible+0x2f0/0x2f0
[ 1250.830311][ T352] ? lockdep_hardirqs_on_prepare+0x38c/0x550
[ 1250.858125][ T352] ? _raw_spin_unlock_irq+0x1f/0x30
[ 1250.881934][ T352] __flush_work+0x42a/0x930
[ 1250.903256][ T352] ? queue_delayed_work_on+0x90/0x90
[ 1250.929838][ T352] ? init_pwq+0x340/0x340
[ 1250.951021][ T352] ? trace_hardirqs_on+0x20/0x1b5
[ 1250.974344][ T352] xlog_cil_force_lsn+0x105/0x520
[ 1250.997409][ T352] ? xfs_log_commit_cil+0x27f0/0x27f0
[ 1251.022702][ T352] ? xfsaild_push+0x320/0x2170
[ 1251.044523][ T352] ? rcu_read_lock_sched_held+0xaa/0xd0
[ 1251.070639][ T352] ? do_raw_spin_lock+0x121/0x290
[ 1251.093099][ T352] ? rcu_read_lock_bh_held+0xc0/0xc0
[ 1251.116765][ T352] ? rwlock_bug.part.1+0x90/0x90
[ 1251.138491][ T352] ? xfsaild_push+0x320/0x2170
[ 1251.160021][ T352] xfs_log_force+0x283/0x8b0
[ 1251.181024][ T352] xfsaild_push+0x320/0x2170
[ 1251.201236][ T352] ? usleep_range+0x120/0x120
[ 1251.222285][ T352] ? lock_downgrade+0x720/0x720
[ 1251.244945][ T352] ? rcu_read_unlock+0x50/0x50
[ 1251.267111][ T352] ? xfs_trans_ail_cursor_first+0x180/0x180
[ 1251.294879][ T352] xfsaild+0x172/0x870
[ 1251.313497][ T352] ? rwlock_bug.part.1+0x90/0x90
[ 1251.338272][ T352] ? xfsaild_push+0x2170/0x2170
[ 1251.361313][ T352] ? __kthread_parkme+0x4d/0x1a0
[ 1251.384386][ T352] ? lockdep_hardirqs_on_prepare+0x38c/0x550
[ 1251.413272][ T352] ? _raw_spin_unlock_irqrestore+0x39/0x40
[ 1251.440768][ T352] ? trace_hardirqs_on+0x20/0x1b5
[ 1251.464454][ T352] ? __kthread_parkme+0xcc/0x1a0
[ 1251.487684][ T352] ? xfsaild_push+0x2170/0x2170
[ 1251.511076][ T352] kthread+0x358/0x420
[ 1251.529882][ T352] ? kthread_create_worker_on_cpu+0xc0/0xc0
[ 1251.556883][ T352] ret_from_fork+0x22/0x30
[ 1251.577642][ T352] INFO: task systemd-logind:1255 blocked for more than 124 seconds.
[ 1251.614691][ T352] Not tainted 5.8.0-rc3-next-20200701 #4
[ 1251.643014][ T352] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 1251.682935][ T352] systemd-logind D21280 1255 1 0x00004320
[ 1251.711905][ T352] Call Trace:
[ 1251.726589][ T352] __schedule+0x768/0x1ba0
[ 1251.747076][ T352] ? kcore_callback+0x1d/0x1d
[ 1251.768197][ T352] ? lockdep_hardirqs_on_prepare+0x550/0x550
[ 1251.797042][ T352] schedule+0xc1/0x2b0
[ 1251.816384][ T352] schedule_timeout+0x390/0x8f0
[ 1251.839135][ T352] ? print_irqtrace_events+0x270/0x270
[ 1251.864164][ T352] ? usleep_range+0x120/0x120
[ 1251.886086][ T352] ? mark_held_locks+0xb0/0x110
[ 1251.908092][ T352] ? lockdep_hardirqs_on_prepare+0x38c/0x550
[ 1251.936499][ T352] ? _raw_spin_unlock_irq+0x1f/0x30
[ 1251.961908][ T352] ? trace_hardirqs_on+0x20/0x1b5
[ 1251.986191][ T352] wait_for_completion+0x15e/0x250
[ 1252.011073][ T352] ? wait_for_completion_interruptible+0x2f0/0x2f0
[ 1252.041028][ T352] ? lockdep_hardirqs_on_prepare+0x38c/0x550
[ 1252.068947][ T352] ? _raw_spin_unlock_irq+0x1f/0x30
[ 1252.092838][ T352] __flush_work+0x42a/0x930
[ 1252.113605][ T352] ? queue_delayed_work_on+0x90/0x90
[ 1252.138605][ T352] ? mark_lock+0x147/0x1800
[ 1252.159065][ T352] ? init_pwq+0x340/0x340
[ 1252.178604][ T352] ? __lock_acquire+0x1920/0x4da0
[ 1252.202405][ T352] xlog_cil_force_lsn+0x105/0x520
[ 1252.225340][ T352] ? xfs_log_commit_cil+0x27f0/0x27f0
[ 1252.250481][ T352] ? xfs_iunpin_wait+0x25d/0x570
[ 1252.275021][ T352] ? rcu_read_lock_sched_held+0xaa/0xd0
[ 1252.301548][ T352] ? rcu_read_lock_bh_held+0xc0/0xc0
[ 1252.326485][ T352] ? xfs_iunpin_wait+0x25d/0x570
[ 1252.350295][ T352] xfs_log_force_lsn+0x237/0x590
[ 1252.372860][ T352] xfs_iunpin_wait+0x25d/0x570
[ 1252.395136][ T352] ? xfs_reclaim_inode+0x1d6/0x860
[ 1252.418578][ T352] ? xfs_inactive+0x350/0x350
[ 1252.439965][ T352] ? rcu_read_unlock+0x50/0x50
[ 1252.462896][ T352] ? var_wake_function+0x150/0x150
[ 1252.488702][ T352] ? xfs_reclaim_inode_grab+0xbf/0x100
[ 1252.513940][ T352] xfs_reclaim_inode+0x1d6/0x860
[ 1252.536360][ T352] ? xfs_inode_clear_reclaim_tag+0xa0/0xa0
[ 1252.562673][ T352] ? do_raw_spin_unlock+0x4f/0x250
[ 1252.587647][ T352] xfs_reclaim_inodes_ag+0x505/0xb00
[ 1252.612344][ T352] ? xfs_reclaim_inode+0x860/0x860
[ 1252.635670][ T352] ? mark_held_locks+0xb0/0x110
[ 1252.657425][ T352] ? lockdep_hardirqs_on_prepare+0x38c/0x550
[ 1252.685355][ T352] ? _raw_spin_unlock_irqrestore+0x39/0x40
[ 1252.712212][ T352] ? trace_hardirqs_on+0x20/0x1b5
[ 1252.735256][ T352] ? try_to_wake_up+0x6d1/0xf40
[ 1252.757809][ T352] ? migrate_swap_stop+0xbf0/0xbf0
[ 1252.780846][ T352] ? do_raw_spin_unlock+0x4f/0x250
[ 1252.803908][ T352] xfs_reclaim_inodes_nr+0x93/0xd0
[ 1252.828191][ T352] ? xfs_reclaim_inodes+0x90/0x90
[ 1252.851060][ T352] ? list_lru_count_one+0x177/0x300
[ 1252.875228][ T352] super_cache_scan+0x2fd/0x430
[ 1252.897411][ T352] do_shrink_slab+0x317/0x990
[ 1252.919083][ T352] shrink_slab+0x3a8/0x4b0
[ 1252.939077][ T352] ? do_shrink_slab+0x990/0x990
[ 1252.962308][ T352] ? mem_cgroup_protected+0x228/0x470
[ 1252.988350][ T352] shrink_node+0x49c/0x17b0
[ 1253.010025][ T352] do_try_to_free_pages+0x348/0x1250
[ 1253.036410][ T352] ? shrink_node+0x17b0/0x17b0
[ 1253.058428][ T352] ? rcu_read_lock_sched_held+0xaa/0xd0
[ 1253.083861][ T352] ? rcu_read_lock_bh_held+0xc0/0xc0
[ 1253.108752][ T352] try_to_free_pages+0x244/0x690
[ 1253.131224][ T352] ? free_unref_page_prepare.part.42+0x160/0x170
[ 1253.161131][ T352] ? reclaim_pages+0x9a0/0x9a0
[ 1253.183202][ T352] ? lockdep_hardirqs_on_prepare+0x38c/0x550
[ 1253.211059][ T352] __alloc_pages_slowpath.constprop.73+0x7b8/0x2410
[ 1253.241737][ T352] ? warn_alloc+0x120/0x120
[ 1253.262497][ T352] ? __alloc_pages_nodemask+0x3b7/0x7c0
[ 1253.288529][ T352] ? lock_downgrade+0x720/0x720
[ 1253.310739][ T352] ? __isolate_free_page+0x570/0x570
[ 1253.336306][ T352] ? rcu_read_lock_sched_held+0xaa/0xd0
[ 1253.361726][ T352] __alloc_pages_nodemask+0x658/0x7c0
[ 1253.386922][ T352] ? __alloc_pages_slowpath.constprop.73+0x2410/0x2410
[ 1253.418811][ T352] allocate_slab+0x378/0x680
[ 1253.439933][ T352] new_slab+0x3c/0x60
[ 1253.458006][ T352] ___slab_alloc+0x49b/0x7b0
[ 1253.479756][ T352] ? getname_kernel+0x49/0x290
[ 1253.502868][ T352] ? getname_kernel+0x49/0x290
[ 1253.525982][ T352] ? __slab_alloc+0x35/0x70
[ 1253.547838][ T352] __slab_alloc+0x35/0x70
[ 1253.567937][ T352] ? getname_kernel+0x49/0x290
[ 1253.590116][ T352] kmem_cache_alloc+0x230/0x2a0
[ 1253.612752][ T352] getname_kernel+0x49/0x290
[ 1253.634286][ T352] kern_path+0xc/0x30
[ 1253.652442][ T352] unix_find_other+0xab/0x620
[ 1253.673847][ T352] ? unix_stream_recvmsg+0xc0/0xc0
[ 1253.697111][ T352] ? find_held_lock+0x33/0x1c0
[ 1253.719156][ T352] ? __might_fault+0xea/0x1a0
[ 1253.740036][ T352] unix_dgram_sendmsg+0x79f/0xe70
[ 1253.763719][ T352] ? unix_dgram_connect+0x650/0x650
[ 1253.787501][ T352] ? _copy_from_user+0xbe/0x100
[ 1253.810230][ T352] ? rw_copy_check_uvector+0x5e/0x380
[ 1253.834712][ T352] ? move_addr_to_kernel.part.25+0x28/0x40
[ 1253.861330][ T352] ? __x64_sys_shutdown+0x70/0x70
[ 1253.883957][ T352] ? dup_iter+0x240/0x240
[ 1253.903962][ T352] ? unix_dgram_connect+0x650/0x650
[ 1253.927796][ T352] ____sys_sendmsg+0x5f5/0x820
[ 1253.949832][ T352] ? sock_create_kern+0x10/0x10
[ 1253.972027][ T352] ? __copy_msghdr_from_user+0x3e0/0x3e0
[ 1253.999411][ T352] ? sock_setsockopt+0x37c/0x1f60
[ 1254.024662][ T352] ? sock_setsockopt+0x37c/0x1f60
[ 1254.048119][ T352] ___sys_sendmsg+0xe5/0x170
[ 1254.069009][ T352] ? sock_getsockopt+0x577/0x1ba3
[ 1254.092402][ T352] ? copy_msghdr_from_user+0xe0/0xe0
[ 1254.116532][ T352] ? sk_get_meminfo+0x4b0/0x4b0
[ 1254.139128][ T352] ? rcu_read_lock_bh_held+0xc0/0xc0
[ 1254.162917][ T352] ? ___bpf_prog_run+0x30c6/0x76a0
[ 1254.186532][ T352] ? lockdep_hardirqs_on_prepare+0x550/0x550
[ 1254.214390][ T352] ? lockdep_init_map_waits+0x267/0x7b0
[ 1254.239719][ T352] ? __bpf_prog_run64+0xc0/0xc0
[ 1254.261805][ T352] ? __sys_setsockopt+0x39a/0x430
[ 1254.285143][ T352] ? __fget_light+0x4c/0x220
[ 1254.306399][ T352] ? sockfd_lookup_light+0x17/0x140
[ 1254.330417][ T352] __sys_sendmsg+0xba/0x150
[ 1254.351957][ T352] ? __sys_sendmsg_sock+0xa0/0xa0
[ 1254.375087][ T352] ? seccomp_do_user_notification.isra.14+0xa60/0xa60
[ 1254.406754][ T352] ? sockfd_lookup_light+0x140/0x140
[ 1254.431071][ T352] ? lockdep_hardirqs_on_prepare+0x38c/0x550
[ 1254.459401][ T352] ? do_syscall_64+0x24/0x310
[ 1254.480731][ T352] do_syscall_64+0x5f/0x310
[ 1254.502045][ T352] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 1254.530295][ T352] RIP: 0033:0x7fd5b4dbdad5
[ 1254.551521][ T352] Code: Bad RIP value.
[ 1254.570837][ T352] RSP: 002b:00007ffd4294cdb8 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
[ 1254.610502][ T352] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fd5b4dbdad5
[ 1254.647167][ T352] RDX: 0000000000004000 RSI: 00007ffd4294cdf0 RDI: 0000000000000010
[ 1254.684282][ T352] RBP: 00007ffd4294cef0 R08: 000000000000006c R09: 0000000000017be2
[ 1254.721279][ T352] R10: 00007ffd4294cd90 R11: 0000000000000246 R12: 0000000000000010
[ 1254.758535][ T352] R13: 00007ffd4294cfc0 R14: 00007ffd4294ce30 R15: 0000000000000000
[ 1254.794642][ T352] INFO: task random:1774 blocked for more than 127 seconds.
[ 1254.827372][ T352] Not tainted 5.8.0-rc3-next-20200701 #4
[ 1254.854803][ T352] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 1254.893736][ T352] random D22488 1774 1765 0x00000000
[ 1254.921952][ T352] Call Trace:
[ 1254.936322][ T352] __schedule+0x768/0x1ba0
[ 1254.955756][ T352] ? kcore_callback+0x1d/0x1d
[ 1254.976489][ T352] schedule+0xc1/0x2b0
[ 1254.994120][ T352] schedule_preempt_disabled+0xc/0x20
[ 1255.019851][ T352] __mutex_lock+0xa1e/0x1390
[ 1255.042999][ T352] ? xfs_reclaim_inodes_ag+0x68c/0xb00
[ 1255.068241][ T352] ? mutex_lock_io_nested+0x1270/0x1270
[ 1255.092922][ T352] ? rcu_read_lock_sched_held+0xaa/0xd0
[ 1255.117517][ T352] ? rcu_read_lock_bh_held+0xc0/0xc0
[ 1255.141068][ T352] ? xfs_perag_get_tag+0x463/0x550
[ 1255.163710][ T352] ? __lock_acquire+0x1920/0x4da0
[ 1255.186068][ T352] ? xfs_reclaim_inodes_ag+0x68c/0xb00
[ 1255.210506][ T352] xfs_reclaim_inodes_ag+0x68c/0xb00
[ 1255.233939][ T352] ? lockdep_hardirqs_on_prepare+0x550/0x550
[ 1255.260536][ T352] ? xfs_reclaim_inode+0x860/0x860
[ 1255.283185][ T352] ? print_irqtrace_events+0x270/0x270
[ 1255.307491][ T352] ? find_held_lock+0x33/0x1c0
[ 1255.328782][ T352] ? xfs_ail_push_all+0x86/0xd0
[ 1255.350848][ T352] ? lock_downgrade+0x720/0x720
[ 1255.372513][ T352] ? rcu_read_unlock+0x50/0x50
[ 1255.393928][ T352] ? do_raw_spin_lock+0x121/0x290
[ 1255.416127][ T352] ? rwlock_bug.part.1+0x90/0x90
[ 1255.437997][ T352] ? do_raw_spin_unlock+0x4f/0x250
[ 1255.460633][ T352] xfs_reclaim_inodes_nr+0x93/0xd0
[ 1255.483981][ T352] ? xfs_reclaim_inodes+0x90/0x90
[ 1255.507264][ T352] ? list_lru_count_one+0x177/0x300
[ 1255.532061][ T352] super_cache_scan+0x2fd/0x430
[ 1255.556560][ T352] do_shrink_slab+0x317/0x990
[ 1255.579834][ T352] shrink_slab+0x141/0x4b0
[ 1255.599330][ T352] ? do_shrink_slab+0x990/0x990
[ 1255.620934][ T352] ? mem_cgroup_iter+0x3c3/0x800
[ 1255.642930][ T352] shrink_node+0x49c/0x17b0
[ 1255.662858][ T352] do_try_to_free_pages+0x348/0x1250
[ 1255.687403][ T352] ? shrink_node+0x17b0/0x17b0
[ 1255.710188][ T352] ? rcu_read_lock_sched_held+0xaa/0xd0
[ 1255.736054][ T352] ? rcu_read_lock_bh_held+0xc0/0xc0
[ 1255.760640][ T352] try_to_free_pages+0x244/0x690
[ 1255.784242][ T352] ? free_unref_page_prepare.part.42+0x160/0x170
[ 1255.813938][ T352] ? reclaim_pages+0x9a0/0x9a0
[ 1255.835127][ T352] ? lockdep_hardirqs_on_prepare+0x38c/0x550
[ 1255.861708][ T352] __alloc_pages_slowpath.constprop.73+0x7b8/0x2410
[ 1255.891312][ T352] ? lockdep_hardirqs_on_prepare+0x38c/0x550
[ 1255.918103][ T352] ? get_page_from_freelist+0x3b8/0x3930
[ 1255.943221][ T352] ? __lock_acquire+0xc57/0x4da0
[ 1255.965030][ T352] ? warn_alloc+0x120/0x120
[ 1255.984934][ T352] ? __isolate_free_page+0x570/0x570
[ 1256.008474][ T352] __alloc_pages_nodemask+0x658/0x7c0
[ 1256.032319][ T352] ? handle_mm_fault+0x1ec/0x32f0
[ 1256.056694][ T352] ? __alloc_pages_slowpath.constprop.73+0x2410/0x2410
[ 1256.089900][ T352] ? mark_held_locks+0xb0/0x110
[ 1256.111356][ T352] alloc_pages_vma+0xa5/0x4c0
[ 1256.132001][ T352] handle_mm_fault+0x1b1e/0x32f0
[ 1256.153941][ T352] ? copy_page_range+0x6b0/0x6b0
[ 1256.175764][ T352] do_user_addr_fault+0x2ba/0x90d
[ 1256.198277][ T352] exc_page_fault+0x56/0xa0
[ 1256.218256][ T352] ? asm_exc_page_fault+0x8/0x30
[ 1256.240143][ T352] asm_exc_page_fault+0x1e/0x30
[ 1256.261573][ T352] RIP: 0033:0x402a91
[ 1256.278694][ T352] Code: Bad RIP value.
[ 1256.296570][ T352] RSP: 002b:00007f2470abeec0 EFLAGS: 00010206
[ 1256.323772][ T352] RAX: 00007f2381af2000 RBX: 0000000000000000 RCX: 00007f3a69f4e8c7
[ 1256.360556][ T352] RDX: 00007f214d118000 RSI: 00000003231a6ccc RDI: 0000000000000000
[ 1256.396710][ T352] RBP: 00007f2470abeef0 R08: 00000000ffffffff R09: 0000000000000000
[ 1256.432457][ T352] R10: 0000000000000022 R11: 0000000000000246 R12: 00007ffd23f27bbe
[ 1256.468370][ T352] R13: 00007ffd23f27bbf R14: 0000000000000000 R15: 00007f2470abefc0
[ 1256.505273][ T352] INFO: task kworker/47:0:1815 blocked for more than 129 seconds.
[ 1256.541387][ T352] Not tainted 5.8.0-rc3-next-20200701 #4
[ 1256.571909][ T352] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 1256.614949][ T352] kworker/47:0 D27512 1815 2 0x00004000
[ 1256.642844][ T352] Workqueue: xfs-sync/dm-0 xfs_log_worker
[ 1256.668278][ T352] Call Trace:
[ 1256.682654][ T352] __schedule+0x768/0x1ba0
[ 1256.702137][ T352] ? kcore_callback+0x1d/0x1d
[ 1256.722824][ T352] ? lockdep_hardirqs_on_prepare+0x38c/0x550
[ 1256.749518][ T352] ? _raw_spin_unlock_irq+0x1f/0x30
[ 1256.772548][ T352] ? trace_hardirqs_on+0x20/0x1b5
[ 1256.794901][ T352] schedule+0xc1/0x2b0
[ 1256.812834][ T352] schedule_timeout+0x390/0x8f0
[ 1256.834068][ T352] ? print_irqtrace_events+0x270/0x270
[ 1256.858477][ T352] ? usleep_range+0x120/0x120
[ 1256.879398][ T352] ? mark_held_locks+0xb0/0x110
[ 1256.900894][ T352] ? lockdep_hardirqs_on_prepare+0x38c/0x550
[ 1256.927501][ T352] ? _raw_spin_unlock_irq+0x1f/0x30
[ 1256.950506][ T352] ? trace_hardirqs_on+0x20/0x1b5
[ 1256.972875][ T352] wait_for_completion+0x15e/0x250
[ 1256.995628][ T352] ? wait_for_completion_interruptible+0x2f0/0x2f0
[ 1257.025057][ T352] ? lockdep_hardirqs_on_prepare+0x38c/0x550
[ 1257.051673][ T352] ? _raw_spin_unlock_irq+0x1f/0x30
[ 1257.076481][ T352] __flush_work+0x42a/0x930
[ 1257.099817][ T352] ? queue_delayed_work_on+0x90/0x90
[ 1257.124652][ T352] ? init_pwq+0x340/0x340
[ 1257.143723][ T352] ? lockdep_hardirqs_on_prepare+0x550/0x550
[ 1257.170447][ T352] xlog_cil_force_lsn+0x105/0x520
[ 1257.193048][ T352] ? xfs_log_commit_cil+0x27f0/0x27f0
[ 1257.216868][ T352] ? xfs_log_worker+0x70/0x200
[ 1257.237854][ T352] ? rcu_read_lock_sched_held+0xaa/0xd0
[ 1257.262124][ T352] ? do_raw_spin_lock+0x121/0x290
[ 1257.284432][ T352] ? rcu_read_lock_bh_held+0xc0/0xc0
[ 1257.307963][ T352] ? rwlock_bug.part.1+0x90/0x90
[ 1257.329862][ T352] ? xfs_log_worker+0x70/0x200
[ 1257.351537][ T352] xfs_log_force+0x283/0x8b0
[ 1257.372051][ T352] xfs_log_worker+0x70/0x200
[ 1257.392452][ T352] process_one_work+0x8ee/0x17c0
[ 1257.414642][ T352] ? lock_acquire+0x1ac/0xaf0
[ 1257.435337][ T352] ? pwq_dec_nr_in_flight+0x310/0x310
[ 1257.459227][ T352] ? do_raw_spin_lock+0x121/0x290
[ 1257.481653][ T352] worker_thread+0x82/0xb30
[ 1257.501638][ T352] ? __kthread_parkme+0xcc/0x1a0
[ 1257.523429][ T352] ? process_one_work+0x17c0/0x17c0
[ 1257.546492][ T352] kthread+0x358/0x420
[ 1257.564555][ T352] ? kthread_create_worker_on_cpu+0xc0/0xc0
[ 1257.593534][ T352] ret_from_fork+0x22/0x30
[ 1257.614307][ T352]
[ 1257.614307][ T352] Showing all locks held in the system:
[ 1257.653989][ T352] 2 locks held by systemd/1:
[ 1257.674581][ T352] #0: ffff88a033931658 (&mm->mmap_lock#2){++++}-{3:3}, at: do_user_addr_fault+0x167/0x90d
[ 1257.719434][ T352] #1: ffffffff906d07c0 (fs_reclaim){+.+.}-{0:0}, at: fs_reclaim_acquire.part.64+0x0/0x30
[ 1257.764101][ T352] 2 locks held by kworker/u97:9/270:
[ 1257.787648][ T352] #0: ffff88a035e6cd38 ((wq_completion)xfs-cil/dm-0){+.+.}-{0:0}, at: process_one_work+0x7c8/0x17c0
[ 1257.837506][ T352] #1: ffffc900077d7e00 ((work_completion)(&cil->xc_push_work)){+.+.}-{0:0}, at: process_one_work+0x7fc/0x17c0
[ 1257.892615][ T352] 1 lock held by khungtaskd/352:
[ 1257.915428][ T352] #0: ffffffff905887e0 (rcu_read_lock){....}-{1:2}, at: rcu_lock_acquire.constprop.29+0x0/0x30
[ 1257.964163][ T352] 1 lock held by khugepaged/360:
[ 1257.986155][ T352] #0: ffffffff906d07c0 (fs_reclaim){+.+.}-{0:0}, at: fs_reclaim_acquire.part.64+0x0/0x30
[ 1258.030582][ T352] 1 lock held by kswapd0/366:
[ 1258.051264][ T352] #0: ffffffff906d07c0 (fs_reclaim){+.+.}-{0:0}, at: __fs_reclaim_acquire+0x0/0x30
[ 1258.094152][ T352] 1 lock held by kswapd1/367:
[ 1258.117184][ T352] #0: ffffffff906d07c0 (fs_reclaim){+.+.}-{0:0}, at: __fs_reclaim_acquire+0x0/0x30
[ 1258.161954][ T352] 1 lock held by kswapd2/368:
[ 1258.182634][ T352] #0: ffffffff906d07c0 (fs_reclaim){+.+.}-{0:0}, at: __fs_reclaim_acquire+0x0/0x30
[ 1258.224967][ T352] 1 lock held by kswapd3/369:
[ 1258.245638][ T352] #0: ffffffff906d07c0 (fs_reclaim){+.+.}-{0:0}, at: __fs_reclaim_acquire+0x0/0x30
[ 1258.287356][ T352] 1 lock held by systemd-journal/896:
[ 1258.311276][ T352] #0: ffffffff906d07c0 (fs_reclaim){+.+.}-{0:0}, at: fs_reclaim_acquire.part.64+0x0/0x30
[ 1258.357096][ T352] 1 lock held by systemd-udevd/934:
[ 1258.380380][ T352] #0: ffffffff906d07c0 (fs_reclaim){+.+.}-{0:0}, at: fs_reclaim_acquire.part.64+0x0/0x30
[ 1258.425357][ T352] 1 lock held by gmain/1221:
[ 1258.445231][ T352] #0: ffffffff906d07c0 (fs_reclaim){+.+.}-{0:0}, at: fs_reclaim_acquire.part.64+0x0/0x30
[ 1258.489822][ T352] 1 lock held by irqbalance/1163:
[ 1258.512203][ T352] #0: ffffffff906d07c0 (fs_reclaim){+.+.}-{0:0}, at: fs_reclaim_acquire.part.64+0x0/0x30
[ 1258.556285][ T352] 1 lock held by sshd/1232:
[ 1258.576151][ T352] #0: ffffffff906d07c0 (fs_reclaim){+.+.}-{0:0}, at: fs_reclaim_acquire.part.64+0x0/0x30
[ 1258.622963][ T352] 5 locks held by systemd-logind/1255:
[ 1258.650245][ T352] #0: ffffffff906d07c0 (fs_reclaim){+.+.}-{0:0}, at: fs_reclaim_acquire.part.64+0x0/0x30
[ 1258.696041][ T352] #1: ffffffff90676850 (shrinker_rwsem){++++}-{3:3}, at: shrink_slab+0x219/0x4b0
[ 1258.738489][ T352] #2: ffff88a005f040e0 (&type->s_umount_key#28){++++}-{3:3}, at: trylock_super+0x11/0xb0
[ 1258.783155][ T352] #3: ffff88a00cd3d228 (&pag->pag_ici_reclaim_lock){+.+.}-{3:3}, at: xfs_reclaim_inodes_ag+0x135/0xb00
[ 1258.833254][ T352] #4: ffff8897c3b57828 (&xfs_nondir_ilock_class){++++}-{3:3}, at: xfs_reclaim_inode+0xdf/0x860
[ 1258.880242][ T352] 2 locks held by agetty/1264:
[ 1258.901291][ T352] #0: ffff889ff8159378 (&sig->cred_guard_mutex){+.+.}-{3:3}, at: __do_execve_file+0x2f2/0x1a50
[ 1258.947820][ T352] #1: ffffffff906d07c0 (fs_reclaim){+.+.}-{0:0}, at: fs_reclaim_acquire.part.64+0x0/0x30
[ 1258.992410][ T352] 1 lock held by crond/1509:
[ 1259.012721][ T352] #0: ffffffff906d07c0 (fs_reclaim){+.+.}-{0:0}, at: fs_reclaim_acquire.part.64+0x0/0x30
[ 1259.057395][ T352] 1 lock held by in:imjournal/1523:
[ 1259.080494][ T352] #0: ffff889fdeaee7c0 (&(&ip->i_mmaplock)->mr_lock){++++}-{3:3}, at: xfs_ilock+0xf3/0x370
[ 1259.127855][ T352] 3 locks held by restraintd/1520:
[ 1259.153828][ T352] #0: ffff88a00bf4ea30 (&f->f_pos_lock){+.+.}-{3:3}, at: __fdget_pos+0x9c/0xb0
[ 1259.194996][ T352] #1: ffff88a005f04430 (sb_writers#8){.+.+}-{0:0}, at: vfs_write+0x371/0x4d0
[ 1259.234636][ T352] #2: ffff88838caed050 (&sb->s_type->i_mutex_key#12){++++}-{3:3}, at: xfs_ilock+0x27c/0x370
[ 1259.280335][ T352] 1 lock held by runtest.sh/1567:
[ 1259.302796][ T352] #0: ffffffff906d07c0 (fs_reclaim){+.+.}-{0:0}, at: fs_reclaim_acquire.part.64+0x0/0x30
[ 1259.347545][ T352] 2 locks held by random/1768:
[ 1259.369261][ T352] #0: ffff888fbda3ce58 (&mm->mmap_lock#2){++++}-{3:3}, at: do_user_addr_fault+0x167/0x90d
[ 1259.414349][ T352] #1: ffffffff906d07c0 (fs_reclaim){+.+.}-{0:0}, at: fs_reclaim_acquire.part.64+0x0/0x30
[ 1259.458977][ T352] 2 locks held by random/1769:
[ 1259.480409][ T352] #0: ffff888fbda3ce58 (&mm->mmap_lock#2){++++}-{3:3}, at: do_user_addr_fault+0x167/0x90d
[ 1259.525545][ T352] #1: ffffffff906d07c0 (fs_reclaim){+.+.}-{0:0}, at: fs_reclaim_acquire.part.64+0x0/0x30
[ 1259.571368][ T352] 2 locks held by random/1770:
[ 1259.592131][ T352] #0: ffff888fbda3ce58 (&mm->mmap_lock#2){++++}-{3:3}, at: do_user_addr_fault+0x167/0x90d
[ 1259.641941][ T352] #1: ffffffff906d07c0 (fs_reclaim){+.+.}-{0:0}, at: fs_reclaim_acquire.part.64+0x0/0x30
[ 1259.689771][ T352] 2 locks held by random/1771:
[ 1259.710979][ T352] #0: ffff888fbda3ce58 (&mm->mmap_lock#2){++++}-{3:3}, at: do_user_addr_fault+0x167/0x90d
[ 1259.757004][ T352] #1: ffffffff906d07c0 (fs_reclaim){+.+.}-{0:0}, at: fs_reclaim_acquire.part.64+0x0/0x30
[ 1259.801644][ T352] 2 locks held by random/1772:
[ 1259.822791][ T352] #0: ffff888fbda3ce58 (&mm->mmap_lock#2){++++}-{3:3}, at: do_user_addr_fault+0x167/0x90d
[ 1259.867692][ T352] #1: ffffffff906d07c0 (fs_reclaim){+.+.}-{0:0}, at: fs_reclaim_acquire.part.64+0x0/0x30
[ 1259.912339][ T352] 2 locks held by random/1773:
[ 1259.933571][ T352] #0: ffff888fbda3ce58 (&mm->mmap_lock#2){++++}-{3:3}, at: do_user_addr_fault+0x167/0x90d
[ 1259.980366][ T352] #1: ffffffff906d07c0 (fs_reclaim){+.+.}-{0:0}, at: fs_reclaim_acquire.part.64+0x0/0x30
[ 1260.026630][ T352] 5 locks held by random/1774:
[ 1260.048812][ T352] #0: ffff888fbda3ce58 (&mm->mmap_lock#2){++++}-{3:3}, at: do_user_addr_fault+0x167/0x90d
[ 1260.094328][ T352] #1: ffffffff906d07c0 (fs_reclaim){+.+.}-{0:0}, at: fs_reclaim_acquire.part.64+0x0/0x30
[ 1260.138919][ T352] #2: ffffffff90676850 (shrinker_rwsem){++++}-{3:3}, at: shrink_slab+0xb8/0x4b0
[ 1260.184545][ T352] #3: ffff88a005f040e0 (&type->s_umount_key#28){++++}-{3:3}, at: trylock_super+0x11/0xb0
[ 1260.228764][ T352] #4: ffff88a00cd3d228 (&pag->pag_ici_reclaim_lock){+.+.}-{3:3}, at: xfs_reclaim_inodes_ag+0x68c/0xb00
[ 1260.278792][ T352] 2 locks held by random/1775:
[ 1260.300785][ T352] #0: ffff888fbda3ce58 (&mm->mmap_lock#2){++++}-{3:3}, at: do_user_addr_fault+0x167/0x90d
[ 1260.348139][ T352] #1: ffffffff906d07c0 (fs_reclaim){+.+.}-{0:0}, at: fs_reclaim_acquire.part.64+0x0/0x30
[ 1260.393623][ T352] 1 lock held by awk/1785:
[ 1260.413577][ T352] #0: ffff8890186467c0 (&(&ip->i_mmaplock)->mr_lock){++++}-{3:3}, at: xfs_ilock+0xf3/0x370
[ 1260.458941][ T352] 2 locks held by kworker/47:0/1815:
[ 1260.482420][ T352] #0: ffff88a005d8b938 ((wq_completion)xfs-sync/dm-0){+.+.}-{0:0}, at: process_one_work+0x7c8/0x17c0
[ 1260.531638][ T352] #1: ffffc9000a64fe00 ((work_completion)(&(&log->l_work)->work)){+.+.}-{0:0}, at: process_one_work+0x7fc/0x17c0
[ 1260.585778][ T352]
[ 1260.595753][ T352] =============================================
[ 1260.595753][ T352]
> ---
> block/blk-flush.c | 17 ++++++-----------
> block/blk-mq-tag.h | 12 ------------
> block/blk-mq.c | 30 ++++++++++++++----------------
> block/blk.h | 5 -----
> 4 files changed, 20 insertions(+), 44 deletions(-)
>
> diff --git a/block/blk-flush.c b/block/blk-flush.c
> index 21108a550fbf..e756db088d84 100644
> --- a/block/blk-flush.c
> +++ b/block/blk-flush.c
> @@ -236,12 +236,10 @@ static void flush_end_io(struct request *flush_rq, blk_status_t error)
> error = fq->rq_status;
>
> hctx = flush_rq->mq_hctx;
> - if (!q->elevator) {
> - blk_mq_tag_set_rq(hctx, flush_rq->tag, fq->orig_rq);
> - flush_rq->tag = -1;
> - } else {
> - flush_rq->internal_tag = -1;
> - }
> + if (!q->elevator)
> + flush_rq->tag = BLK_MQ_NO_TAG;
> + else
> + flush_rq->internal_tag = BLK_MQ_NO_TAG;
>
> running = &fq->flush_queue[fq->flush_running_idx];
> BUG_ON(fq->flush_pending_idx == fq->flush_running_idx);
> @@ -315,13 +313,10 @@ static void blk_kick_flush(struct request_queue *q, struct blk_flush_queue *fq,
> flush_rq->mq_ctx = first_rq->mq_ctx;
> flush_rq->mq_hctx = first_rq->mq_hctx;
>
> - if (!q->elevator) {
> - fq->orig_rq = first_rq;
> + if (!q->elevator)
> flush_rq->tag = first_rq->tag;
> - blk_mq_tag_set_rq(flush_rq->mq_hctx, first_rq->tag, flush_rq);
> - } else {
> + else
> flush_rq->internal_tag = first_rq->internal_tag;
> - }
>
> flush_rq->cmd_flags = REQ_OP_FLUSH | REQ_PREFLUSH;
> flush_rq->cmd_flags |= (flags & REQ_DRV) | (flags & REQ_FAILFAST_MASK);
> diff --git a/block/blk-mq-tag.h b/block/blk-mq-tag.h
> index 3945c7f5b944..b1acac518c4e 100644
> --- a/block/blk-mq-tag.h
> +++ b/block/blk-mq-tag.h
> @@ -101,18 +101,6 @@ static inline bool hctx_may_queue(struct blk_mq_hw_ctx *hctx,
> return atomic_read(&hctx->nr_active) < depth;
> }
>
> -/*
> - * This helper should only be used for flush request to share tag
> - * with the request cloned from, and both the two requests can't be
> - * in flight at the same time. The caller has to make sure the tag
> - * can't be freed.
> - */
> -static inline void blk_mq_tag_set_rq(struct blk_mq_hw_ctx *hctx,
> - unsigned int tag, struct request *rq)
> -{
> - hctx->tags->rqs[tag] = rq;
> -}
> -
> static inline bool blk_mq_tag_is_reserved(struct blk_mq_tags *tags,
> unsigned int tag)
> {
> diff --git a/block/blk-mq.c b/block/blk-mq.c
> index e7a2d2d44b51..a20647139c95 100644
> --- a/block/blk-mq.c
> +++ b/block/blk-mq.c
> @@ -277,26 +277,20 @@ static struct request *blk_mq_rq_ctx_init(struct blk_mq_alloc_data *data,
> {
> struct blk_mq_tags *tags = blk_mq_tags_from_data(data);
> struct request *rq = tags->static_rqs[tag];
> - req_flags_t rq_flags = 0;
>
> if (data->q->elevator) {
> rq->tag = BLK_MQ_NO_TAG;
> rq->internal_tag = tag;
> } else {
> - if (data->hctx->flags & BLK_MQ_F_TAG_SHARED) {
> - rq_flags = RQF_MQ_INFLIGHT;
> - atomic_inc(&data->hctx->nr_active);
> - }
> rq->tag = tag;
> rq->internal_tag = BLK_MQ_NO_TAG;
> - data->hctx->tags->rqs[rq->tag] = rq;
> }
>
> /* csd/requeue_work/fifo_time is initialized before use */
> rq->q = data->q;
> rq->mq_ctx = data->ctx;
> rq->mq_hctx = data->hctx;
> - rq->rq_flags = rq_flags;
> + rq->rq_flags = 0;
> rq->cmd_flags = data->cmd_flags;
> if (data->flags & BLK_MQ_REQ_PREEMPT)
> rq->rq_flags |= RQF_PREEMPT;
> @@ -1127,9 +1121,10 @@ static bool __blk_mq_get_driver_tag(struct request *rq)
> {
> struct sbitmap_queue *bt = &rq->mq_hctx->tags->bitmap_tags;
> unsigned int tag_offset = rq->mq_hctx->tags->nr_reserved_tags;
> - bool shared = blk_mq_tag_busy(rq->mq_hctx);
> int tag;
>
> + blk_mq_tag_busy(rq->mq_hctx);
> +
> if (blk_mq_tag_is_reserved(rq->mq_hctx->sched_tags, rq->internal_tag)) {
> bt = &rq->mq_hctx->tags->breserved_tags;
> tag_offset = 0;
> @@ -1142,19 +1137,22 @@ static bool __blk_mq_get_driver_tag(struct request *rq)
> return false;
>
> rq->tag = tag + tag_offset;
> - if (shared) {
> - rq->rq_flags |= RQF_MQ_INFLIGHT;
> - atomic_inc(&rq->mq_hctx->nr_active);
> - }
> - rq->mq_hctx->tags->rqs[rq->tag] = rq;
> return true;
> }
>
> static bool blk_mq_get_driver_tag(struct request *rq)
> {
> - if (rq->tag != BLK_MQ_NO_TAG)
> - return true;
> - return __blk_mq_get_driver_tag(rq);
> + struct blk_mq_hw_ctx *hctx = rq->mq_hctx;
> +
> + if (rq->tag == BLK_MQ_NO_TAG && !__blk_mq_get_driver_tag(rq))
> + return false;
> +
> + if (hctx->flags & BLK_MQ_F_TAG_SHARED) {
> + rq->rq_flags |= RQF_MQ_INFLIGHT;
> + atomic_inc(&hctx->nr_active);
> + }
> + hctx->tags->rqs[rq->tag] = rq;
> + return true;
> }
>
> static int blk_mq_dispatch_wake(wait_queue_entry_t *wait, unsigned mode,
> diff --git a/block/blk.h b/block/blk.h
> index 41a50880c94e..0184a31fe4df 100644
> --- a/block/blk.h
> +++ b/block/blk.h
> @@ -25,11 +25,6 @@ struct blk_flush_queue {
> struct list_head flush_data_in_flight;
> struct request *flush_rq;
>
> - /*
> - * flush_rq shares tag with this rq, both can't be active
> - * at the same time
> - */
> - struct request *orig_rq;
> struct lock_class_key key;
> spinlock_t mq_flush_lock;
> };
> --
> 2.25.2
>
next prev parent reply other threads:[~2020-07-02 4:37 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-06-30 14:03 [PATCH V3 0/3] blk-mq: driver tag related cleanup Ming Lei
2020-06-30 14:03 ` [PATCH V3 1/3] blk-mq: move blk_mq_get_driver_tag into blk-mq.c Ming Lei
2020-06-30 14:03 ` [PATCH V3 2/3] blk-mq: move blk_mq_put_driver_tag() " Ming Lei
2020-06-30 14:03 ` [PATCH V3 3/3] blk-mq: centralise related handling into blk_mq_get_driver_tag Ming Lei
2020-07-02 4:37 ` Qian Cai [this message]
2020-07-02 6:20 ` Ming Lei
2020-06-30 18:58 ` [PATCH V3 0/3] blk-mq: driver tag related cleanup Jens Axboe
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200702043721.GA1087@lca.pw \
--to=cai@lca.pw \
--cc=axboe@kernel.dk \
--cc=hch@infradead.org \
--cc=linux-block@vger.kernel.org \
--cc=m.szyprowski@samsung.com \
--cc=ming.lei@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).