All of lore.kernel.org
 help / color / mirror / Atom feed
* multi-threads libvmmalloc fork test hang
@ 2016-10-27 11:22 ` Xiong Zhou
  0 siblings, 0 replies; 14+ messages in thread
From: Xiong Zhou @ 2016-10-27 11:22 UTC (permalink / raw)
  To: linux-nvdimm-y27Ovi1pjclAfugRpC6u6w, jack-AlSwsSmVLrQ
  Cc: linux-fsdevel-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA


# description

nvml test suite vmmalloc_fork test hang.

$ ps -eo stat,comm  | grep vmma
S+   vmmalloc_fork
Sl+  vmmalloc_fork
Z+   vmmalloc_fork <defunct>
Sl+  vmmalloc_fork
Z+   vmmalloc_fork <defunct>
Z+   vmmalloc_fork <defunct>
Sl+  vmmalloc_fork
Z+   vmmalloc_fork <defunct>
Z+   vmmalloc_fork <defunct>
Z+   vmmalloc_fork <defunct>

dmesg:

[  250.499097] INFO: task vmmalloc_fork:9805 blocked for more than 120 seconds.
[  250.530667]       Not tainted 4.9.09fe68ca+ #27
[  250.550901] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  250.585752] vmmalloc_fork   D[  250.598362]  ffffffff8171813c     0  9805   9765 0x00000080
[  250.623445]  ffff88075dc68f80[  250.636052]  0000000000000000 ffff88076058db00 ffff88017c5b0000 ffff880763b19340[  250.668510]  ffffc9000fe1bbb0 ffffffff8171813c ffffc9000fe1bc20 ffffc9000fe1bbe0[  250.704220]  ffffffff82248898 ffff88076058db00 ffffffff82248898Call Trace:
[  250.738382]  [<ffffffff8171813c>] ? __schedule+0x21c/0x6a0
[  250.763404]  [<ffffffff817185f6>] schedule+0x36/0x80
[  250.786177]  [<ffffffff81284471>] get_unlocked_mapping_entry+0xc1/0x120
[  250.815869]  [<ffffffff81283810>] ? iomap_dax_rw+0x110/0x110
[  250.841350]  [<ffffffff81284c0a>] grab_mapping_entry+0x4a/0x220
[  250.868442]  [<ffffffff812851e9>] iomap_dax_fault+0xa9/0x3b0
[  250.894437]  [<ffffffffa02b15fe>] xfs_filemap_fault+0xce/0xf0 [xfs]
[  250.922805]  [<ffffffff811d3159>] __do_fault+0x79/0x100
[  250.947035]  [<ffffffff811d7a2b>] do_fault+0x49b/0x690
[  250.970964]  [<ffffffffa02b146c>] ? xfs_filemap_pmd_fault+0x9c/0x160 [xfs]
[  251.001812]  [<ffffffff811d94ba>] handle_mm_fault+0x61a/0xa50
[  251.027736]  [<ffffffff8106c3da>] __do_page_fault+0x22a/0x4a0
[  251.053700]  [<ffffffff8106c680>] do_page_fault+0x30/0x80
[  251.077962]  [<ffffffff81003b55>] ? do_syscall_64+0x175/0x180
[  251.103835]  [<ffffffff8171e208>] page_fault+0x28/0x30


# kernel versions:

v4.6 pass in seconds
v4.7 hang
v4.9-rc1 hang
Linus tree to commit 9fe68ca hang

bisect points to 
 first bad commit: [ac401cc782429cc8560ce4840b1405d603740917] dax: New fault locking

v4.7 with these 3 commits reverted pass:
4d9a2c8 - Jan Kara, 6 months ago : dax: Remove i_mmap_lock protection
bc2466e - Jan Kara, 6 months ago : dax: Use radix tree entry lock to protect cow faults
ac401cc - Jan Kara, 6 months ago : dax: New fault locking

# nvml version:
https://github.com/pmem/nvml.git
to commit:
  feab4d6f65102139ce460890c898fcad09ce20ae

# How reproducible:
always

# Test steps:

<git clone and pmem0 setup>

$cd nvml
$make install -j64

$cat > src/test/testconfig.sh <<EOF
PMEM_FS_DIR=/daxmnt
NON_PMEM_FS_DIR=/tmp
EOF

$mkfs.xfs /dev/pmem0
$mkdir -p /daxmnt/
$mount -o dax /dev/pmem0 /daxmnt/

$make -C src/test/vmmalloc_fork/ TEST_TIME=60m clean
$make -C src/test/vmmalloc_fork/ TEST_TIME=60m check
$umount /daxmnt

^ permalink raw reply	[flat|nested] 14+ messages in thread

* multi-threads libvmmalloc fork test hang
@ 2016-10-27 11:22 ` Xiong Zhou
  0 siblings, 0 replies; 14+ messages in thread
From: Xiong Zhou @ 2016-10-27 11:22 UTC (permalink / raw)
  To: linux-nvdimm, jack; +Cc: linux-fsdevel, linux-kernel


# description

nvml test suite vmmalloc_fork test hang.

$ ps -eo stat,comm  | grep vmma
S+   vmmalloc_fork
Sl+  vmmalloc_fork
Z+   vmmalloc_fork <defunct>
Sl+  vmmalloc_fork
Z+   vmmalloc_fork <defunct>
Z+   vmmalloc_fork <defunct>
Sl+  vmmalloc_fork
Z+   vmmalloc_fork <defunct>
Z+   vmmalloc_fork <defunct>
Z+   vmmalloc_fork <defunct>

dmesg:

[  250.499097] INFO: task vmmalloc_fork:9805 blocked for more than 120 seconds.
[  250.530667]       Not tainted 4.9.09fe68ca+ #27
[  250.550901] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  250.585752] vmmalloc_fork   D[  250.598362]  ffffffff8171813c     0  9805   9765 0x00000080
[  250.623445]  ffff88075dc68f80[  250.636052]  0000000000000000 ffff88076058db00 ffff88017c5b0000 ffff880763b19340[  250.668510]  ffffc9000fe1bbb0 ffffffff8171813c ffffc9000fe1bc20 ffffc9000fe1bbe0[  250.704220]  ffffffff82248898 ffff88076058db00 ffffffff82248898Call Trace:
[  250.738382]  [<ffffffff8171813c>] ? __schedule+0x21c/0x6a0
[  250.763404]  [<ffffffff817185f6>] schedule+0x36/0x80
[  250.786177]  [<ffffffff81284471>] get_unlocked_mapping_entry+0xc1/0x120
[  250.815869]  [<ffffffff81283810>] ? iomap_dax_rw+0x110/0x110
[  250.841350]  [<ffffffff81284c0a>] grab_mapping_entry+0x4a/0x220
[  250.868442]  [<ffffffff812851e9>] iomap_dax_fault+0xa9/0x3b0
[  250.894437]  [<ffffffffa02b15fe>] xfs_filemap_fault+0xce/0xf0 [xfs]
[  250.922805]  [<ffffffff811d3159>] __do_fault+0x79/0x100
[  250.947035]  [<ffffffff811d7a2b>] do_fault+0x49b/0x690
[  250.970964]  [<ffffffffa02b146c>] ? xfs_filemap_pmd_fault+0x9c/0x160 [xfs]
[  251.001812]  [<ffffffff811d94ba>] handle_mm_fault+0x61a/0xa50
[  251.027736]  [<ffffffff8106c3da>] __do_page_fault+0x22a/0x4a0
[  251.053700]  [<ffffffff8106c680>] do_page_fault+0x30/0x80
[  251.077962]  [<ffffffff81003b55>] ? do_syscall_64+0x175/0x180
[  251.103835]  [<ffffffff8171e208>] page_fault+0x28/0x30


# kernel versions:

v4.6 pass in seconds
v4.7 hang
v4.9-rc1 hang
Linus tree to commit 9fe68ca hang

bisect points to 
 first bad commit: [ac401cc782429cc8560ce4840b1405d603740917] dax: New fault locking

v4.7 with these 3 commits reverted pass:
4d9a2c8 - Jan Kara, 6 months ago : dax: Remove i_mmap_lock protection
bc2466e - Jan Kara, 6 months ago : dax: Use radix tree entry lock to protect cow faults
ac401cc - Jan Kara, 6 months ago : dax: New fault locking

# nvml version:
https://github.com/pmem/nvml.git
to commit:
  feab4d6f65102139ce460890c898fcad09ce20ae

# How reproducible:
always

# Test steps:

<git clone and pmem0 setup>

$cd nvml
$make install -j64

$cat > src/test/testconfig.sh <<EOF
PMEM_FS_DIR=/daxmnt
NON_PMEM_FS_DIR=/tmp
EOF

$mkfs.xfs /dev/pmem0
$mkdir -p /daxmnt/
$mount -o dax /dev/pmem0 /daxmnt/

$make -C src/test/vmmalloc_fork/ TEST_TIME=60m clean
$make -C src/test/vmmalloc_fork/ TEST_TIME=60m check
$umount /daxmnt

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: multi-threads libvmmalloc fork test hang
  2016-10-27 11:22 ` Xiong Zhou
@ 2016-10-27 13:37     ` Jan Kara
  -1 siblings, 0 replies; 14+ messages in thread
From: Jan Kara @ 2016-10-27 13:37 UTC (permalink / raw)
  To: Xiong Zhou
  Cc: linux-fsdevel-u79uwXL29TY76Z2rM5mHXA,
	linux-nvdimm-y27Ovi1pjclAfugRpC6u6w, jack-AlSwsSmVLrQ,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA

Thanks for the report. I'll try to reproduce it...

								Honza

On Thu 27-10-16 19:22:30, Xiong Zhou wrote:
> # description
> 
> nvml test suite vmmalloc_fork test hang.
> 
> $ ps -eo stat,comm  | grep vmma
> S+   vmmalloc_fork
> Sl+  vmmalloc_fork
> Z+   vmmalloc_fork <defunct>
> Sl+  vmmalloc_fork
> Z+   vmmalloc_fork <defunct>
> Z+   vmmalloc_fork <defunct>
> Sl+  vmmalloc_fork
> Z+   vmmalloc_fork <defunct>
> Z+   vmmalloc_fork <defunct>
> Z+   vmmalloc_fork <defunct>
> 
> dmesg:
> 
> [  250.499097] INFO: task vmmalloc_fork:9805 blocked for more than 120 seconds.
> [  250.530667]       Not tainted 4.9.09fe68ca+ #27
> [  250.550901] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> [  250.585752] vmmalloc_fork   D[  250.598362]  ffffffff8171813c     0  9805   9765 0x00000080
> [  250.623445]  ffff88075dc68f80[  250.636052]  0000000000000000 ffff88076058db00 ffff88017c5b0000 ffff880763b19340[  250.668510]  ffffc9000fe1bbb0 ffffffff8171813c ffffc9000fe1bc20 ffffc9000fe1bbe0[  250.704220]  ffffffff82248898 ffff88076058db00 ffffffff82248898Call Trace:
> [  250.738382]  [<ffffffff8171813c>] ? __schedule+0x21c/0x6a0
> [  250.763404]  [<ffffffff817185f6>] schedule+0x36/0x80
> [  250.786177]  [<ffffffff81284471>] get_unlocked_mapping_entry+0xc1/0x120
> [  250.815869]  [<ffffffff81283810>] ? iomap_dax_rw+0x110/0x110
> [  250.841350]  [<ffffffff81284c0a>] grab_mapping_entry+0x4a/0x220
> [  250.868442]  [<ffffffff812851e9>] iomap_dax_fault+0xa9/0x3b0
> [  250.894437]  [<ffffffffa02b15fe>] xfs_filemap_fault+0xce/0xf0 [xfs]
> [  250.922805]  [<ffffffff811d3159>] __do_fault+0x79/0x100
> [  250.947035]  [<ffffffff811d7a2b>] do_fault+0x49b/0x690
> [  250.970964]  [<ffffffffa02b146c>] ? xfs_filemap_pmd_fault+0x9c/0x160 [xfs]
> [  251.001812]  [<ffffffff811d94ba>] handle_mm_fault+0x61a/0xa50
> [  251.027736]  [<ffffffff8106c3da>] __do_page_fault+0x22a/0x4a0
> [  251.053700]  [<ffffffff8106c680>] do_page_fault+0x30/0x80
> [  251.077962]  [<ffffffff81003b55>] ? do_syscall_64+0x175/0x180
> [  251.103835]  [<ffffffff8171e208>] page_fault+0x28/0x30
> 
> 
> # kernel versions:
> 
> v4.6 pass in seconds
> v4.7 hang
> v4.9-rc1 hang
> Linus tree to commit 9fe68ca hang
> 
> bisect points to 
>  first bad commit: [ac401cc782429cc8560ce4840b1405d603740917] dax: New fault locking
> 
> v4.7 with these 3 commits reverted pass:
> 4d9a2c8 - Jan Kara, 6 months ago : dax: Remove i_mmap_lock protection
> bc2466e - Jan Kara, 6 months ago : dax: Use radix tree entry lock to protect cow faults
> ac401cc - Jan Kara, 6 months ago : dax: New fault locking
> 
> # nvml version:
> https://github.com/pmem/nvml.git
> to commit:
>   feab4d6f65102139ce460890c898fcad09ce20ae
> 
> # How reproducible:
> always
> 
> # Test steps:
> 
> <git clone and pmem0 setup>
> 
> $cd nvml
> $make install -j64
> 
> $cat > src/test/testconfig.sh <<EOF
> PMEM_FS_DIR=/daxmnt
> NON_PMEM_FS_DIR=/tmp
> EOF
> 
> $mkfs.xfs /dev/pmem0
> $mkdir -p /daxmnt/
> $mount -o dax /dev/pmem0 /daxmnt/
> 
> $make -C src/test/vmmalloc_fork/ TEST_TIME=60m clean
> $make -C src/test/vmmalloc_fork/ TEST_TIME=60m check
> $umount /daxmnt
-- 
Jan Kara <jack-IBi9RG/b67k@public.gmane.org>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: multi-threads libvmmalloc fork test hang
@ 2016-10-27 13:37     ` Jan Kara
  0 siblings, 0 replies; 14+ messages in thread
From: Jan Kara @ 2016-10-27 13:37 UTC (permalink / raw)
  To: Xiong Zhou; +Cc: linux-nvdimm, jack, linux-fsdevel, linux-kernel

Thanks for the report. I'll try to reproduce it...

								Honza

On Thu 27-10-16 19:22:30, Xiong Zhou wrote:
> # description
> 
> nvml test suite vmmalloc_fork test hang.
> 
> $ ps -eo stat,comm  | grep vmma
> S+   vmmalloc_fork
> Sl+  vmmalloc_fork
> Z+   vmmalloc_fork <defunct>
> Sl+  vmmalloc_fork
> Z+   vmmalloc_fork <defunct>
> Z+   vmmalloc_fork <defunct>
> Sl+  vmmalloc_fork
> Z+   vmmalloc_fork <defunct>
> Z+   vmmalloc_fork <defunct>
> Z+   vmmalloc_fork <defunct>
> 
> dmesg:
> 
> [  250.499097] INFO: task vmmalloc_fork:9805 blocked for more than 120 seconds.
> [  250.530667]       Not tainted 4.9.09fe68ca+ #27
> [  250.550901] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> [  250.585752] vmmalloc_fork   D[  250.598362]  ffffffff8171813c     0  9805   9765 0x00000080
> [  250.623445]  ffff88075dc68f80[  250.636052]  0000000000000000 ffff88076058db00 ffff88017c5b0000 ffff880763b19340[  250.668510]  ffffc9000fe1bbb0 ffffffff8171813c ffffc9000fe1bc20 ffffc9000fe1bbe0[  250.704220]  ffffffff82248898 ffff88076058db00 ffffffff82248898Call Trace:
> [  250.738382]  [<ffffffff8171813c>] ? __schedule+0x21c/0x6a0
> [  250.763404]  [<ffffffff817185f6>] schedule+0x36/0x80
> [  250.786177]  [<ffffffff81284471>] get_unlocked_mapping_entry+0xc1/0x120
> [  250.815869]  [<ffffffff81283810>] ? iomap_dax_rw+0x110/0x110
> [  250.841350]  [<ffffffff81284c0a>] grab_mapping_entry+0x4a/0x220
> [  250.868442]  [<ffffffff812851e9>] iomap_dax_fault+0xa9/0x3b0
> [  250.894437]  [<ffffffffa02b15fe>] xfs_filemap_fault+0xce/0xf0 [xfs]
> [  250.922805]  [<ffffffff811d3159>] __do_fault+0x79/0x100
> [  250.947035]  [<ffffffff811d7a2b>] do_fault+0x49b/0x690
> [  250.970964]  [<ffffffffa02b146c>] ? xfs_filemap_pmd_fault+0x9c/0x160 [xfs]
> [  251.001812]  [<ffffffff811d94ba>] handle_mm_fault+0x61a/0xa50
> [  251.027736]  [<ffffffff8106c3da>] __do_page_fault+0x22a/0x4a0
> [  251.053700]  [<ffffffff8106c680>] do_page_fault+0x30/0x80
> [  251.077962]  [<ffffffff81003b55>] ? do_syscall_64+0x175/0x180
> [  251.103835]  [<ffffffff8171e208>] page_fault+0x28/0x30
> 
> 
> # kernel versions:
> 
> v4.6 pass in seconds
> v4.7 hang
> v4.9-rc1 hang
> Linus tree to commit 9fe68ca hang
> 
> bisect points to 
>  first bad commit: [ac401cc782429cc8560ce4840b1405d603740917] dax: New fault locking
> 
> v4.7 with these 3 commits reverted pass:
> 4d9a2c8 - Jan Kara, 6 months ago : dax: Remove i_mmap_lock protection
> bc2466e - Jan Kara, 6 months ago : dax: Use radix tree entry lock to protect cow faults
> ac401cc - Jan Kara, 6 months ago : dax: New fault locking
> 
> # nvml version:
> https://github.com/pmem/nvml.git
> to commit:
>   feab4d6f65102139ce460890c898fcad09ce20ae
> 
> # How reproducible:
> always
> 
> # Test steps:
> 
> <git clone and pmem0 setup>
> 
> $cd nvml
> $make install -j64
> 
> $cat > src/test/testconfig.sh <<EOF
> PMEM_FS_DIR=/daxmnt
> NON_PMEM_FS_DIR=/tmp
> EOF
> 
> $mkfs.xfs /dev/pmem0
> $mkdir -p /daxmnt/
> $mount -o dax /dev/pmem0 /daxmnt/
> 
> $make -C src/test/vmmalloc_fork/ TEST_TIME=60m clean
> $make -C src/test/vmmalloc_fork/ TEST_TIME=60m check
> $umount /daxmnt
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: multi-threads libvmmalloc fork test hang
  2016-10-27 13:37     ` Jan Kara
@ 2017-01-03 16:58         ` Ross Zwisler
  -1 siblings, 0 replies; 14+ messages in thread
From: Ross Zwisler @ 2017-01-03 16:58 UTC (permalink / raw)
  To: Jan Kara
  Cc: linux-fsdevel-u79uwXL29TY76Z2rM5mHXA,
	linux-nvdimm-y27Ovi1pjclAfugRpC6u6w,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA

On Thu, Oct 27, 2016 at 03:37:20PM +0200, Jan Kara wrote:
> Thanks for the report. I'll try to reproduce it...
> 
> 								Honza
> 
> On Thu 27-10-16 19:22:30, Xiong Zhou wrote:
> > # description
> > 
> > nvml test suite vmmalloc_fork test hang.
> > 
> > $ ps -eo stat,comm  | grep vmma
> > S+   vmmalloc_fork
> > Sl+  vmmalloc_fork
> > Z+   vmmalloc_fork <defunct>
> > Sl+  vmmalloc_fork
> > Z+   vmmalloc_fork <defunct>
> > Z+   vmmalloc_fork <defunct>
> > Sl+  vmmalloc_fork
> > Z+   vmmalloc_fork <defunct>
> > Z+   vmmalloc_fork <defunct>
> > Z+   vmmalloc_fork <defunct>
> > 
> > dmesg:
> > 
> > [  250.499097] INFO: task vmmalloc_fork:9805 blocked for more than 120 seconds.
> > [  250.530667]       Not tainted 4.9.09fe68ca+ #27
> > [  250.550901] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> > [  250.585752] vmmalloc_fork   D[  250.598362]  ffffffff8171813c     0  9805   9765 0x00000080
> > [  250.623445]  ffff88075dc68f80[  250.636052]  0000000000000000 ffff88076058db00 ffff88017c5b0000 ffff880763b19340[  250.668510]  ffffc9000fe1bbb0 ffffffff8171813c ffffc9000fe1bc20 ffffc9000fe1bbe0[  250.704220]  ffffffff82248898 ffff88076058db00 ffffffff82248898Call Trace:
> > [  250.738382]  [<ffffffff8171813c>] ? __schedule+0x21c/0x6a0
> > [  250.763404]  [<ffffffff817185f6>] schedule+0x36/0x80
> > [  250.786177]  [<ffffffff81284471>] get_unlocked_mapping_entry+0xc1/0x120
> > [  250.815869]  [<ffffffff81283810>] ? iomap_dax_rw+0x110/0x110
> > [  250.841350]  [<ffffffff81284c0a>] grab_mapping_entry+0x4a/0x220
> > [  250.868442]  [<ffffffff812851e9>] iomap_dax_fault+0xa9/0x3b0
> > [  250.894437]  [<ffffffffa02b15fe>] xfs_filemap_fault+0xce/0xf0 [xfs]
> > [  250.922805]  [<ffffffff811d3159>] __do_fault+0x79/0x100
> > [  250.947035]  [<ffffffff811d7a2b>] do_fault+0x49b/0x690
> > [  250.970964]  [<ffffffffa02b146c>] ? xfs_filemap_pmd_fault+0x9c/0x160 [xfs]
> > [  251.001812]  [<ffffffff811d94ba>] handle_mm_fault+0x61a/0xa50
> > [  251.027736]  [<ffffffff8106c3da>] __do_page_fault+0x22a/0x4a0
> > [  251.053700]  [<ffffffff8106c680>] do_page_fault+0x30/0x80
> > [  251.077962]  [<ffffffff81003b55>] ? do_syscall_64+0x175/0x180
> > [  251.103835]  [<ffffffff8171e208>] page_fault+0x28/0x30
> > 
> > 
> > # kernel versions:
> > 
> > v4.6 pass in seconds
> > v4.7 hang
> > v4.9-rc1 hang
> > Linus tree to commit 9fe68ca hang
> > 
> > bisect points to 
> >  first bad commit: [ac401cc782429cc8560ce4840b1405d603740917] dax: New fault locking
> > 
> > v4.7 with these 3 commits reverted pass:
> > 4d9a2c8 - Jan Kara, 6 months ago : dax: Remove i_mmap_lock protection
> > bc2466e - Jan Kara, 6 months ago : dax: Use radix tree entry lock to protect cow faults
> > ac401cc - Jan Kara, 6 months ago : dax: New fault locking
> > 
> > # nvml version:
> > https://github.com/pmem/nvml.git
> > to commit:
> >   feab4d6f65102139ce460890c898fcad09ce20ae
> > 
> > # How reproducible:
> > always
> > 
> > # Test steps:
> > 
> > <git clone and pmem0 setup>
> > 
> > $cd nvml
> > $make install -j64
> > 
> > $cat > src/test/testconfig.sh <<EOF
> > PMEM_FS_DIR=/daxmnt
> > NON_PMEM_FS_DIR=/tmp
> > EOF
> > 
> > $mkfs.xfs /dev/pmem0
> > $mkdir -p /daxmnt/
> > $mount -o dax /dev/pmem0 /daxmnt/
> > 
> > $make -C src/test/vmmalloc_fork/ TEST_TIME=60m clean
> > $make -C src/test/vmmalloc_fork/ TEST_TIME=60m check
> > $umount /daxmnt

As I mentioned in the other thread I was able to reproduce this issue with
v4.10-rc2, and am currently debugging.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: multi-threads libvmmalloc fork test hang
@ 2017-01-03 16:58         ` Ross Zwisler
  0 siblings, 0 replies; 14+ messages in thread
From: Ross Zwisler @ 2017-01-03 16:58 UTC (permalink / raw)
  To: Jan Kara; +Cc: Xiong Zhou, linux-fsdevel, linux-nvdimm, linux-kernel

On Thu, Oct 27, 2016 at 03:37:20PM +0200, Jan Kara wrote:
> Thanks for the report. I'll try to reproduce it...
> 
> 								Honza
> 
> On Thu 27-10-16 19:22:30, Xiong Zhou wrote:
> > # description
> > 
> > nvml test suite vmmalloc_fork test hang.
> > 
> > $ ps -eo stat,comm  | grep vmma
> > S+   vmmalloc_fork
> > Sl+  vmmalloc_fork
> > Z+   vmmalloc_fork <defunct>
> > Sl+  vmmalloc_fork
> > Z+   vmmalloc_fork <defunct>
> > Z+   vmmalloc_fork <defunct>
> > Sl+  vmmalloc_fork
> > Z+   vmmalloc_fork <defunct>
> > Z+   vmmalloc_fork <defunct>
> > Z+   vmmalloc_fork <defunct>
> > 
> > dmesg:
> > 
> > [  250.499097] INFO: task vmmalloc_fork:9805 blocked for more than 120 seconds.
> > [  250.530667]       Not tainted 4.9.09fe68ca+ #27
> > [  250.550901] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> > [  250.585752] vmmalloc_fork   D[  250.598362]  ffffffff8171813c     0  9805   9765 0x00000080
> > [  250.623445]  ffff88075dc68f80[  250.636052]  0000000000000000 ffff88076058db00 ffff88017c5b0000 ffff880763b19340[  250.668510]  ffffc9000fe1bbb0 ffffffff8171813c ffffc9000fe1bc20 ffffc9000fe1bbe0[  250.704220]  ffffffff82248898 ffff88076058db00 ffffffff82248898Call Trace:
> > [  250.738382]  [<ffffffff8171813c>] ? __schedule+0x21c/0x6a0
> > [  250.763404]  [<ffffffff817185f6>] schedule+0x36/0x80
> > [  250.786177]  [<ffffffff81284471>] get_unlocked_mapping_entry+0xc1/0x120
> > [  250.815869]  [<ffffffff81283810>] ? iomap_dax_rw+0x110/0x110
> > [  250.841350]  [<ffffffff81284c0a>] grab_mapping_entry+0x4a/0x220
> > [  250.868442]  [<ffffffff812851e9>] iomap_dax_fault+0xa9/0x3b0
> > [  250.894437]  [<ffffffffa02b15fe>] xfs_filemap_fault+0xce/0xf0 [xfs]
> > [  250.922805]  [<ffffffff811d3159>] __do_fault+0x79/0x100
> > [  250.947035]  [<ffffffff811d7a2b>] do_fault+0x49b/0x690
> > [  250.970964]  [<ffffffffa02b146c>] ? xfs_filemap_pmd_fault+0x9c/0x160 [xfs]
> > [  251.001812]  [<ffffffff811d94ba>] handle_mm_fault+0x61a/0xa50
> > [  251.027736]  [<ffffffff8106c3da>] __do_page_fault+0x22a/0x4a0
> > [  251.053700]  [<ffffffff8106c680>] do_page_fault+0x30/0x80
> > [  251.077962]  [<ffffffff81003b55>] ? do_syscall_64+0x175/0x180
> > [  251.103835]  [<ffffffff8171e208>] page_fault+0x28/0x30
> > 
> > 
> > # kernel versions:
> > 
> > v4.6 pass in seconds
> > v4.7 hang
> > v4.9-rc1 hang
> > Linus tree to commit 9fe68ca hang
> > 
> > bisect points to 
> >  first bad commit: [ac401cc782429cc8560ce4840b1405d603740917] dax: New fault locking
> > 
> > v4.7 with these 3 commits reverted pass:
> > 4d9a2c8 - Jan Kara, 6 months ago : dax: Remove i_mmap_lock protection
> > bc2466e - Jan Kara, 6 months ago : dax: Use radix tree entry lock to protect cow faults
> > ac401cc - Jan Kara, 6 months ago : dax: New fault locking
> > 
> > # nvml version:
> > https://github.com/pmem/nvml.git
> > to commit:
> >   feab4d6f65102139ce460890c898fcad09ce20ae
> > 
> > # How reproducible:
> > always
> > 
> > # Test steps:
> > 
> > <git clone and pmem0 setup>
> > 
> > $cd nvml
> > $make install -j64
> > 
> > $cat > src/test/testconfig.sh <<EOF
> > PMEM_FS_DIR=/daxmnt
> > NON_PMEM_FS_DIR=/tmp
> > EOF
> > 
> > $mkfs.xfs /dev/pmem0
> > $mkdir -p /daxmnt/
> > $mount -o dax /dev/pmem0 /daxmnt/
> > 
> > $make -C src/test/vmmalloc_fork/ TEST_TIME=60m clean
> > $make -C src/test/vmmalloc_fork/ TEST_TIME=60m check
> > $umount /daxmnt

As I mentioned in the other thread I was able to reproduce this issue with
v4.10-rc2, and am currently debugging.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH] dax: fix deadlock with DAX 4k holes
  2016-10-27 11:22 ` Xiong Zhou
  (?)
@ 2017-01-03 21:36   ` Ross Zwisler
  -1 siblings, 0 replies; 14+ messages in thread
From: Ross Zwisler @ 2017-01-03 21:36 UTC (permalink / raw)
  To: Xiong Zhou, stable, linux-kernel
  Cc: Jan Kara, Andrew Morton, linux-mm, linux-nvdimm, Dave Chinner,
	Christoph Hellwig, Dave Hansen

Currently in DAX if we have three read faults on the same hole address we
can end up with the following:

Thread 0		Thread 1		Thread 2
--------		--------		--------
dax_iomap_fault
 grab_mapping_entry
  lock_slot
   <locks empty DAX entry>

  			dax_iomap_fault
			 grab_mapping_entry
			  get_unlocked_mapping_entry
			   <sleeps on empty DAX entry>

						dax_iomap_fault
						 grab_mapping_entry
						  get_unlocked_mapping_entry
						   <sleeps on empty DAX entry>
  dax_load_hole
   find_or_create_page
   ...
    page_cache_tree_insert
     dax_wake_mapping_entry_waiter
      <wakes one sleeper>
     __radix_tree_replace
      <swaps empty DAX entry with 4k zero page>

			<wakes>
			get_page
			lock_page
			...
			put_locked_mapping_entry
			unlock_page
			put_page

						<sleeps forever on the DAX
						 wait queue>

The crux of the problem is that once we insert a 4k zero page, all locking
from then on is done in terms of that 4k zero page and any additional
threads sleeping on the empty DAX entry will never be woken.  Fix this by
waking all sleepers when we replace the DAX radix tree entry with a 4k zero
page.  This will allow all sleeping threads to successfully transition from
locking based on the DAX empty entry to locking on the 4k zero page.

With the test case reported by Xiong this happens very regularly in my test
setup, with some runs resulting in 9+ threads in this deadlocked state.
With this fix I've been able to run that same test dozens of times in a
loop without issue.

Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Reported-by: Xiong Zhou <xzhou@redhat.com>
Fixes: commit ac401cc78242 ("dax: New fault locking")
Cc: Jan Kara <jack@suse.cz>
Cc: stable@vger.kernel.org # 4.7+
---

This issue exists as far back as v4.7, and I was easly able to reproduce it
with v4.7 using the same test.

Unfortunately this patch won't apply cleanly to the stable trees, but the
change is very simple and should be easy to replicate by hand.  Please ping
me if you'd like patches that apply cleanly to the v4.9 and v4.8.15 trees.

---
 mm/filemap.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mm/filemap.c b/mm/filemap.c
index d0e4d10..b772a33 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -138,7 +138,7 @@ static int page_cache_tree_insert(struct address_space *mapping,
 				dax_radix_locked_entry(0, RADIX_DAX_EMPTY));
 			/* Wakeup waiters for exceptional entry lock */
 			dax_wake_mapping_entry_waiter(mapping, page->index, p,
-						      false);
+						      true);
 		}
 	}
 	__radix_tree_replace(&mapping->page_tree, node, slot, page,
-- 
2.7.4

_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH] dax: fix deadlock with DAX 4k holes
@ 2017-01-03 21:36   ` Ross Zwisler
  0 siblings, 0 replies; 14+ messages in thread
From: Ross Zwisler @ 2017-01-03 21:36 UTC (permalink / raw)
  To: Xiong Zhou, stable, linux-kernel
  Cc: Ross Zwisler, Andrew Morton, Christoph Hellwig, Dan Williams,
	Dave Chinner, Dave Hansen, Jan Kara, linux-mm, linux-nvdimm

Currently in DAX if we have three read faults on the same hole address we
can end up with the following:

Thread 0		Thread 1		Thread 2
--------		--------		--------
dax_iomap_fault
 grab_mapping_entry
  lock_slot
   <locks empty DAX entry>

  			dax_iomap_fault
			 grab_mapping_entry
			  get_unlocked_mapping_entry
			   <sleeps on empty DAX entry>

						dax_iomap_fault
						 grab_mapping_entry
						  get_unlocked_mapping_entry
						   <sleeps on empty DAX entry>
  dax_load_hole
   find_or_create_page
   ...
    page_cache_tree_insert
     dax_wake_mapping_entry_waiter
      <wakes one sleeper>
     __radix_tree_replace
      <swaps empty DAX entry with 4k zero page>

			<wakes>
			get_page
			lock_page
			...
			put_locked_mapping_entry
			unlock_page
			put_page

						<sleeps forever on the DAX
						 wait queue>

The crux of the problem is that once we insert a 4k zero page, all locking
from then on is done in terms of that 4k zero page and any additional
threads sleeping on the empty DAX entry will never be woken.  Fix this by
waking all sleepers when we replace the DAX radix tree entry with a 4k zero
page.  This will allow all sleeping threads to successfully transition from
locking based on the DAX empty entry to locking on the 4k zero page.

With the test case reported by Xiong this happens very regularly in my test
setup, with some runs resulting in 9+ threads in this deadlocked state.
With this fix I've been able to run that same test dozens of times in a
loop without issue.

Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Reported-by: Xiong Zhou <xzhou@redhat.com>
Fixes: commit ac401cc78242 ("dax: New fault locking")
Cc: Jan Kara <jack@suse.cz>
Cc: stable@vger.kernel.org # 4.7+
---

This issue exists as far back as v4.7, and I was easly able to reproduce it
with v4.7 using the same test.

Unfortunately this patch won't apply cleanly to the stable trees, but the
change is very simple and should be easy to replicate by hand.  Please ping
me if you'd like patches that apply cleanly to the v4.9 and v4.8.15 trees.

---
 mm/filemap.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mm/filemap.c b/mm/filemap.c
index d0e4d10..b772a33 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -138,7 +138,7 @@ static int page_cache_tree_insert(struct address_space *mapping,
 				dax_radix_locked_entry(0, RADIX_DAX_EMPTY));
 			/* Wakeup waiters for exceptional entry lock */
 			dax_wake_mapping_entry_waiter(mapping, page->index, p,
-						      false);
+						      true);
 		}
 	}
 	__radix_tree_replace(&mapping->page_tree, node, slot, page,
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH] dax: fix deadlock with DAX 4k holes
@ 2017-01-03 21:36   ` Ross Zwisler
  0 siblings, 0 replies; 14+ messages in thread
From: Ross Zwisler @ 2017-01-03 21:36 UTC (permalink / raw)
  To: Xiong Zhou, stable, linux-kernel
  Cc: Ross Zwisler, Andrew Morton, Christoph Hellwig, Dan Williams,
	Dave Chinner, Dave Hansen, Jan Kara, linux-mm, linux-nvdimm

Currently in DAX if we have three read faults on the same hole address we
can end up with the following:

Thread 0		Thread 1		Thread 2
--------		--------		--------
dax_iomap_fault
 grab_mapping_entry
  lock_slot
   <locks empty DAX entry>

  			dax_iomap_fault
			 grab_mapping_entry
			  get_unlocked_mapping_entry
			   <sleeps on empty DAX entry>

						dax_iomap_fault
						 grab_mapping_entry
						  get_unlocked_mapping_entry
						   <sleeps on empty DAX entry>
  dax_load_hole
   find_or_create_page
   ...
    page_cache_tree_insert
     dax_wake_mapping_entry_waiter
      <wakes one sleeper>
     __radix_tree_replace
      <swaps empty DAX entry with 4k zero page>

			<wakes>
			get_page
			lock_page
			...
			put_locked_mapping_entry
			unlock_page
			put_page

						<sleeps forever on the DAX
						 wait queue>

The crux of the problem is that once we insert a 4k zero page, all locking
from then on is done in terms of that 4k zero page and any additional
threads sleeping on the empty DAX entry will never be woken.  Fix this by
waking all sleepers when we replace the DAX radix tree entry with a 4k zero
page.  This will allow all sleeping threads to successfully transition from
locking based on the DAX empty entry to locking on the 4k zero page.

With the test case reported by Xiong this happens very regularly in my test
setup, with some runs resulting in 9+ threads in this deadlocked state.
With this fix I've been able to run that same test dozens of times in a
loop without issue.

Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Reported-by: Xiong Zhou <xzhou@redhat.com>
Fixes: commit ac401cc78242 ("dax: New fault locking")
Cc: Jan Kara <jack@suse.cz>
Cc: stable@vger.kernel.org # 4.7+
---

This issue exists as far back as v4.7, and I was easly able to reproduce it
with v4.7 using the same test.

Unfortunately this patch won't apply cleanly to the stable trees, but the
change is very simple and should be easy to replicate by hand.  Please ping
me if you'd like patches that apply cleanly to the v4.9 and v4.8.15 trees.

---
 mm/filemap.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mm/filemap.c b/mm/filemap.c
index d0e4d10..b772a33 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -138,7 +138,7 @@ static int page_cache_tree_insert(struct address_space *mapping,
 				dax_radix_locked_entry(0, RADIX_DAX_EMPTY));
 			/* Wakeup waiters for exceptional entry lock */
 			dax_wake_mapping_entry_waiter(mapping, page->index, p,
-						      false);
+						      true);
 		}
 	}
 	__radix_tree_replace(&mapping->page_tree, node, slot, page,
-- 
2.7.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH] dax: fix deadlock with DAX 4k holes
  2017-01-03 21:36   ` Ross Zwisler
  (?)
@ 2017-01-04  7:18     ` Jan Kara
  -1 siblings, 0 replies; 14+ messages in thread
From: Jan Kara @ 2017-01-04  7:18 UTC (permalink / raw)
  To: Ross Zwisler
  Cc: Jan Kara, linux-nvdimm, Dave Chinner, linux-kernel, stable,
	Christoph Hellwig, linux-mm, Dave Hansen, Andrew Morton

On Tue 03-01-17 14:36:05, Ross Zwisler wrote:
> Currently in DAX if we have three read faults on the same hole address we
> can end up with the following:
> 
> Thread 0		Thread 1		Thread 2
> --------		--------		--------
> dax_iomap_fault
>  grab_mapping_entry
>   lock_slot
>    <locks empty DAX entry>
> 
>   			dax_iomap_fault
> 			 grab_mapping_entry
> 			  get_unlocked_mapping_entry
> 			   <sleeps on empty DAX entry>
> 
> 						dax_iomap_fault
> 						 grab_mapping_entry
> 						  get_unlocked_mapping_entry
> 						   <sleeps on empty DAX entry>
>   dax_load_hole
>    find_or_create_page
>    ...
>     page_cache_tree_insert
>      dax_wake_mapping_entry_waiter
>       <wakes one sleeper>
>      __radix_tree_replace
>       <swaps empty DAX entry with 4k zero page>
> 
> 			<wakes>
> 			get_page
> 			lock_page
> 			...
> 			put_locked_mapping_entry
> 			unlock_page
> 			put_page
> 
> 						<sleeps forever on the DAX
> 						 wait queue>
> 
> The crux of the problem is that once we insert a 4k zero page, all locking
> from then on is done in terms of that 4k zero page and any additional
> threads sleeping on the empty DAX entry will never be woken.  Fix this by
> waking all sleepers when we replace the DAX radix tree entry with a 4k zero
> page.  This will allow all sleeping threads to successfully transition from
> locking based on the DAX empty entry to locking on the 4k zero page.
> 
> With the test case reported by Xiong this happens very regularly in my test
> setup, with some runs resulting in 9+ threads in this deadlocked state.
> With this fix I've been able to run that same test dozens of times in a
> loop without issue.
> 
> Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
> Reported-by: Xiong Zhou <xzhou@redhat.com>
> Fixes: commit ac401cc78242 ("dax: New fault locking")
> Cc: Jan Kara <jack@suse.cz>
> Cc: stable@vger.kernel.org # 4.7+

Ah, very good catch. You can add:

Reviewed-by: Jan Kara <jack@suse.cz>

I wonder why I was not able to reproduce this... Probably the timing didn't
work out right on my test machine.

								Honza

> ---
> 
> This issue exists as far back as v4.7, and I was easly able to reproduce it
> with v4.7 using the same test.
> 
> Unfortunately this patch won't apply cleanly to the stable trees, but the
> change is very simple and should be easy to replicate by hand.  Please ping
> me if you'd like patches that apply cleanly to the v4.9 and v4.8.15 trees.
> 
> ---
>  mm/filemap.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/mm/filemap.c b/mm/filemap.c
> index d0e4d10..b772a33 100644
> --- a/mm/filemap.c
> +++ b/mm/filemap.c
> @@ -138,7 +138,7 @@ static int page_cache_tree_insert(struct address_space *mapping,
>  				dax_radix_locked_entry(0, RADIX_DAX_EMPTY));
>  			/* Wakeup waiters for exceptional entry lock */
>  			dax_wake_mapping_entry_waiter(mapping, page->index, p,
> -						      false);
> +						      true);
>  		}
>  	}
>  	__radix_tree_replace(&mapping->page_tree, node, slot, page,
> -- 
> 2.7.4
> 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR
_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] dax: fix deadlock with DAX 4k holes
@ 2017-01-04  7:18     ` Jan Kara
  0 siblings, 0 replies; 14+ messages in thread
From: Jan Kara @ 2017-01-04  7:18 UTC (permalink / raw)
  To: Ross Zwisler
  Cc: Xiong Zhou, stable, linux-kernel, Andrew Morton,
	Christoph Hellwig, Dan Williams, Dave Chinner, Dave Hansen,
	Jan Kara, linux-mm, linux-nvdimm

On Tue 03-01-17 14:36:05, Ross Zwisler wrote:
> Currently in DAX if we have three read faults on the same hole address we
> can end up with the following:
> 
> Thread 0		Thread 1		Thread 2
> --------		--------		--------
> dax_iomap_fault
>  grab_mapping_entry
>   lock_slot
>    <locks empty DAX entry>
> 
>   			dax_iomap_fault
> 			 grab_mapping_entry
> 			  get_unlocked_mapping_entry
> 			   <sleeps on empty DAX entry>
> 
> 						dax_iomap_fault
> 						 grab_mapping_entry
> 						  get_unlocked_mapping_entry
> 						   <sleeps on empty DAX entry>
>   dax_load_hole
>    find_or_create_page
>    ...
>     page_cache_tree_insert
>      dax_wake_mapping_entry_waiter
>       <wakes one sleeper>
>      __radix_tree_replace
>       <swaps empty DAX entry with 4k zero page>
> 
> 			<wakes>
> 			get_page
> 			lock_page
> 			...
> 			put_locked_mapping_entry
> 			unlock_page
> 			put_page
> 
> 						<sleeps forever on the DAX
> 						 wait queue>
> 
> The crux of the problem is that once we insert a 4k zero page, all locking
> from then on is done in terms of that 4k zero page and any additional
> threads sleeping on the empty DAX entry will never be woken.  Fix this by
> waking all sleepers when we replace the DAX radix tree entry with a 4k zero
> page.  This will allow all sleeping threads to successfully transition from
> locking based on the DAX empty entry to locking on the 4k zero page.
> 
> With the test case reported by Xiong this happens very regularly in my test
> setup, with some runs resulting in 9+ threads in this deadlocked state.
> With this fix I've been able to run that same test dozens of times in a
> loop without issue.
> 
> Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
> Reported-by: Xiong Zhou <xzhou@redhat.com>
> Fixes: commit ac401cc78242 ("dax: New fault locking")
> Cc: Jan Kara <jack@suse.cz>
> Cc: stable@vger.kernel.org # 4.7+

Ah, very good catch. You can add:

Reviewed-by: Jan Kara <jack@suse.cz>

I wonder why I was not able to reproduce this... Probably the timing didn't
work out right on my test machine.

								Honza

> ---
> 
> This issue exists as far back as v4.7, and I was easly able to reproduce it
> with v4.7 using the same test.
> 
> Unfortunately this patch won't apply cleanly to the stable trees, but the
> change is very simple and should be easy to replicate by hand.  Please ping
> me if you'd like patches that apply cleanly to the v4.9 and v4.8.15 trees.
> 
> ---
>  mm/filemap.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/mm/filemap.c b/mm/filemap.c
> index d0e4d10..b772a33 100644
> --- a/mm/filemap.c
> +++ b/mm/filemap.c
> @@ -138,7 +138,7 @@ static int page_cache_tree_insert(struct address_space *mapping,
>  				dax_radix_locked_entry(0, RADIX_DAX_EMPTY));
>  			/* Wakeup waiters for exceptional entry lock */
>  			dax_wake_mapping_entry_waiter(mapping, page->index, p,
> -						      false);
> +						      true);
>  		}
>  	}
>  	__radix_tree_replace(&mapping->page_tree, node, slot, page,
> -- 
> 2.7.4
> 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] dax: fix deadlock with DAX 4k holes
@ 2017-01-04  7:18     ` Jan Kara
  0 siblings, 0 replies; 14+ messages in thread
From: Jan Kara @ 2017-01-04  7:18 UTC (permalink / raw)
  To: Ross Zwisler
  Cc: Xiong Zhou, stable, linux-kernel, Andrew Morton,
	Christoph Hellwig, Dan Williams, Dave Chinner, Dave Hansen,
	Jan Kara, linux-mm, linux-nvdimm

On Tue 03-01-17 14:36:05, Ross Zwisler wrote:
> Currently in DAX if we have three read faults on the same hole address we
> can end up with the following:
> 
> Thread 0		Thread 1		Thread 2
> --------		--------		--------
> dax_iomap_fault
>  grab_mapping_entry
>   lock_slot
>    <locks empty DAX entry>
> 
>   			dax_iomap_fault
> 			 grab_mapping_entry
> 			  get_unlocked_mapping_entry
> 			   <sleeps on empty DAX entry>
> 
> 						dax_iomap_fault
> 						 grab_mapping_entry
> 						  get_unlocked_mapping_entry
> 						   <sleeps on empty DAX entry>
>   dax_load_hole
>    find_or_create_page
>    ...
>     page_cache_tree_insert
>      dax_wake_mapping_entry_waiter
>       <wakes one sleeper>
>      __radix_tree_replace
>       <swaps empty DAX entry with 4k zero page>
> 
> 			<wakes>
> 			get_page
> 			lock_page
> 			...
> 			put_locked_mapping_entry
> 			unlock_page
> 			put_page
> 
> 						<sleeps forever on the DAX
> 						 wait queue>
> 
> The crux of the problem is that once we insert a 4k zero page, all locking
> from then on is done in terms of that 4k zero page and any additional
> threads sleeping on the empty DAX entry will never be woken.  Fix this by
> waking all sleepers when we replace the DAX radix tree entry with a 4k zero
> page.  This will allow all sleeping threads to successfully transition from
> locking based on the DAX empty entry to locking on the 4k zero page.
> 
> With the test case reported by Xiong this happens very regularly in my test
> setup, with some runs resulting in 9+ threads in this deadlocked state.
> With this fix I've been able to run that same test dozens of times in a
> loop without issue.
> 
> Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
> Reported-by: Xiong Zhou <xzhou@redhat.com>
> Fixes: commit ac401cc78242 ("dax: New fault locking")
> Cc: Jan Kara <jack@suse.cz>
> Cc: stable@vger.kernel.org # 4.7+

Ah, very good catch. You can add:

Reviewed-by: Jan Kara <jack@suse.cz>

I wonder why I was not able to reproduce this... Probably the timing didn't
work out right on my test machine.

								Honza

> ---
> 
> This issue exists as far back as v4.7, and I was easly able to reproduce it
> with v4.7 using the same test.
> 
> Unfortunately this patch won't apply cleanly to the stable trees, but the
> change is very simple and should be easy to replicate by hand.  Please ping
> me if you'd like patches that apply cleanly to the v4.9 and v4.8.15 trees.
> 
> ---
>  mm/filemap.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/mm/filemap.c b/mm/filemap.c
> index d0e4d10..b772a33 100644
> --- a/mm/filemap.c
> +++ b/mm/filemap.c
> @@ -138,7 +138,7 @@ static int page_cache_tree_insert(struct address_space *mapping,
>  				dax_radix_locked_entry(0, RADIX_DAX_EMPTY));
>  			/* Wakeup waiters for exceptional entry lock */
>  			dax_wake_mapping_entry_waiter(mapping, page->index, p,
> -						      false);
> +						      true);
>  		}
>  	}
>  	__radix_tree_replace(&mapping->page_tree, node, slot, page,
> -- 
> 2.7.4
> 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] dax: fix deadlock with DAX 4k holes
  2017-01-03 21:36   ` Ross Zwisler
@ 2017-01-04 14:26     ` Xiong Zhou
  -1 siblings, 0 replies; 14+ messages in thread
From: Xiong Zhou @ 2017-01-04 14:26 UTC (permalink / raw)
  To: Ross Zwisler
  Cc: Xiong Zhou, stable, linux-kernel, Andrew Morton,
	Christoph Hellwig, Dan Williams, Dave Chinner, Dave Hansen,
	Jan Kara, linux-mm, linux-nvdimm

On Tue, Jan 03, 2017 at 02:36:05PM -0700, Ross Zwisler wrote:
> Currently in DAX if we have three read faults on the same hole address we
> can end up with the following:
> 
> Thread 0		Thread 1		Thread 2
> --------		--------		--------
> dax_iomap_fault
>  grab_mapping_entry
>   lock_slot
>    <locks empty DAX entry>
> 
>   			dax_iomap_fault
> 			 grab_mapping_entry
> 			  get_unlocked_mapping_entry
> 			   <sleeps on empty DAX entry>
> 
> 						dax_iomap_fault
> 						 grab_mapping_entry
> 						  get_unlocked_mapping_entry
> 						   <sleeps on empty DAX entry>
>   dax_load_hole
>    find_or_create_page
>    ...
>     page_cache_tree_insert
>      dax_wake_mapping_entry_waiter
>       <wakes one sleeper>
>      __radix_tree_replace
>       <swaps empty DAX entry with 4k zero page>
> 
> 			<wakes>
> 			get_page
> 			lock_page
> 			...
> 			put_locked_mapping_entry
> 			unlock_page
> 			put_page
> 
> 						<sleeps forever on the DAX
> 						 wait queue>
> 
> The crux of the problem is that once we insert a 4k zero page, all locking
> from then on is done in terms of that 4k zero page and any additional
> threads sleeping on the empty DAX entry will never be woken.  Fix this by
> waking all sleepers when we replace the DAX radix tree entry with a 4k zero
> page.  This will allow all sleeping threads to successfully transition from
> locking based on the DAX empty entry to locking on the 4k zero page.
> 
> With the test case reported by Xiong this happens very regularly in my test
> setup, with some runs resulting in 9+ threads in this deadlocked state.
> With this fix I've been able to run that same test dozens of times in a
> loop without issue.
> 
> Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
> Reported-by: Xiong Zhou <xzhou@redhat.com>
> Fixes: commit ac401cc78242 ("dax: New fault locking")
> Cc: Jan Kara <jack@suse.cz>
> Cc: stable@vger.kernel.org # 4.7+
> ---

Positive test result of this patch for this issue and the regression
tests.

Great job!

> 
> This issue exists as far back as v4.7, and I was easly able to reproduce it
> with v4.7 using the same test.
> 
> Unfortunately this patch won't apply cleanly to the stable trees, but the
> change is very simple and should be easy to replicate by hand.  Please ping
> me if you'd like patches that apply cleanly to the v4.9 and v4.8.15 trees.
> 
> ---
>  mm/filemap.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/mm/filemap.c b/mm/filemap.c
> index d0e4d10..b772a33 100644
> --- a/mm/filemap.c
> +++ b/mm/filemap.c
> @@ -138,7 +138,7 @@ static int page_cache_tree_insert(struct address_space *mapping,
>  				dax_radix_locked_entry(0, RADIX_DAX_EMPTY));
>  			/* Wakeup waiters for exceptional entry lock */
>  			dax_wake_mapping_entry_waiter(mapping, page->index, p,
> -						      false);
> +						      true);
>  		}
>  	}
>  	__radix_tree_replace(&mapping->page_tree, node, slot, page,
> -- 
> 2.7.4
> 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] dax: fix deadlock with DAX 4k holes
@ 2017-01-04 14:26     ` Xiong Zhou
  0 siblings, 0 replies; 14+ messages in thread
From: Xiong Zhou @ 2017-01-04 14:26 UTC (permalink / raw)
  To: Ross Zwisler
  Cc: Xiong Zhou, stable, linux-kernel, Andrew Morton,
	Christoph Hellwig, Dan Williams, Dave Chinner, Dave Hansen,
	Jan Kara, linux-mm, linux-nvdimm

On Tue, Jan 03, 2017 at 02:36:05PM -0700, Ross Zwisler wrote:
> Currently in DAX if we have three read faults on the same hole address we
> can end up with the following:
> 
> Thread 0		Thread 1		Thread 2
> --------		--------		--------
> dax_iomap_fault
>  grab_mapping_entry
>   lock_slot
>    <locks empty DAX entry>
> 
>   			dax_iomap_fault
> 			 grab_mapping_entry
> 			  get_unlocked_mapping_entry
> 			   <sleeps on empty DAX entry>
> 
> 						dax_iomap_fault
> 						 grab_mapping_entry
> 						  get_unlocked_mapping_entry
> 						   <sleeps on empty DAX entry>
>   dax_load_hole
>    find_or_create_page
>    ...
>     page_cache_tree_insert
>      dax_wake_mapping_entry_waiter
>       <wakes one sleeper>
>      __radix_tree_replace
>       <swaps empty DAX entry with 4k zero page>
> 
> 			<wakes>
> 			get_page
> 			lock_page
> 			...
> 			put_locked_mapping_entry
> 			unlock_page
> 			put_page
> 
> 						<sleeps forever on the DAX
> 						 wait queue>
> 
> The crux of the problem is that once we insert a 4k zero page, all locking
> from then on is done in terms of that 4k zero page and any additional
> threads sleeping on the empty DAX entry will never be woken.  Fix this by
> waking all sleepers when we replace the DAX radix tree entry with a 4k zero
> page.  This will allow all sleeping threads to successfully transition from
> locking based on the DAX empty entry to locking on the 4k zero page.
> 
> With the test case reported by Xiong this happens very regularly in my test
> setup, with some runs resulting in 9+ threads in this deadlocked state.
> With this fix I've been able to run that same test dozens of times in a
> loop without issue.
> 
> Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
> Reported-by: Xiong Zhou <xzhou@redhat.com>
> Fixes: commit ac401cc78242 ("dax: New fault locking")
> Cc: Jan Kara <jack@suse.cz>
> Cc: stable@vger.kernel.org # 4.7+
> ---

Positive test result of this patch for this issue and the regression
tests.

Great job!

> 
> This issue exists as far back as v4.7, and I was easly able to reproduce it
> with v4.7 using the same test.
> 
> Unfortunately this patch won't apply cleanly to the stable trees, but the
> change is very simple and should be easy to replicate by hand.  Please ping
> me if you'd like patches that apply cleanly to the v4.9 and v4.8.15 trees.
> 
> ---
>  mm/filemap.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/mm/filemap.c b/mm/filemap.c
> index d0e4d10..b772a33 100644
> --- a/mm/filemap.c
> +++ b/mm/filemap.c
> @@ -138,7 +138,7 @@ static int page_cache_tree_insert(struct address_space *mapping,
>  				dax_radix_locked_entry(0, RADIX_DAX_EMPTY));
>  			/* Wakeup waiters for exceptional entry lock */
>  			dax_wake_mapping_entry_waiter(mapping, page->index, p,
> -						      false);
> +						      true);
>  		}
>  	}
>  	__radix_tree_replace(&mapping->page_tree, node, slot, page,
> -- 
> 2.7.4
> 

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2017-01-04 14:26 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-10-27 11:22 multi-threads libvmmalloc fork test hang Xiong Zhou
2016-10-27 11:22 ` Xiong Zhou
     [not found] ` <20161027112230.wsumgs62fqdxt3sc-obHvtwIU90hQcClZ3XN9yxcY2uh10dtjAL8bYrjMMd8@public.gmane.org>
2016-10-27 13:37   ` Jan Kara
2016-10-27 13:37     ` Jan Kara
     [not found]     ` <20161027133720.GF19743-4I4JzKEfoa/jFM9bn6wA6Q@public.gmane.org>
2017-01-03 16:58       ` Ross Zwisler
2017-01-03 16:58         ` Ross Zwisler
2017-01-03 21:36 ` [PATCH] dax: fix deadlock with DAX 4k holes Ross Zwisler
2017-01-03 21:36   ` Ross Zwisler
2017-01-03 21:36   ` Ross Zwisler
2017-01-04  7:18   ` Jan Kara
2017-01-04  7:18     ` Jan Kara
2017-01-04  7:18     ` Jan Kara
2017-01-04 14:26   ` Xiong Zhou
2017-01-04 14:26     ` Xiong Zhou

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.