linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* LTP rwtest01 blocks on DAX mountpoint
@ 2016-12-24 11:07 Xiong Zhou
  2016-12-30  9:33 ` Xiong Zhou
  0 siblings, 1 reply; 11+ messages in thread
From: Xiong Zhou @ 2016-12-24 11:07 UTC (permalink / raw)
  To: jack, linux-nvdimm, linux-fsdevel; +Cc: linux-kernel

[-- Attachment #1: Type: text/plain, Size: 964 bytes --]

Hi lists,

Since around 20161129 tag, LTP rwtest01 on dax mountpoint blocks
on linux-next tree, now on Linus tree.

In "normal", rwtest01 subcase ends in a few minutes, now it keeps
running for hours on dax mountpoint, both ext4 and xfs. Ctrl + c
can interrupt it.

It is always reproducible, blocking following tests.

It does not happen when mounting without dax option.
It does not happen on v4.9.

Bisect point to:

commit 4b4bb46d00b386e1c972890dc5785a7966eaa9c0
Author: Jan Kara <jack@suse.cz>
Date:   Wed Dec 14 15:07:53 2016 -0800

    dax: clear dirty entry tags on cache flush


Reverting this commit on top of Linus tree "fixes" this issue.

Reproducer:

sh-4.2# cat rwt
rwtest01 export LTPROOT; rwtest -N rwtest01 -c -q -i 60s  -f sync 10%25000:$TMPDIR/rw-sync-$$
sh-4.2# 
mkfs.xfs /dev/pmem0p1
mount -o dax /dev/pmem0p1 /daxmnt && \
/opt/ltp/runltp -q -d /daxmnt -f rwt -p -b /dev/pmem0p2 -B xfs
umount /daxmnt

Bisect log is attached.

Thanks,
Xiong

[-- Attachment #2: bisect2 --]
[-- Type: text/plain, Size: 2814 bytes --]

git bisect start
# bad: [50f6584e4c626b8fa39edb66f33fec27bab3996c] Merge tag 'leds_for_4.10_email_update' of git://git.kernel.org/pub/scm/linux/kernel/git/j.anaszewski/linux-leds
git bisect bad 50f6584e4c626b8fa39edb66f33fec27bab3996c
# good: [69973b830859bc6529a7a0468ba0d80ee5117826] Linux 4.9
git bisect good 69973b830859bc6529a7a0468ba0d80ee5117826
# good: [5266e70335dac35c35b5ca9cea4251c1389d4a68] Merge tag 'tty-4.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty
git bisect good 5266e70335dac35c35b5ca9cea4251c1389d4a68
# bad: [6df8b74b1720db1133ace0861cb6721bfe57819a] Merge tag 'devicetree-for-4.10' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux
git bisect bad 6df8b74b1720db1133ace0861cb6721bfe57819a
# good: [f4000cd99750065d5177555c0a805c97174d1b9f] Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
git bisect good f4000cd99750065d5177555c0a805c97174d1b9f
# bad: [e1e14ab8411df344a17687821f8f78f0a1e73cbb] radix tree test suite: delete unused rcupdate.c
git bisect bad e1e14ab8411df344a17687821f8f78f0a1e73cbb
# good: [f5b893c947151d424a4ab55ea3a8544b81974b31] scsi: qla4xxx: switch to pci_alloc_irq_vectors
git bisect good f5b893c947151d424a4ab55ea3a8544b81974b31
# good: [b9f98bd4034a3196ff068eb0fa376c5f41077480] Merge tag 'mmc-v4.10-2' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc
git bisect good b9f98bd4034a3196ff068eb0fa376c5f41077480
# good: [4d1f0fb096aedea7bb5489af93498a82e467c480] kernel/watchdog: use nmi registers snapshot in hardlockup handler
git bisect good 4d1f0fb096aedea7bb5489af93498a82e467c480
# good: [5b56d49fc31dbb0487e14ead790fc81ca9fb2c99] mm: add locked parameter to get_user_pages_remote()
git bisect good 5b56d49fc31dbb0487e14ead790fc81ca9fb2c99
# bad: [cfa40bcfd6fed7010b1633bf127ed8571d3b607e] radix tree test suite: benchmark for iterator
git bisect bad cfa40bcfd6fed7010b1633bf127ed8571d3b607e
# good: [a41b70d6dfc28b9e1a17c2a9f3181c2b614bfd54] mm: use vmf->page during WP faults
git bisect good a41b70d6dfc28b9e1a17c2a9f3181c2b614bfd54
# bad: [4b4bb46d00b386e1c972890dc5785a7966eaa9c0] dax: clear dirty entry tags on cache flush
git bisect bad 4b4bb46d00b386e1c972890dc5785a7966eaa9c0
# good: [a19e25536ed3a20845f642ce531e10c27fb2add5] mm: change return values of finish_mkwrite_fault()
git bisect good a19e25536ed3a20845f642ce531e10c27fb2add5
# good: [a6abc2c0e77b16480f4d2c1eb7925e5287ae1526] dax: make cache flushing protected by entry lock
git bisect good a6abc2c0e77b16480f4d2c1eb7925e5287ae1526
# good: [2f89dc12a25ddf995b9acd7b6543fe892e3473d6] dax: protect PTE modification on WP fault by radix tree entry lock
git bisect good 2f89dc12a25ddf995b9acd7b6543fe892e3473d6
# first bad commit: [4b4bb46d00b386e1c972890dc5785a7966eaa9c0] dax: clear dirty entry tags on cache flush

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: LTP rwtest01 blocks on DAX mountpoint
  2016-12-24 11:07 LTP rwtest01 blocks on DAX mountpoint Xiong Zhou
@ 2016-12-30  9:33 ` Xiong Zhou
  2017-01-02 10:05   ` Jan Kara
  2017-01-02 17:16   ` Jan Kara
  0 siblings, 2 replies; 11+ messages in thread
From: Xiong Zhou @ 2016-12-30  9:33 UTC (permalink / raw)
  To: Xiong Zhou; +Cc: jack, linux-nvdimm, linux-fsdevel, linux-kernel

On Sat, Dec 24, 2016 at 07:07:14PM +0800, Xiong Zhou wrote:
> Hi lists,
> 
> Since around 20161129 tag, LTP rwtest01 on dax mountpoint blocks
> on linux-next tree, now on Linus tree.
> 
> In "normal", rwtest01 subcase ends in a few minutes, now it keeps
> running for hours on dax mountpoint, both ext4 and xfs. Ctrl + c
> can interrupt it.

Test programme is waiting for a memcpy call to return.

>From sysrq output, kernel code is not blocking on somewhere,
it just wont return.

> 
> It is always reproducible, blocking following tests.
> 
> It does not happen when mounting without dax option.
> It does not happen on v4.9.
> 
> Bisect point to:
> 
> commit 4b4bb46d00b386e1c972890dc5785a7966eaa9c0
> Author: Jan Kara <jack@suse.cz>
> Date:   Wed Dec 14 15:07:53 2016 -0800
> 
>     dax: clear dirty entry tags on cache flush
> 
> 
> Reverting this commit on top of Linus tree "fixes" this issue.
> 
> Reproducer:
> 
> sh-4.2# cat rwt
> rwtest01 export LTPROOT; rwtest -N rwtest01 -c -q -i 60s  -f sync 10%25000:$TMPDIR/rw-sync-$$
> sh-4.2# 
> mkfs.xfs /dev/pmem0p1
> mount -o dax /dev/pmem0p1 /daxmnt && \
> /opt/ltp/runltp -q -d /daxmnt -f rwt -p -b /dev/pmem0p2 -B xfs
> umount /daxmnt
> 
> Bisect log is attached.
> 
> Thanks,
> Xiong

> git bisect start
> # bad: [50f6584e4c626b8fa39edb66f33fec27bab3996c] Merge tag 'leds_for_4.10_email_update' of git://git.kernel.org/pub/scm/linux/kernel/git/j.anaszewski/linux-leds
> git bisect bad 50f6584e4c626b8fa39edb66f33fec27bab3996c
> # good: [69973b830859bc6529a7a0468ba0d80ee5117826] Linux 4.9
> git bisect good 69973b830859bc6529a7a0468ba0d80ee5117826
> # good: [5266e70335dac35c35b5ca9cea4251c1389d4a68] Merge tag 'tty-4.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty
> git bisect good 5266e70335dac35c35b5ca9cea4251c1389d4a68
> # bad: [6df8b74b1720db1133ace0861cb6721bfe57819a] Merge tag 'devicetree-for-4.10' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux
> git bisect bad 6df8b74b1720db1133ace0861cb6721bfe57819a
> # good: [f4000cd99750065d5177555c0a805c97174d1b9f] Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
> git bisect good f4000cd99750065d5177555c0a805c97174d1b9f
> # bad: [e1e14ab8411df344a17687821f8f78f0a1e73cbb] radix tree test suite: delete unused rcupdate.c
> git bisect bad e1e14ab8411df344a17687821f8f78f0a1e73cbb
> # good: [f5b893c947151d424a4ab55ea3a8544b81974b31] scsi: qla4xxx: switch to pci_alloc_irq_vectors
> git bisect good f5b893c947151d424a4ab55ea3a8544b81974b31
> # good: [b9f98bd4034a3196ff068eb0fa376c5f41077480] Merge tag 'mmc-v4.10-2' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc
> git bisect good b9f98bd4034a3196ff068eb0fa376c5f41077480
> # good: [4d1f0fb096aedea7bb5489af93498a82e467c480] kernel/watchdog: use nmi registers snapshot in hardlockup handler
> git bisect good 4d1f0fb096aedea7bb5489af93498a82e467c480
> # good: [5b56d49fc31dbb0487e14ead790fc81ca9fb2c99] mm: add locked parameter to get_user_pages_remote()
> git bisect good 5b56d49fc31dbb0487e14ead790fc81ca9fb2c99
> # bad: [cfa40bcfd6fed7010b1633bf127ed8571d3b607e] radix tree test suite: benchmark for iterator
> git bisect bad cfa40bcfd6fed7010b1633bf127ed8571d3b607e
> # good: [a41b70d6dfc28b9e1a17c2a9f3181c2b614bfd54] mm: use vmf->page during WP faults
> git bisect good a41b70d6dfc28b9e1a17c2a9f3181c2b614bfd54
> # bad: [4b4bb46d00b386e1c972890dc5785a7966eaa9c0] dax: clear dirty entry tags on cache flush
> git bisect bad 4b4bb46d00b386e1c972890dc5785a7966eaa9c0
> # good: [a19e25536ed3a20845f642ce531e10c27fb2add5] mm: change return values of finish_mkwrite_fault()
> git bisect good a19e25536ed3a20845f642ce531e10c27fb2add5
> # good: [a6abc2c0e77b16480f4d2c1eb7925e5287ae1526] dax: make cache flushing protected by entry lock
> git bisect good a6abc2c0e77b16480f4d2c1eb7925e5287ae1526
> # good: [2f89dc12a25ddf995b9acd7b6543fe892e3473d6] dax: protect PTE modification on WP fault by radix tree entry lock
> git bisect good 2f89dc12a25ddf995b9acd7b6543fe892e3473d6
> # first bad commit: [4b4bb46d00b386e1c972890dc5785a7966eaa9c0] dax: clear dirty entry tags on cache flush

> _______________________________________________
> Linux-nvdimm mailing list
> Linux-nvdimm@lists.01.org
> https://lists.01.org/mailman/listinfo/linux-nvdimm

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: LTP rwtest01 blocks on DAX mountpoint
  2016-12-30  9:33 ` Xiong Zhou
@ 2017-01-02 10:05   ` Jan Kara
  2017-01-02 17:16   ` Jan Kara
  1 sibling, 0 replies; 11+ messages in thread
From: Jan Kara @ 2017-01-02 10:05 UTC (permalink / raw)
  To: Xiong Zhou; +Cc: jack, linux-nvdimm, linux-fsdevel, linux-kernel

On Fri 30-12-16 17:33:53, Xiong Zhou wrote:
> On Sat, Dec 24, 2016 at 07:07:14PM +0800, Xiong Zhou wrote:
> > Hi lists,
> > 
> > Since around 20161129 tag, LTP rwtest01 on dax mountpoint blocks
> > on linux-next tree, now on Linus tree.
> > 
> > In "normal", rwtest01 subcase ends in a few minutes, now it keeps
> > running for hours on dax mountpoint, both ext4 and xfs. Ctrl + c
> > can interrupt it.
> 
> Test programme is waiting for a memcpy call to return.
> 
> From sysrq output, kernel code is not blocking on somewhere,
> it just wont return.

Thanks for report. I'll try to reproduce this. Looks like we are forever
retrying the fault or something like that.

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: LTP rwtest01 blocks on DAX mountpoint
  2016-12-30  9:33 ` Xiong Zhou
  2017-01-02 10:05   ` Jan Kara
@ 2017-01-02 17:16   ` Jan Kara
  2017-01-02 21:49     ` Ross Zwisler
  1 sibling, 1 reply; 11+ messages in thread
From: Jan Kara @ 2017-01-02 17:16 UTC (permalink / raw)
  To: Xiong Zhou; +Cc: jack, linux-nvdimm, linux-fsdevel, linux-kernel

On Fri 30-12-16 17:33:53, Xiong Zhou wrote:
> On Sat, Dec 24, 2016 at 07:07:14PM +0800, Xiong Zhou wrote:
> > Hi lists,
> > 
> > Since around 20161129 tag, LTP rwtest01 on dax mountpoint blocks
> > on linux-next tree, now on Linus tree.
> > 
> > In "normal", rwtest01 subcase ends in a few minutes, now it keeps
> > running for hours on dax mountpoint, both ext4 and xfs. Ctrl + c
> > can interrupt it.
> 
> Test programme is waiting for a memcpy call to return.
> 
> From sysrq output, kernel code is not blocking on somewhere,
> it just wont return.

I was trying to reproduce this but for me rwtest01 completes just fine on
dax mountpoint (I've used your reproducer). So can you sample several
kernel stack traces to get a rough idea where the kernel is running?
Thanks!

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: LTP rwtest01 blocks on DAX mountpoint
  2017-01-02 17:16   ` Jan Kara
@ 2017-01-02 21:49     ` Ross Zwisler
  2017-01-03  6:49       ` Xiong Zhou
  0 siblings, 1 reply; 11+ messages in thread
From: Ross Zwisler @ 2017-01-02 21:49 UTC (permalink / raw)
  To: Jan Kara; +Cc: Xiong Zhou, linux-fsdevel, linux-nvdimm, linux-kernel

On Mon, Jan 02, 2017 at 06:16:17PM +0100, Jan Kara wrote:
> On Fri 30-12-16 17:33:53, Xiong Zhou wrote:
> > On Sat, Dec 24, 2016 at 07:07:14PM +0800, Xiong Zhou wrote:
> > > Hi lists,
> > > 
> > > Since around 20161129 tag, LTP rwtest01 on dax mountpoint blocks
> > > on linux-next tree, now on Linus tree.
> > > 
> > > In "normal", rwtest01 subcase ends in a few minutes, now it keeps
> > > running for hours on dax mountpoint, both ext4 and xfs. Ctrl + c
> > > can interrupt it.
> > 
> > Test programme is waiting for a memcpy call to return.
> > 
> > From sysrq output, kernel code is not blocking on somewhere,
> > it just wont return.
> 
> I was trying to reproduce this but for me rwtest01 completes just fine on
> dax mountpoint (I've used your reproducer). So can you sample several
> kernel stack traces to get a rough idea where the kernel is running?
> Thanks!
> 
> 								Honza

I'm also unable to reproduce this issue.  I've tried with both the blamed
commit:

4b4bb46 (HEAD) dax: clear dirty entry tags on cache flush

and with v4.9-rc2.  Both pass the test in my setup.

Perhaps the variable is the size of your PMEM partitions?

# fdisk -l /dev/pmem0
Disk /dev/pmem0: 16 GiB, 17179869184 bytes, 33554432 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disklabel type: dos
Disk identifier: 0xfe50c900

Device       Boot    Start      End  Sectors Size Id Type
/dev/pmem0p1          4096 25165823 25161728  12G 83 Linux
/dev/pmem0p2      25165824 33550335  8384512   4G 83 Linux

What does your setup look like?

I'm using the current tip of the LTP tree:

8cc4165  waitid02: define _XOPEN_SOURCE 500

Thanks,
- Ross

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: LTP rwtest01 blocks on DAX mountpoint
  2017-01-02 21:49     ` Ross Zwisler
@ 2017-01-03  6:49       ` Xiong Zhou
  2017-01-03 16:57         ` Ross Zwisler
  0 siblings, 1 reply; 11+ messages in thread
From: Xiong Zhou @ 2017-01-03  6:49 UTC (permalink / raw)
  To: Ross Zwisler, Jan Kara, Xiong Zhou, linux-fsdevel, linux-nvdimm,
	linux-kernel

On Mon, Jan 02, 2017 at 02:49:41PM -0700, Ross Zwisler wrote:
> On Mon, Jan 02, 2017 at 06:16:17PM +0100, Jan Kara wrote:
> > On Fri 30-12-16 17:33:53, Xiong Zhou wrote:
> > > On Sat, Dec 24, 2016 at 07:07:14PM +0800, Xiong Zhou wrote:
> > > > Hi lists,
snip
> > I was trying to reproduce this but for me rwtest01 completes just fine on
> > dax mountpoint (I've used your reproducer). So can you sample several
> > kernel stack traces to get a rough idea where the kernel is running?
> > Thanks!
> > 
> > 								Honza
> 
> I'm also unable to reproduce this issue.  I've tried with both the blamed
> commit:
> 4b4bb46 (HEAD) dax: clear dirty entry tags on cache flush
> and with v4.9-rc2.  Both pass the test in my setup.
> Perhaps the variable is the size of your PMEM partitions?
> # fdisk -l /dev/pmem0
> Disk /dev/pmem0: 16 GiB, 17179869184 bytes, 33554432 sectors
> Units: sectors of 1 * 512 = 512 bytes
> Sector size (logical/physical): 512 bytes / 4096 bytes
> I/O size (minimum/optimal): 4096 bytes / 4096 bytes
> Disklabel type: dos
> Disk identifier: 0xfe50c900
> Device       Boot    Start      End  Sectors Size Id Type
> /dev/pmem0p1          4096 25165823 25161728  12G 83 Linux
> /dev/pmem0p2      25165824 33550335  8384512   4G 83 Linux
> 
> What does your setup look like?
> I'm using the current tip of the LTP tree:
> 8cc4165  waitid02: define _XOPEN_SOURCE 500
> Thanks,
> - Ross

Thanks all for looking into it.

Turns out the rc2 relative updates fix this issue, so does
an old issue i reported a while ago:
multi-threads libvmmalloc fork test hang
https://lists.01.org/pipermail/linux-nvdimm/2016-October/007602.html

I'm able to reproduce these issues before rc2, now it
passes on current Linus tree:
c8b4ec8 Merge tag 'fscrypt-for-stable'

Thanks,
Xiong

> --
> To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: LTP rwtest01 blocks on DAX mountpoint
  2017-01-03  6:49       ` Xiong Zhou
@ 2017-01-03 16:57         ` Ross Zwisler
  2017-01-04  1:21           ` Xiong Zhou
                             ` (2 more replies)
  0 siblings, 3 replies; 11+ messages in thread
From: Ross Zwisler @ 2017-01-03 16:57 UTC (permalink / raw)
  To: Xiong Zhou
  Cc: Ross Zwisler, Jan Kara, linux-fsdevel, linux-nvdimm, linux-kernel

On Tue, Jan 03, 2017 at 02:49:22PM +0800, Xiong Zhou wrote:
> On Mon, Jan 02, 2017 at 02:49:41PM -0700, Ross Zwisler wrote:
> > On Mon, Jan 02, 2017 at 06:16:17PM +0100, Jan Kara wrote:
> > > On Fri 30-12-16 17:33:53, Xiong Zhou wrote:
> > > > On Sat, Dec 24, 2016 at 07:07:14PM +0800, Xiong Zhou wrote:
> > > > > Hi lists,
> snip
> > > I was trying to reproduce this but for me rwtest01 completes just fine on
> > > dax mountpoint (I've used your reproducer). So can you sample several
> > > kernel stack traces to get a rough idea where the kernel is running?
> > > Thanks!
> > > 
> > > 								Honza
> > 
> > I'm also unable to reproduce this issue.  I've tried with both the blamed
> > commit:
> > 4b4bb46 (HEAD) dax: clear dirty entry tags on cache flush
> > and with v4.9-rc2.  Both pass the test in my setup.
> > Perhaps the variable is the size of your PMEM partitions?
> > # fdisk -l /dev/pmem0
> > Disk /dev/pmem0: 16 GiB, 17179869184 bytes, 33554432 sectors
> > Units: sectors of 1 * 512 = 512 bytes
> > Sector size (logical/physical): 512 bytes / 4096 bytes
> > I/O size (minimum/optimal): 4096 bytes / 4096 bytes
> > Disklabel type: dos
> > Disk identifier: 0xfe50c900
> > Device       Boot    Start      End  Sectors Size Id Type
> > /dev/pmem0p1          4096 25165823 25161728  12G 83 Linux
> > /dev/pmem0p2      25165824 33550335  8384512   4G 83 Linux
> > 
> > What does your setup look like?
> > I'm using the current tip of the LTP tree:
> > 8cc4165  waitid02: define _XOPEN_SOURCE 500
> > Thanks,
> > - Ross
> 
> Thanks all for looking into it.
> 
> Turns out the rc2 relative updates fix this issue, so does
> an old issue i reported a while ago:
> multi-threads libvmmalloc fork test hang
> https://lists.01.org/pipermail/linux-nvdimm/2016-October/007602.html
> 
> I'm able to reproduce these issues before rc2, now it
> passes on current Linus tree:
> c8b4ec8 Merge tag 'fscrypt-for-stable'

Hmm...I'm able to reproduce the other libvmmalloc issue with both v4.10-rc2
and with "c8b4ec8 Merge tag 'fscrypt-for-stable'".  I'm debugging that issue
today.

It's interesting that both tests started passing for you.  Did you change
something in your test setup?

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: LTP rwtest01 blocks on DAX mountpoint
  2017-01-03 16:57         ` Ross Zwisler
@ 2017-01-04  1:21           ` Xiong Zhou
  2017-01-04  1:49           ` Xiong Zhou
  2017-01-04  9:48           ` Xiong Zhou
  2 siblings, 0 replies; 11+ messages in thread
From: Xiong Zhou @ 2017-01-04  1:21 UTC (permalink / raw)
  To: Ross Zwisler, Jan Kara, linux-fsdevel, linux-nvdimm, linux-kernel

On Tue, Jan 03, 2017 at 09:57:10AM -0700, Ross Zwisler wrote:
> On Tue, Jan 03, 2017 at 02:49:22PM +0800, Xiong Zhou wrote:
> > On Mon, Jan 02, 2017 at 02:49:41PM -0700, Ross Zwisler wrote:
> > > On Mon, Jan 02, 2017 at 06:16:17PM +0100, Jan Kara wrote:
> > > > On Fri 30-12-16 17:33:53, Xiong Zhou wrote:
> > > > > On Sat, Dec 24, 2016 at 07:07:14PM +0800, Xiong Zhou wrote:
> > > > > > Hi lists,
> > snip
> > > > I was trying to reproduce this but for me rwtest01 completes just fine on
> > > > dax mountpoint (I've used your reproducer). So can you sample several
> > > > kernel stack traces to get a rough idea where the kernel is running?
> > > > Thanks!
> > > > 
> > > > 								Honza
> > > 
> > > I'm also unable to reproduce this issue.  I've tried with both the blamed
> > > commit:
> > > 4b4bb46 (HEAD) dax: clear dirty entry tags on cache flush
> > > and with v4.9-rc2.  Both pass the test in my setup.
> > > Perhaps the variable is the size of your PMEM partitions?
> > > # fdisk -l /dev/pmem0
> > > Disk /dev/pmem0: 16 GiB, 17179869184 bytes, 33554432 sectors
> > > Units: sectors of 1 * 512 = 512 bytes
> > > Sector size (logical/physical): 512 bytes / 4096 bytes
> > > I/O size (minimum/optimal): 4096 bytes / 4096 bytes
> > > Disklabel type: dos
> > > Disk identifier: 0xfe50c900
> > > Device       Boot    Start      End  Sectors Size Id Type
> > > /dev/pmem0p1          4096 25165823 25161728  12G 83 Linux
> > > /dev/pmem0p2      25165824 33550335  8384512   4G 83 Linux
> > > 
> > > What does your setup look like?
> > > I'm using the current tip of the LTP tree:
> > > 8cc4165  waitid02: define _XOPEN_SOURCE 500
> > > Thanks,
> > > - Ross
> > 
> > Thanks all for looking into it.
> > 
> > Turns out the rc2 relative updates fix this issue, so does
> > an old issue i reported a while ago:
> > multi-threads libvmmalloc fork test hang
> > https://lists.01.org/pipermail/linux-nvdimm/2016-October/007602.html
> > 
> > I'm able to reproduce these issues before rc2, now it
> > passes on current Linus tree:
> > c8b4ec8 Merge tag 'fscrypt-for-stable'
> 
> Hmm...I'm able to reproduce the other libvmmalloc issue with both v4.10-rc2
> and with "c8b4ec8 Merge tag 'fscrypt-for-stable'".  I'm debugging that issue
> today.
> 
> It's interesting that both tests started passing for you.  Did you change
> something in your test setup?

Nope.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: LTP rwtest01 blocks on DAX mountpoint
  2017-01-03 16:57         ` Ross Zwisler
  2017-01-04  1:21           ` Xiong Zhou
@ 2017-01-04  1:49           ` Xiong Zhou
  2017-01-04  9:48           ` Xiong Zhou
  2 siblings, 0 replies; 11+ messages in thread
From: Xiong Zhou @ 2017-01-04  1:49 UTC (permalink / raw)
  To: Ross Zwisler, Jan Kara, linux-fsdevel, linux-nvdimm, linux-kernel

On Tue, Jan 03, 2017 at 09:57:10AM -0700, Ross Zwisler wrote:
> On Tue, Jan 03, 2017 at 02:49:22PM +0800, Xiong Zhou wrote:
> > On Mon, Jan 02, 2017 at 02:49:41PM -0700, Ross Zwisler wrote:
> > > On Mon, Jan 02, 2017 at 06:16:17PM +0100, Jan Kara wrote:
> > > > On Fri 30-12-16 17:33:53, Xiong Zhou wrote:
> > > > > On Sat, Dec 24, 2016 at 07:07:14PM +0800, Xiong Zhou wrote:
> > > > > > Hi lists,
> > snip
> > > > I was trying to reproduce this but for me rwtest01 completes just fine on
> > > > dax mountpoint (I've used your reproducer). So can you sample several
> > > > kernel stack traces to get a rough idea where the kernel is running?
> > > > Thanks!
> > > > 
> > > > 								Honza
> > > 
> > > I'm also unable to reproduce this issue.  I've tried with both the blamed
> > > commit:
> > > 4b4bb46 (HEAD) dax: clear dirty entry tags on cache flush
> > > and with v4.9-rc2.  Both pass the test in my setup.
> > > Perhaps the variable is the size of your PMEM partitions?
> > > # fdisk -l /dev/pmem0
> > > Disk /dev/pmem0: 16 GiB, 17179869184 bytes, 33554432 sectors
> > > Units: sectors of 1 * 512 = 512 bytes
> > > Sector size (logical/physical): 512 bytes / 4096 bytes
> > > I/O size (minimum/optimal): 4096 bytes / 4096 bytes
> > > Disklabel type: dos
> > > Disk identifier: 0xfe50c900
> > > Device       Boot    Start      End  Sectors Size Id Type
> > > /dev/pmem0p1          4096 25165823 25161728  12G 83 Linux
> > > /dev/pmem0p2      25165824 33550335  8384512   4G 83 Linux
> > > 
> > > What does your setup look like?
> > > I'm using the current tip of the LTP tree:
> > > 8cc4165  waitid02: define _XOPEN_SOURCE 500
> > > Thanks,
> > > - Ross
> > 
> > Thanks all for looking into it.
> > 
> > Turns out the rc2 relative updates fix this issue, so does
> > an old issue i reported a while ago:
> > multi-threads libvmmalloc fork test hang
> > https://lists.01.org/pipermail/linux-nvdimm/2016-October/007602.html
> > 
> > I'm able to reproduce these issues before rc2, now it
> > passes on current Linus tree:
> > c8b4ec8 Merge tag 'fscrypt-for-stable'
> 
> Hmm...I'm able to reproduce the other libvmmalloc issue with both v4.10-rc2
> and with "c8b4ec8 Merge tag 'fscrypt-for-stable'".  I'm debugging that issue
> today.
> 
> It's interesting that both tests started passing for you.  Did you change
> something in your test setup?

Er.. After double-checking, I have reduced nvml check test time on
Dec 30,

-make -C src check -j $NR_CPU
+make -C src check -j $NR_CPU TEST_TIME=1m

Other then this, nothing changed but the kernel code, same machine,
same cmdline, same configs.

I'm going to dig this more, and test your patch.

Thanks for looking into this!
> --
> To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: LTP rwtest01 blocks on DAX mountpoint
  2017-01-03 16:57         ` Ross Zwisler
  2017-01-04  1:21           ` Xiong Zhou
  2017-01-04  1:49           ` Xiong Zhou
@ 2017-01-04  9:48           ` Xiong Zhou
  2017-01-04 16:40             ` Ross Zwisler
  2 siblings, 1 reply; 11+ messages in thread
From: Xiong Zhou @ 2017-01-04  9:48 UTC (permalink / raw)
  To: Ross Zwisler, Xiong Zhou, Jan Kara, linux-fsdevel, linux-nvdimm,
	linux-kernel

On Tue, Jan 03, 2017 at 09:57:10AM -0700, Ross Zwisler wrote:
> On Tue, Jan 03, 2017 at 02:49:22PM +0800, Xiong Zhou wrote:
> > On Mon, Jan 02, 2017 at 02:49:41PM -0700, Ross Zwisler wrote:
> > > On Mon, Jan 02, 2017 at 06:16:17PM +0100, Jan Kara wrote:
> > > > On Fri 30-12-16 17:33:53, Xiong Zhou wrote:
> > > > > On Sat, Dec 24, 2016 at 07:07:14PM +0800, Xiong Zhou wrote:
> > > > > > Hi lists,
> > snip
> > > > I was trying to reproduce this but for me rwtest01 completes just fine on
> > > > dax mountpoint (I've used your reproducer). So can you sample several
> > > > kernel stack traces to get a rough idea where the kernel is running?
> > > > Thanks!
> > > > 
> > > > 								Honza
> > > 
> > > I'm also unable to reproduce this issue.  I've tried with both the blamed
> > > commit:
> > > 4b4bb46 (HEAD) dax: clear dirty entry tags on cache flush
> > > and with v4.9-rc2.  Both pass the test in my setup.
> > > Perhaps the variable is the size of your PMEM partitions?
> > > # fdisk -l /dev/pmem0
> > > Disk /dev/pmem0: 16 GiB, 17179869184 bytes, 33554432 sectors
> > > Units: sectors of 1 * 512 = 512 bytes
> > > Sector size (logical/physical): 512 bytes / 4096 bytes
> > > I/O size (minimum/optimal): 4096 bytes / 4096 bytes
> > > Disklabel type: dos
> > > Disk identifier: 0xfe50c900
> > > Device       Boot    Start      End  Sectors Size Id Type
> > > /dev/pmem0p1          4096 25165823 25161728  12G 83 Linux
> > > /dev/pmem0p2      25165824 33550335  8384512   4G 83 Linux
> > > 
> > > What does your setup look like?
> > > I'm using the current tip of the LTP tree:
> > > 8cc4165  waitid02: define _XOPEN_SOURCE 500
> > > Thanks,
> > > - Ross
> > 
> > Thanks all for looking into it.
> > 
> > Turns out the rc2 relative updates fix this issue, so does
> > an old issue i reported a while ago:
> > multi-threads libvmmalloc fork test hang
> > https://lists.01.org/pipermail/linux-nvdimm/2016-October/007602.html
> > 
> > I'm able to reproduce these issues before rc2, now it
> > passes on current Linus tree:
> > c8b4ec8 Merge tag 'fscrypt-for-stable'
> 
> Hmm...I'm able to reproduce the other libvmmalloc issue with both v4.10-rc2
> and with "c8b4ec8 Merge tag 'fscrypt-for-stable'".  I'm debugging that issue
> today.
> 
> It's interesting that both tests started passing for you.  Did you change
> something in your test setup?

Hi,

Quick update:
  Ross's new patch fixed the vmmaloc_fork issue, not the rc2 update.
  Regression tests is going on, so far so good.

I'm able to reproduce the vmmalloc_fork issue on rc2 kernel
	c8b4ec8 Merge tag 'fscrypt-for-stable'
with nvml commit to
	77c2a5a Merge pull request #1554 from krzycz/win-libvmem_rc

My previous statement about rc2 fixed old vmmalloc_fork issue
was wrong, my mistake. I have changed my test setup.

Now after some tests, Ross's patch
	[PATCH] dax: fix deadlock with DAX 4k holes
on top of Linus tree c8b4ec8 have fixed this vmmalloc_fork issue.
My DAX regression tests is going on, looks good so far. Gonna
update once it have finished.

> --
> To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: LTP rwtest01 blocks on DAX mountpoint
  2017-01-04  9:48           ` Xiong Zhou
@ 2017-01-04 16:40             ` Ross Zwisler
  0 siblings, 0 replies; 11+ messages in thread
From: Ross Zwisler @ 2017-01-04 16:40 UTC (permalink / raw)
  To: Xiong Zhou
  Cc: Ross Zwisler, Jan Kara, linux-fsdevel, linux-nvdimm, linux-kernel

On Wed, Jan 04, 2017 at 05:48:34PM +0800, Xiong Zhou wrote:
> On Tue, Jan 03, 2017 at 09:57:10AM -0700, Ross Zwisler wrote:
> > On Tue, Jan 03, 2017 at 02:49:22PM +0800, Xiong Zhou wrote:
> > > On Mon, Jan 02, 2017 at 02:49:41PM -0700, Ross Zwisler wrote:
> > > > On Mon, Jan 02, 2017 at 06:16:17PM +0100, Jan Kara wrote:
> > > > > On Fri 30-12-16 17:33:53, Xiong Zhou wrote:
> > > > > > On Sat, Dec 24, 2016 at 07:07:14PM +0800, Xiong Zhou wrote:
> > > > > > > Hi lists,
> > > snip
> > > > > I was trying to reproduce this but for me rwtest01 completes just fine on
> > > > > dax mountpoint (I've used your reproducer). So can you sample several
> > > > > kernel stack traces to get a rough idea where the kernel is running?
> > > > > Thanks!
> > > > > 
> > > > > 								Honza
> > > > 
> > > > I'm also unable to reproduce this issue.  I've tried with both the blamed
> > > > commit:
> > > > 4b4bb46 (HEAD) dax: clear dirty entry tags on cache flush
> > > > and with v4.9-rc2.  Both pass the test in my setup.
> > > > Perhaps the variable is the size of your PMEM partitions?
> > > > # fdisk -l /dev/pmem0
> > > > Disk /dev/pmem0: 16 GiB, 17179869184 bytes, 33554432 sectors
> > > > Units: sectors of 1 * 512 = 512 bytes
> > > > Sector size (logical/physical): 512 bytes / 4096 bytes
> > > > I/O size (minimum/optimal): 4096 bytes / 4096 bytes
> > > > Disklabel type: dos
> > > > Disk identifier: 0xfe50c900
> > > > Device       Boot    Start      End  Sectors Size Id Type
> > > > /dev/pmem0p1          4096 25165823 25161728  12G 83 Linux
> > > > /dev/pmem0p2      25165824 33550335  8384512   4G 83 Linux
> > > > 
> > > > What does your setup look like?
> > > > I'm using the current tip of the LTP tree:
> > > > 8cc4165  waitid02: define _XOPEN_SOURCE 500
> > > > Thanks,
> > > > - Ross
> > > 
> > > Thanks all for looking into it.
> > > 
> > > Turns out the rc2 relative updates fix this issue, so does
> > > an old issue i reported a while ago:
> > > multi-threads libvmmalloc fork test hang
> > > https://lists.01.org/pipermail/linux-nvdimm/2016-October/007602.html
> > > 
> > > I'm able to reproduce these issues before rc2, now it
> > > passes on current Linus tree:
> > > c8b4ec8 Merge tag 'fscrypt-for-stable'
> > 
> > Hmm...I'm able to reproduce the other libvmmalloc issue with both v4.10-rc2
> > and with "c8b4ec8 Merge tag 'fscrypt-for-stable'".  I'm debugging that issue
> > today.
> > 
> > It's interesting that both tests started passing for you.  Did you change
> > something in your test setup?
> 
> Hi,
> 
> Quick update:
>   Ross's new patch fixed the vmmaloc_fork issue, not the rc2 update.
>   Regression tests is going on, so far so good.
> 
> I'm able to reproduce the vmmalloc_fork issue on rc2 kernel
> 	c8b4ec8 Merge tag 'fscrypt-for-stable'
> with nvml commit to
> 	77c2a5a Merge pull request #1554 from krzycz/win-libvmem_rc
> 
> My previous statement about rc2 fixed old vmmalloc_fork issue
> was wrong, my mistake. I have changed my test setup.
> 
> Now after some tests, Ross's patch
> 	[PATCH] dax: fix deadlock with DAX 4k holes
> on top of Linus tree c8b4ec8 have fixed this vmmalloc_fork issue.
> My DAX regression tests is going on, looks good so far. Gonna
> update once it have finished.

Cool, thanks for the update.  If you're still able to reproduce this second
issue after my patch we can dig in to the differences between your test setup
and mine so I can reproduce it & debug.

Thanks for the reports!

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2017-01-04 16:42 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-12-24 11:07 LTP rwtest01 blocks on DAX mountpoint Xiong Zhou
2016-12-30  9:33 ` Xiong Zhou
2017-01-02 10:05   ` Jan Kara
2017-01-02 17:16   ` Jan Kara
2017-01-02 21:49     ` Ross Zwisler
2017-01-03  6:49       ` Xiong Zhou
2017-01-03 16:57         ` Ross Zwisler
2017-01-04  1:21           ` Xiong Zhou
2017-01-04  1:49           ` Xiong Zhou
2017-01-04  9:48           ` Xiong Zhou
2017-01-04 16:40             ` Ross Zwisler

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).