fstests.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* system hang on a syncfs test with nfs_export enabled
@ 2020-04-30  9:15 Chengguang Xu
  2020-04-30  9:48 ` 回复:system " Chengguang Xu
  0 siblings, 1 reply; 5+ messages in thread
From: Chengguang Xu @ 2020-04-30  9:15 UTC (permalink / raw)
  To: linux-unionfs; +Cc: fstests, amir73il, miklos, guaneryu, cgxu519

Hi 

I'm doing some tests for my new version of syncfs improvement patch and I found an 
interesting problem​ when combining dirty data && godown && nfs_export.

My expectation  is  Pass or Fail  all tests listed below, Test2 looks a bit strange and in my
opinion there is no strong connection between nfs_export/index and dirty data.
Any idea?


Test env and step like below:

Test1:
Compile module with nfs_export enabled
Run xfstest generic/474   ==> PASS

Test2:
Compile module with nfs_export enabled
Comment syncfs step in the test
Run xfstest generic/474   ==> Hang

Test3:
Compile module with nfs_export disabled
Run xfstest generic/474   ==> PASS

Test4:
Compile module with nfs_export disabled
Comment syncfs step in the test
Run xfstest generic/474   ==> FAIL





^ permalink raw reply	[flat|nested] 5+ messages in thread

* 回复:system hang on a syncfs test with nfs_export enabled
  2020-04-30  9:15 system hang on a syncfs test with nfs_export enabled Chengguang Xu
@ 2020-04-30  9:48 ` Chengguang Xu
  2020-04-30 12:22   ` system " Amir Goldstein
  0 siblings, 1 reply; 5+ messages in thread
From: Chengguang Xu @ 2020-04-30  9:48 UTC (permalink / raw)
  To: cgxu519; +Cc: linux-unionfs, fstests, amir73il, miklos, guaneryu

 ---- 在 星期四, 2020-04-30 17:15:20 Chengguang Xu <cgxu519@mykernel.net> 撰写 ----
 > Hi 
 > 
 > I'm doing some tests for my new version of syncfs improvement patch and I found an 
 > interesting problem​ when combining dirty data && godown && nfs_export.
 > 
 > My expectation  is  Pass or Fail  all tests listed below, Test2 looks a bit strange and in my
 > opinion there is no strong connection between nfs_export/index and dirty data.
 > Any idea?
 > 
 > 
 > Test env and step like below:
 > 
 > Test1:
 > Compile module with nfs_export enabled
 > Run xfstest generic/474   ==> PASS
 > 
 > Test2:
 > Compile module with nfs_export enabled
 > Comment syncfs step in the test
 > Run xfstest generic/474   ==> Hang
 > 
 > Test3:
 > Compile module with nfs_export disabled
 > Run xfstest generic/474   ==> PASS
 > 
 > Test4:
 > Compile module with nfs_export disabled
 > Comment syncfs step in the test
 > Run xfstest generic/474   ==> FAIL
 > 

Additional information:

Overlayfs version: latest next branch of miklos tree (5.7-rc2)
Underlying fs: xfs

Thanks,
cgxu





^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: system hang on a syncfs test with nfs_export enabled
  2020-04-30  9:48 ` 回复:system " Chengguang Xu
@ 2020-04-30 12:22   ` Amir Goldstein
  2020-05-02  4:10     ` Chengguang Xu
  0 siblings, 1 reply; 5+ messages in thread
From: Amir Goldstein @ 2020-04-30 12:22 UTC (permalink / raw)
  To: Chengguang Xu; +Cc: linux-unionfs, fstests, miklos, guaneryu

On Thu, Apr 30, 2020 at 12:48 PM Chengguang Xu <cgxu519@mykernel.net> wrote:
>
>  ---- 在 星期四, 2020-04-30 17:15:20 Chengguang Xu <cgxu519@mykernel.net> 撰写 ----
>  > Hi
>  >
>  > I'm doing some tests for my new version of syncfs improvement patch and I found an
>  > interesting problem when combining dirty data && godown && nfs_export.
>  >
>  > My expectation  is  Pass or Fail  all tests listed below, Test2 looks a bit strange and in my
>  > opinion there is no strong connection between nfs_export/index and dirty data.
>  > Any idea?
>  >
>  >
>  > Test env and step like below:
>  >
>  > Test1:
>  > Compile module with nfs_export enabled
>  > Run xfstest generic/474   ==> PASS
>  >
>  > Test2:
>  > Compile module with nfs_export enabled
>  > Comment syncfs step in the test
>  > Run xfstest generic/474   ==> Hang
>  >
>  > Test3:
>  > Compile module with nfs_export disabled
>  > Run xfstest generic/474   ==> PASS
>  >
>  > Test4:
>  > Compile module with nfs_export disabled
>  > Comment syncfs step in the test
>  > Run xfstest generic/474   ==> FAIL
>  >
>
> Additional information:
>
> Overlayfs version: latest next branch of miklos tree (5.7-rc2)
> Underlying fs: xfs
>

Please test also against 5.7-rc2. Maybe we introduced some
regression in -next.

Please dump waiting processes stack by echo w > /proc/sysrq-trigger
to see where in kernel does the test hang.

I cannot think of anything in nfs_export/index that should affect
generic/474, but we will find out soon...

Thanks,
Amir.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: system hang on a syncfs test with nfs_export enabled
  2020-04-30 12:22   ` system " Amir Goldstein
@ 2020-05-02  4:10     ` Chengguang Xu
  2020-05-02  9:17       ` Amir Goldstein
  0 siblings, 1 reply; 5+ messages in thread
From: Chengguang Xu @ 2020-05-02  4:10 UTC (permalink / raw)
  To: Amir Goldstein; +Cc: linux-unionfs, fstests, miklos, guaneryu

 ---- 在 星期四, 2020-04-30 20:22:06 Amir Goldstein <amir73il@gmail.com> 撰写 ----
 > On Thu, Apr 30, 2020 at 12:48 PM Chengguang Xu <cgxu519@mykernel.net> wrote:
 > >
 > >  ---- 在 星期四, 2020-04-30 17:15:20 Chengguang Xu <cgxu519@mykernel.net> 撰写 ----
 > >  > Hi
 > >  >
 > >  > I'm doing some tests for my new version of syncfs improvement patch and I found an
 > >  > interesting problem when combining dirty data && godown && nfs_export.
 > >  >
 > >  > My expectation  is  Pass or Fail  all tests listed below, Test2 looks a bit strange and in my
 > >  > opinion there is no strong connection between nfs_export/index and dirty data.
 > >  > Any idea?
 > >  >
 > >  >
 > >  > Test env and step like below:
 > >  >
 > >  > Test1:
 > >  > Compile module with nfs_export enabled
 > >  > Run xfstest generic/474   ==> PASS
 > >  >
 > >  > Test2:
 > >  > Compile module with nfs_export enabled
 > >  > Comment syncfs step in the test
 > >  > Run xfstest generic/474   ==> Hang
 > >  >
 > >  > Test3:
 > >  > Compile module with nfs_export disabled
 > >  > Run xfstest generic/474   ==> PASS
 > >  >
 > >  > Test4:
 > >  > Compile module with nfs_export disabled
 > >  > Comment syncfs step in the test
 > >  > Run xfstest generic/474   ==> FAIL
 > >  >
 > >
 > > Additional information:
 > >
 > > Overlayfs version: latest next branch of miklos tree (5.7-rc2)
 > > Underlying fs: xfs
 > >
 > 
 > Please test also against 5.7-rc2. Maybe we introduced some
 > regression in -next.
 > 
 > Please dump waiting processes stack by echo w > /proc/sysrq-trigger
 > to see where in kernel does the test hang.
 > 
 > I cannot think of anything in nfs_export/index that should affect
 > generic/474, but we will find out soon...
 > 

I‘m on vacation this week and it seems hard to reproduce the problem on my laptop, maybe there were some config problems.
I'll do more analyses next week on my testing machine.


Thanks,
cgxu




^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: system hang on a syncfs test with nfs_export enabled
  2020-05-02  4:10     ` Chengguang Xu
@ 2020-05-02  9:17       ` Amir Goldstein
  0 siblings, 0 replies; 5+ messages in thread
From: Amir Goldstein @ 2020-05-02  9:17 UTC (permalink / raw)
  To: Chengguang Xu
  Cc: linux-unionfs, fstests, miklos, guaneryu, Brian Foster,
	Christoph Hellwig, linux-xfs, Darrick J. Wong

+CC  xfs folks

On Sat, May 2, 2020 at 7:10 AM Chengguang Xu <cgxu519@mykernel.net> wrote:
>
>  ---- 在 星期四, 2020-04-30 20:22:06 Amir Goldstein <amir73il@gmail.com> 撰写 ----
>  > On Thu, Apr 30, 2020 at 12:48 PM Chengguang Xu <cgxu519@mykernel.net> wrote:
>  > >
>  > >  ---- 在 星期四, 2020-04-30 17:15:20 Chengguang Xu <cgxu519@mykernel.net> 撰写 ----
>  > >  > Hi
>  > >  >
>  > >  > I'm doing some tests for my new version of syncfs improvement patch and I found an
>  > >  > interesting problem when combining dirty data && godown && nfs_export.
>  > >  >
>  > >  > My expectation  is  Pass or Fail  all tests listed below, Test2 looks a bit strange and in my
>  > >  > opinion there is no strong connection between nfs_export/index and dirty data.
>  > >  > Any idea?
>  > >  >
>  > >  >
>  > >  > Test env and step like below:
>  > >  >
>  > >  > Test1:
>  > >  > Compile module with nfs_export enabled
>  > >  > Run xfstest generic/474   ==> PASS
>  > >  >
>  > >  > Test2:
>  > >  > Compile module with nfs_export enabled
>  > >  > Comment syncfs step in the test
>  > >  > Run xfstest generic/474   ==> Hang
>  > >  >
>  > >  > Test3:
>  > >  > Compile module with nfs_export disabled
>  > >  > Run xfstest generic/474   ==> PASS
>  > >  >
>  > >  > Test4:
>  > >  > Compile module with nfs_export disabled
>  > >  > Comment syncfs step in the test
>  > >  > Run xfstest generic/474   ==> FAIL
>  > >  >
>  > >
>  > > Additional information:
>  > >
>  > > Overlayfs version: latest next branch of miklos tree (5.7-rc2)
>  > > Underlying fs: xfs
>  > >
>  >
>  > Please test also against 5.7-rc2. Maybe we introduced some
>  > regression in -next.
>  >
>  > Please dump waiting processes stack by echo w > /proc/sysrq-trigger
>  > to see where in kernel does the test hang.
>  >
>  > I cannot think of anything in nfs_export/index that should affect
>  > generic/474, but we will find out soon...
>  >
>
> I‘m on vacation this week and it seems hard to reproduce the problem on my laptop, maybe there were some config problems.
> I'll do more analyses next week on my testing machine.
>

Forgot to say - I also tried and failed to reproduce.

Looking under the lamppost, I suspect changes in xfs shutdown
in v5.7-rc1:

git log --oneline --grep shutdown v5.6.. -- fs/xfs
842a42d126b4 xfs: shutdown on failure to add page to log bio
5781464bd1ee xfs: move the ioerror check out of xlog_state_clean_iclog
12e6a0f449d5 xfs: remove the aborted parameter to xlog_state_done_syncing
a582f32fade2 xfs: simplify log shutdown checking in xfs_log_release_iclog
8a6271431339 xfs: fix unmount hang and memory leak on shutdown during quotaoff
13859c984301 xfs: cleanup xfs_log_unmount_write
b941c71947a0 xfs: mark XLOG_FORCED_SHUTDOWN as unlikely
6b789c337a59 xfs: fix iclog release error check race with shutdown

If you are able to reproduce, please try to reproduce with v5.6.
It could be an intersection between changes to xfs shutdown and
the way that kernel internal modules interact with xfs.

Trying to look at wide spread test coverage of -overlay + xfs shutdown,
I count only 2 generic tests that exercise this combination:
generic/474 and generic/461. The rest of the shutdown tests require
either local_device or metadata_journaling.

I think that at least Darrick runs -overlay as part of validating
an xfs pull request to Linus, so there should be fare amount of test
coverage for these two tests.

generic/461 seems to do something quite close to what you did when
commenting out syncfs in generic/474, but is not in 'quick' group, so it
may get less wide testing coverage.
I wonder why is is not quick, though. On my system it runs for 24s.

Thanks,
Amir.

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2020-05-02  9:17 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-04-30  9:15 system hang on a syncfs test with nfs_export enabled Chengguang Xu
2020-04-30  9:48 ` 回复:system " Chengguang Xu
2020-04-30 12:22   ` system " Amir Goldstein
2020-05-02  4:10     ` Chengguang Xu
2020-05-02  9:17       ` Amir Goldstein

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).