All of lore.kernel.org
 help / color / mirror / Atom feed
* A special case in write_flush which cause the umount busy
@ 2018-02-11  6:55 Wang, Alan 1. (NSB - CN/Hangzhou)
  2018-02-13 19:57 ` NeilBrown
  0 siblings, 1 reply; 5+ messages in thread
From: Wang, Alan 1. (NSB - CN/Hangzhou) @ 2018-02-11  6:55 UTC (permalink / raw)
  To: linux-nfs; +Cc: neilb

Hi,

I have a test case on mount/umount on a partition from nfs server side. And=
 encounter a problem of umount busy in a low probability.

The Linux version is 3.10.64 with the patch "sunrpc/cache: make cache flush=
ing more reliable".
https://patchwork.kernel.org/patch/7410021/

After some analysis and test in many times, I find that when it failed to m=
ount, the time "then" and "now" are different, which caused the last_refres=
h is far beyond the flush_time. So this cache is not expired and won't be c=
lean at once.=20
Then the ref in cache_head won't be released, and mntput_no_expire didn't b=
e called to decrease the count. That caused the umount busy.

Below are logs in my test.

kernel: [  292.767801] write_flush 1480 then =3D 249, now =3D 250
kernel: [  292.767817] cache_clean 451, cd name nfsd.fh expiry_time 7904852=
53, cd flush_time 249, last_refresh 369, seconds_since_boot 250
kernel: [  292.767913] write_flush 1480 then =3D 249, now =3D 250
kernel: [  292.767928] cache_clean 451, cd name nfsd.export expiry_time 204=
9, cd flush_time 249, last_refresh 369, seconds_since_boot 250
kernel: [  292.773229] do_refcount_check 283 mycount 4
kernel: [  292.773245] do_umount 1344 retval -16

I think this happens in such case that the exportfs writes the flush with c=
urrent time, the time of "then". But when seconds_since_boot being called i=
n function write_flush, the time is on the next second, so the "now" is one=
 second after "then".
Because "then" is less than "now", the flush_time is set directly to origin=
al "then", rather than "cd->flush_time + 1".


And I want to change the condition as below. I'm not sure it's OK and has n=
o effects to other part.

--------------------------------------------------------------------------
       then =3D get_expiry(&bp);
       now =3D seconds_since_boot();
       cd->nextcheck =3D now;
       /* Can only set flush_time to 1 second beyond "now", or
       * possibly 1 second beyond flushtime.  This is because
       * flush_time never goes backwards so it mustn't get too far
       * ahead of time.
       */
       if (then !=3D now)
              printk("%s %d then =3D %d, now =3D %d\n", __func__, __LINE__,=
 then, now);
-      if (then >=3D now) {
+     if (then >=3D now - 1) {
              /* Want to flush everything, so behave like cache_purge() */
              if (cd->flush_time >=3D now)
                     now =3D cd->flush_time + 1;
              then =3D now;
       }

       cd->flush_time =3D then;
--------------------------------------------------------------------------

Best Regards,
Alan Wang


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2018-02-22  0:46 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-02-11  6:55 A special case in write_flush which cause the umount busy Wang, Alan 1. (NSB - CN/Hangzhou)
2018-02-13 19:57 ` NeilBrown
2018-02-13 21:03   ` J. Bruce Fields
2018-02-14  1:15     ` [PATCH] SUNRPC: cache: ignore timestamp written to 'flush' file NeilBrown
2018-02-22  0:46       ` J. Bruce Fields

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.