fuse freeze and usb devices

* fuse freeze and usb devices
@ 2020-02-21 18:04 taz.007
  2020-02-21 19:20 ` Alan Stern
  0 siblings, 1 reply; 4+ messages in thread
From: taz.007 @ 2020-02-21 18:04 UTC (permalink / raw)
  To: linux-usb

Hello linux-usb,

I'm experiencing some freezing from a fuse userspace daemon. I'm not 
sure if it is an actual usb issue, so please point me to the correct 
subsystem/mailing list if they could help.
setup:
10 hard drives (ext3 or ext4) mounted on the system.
7 of those are sata under usb enclosures, (usb2 only).
2 of them are usbkeys (usb1 & usb2).
1 of them is a regular sata drive directly connected.
I use mergerfs to gather all of them under a common mount point.
scenario :
the machine is cpu loaded, (2C/4T) nearly fully used.
rsync is running in a loop (in order to reproduce the issue), copying 
some files (several GB) from the mergerfs mount point to another drive 
(that is not part of the pool, also a regular ext4 mounted drive).
some background processes are doing "light" (~50KB/sec) IO on the same 
mergerfs pool.
after a while , any access to the mergerfs mount point is frozen.
This is because mergerfs itself is stuck in a syscall (if I understand 
correctly) that is never returning.
However I can access (by doing an "ls" for example) the underlying 
mounted hard drives fine!
And in this case, accessing the underlying hard drives via "ls" somehow 
unfreezes the previously blocked syscall from the mergerfs daemon!
It is not even needed to use "ls", doing hdparm -tT on the drives 
directly also permits to unfreeze mergerfs.

Now the link with usb :
When I tweak the values of /sys/block/sdX/device/max_sectors I can alter 
the behaviour.
With the values of 128 or 240, I'm unable to reproduce the issue.
With the value of 512 it reproduces the issue after around 4-5hours.
With the value of 1024 it reproduces the issue after around 2hours.
(maybe those are statistically insignificant numbers and I'm just unlucky)

There are no errors from the kernel, and the drives still seem to be 
working fine in fact.
I'm using Linux 5.5.3, but I tried back the 5.1.15, and the issue is 
already there.

For more detailed info on the mergerfs callstack, see the original 
bugreport thread :
https://github.com/trapexit/mergerfs/issues/708

Please don't forget to CC me as I'm not subscribed to the ML.

^ permalink raw reply	[flat|nested] 4+ messages in thread