On 2019/5/3 上午3:02, Hendrik Friedel wrote: > Hello, > > thanks for your replies. I appreciate it! >>> I am using btrfs-progs v4.20.2 and debian stretch with >>> 4.19.0-0.bpo.2-amd64 (I think, this is the latest Kernel available in >>> stretch. Please correct if I am wrong. >> >> What scheduler is being used for the drive? >> >> # cat /sys/block//queue/scheduler > [mq-deadline] none > >> If it's none, then kernel version and scheduler aren't likely related >> to what you're seeing. >> >> It's not immediately urgent, but I would still look for something >> newer, just because the 4.19 series already has 37 upstream updates >> released, each with dozens of fixes, easily there are over 1000 fixes >> available in total. I'm not a Debian user but I think there's >> stretch-backports that has newer kernels? >> http://jensd.be/818/linux/install-a-newer-kernel-in-debian-9-stretch-stable >> > > Unfortunately, backports provides 4.19 as the latest. > I am now manually compiling 5.0. Last time I did that, I was less half > my current age :-) > >> We need the entire dmesg so we can see if there are any earlier >> complaints by the drive or the link. Can you attach the entire dmesg >> as a file? > Done (also the two smartctl outputs). > >>Have you tried stop the workload, and see if the timeout disappears? > > Unfortunately not. I had the impression that the system did not react > anymore. I CTRL-Ced and rebooted. > I was copying all the stuff from my old drive to the new one. I should > say, that the workload was high, but not exceptional. Just one or two > copy jobs. Then it's some deadlock, not regular high load timeout. > Also, the btrfs drive was in advantage: > 1) it had btrfs ;-) (the other ext4) > 2) it did not need to search > 3) it was connected via SATA (and not USB3 as the source) > > The drive does not seem to be an SMR drive (WD80EZAZ). > >> If it just disappear after some time, then it's the disk too slow and >> too heavy load, combined with btrfs' low concurrency design leading to >> the problem. > > I was tempted to ask, whether this should be fixed. On the other hand, I > am not even sure anything bad happened (except, well, the system -at > least the copy- seemed to hang). Definitely needs to be fixed. With full dmesg, it's now clear that is a real dead lock. Something wrong with the free space cache, blocking the whole fs to be committed. If you still want to try btrfs, you could try "nosapce_cache" mount option. Free space cache of btrfs is just an optimization, you can completely ignore that with minor performance drop. Thanks, Qu > > By the way: I ran a scrub and a smartctl -t long. Both without errors. > > Greetings, > Hendrik