On 06.04.21 09:06, Michal Kubecek wrote:
> On Tue, Apr 06, 2021 at 08:55:41AM +0800, Yunsheng Lin wrote:
>>
>> Hi, Jiri
>> Do you have a reproducer that can be shared here?
>> With reproducer, I can debug and test it myself too.
> 
> I'm afraid we are not aware of a simple reproducer. As mentioned in the
> original discussion, the race window is extremely small and the other
> thread has to do quite a lot in the meantime which is probably why, as
> far as I know, this was never observed on real hardware, only in
> virtualization environments. NFS may also be important as, IIUC, it can
> often issue an RPC request from a different CPU right after a data
> transfer. Perhaps you could cheat a bit and insert a random delay
> between the empty queue check and releasing q->seqlock to make it more
> likely to happen.
> 
> Other than that, it's rather just "run this complex software in a xen VM
> and wait".

Being the one who has managed to reproduce the issue I can share my
setup, maybe you can setup something similar (we have seen the issue
with this kind of setup on two different machines).

I'm using a physical machine with 72 cpus and 48 GB of memory. It is
running Xen as virtualization platform.

Xen dom0 is limited to 40 vcpus and 32 GB of memory, the dom0 vcpus are
limited to run on the first 40 physical cpus (no idea whether that
matters, though).

In a guest with 16 vcpu and 8GB of memory I'm running 8 parallel
sysbench instances in a loop, those instances are prepared via

sysbench --file-test-mode=rndrd --test=fileio prepare

and then started in a do while loop via:

sysbench --test=fileio --file-test-mode=rndrw --rand-seed=0 
--max-time=300 --max-requests=0 run

Each instance is using a dedicated NFS mount to run on. The NFS
server for the 8 mounts is running in dom0 of the same server, the
data of the NFS shares is located in a RAM disk (size is a little bit
above 16GB). The shares are mounted in the guest with:

mount -t nfs -o 
rw,proto=tcp,nolock,nfsvers=3,rsize=65536,wsize=65536,nosharetransport 
dom0:/ramdisk/share[1-8] /mnt[1-8]

The guests vcpus are limited to run on physical cpus 40-55, on the same
physical cpus I have 16 small guests running eating up cpu time, each of
those guests is pinned to one of the physical cpus 40-55.

That's basically it. All you need to do is to watch out for sysbench
reporting maximum latencies above one second or so (in my setup there
are latencies of several minutes at least once each hour of testing).

In case you'd like to have some more details about the setup don't
hesitate to contact me directly. I can provide you with some scripts
and config runes if you want.


Juergen