On Thu, Aug 22, 2019 at 10:00 AM Keith Busch wrote: > > On Wed, Aug 21, 2019 at 7:34 PM Ming Lei wrote: > > On Wed, Aug 21, 2019 at 04:27:00PM +0000, Long Li wrote: > > > Here is the command to benchmark it: > > > > > > fio --bs=4k --ioengine=libaio --iodepth=128 --filename=/dev/nvme0n1:/dev/nvme1n1:/dev/nvme2n1:/dev/nvme3n1:/dev/nvme4n1:/dev/nvme5n1:/dev/nvme6n1:/dev/nvme7n1:/dev/nvme8n1:/dev/nvme9n1 --direct=1 --runtime=120 --numjobs=80 --rw=randread --name=test --group_reporting --gtod_reduce=1 > > > > > > > I can reproduce the issue on one machine(96 cores) with 4 NVMes(32 queues), so > > each queue is served on 3 CPUs. > > > > IOPS drops > 20% when 'use_threaded_interrupts' is enabled. From fio log, CPU > > context switch is increased a lot. > > Interestingly use_threaded_interrupts shows a marginal improvement on > my machine with the same fio profile. It was only 5 NVMes, but they've > one queue per-cpu on 112 cores. Not investigate it yet. BTW, my fio test is only done on the single hw queue via 'taskset -c $cpu_list_of_the_queue', without applying the threaded interrupt affinity patch. NVMe is Optane. The same issue can be reproduced after I force to use 1:1 mapping via passing 'possible_cpus=32' kernel cmd line. Maybe related with kernel options, so attache the one I used, and basically it is a subset of RHEL8 kernel. Thanks, Ming Lei