dm-thin: Several Questions on dm-thin performance.

* dm-thin: Several Questions on dm-thin performance.
@ 2019-11-22  3:14 JeffleXu
  2019-11-22 18:55 ` Joe Thornber
  0 siblings, 1 reply; 13+ messages in thread
From: JeffleXu @ 2019-11-22  3:14 UTC (permalink / raw)
  To: dm-devel

Hi guys,

I have several questions on dm-thin when I'm testing and evaluating IO 
performance of dm-thin. I would be grateful if someone could spend a 
little time on it.

The first question is what's the purpose of data cell? In 
thin_bio_map(), normal bio will be packed as a virtual cell and data 
cell. I can understand that virtual cell is used to prevent discard bio 
and non-discard bio targeting the same block from being processed at the 
same time. I find it was added in commit     
e8088073c9610af017fd47fddd104a2c3afb32e8 (dm thin: fix race between 
simultaneous io and discards to same block), but I'm still confused 
about the use of data cell.

The second question is the impact of virtual cell and data cell on IO 
performance. If $data_block_size is large for example 1G, in multithread 
fio test, most bio will be buffered in cell->bios list and then be 
processed by worker thread asynchronously, even when there's no discard 
bio. Thus the original parallel IO is processed by worker thread 
serially now. As the number of fio test threads increase, the single 
worker thread can easily get CPU 100%, and thus become the bottleneck of 
the performance since dm-thin workqueue is ordered unbound.

Using an nvme SSD and fio (direct=1, ioengine=libaio, iodepth=128, 
numjobs=4, rw=read, bs=4k), the bandwidth on bare nvme is 1589MiB/s. The 
bandwidth on thin device is only 1274MiB/s, while the four fio threads 
run at 200% CPU and the single worker thread is always runing at 100% 
CPU. perf of worker thread showes that process_bio() consumes 86% of the 
time.

Regards

Jeffle Xu

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel

^ permalink raw reply	[flat|nested] 13+ messages in thread