All of lore.kernel.org
 help / color / mirror / Atom feed
* Accounting for partially filled blocks when modeling workloads
@ 2021-02-17  9:36 graf wasili
  0 siblings, 0 replies; only message in thread
From: graf wasili @ 2021-02-17  9:36 UTC (permalink / raw)
  To: fio

Hello everybody,

Before everything else I wanted to say thanks to Jens and everybody who 
has put their time in this great project :)

I recently started using fio for benchmarking hard disks and possible 
storage setups and also learn more about the workings of the layers 
involved and now ran into a question.
I'm sorry if I'm missing something here - my knowledge of the topic is 
certainly lacking...  However, any help is appreciated.

On an ext4 file system two files cannot share the same block - so 
reading or writing a lot of small files introduces a considerable 
overhead wrt the actual amount of data retrieved from disk.
I want to take this into account when modelling workloads with fio, but 
issuing a command like

> fio --name test --rw=randread --blocksize=4096 --size=1m --nrfiles=100 
> ---directory=path/on/ext4-formated/hard-disk/

will make fio read only 800KiB or 200 blocks. Each file occupies roughly 
2,5 blocks, so partially filled blocks occupied by each file are just 
ignored.


Looking into a possible solution i figured I could issue the parameter

> bssplit=2048:n, 4096:n-1

with

> n = 1 /  ceil[(file size) / (block size)]

to simulate the overhead for reads/writes of partially occupied blocks 
and get a meaningful throughput number for this kind of scenario 
(assuming that in reality, (file size) mod (block size) is uniformly 
distributed).


Does this make sense? Are there maybe better solutions?

Thanks
gw


^ permalink raw reply	[flat|nested] only message in thread

only message in thread, other threads:[~2021-02-17  9:37 UTC | newest]

Thread overview: (only message) (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-02-17  9:36 Accounting for partially filled blocks when modeling workloads graf wasili

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.