All of lore.kernel.org
 help / color / mirror / Atom feed
* Compression options in fio
@ 2016-06-27 20:41 Srinath Krishna Ananthakrishnan
  2016-06-28  5:16 ` Sitsofe Wheeler
  0 siblings, 1 reply; 6+ messages in thread
From: Srinath Krishna Ananthakrishnan @ 2016-06-27 20:41 UTC (permalink / raw)
  To: fio

Hi,

I'm fiddling around with some compression options in fio but I don't
seem to get it working.

From the fio manual, I see that the following options are supported by fio.

buffer_compress_percentage, buffer_compress_chunk, buffer_pattern

I have the following workload configuration,

[global]
randrepeat=1
ioengine=sync
iodepth=64
iodepth_batch=16
direct=1
runtime=5
time_based=1
numjobs=1
verify_fatal=1
verify_dump=1
filename=./my_file

[small_write]
rw=rw
blocksize=4k
size=100M
verify=crc32c-intel
buffer_compress_percentage=50
buffer_pattern="abcd"

With this, I don't see fio generating compressible data with the
pattern and still reverts to generating random data. Anything I'm
missing?

Thanks,
Srinath

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Compression options in fio
  2016-06-27 20:41 Compression options in fio Srinath Krishna Ananthakrishnan
@ 2016-06-28  5:16 ` Sitsofe Wheeler
  2016-06-30 17:58   ` Srinath Krishna Ananthakrishnan
  0 siblings, 1 reply; 6+ messages in thread
From: Sitsofe Wheeler @ 2016-06-28  5:16 UTC (permalink / raw)
  To: Srinath Krishna Ananthakrishnan; +Cc: fio

On 27 June 2016 at 21:41, Srinath Krishna Ananthakrishnan <ska@datera.io> wrote:
>
> [small_write]
> rw=rw
> blocksize=4k
> size=100M
> verify=crc32c-intel
> buffer_compress_percentage=50
> buffer_pattern="abcd"
>
> With this, I don't see fio generating compressible data with the
> pattern and still reverts to generating random data. Anything I'm
> missing?

You're forcing the pattern of data that must be used via
buffer_pattern so you aren't you taking away the ability for fio to
vary the data? Do you get a different result if you don't specify
buffer_pattern ?

-- 
Sitsofe | http://sucs.org/~sits/

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Compression options in fio
  2016-06-28  5:16 ` Sitsofe Wheeler
@ 2016-06-30 17:58   ` Srinath Krishna Ananthakrishnan
  2016-06-30 18:35     ` Srinath Krishna Ananthakrishnan
  0 siblings, 1 reply; 6+ messages in thread
From: Srinath Krishna Ananthakrishnan @ 2016-06-30 17:58 UTC (permalink / raw)
  To: Sitsofe Wheeler; +Cc: fio

I don't want fio to vary the data and need it to be more deterministic.

After playing with it for quite some time, I have the following
(inconclusive) theory.

1. If no verify options are set, fio generates
buffer_compress_percentage worth of compressible data per block size.
Compressible data is always zeroed out.
2. With verify options, fio generates buffer_compress_percentage worth
of compressible data (zeroes) for some blocks but there are these
bunch of blocks from time to time that contain purely random data.

From the manual, verify options add additional meta data to the header
of every block for verification but I'm not sure why some blocks turn
out to be completely random with this setting on. I tried multiple
verify hashes with the same result.

With either of the cases, I don't seem to get the buffer_pattern
setting working.


Thanks,
Srinath


On Mon, Jun 27, 2016 at 10:16 PM, Sitsofe Wheeler <sitsofe@gmail.com> wrote:
> On 27 June 2016 at 21:41, Srinath Krishna Ananthakrishnan <ska@datera.io> wrote:
>>
>> [small_write]
>> rw=rw
>> blocksize=4k
>> size=100M
>> verify=crc32c-intel
>> buffer_compress_percentage=50
>> buffer_pattern="abcd"
>>
>> With this, I don't see fio generating compressible data with the
>> pattern and still reverts to generating random data. Anything I'm
>> missing?
>
> You're forcing the pattern of data that must be used via
> buffer_pattern so you aren't you taking away the ability for fio to
> vary the data? Do you get a different result if you don't specify
> buffer_pattern ?
>
> --
> Sitsofe | http://sucs.org/~sits/

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Compression options in fio
  2016-06-30 17:58   ` Srinath Krishna Ananthakrishnan
@ 2016-06-30 18:35     ` Srinath Krishna Ananthakrishnan
  2016-06-30 20:43       ` Sitsofe Wheeler
  0 siblings, 1 reply; 6+ messages in thread
From: Srinath Krishna Ananthakrishnan @ 2016-06-30 18:35 UTC (permalink / raw)
  To: Sitsofe Wheeler; +Cc: fio

Another idiosyncrasy that I observed was,

With rw=write, compression options are not heeded. They seem to work only rw=rw.
Thanks,
Srinath


On Thu, Jun 30, 2016 at 10:58 AM, Srinath Krishna Ananthakrishnan
<ska@datera.io> wrote:
> I don't want fio to vary the data and need it to be more deterministic.
>
> After playing with it for quite some time, I have the following
> (inconclusive) theory.
>
> 1. If no verify options are set, fio generates
> buffer_compress_percentage worth of compressible data per block size.
> Compressible data is always zeroed out.
> 2. With verify options, fio generates buffer_compress_percentage worth
> of compressible data (zeroes) for some blocks but there are these
> bunch of blocks from time to time that contain purely random data.
>
> From the manual, verify options add additional meta data to the header
> of every block for verification but I'm not sure why some blocks turn
> out to be completely random with this setting on. I tried multiple
> verify hashes with the same result.
>
> With either of the cases, I don't seem to get the buffer_pattern
> setting working.
>
>
> Thanks,
> Srinath
>
>
> On Mon, Jun 27, 2016 at 10:16 PM, Sitsofe Wheeler <sitsofe@gmail.com> wrote:
>> On 27 June 2016 at 21:41, Srinath Krishna Ananthakrishnan <ska@datera.io> wrote:
>>>
>>> [small_write]
>>> rw=rw
>>> blocksize=4k
>>> size=100M
>>> verify=crc32c-intel
>>> buffer_compress_percentage=50
>>> buffer_pattern="abcd"
>>>
>>> With this, I don't see fio generating compressible data with the
>>> pattern and still reverts to generating random data. Anything I'm
>>> missing?
>>
>> You're forcing the pattern of data that must be used via
>> buffer_pattern so you aren't you taking away the ability for fio to
>> vary the data? Do you get a different result if you don't specify
>> buffer_pattern ?
>>
>> --
>> Sitsofe | http://sucs.org/~sits/

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Compression options in fio
  2016-06-30 18:35     ` Srinath Krishna Ananthakrishnan
@ 2016-06-30 20:43       ` Sitsofe Wheeler
  2016-06-30 23:23         ` Srinath Krishna Ananthakrishnan
  0 siblings, 1 reply; 6+ messages in thread
From: Sitsofe Wheeler @ 2016-06-30 20:43 UTC (permalink / raw)
  To: Srinath Krishna Ananthakrishnan; +Cc: fio

Could you bottom post replies? It makes things a bit easier here...

> On Thu, Jun 30, 2016 at 10:58 AM, Srinath Krishna Ananthakrishnan
> <ska@datera.io> wrote:
>> I don't want fio to vary the data and need it to be more deterministic.

Don't forget you can use can also use randseed to generate the same
random data between invocations.

>> After playing with it for quite some time, I have the following
>> (inconclusive) theory.
>>
>> 1. If no verify options are set, fio generates
>> buffer_compress_percentage worth of compressible data per block size.
>> Compressible data is always zeroed out.
>> 2. With verify options, fio generates buffer_compress_percentage worth
>> of compressible data (zeroes) for some blocks but there are these
>> bunch of blocks from time to time that contain purely random data.

You can inspect the source code directly (e.g.
https://github.com/axboe/fio/blob/8c07860de982fabaaf41d44c22aa86aba2539b58/io_u.c#L2031
) to find out what's happening.

>> From the manual, verify options add additional meta data to the header
>> of every block for verification but I'm not sure why some blocks turn
>> out to be completely random with this setting on. I tried multiple
>> verify hashes with the same result.
>>
>> With either of the cases, I don't seem to get the buffer_pattern
>> setting working.

I don't quite see the behaviour you describe. I used the following
with the latest git fio on x86_64:
[global]
randrepeat=1
ioengine=sync
iodepth=64
iodepth_batch=16
direct=1
numjobs=1
verify_fatal=1
verify_dump=1
filename=./my_file

[small_write]
rw=write
blocksize=4k
size=100M
verify=crc32c-intel
buffer_compress_percentage=50
buffer_pattern=0xdeadbeef

Using hexdump showed an initial header followed by random data up to
0x800 and from 0x801 to 0xfff was the pattern deadbeef (in binary). So
50% of the block was the header + random and the other 50% was the
buffer_pattern I specified. Bear in mind that if something were to
compress these blocks it would likely get better than 50% compression
because the repeating deadbeef pattern is itself highly
compressible...

On 30 June 2016 at 19:35, Srinath Krishna Ananthakrishnan <ska@datera.io> wrote:
> Another idiosyncrasy that I observed was,
>
> With rw=write, compression options are not heeded. They seem to work only rw=rw.
> Thanks,
> Srinath

See the above job file where I used rw=write. Are you using a git
version of fio?

-- 
Sitsofe | http://sucs.org/~sits/

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Compression options in fio
  2016-06-30 20:43       ` Sitsofe Wheeler
@ 2016-06-30 23:23         ` Srinath Krishna Ananthakrishnan
  0 siblings, 0 replies; 6+ messages in thread
From: Srinath Krishna Ananthakrishnan @ 2016-06-30 23:23 UTC (permalink / raw)
  To: Sitsofe Wheeler; +Cc: fio

On Thu, Jun 30, 2016 at 1:43 PM, Sitsofe Wheeler <sitsofe@gmail.com> wrote:
> Could you bottom post replies? It makes things a bit easier here...
>

Sorry about that.

>> On Thu, Jun 30, 2016 at 10:58 AM, Srinath Krishna Ananthakrishnan
>> <ska@datera.io> wrote:
>>> I don't want fio to vary the data and need it to be more deterministic.
>
> Don't forget you can use can also use randseed to generate the same
> random data between invocations.
>
>>> After playing with it for quite some time, I have the following
>>> (inconclusive) theory.
>>>
>>> 1. If no verify options are set, fio generates
>>> buffer_compress_percentage worth of compressible data per block size.
>>> Compressible data is always zeroed out.
>>> 2. With verify options, fio generates buffer_compress_percentage worth
>>> of compressible data (zeroes) for some blocks but there are these
>>> bunch of blocks from time to time that contain purely random data.
>
> You can inspect the source code directly (e.g.
> https://github.com/axboe/fio/blob/8c07860de982fabaaf41d44c22aa86aba2539b58/io_u.c#L2031
> ) to find out what's happening.
>
>>> From the manual, verify options add additional meta data to the header
>>> of every block for verification but I'm not sure why some blocks turn
>>> out to be completely random with this setting on. I tried multiple
>>> verify hashes with the same result.
>>>
>>> With either of the cases, I don't seem to get the buffer_pattern
>>> setting working.
>
> I don't quite see the behaviour you describe. I used the following
> with the latest git fio on x86_64:
> [global]
> randrepeat=1
> ioengine=sync
> iodepth=64
> iodepth_batch=16
> direct=1
> numjobs=1
> verify_fatal=1
> verify_dump=1
> filename=./my_file
>
> [small_write]
> rw=write
> blocksize=4k
> size=100M
> verify=crc32c-intel
> buffer_compress_percentage=50
> buffer_pattern=0xdeadbeef
>
> Using hexdump showed an initial header followed by random data up to
> 0x800 and from 0x801 to 0xfff was the pattern deadbeef (in binary). So
> 50% of the block was the header + random and the other 50% was the
> buffer_pattern I specified. Bear in mind that if something were to
> compress these blocks it would likely get better than 50% compression
> because the repeating deadbeef pattern is itself highly
> compressible...
>
> On 30 June 2016 at 19:35, Srinath Krishna Ananthakrishnan <ska@datera.io> wrote:
>> Another idiosyncrasy that I observed was,
>>
>> With rw=write, compression options are not heeded. They seem to work only rw=rw.
>> Thanks,
>> Srinath
>
> See the above job file where I used rw=write. Are you using a git
> version of fio?

I just tried on the tip of the branch fio and it works. The fio
version I was using was 2.1.10 which is quite old. I believe these
issues have been addressed in the more recent versions. Thanks
Sitsofe.

>
> --
> Sitsofe | http://sucs.org/~sits/

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2016-06-30 23:31 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-06-27 20:41 Compression options in fio Srinath Krishna Ananthakrishnan
2016-06-28  5:16 ` Sitsofe Wheeler
2016-06-30 17:58   ` Srinath Krishna Ananthakrishnan
2016-06-30 18:35     ` Srinath Krishna Ananthakrishnan
2016-06-30 20:43       ` Sitsofe Wheeler
2016-06-30 23:23         ` Srinath Krishna Ananthakrishnan

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.