All of lore.kernel.org
 help / color / mirror / Atom feed
From: Brian Fallik <bfallik@bamboom.com>
To: Jens Axboe <JAxboe@fusionio.com>
Cc: Jeff Moyer <jmoyer@redhat.com>,
	"fio@vger.kernel.org" <fio@vger.kernel.org>
Subject: Re: fio file test patterns
Date: Wed, 31 Aug 2011 14:34:22 -0600	[thread overview]
Message-ID: <CAOYTo+3JyNsxMf+UO5C3N=rrtkGR999XM9Guo5-0TXFqSuUr+Q@mail.gmail.com> (raw)
In-Reply-To: <4E5E91D9.4000702@fusionio.com>

Hi,

Hmm.  I'm still unable to generate random test data using fio.   My
new job control file is:
  [foo0]
  rate =,200k
  rw =write
  refill_buffers =1
  size =4m

  [foo1]
  rate =,200k
  rw =write
  refill_buffers =1
  size =4m
But:
  $ diff foo0.1.0 foo1.2.0
reports they're identical.  I also tried enabling data verification by
adding "verify=crc32c-intel" but that had no effect.  Am I missing
something obvious or is this potentially broken in fio 1.57?

I also have another (maybe related?) question.  Apologies if this
belongs in a separate thread, but are there any notes explaining why
fio lays out the files before starting sequential writes?  The
workload I was hoping to simulate is sustained, sequential writes to
disk.  I'm trying to answer the question "How many simultaneous
200kBps writers can we support?"  Using my current jobs file, fio
starts by creating the files (e.g "foo0: Laying out IO file(s) (1
file(s) / 4MB)") before it starts processing.  However, creating the
files in advance accounts for a chunk of performance that doesn't seem
to be measured by fio.  Am I misunderstanding how to configure fio or
its intended usage?

Thanks,
brian


On Wed, Aug 31, 2011 at 3:56 PM, Jens Axboe <jaxboe@fusionio.com> wrote:
> On 2011-08-31 12:30, Jeff Moyer wrote:
>> Brian Fallik <bfallik@bamboom.com> writes:
>>
>>> Hi,
>>>
>>> Apologies if this is documented somewhere else but I couldn't find it
>>> in the fio man page, example job files, or list archives.
>>>
>>> I'm exploring fio as a testing tool and it seems very well suited for
>>> my needs.  I'm currently running experiments with N sequential writers
>>> all writing at 200k.  The jobs file is very simple:
>>>     [global]
>>>     size=10m
>>>     directory=.
>>>
>>>     [foo1]
>>>     rw=write
>>>     rate=200k
>>>
>>>     [foo2]
>>>     ...
>>> fio creates various foo* files as part of its test but they all seem
>>> to contain the same content.  I would have expected fio to generate
>>> random data in each file to avoid potential optimizations like
>>> deduplication.   Am I missing the flag to generate random test
>>> patterns or is this behavior intentional?
>>
>>        refill_buffers
>>               If this option is given, fio will refill the IO buffers on every
>>               submit. The default is to only fill it at init  time and reuse
>>               that  data.  Only  makes  sense if zero_buffers isn't specified,
>>               naturally. If data verification is  enabled, refill_buffers is
>>               also automatically enabled.
>
> Yes. Fio does use random data by default, but for to avoid slowing down
> too much, it also defaults to reusing the same random data all the time.
> If you set the above option, you get fully fresh random data for every
> write, thus fully defeating any de-dupe/compression attempts on the
> target.
>
> --
> Jens Axboe
>
>


  reply	other threads:[~2011-08-31 20:34 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-08-31 17:53 fio file test patterns Brian Fallik
2011-08-31 18:30 ` Jeff Moyer
2011-08-31 19:56   ` Jens Axboe
2011-08-31 20:34     ` Brian Fallik [this message]
2011-08-31 21:21       ` Jens Axboe
2011-08-31 21:38         ` Jens Axboe
2011-09-01 14:18           ` Brian Fallik
2011-09-01 16:06             ` Jens Axboe
2011-09-01 18:17               ` Brian Fallik
2011-09-01 20:14                 ` Jens Axboe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAOYTo+3JyNsxMf+UO5C3N=rrtkGR999XM9Guo5-0TXFqSuUr+Q@mail.gmail.com' \
    --to=bfallik@bamboom.com \
    --cc=JAxboe@fusionio.com \
    --cc=fio@vger.kernel.org \
    --cc=jmoyer@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.