* [PATCH] fix rand_seed mismatches in verify phase
@ 2014-02-03 21:31 Puthikorn Voravootivat
2014-02-03 22:54 ` Grant Grundler
2014-02-05 17:15 ` Jens Axboe
0 siblings, 2 replies; 12+ messages in thread
From: Puthikorn Voravootivat @ 2014-02-03 21:31 UTC (permalink / raw)
To: Jens Axboe
Cc: FIO List, Gwendal Grignou, Grant Grundler, Puthikorn Voravootivat
In verify phase, the rand_seed generated on replay does not match
the written rand_seed.
Multiple problems are causing this:
1. In verify phase fio does not set io_u->rand_seed to compare with
hdr->rand_seed
2. In randrw scenario, fio log is stored in red-black tree in "sorted by LBA"
order. Thus, it is imposible to replay the written order, or rather
generate the seeds again in the same order.
3. In write phase, the code currently will generate rand_seed, write data
and log rand_seed. When queuedepth > 1, it's possible the writes complete
in a different order than rand_seed was generated. Thus when replaying
the log, the generated rand_seed might not match what was written.
4. verify_backlog option will start verification before all the data has been
written and it make rand_seed replay code broken with current design.
Proposed fixes:
1. Use of existing verify_state to generate verify header.
(and assumes this was the original intention of verify_state). And also
adds code to replay rand_seed in verify phase.
2. If verifysort option is not enabled, store the write log in a list instead
of the red-black tree. Otherwise, don't attempt to verify the rand_seed
in the header.
3. In write phase, generate rand_seed, log rand_seed, write data. I.e. log
IO transactions in the order generated, not completed.
4. Don't verify rand_seed when verify_backlog is enabled.
Signed-off-by: Puthikorn Voravootivat <puthik@chromium.org>
---
backend.c | 13 +++++++++++++
io_u.c | 6 ------
iolog.c | 2 +-
verify.c | 14 ++++++++++++--
4 files changed, 26 insertions(+), 9 deletions(-)
diff --git a/backend.c b/backend.c
index 93e6632..bf9d066 100644
--- a/backend.c
+++ b/backend.c
@@ -697,6 +697,13 @@ static uint64_t do_io(struct thread_data *td)
*/
if (td->o.verify != VERIFY_NONE && io_u->ddir == DDIR_READ &&
((io_u->flags & IO_U_F_VER_LIST) || !td_rw(td))) {
+
+ if (!td->o.verify_pattern_bytes) {
+ io_u->rand_seed = __rand(&td->__verify_state);
+ if (sizeof(int) != sizeof(long *))
+ io_u->rand_seed *= __rand(&td->__verify_state);
+ }
+
if (td->o.verify_async)
io_u->end_io = verify_io_u_async;
else
@@ -707,6 +714,12 @@ static uint64_t do_io(struct thread_data *td)
else
td_set_runstate(td, TD_RUNNING);
+ if (td_write(td) && io_u->ddir == DDIR_WRITE &&
+ td->o.do_verify &&
+ td->o.verify != VERIFY_NONE &&
+ !td->o.experimental_verify)
+ log_io_piece(td, io_u);
+
ret = td_io_queue(td, io_u);
switch (ret) {
case FIO_Q_COMPLETED:
diff --git a/io_u.c b/io_u.c
index 518d884..f68b213 100644
--- a/io_u.c
+++ b/io_u.c
@@ -1623,12 +1623,6 @@ static void io_completed(struct thread_data *td, struct io_u *io_u,
utime_since_now(&td->start));
}
- if (td_write(td) && idx == DDIR_WRITE &&
- td->o.do_verify &&
- td->o.verify != VERIFY_NONE &&
- !td->o.experimental_verify)
- log_io_piece(td, io_u);
-
icd->bytes_done[idx] += bytes;
if (io_u->end_io) {
diff --git a/iolog.c b/iolog.c
index ec29971..017b235 100644
--- a/iolog.c
+++ b/iolog.c
@@ -209,7 +209,7 @@ void log_io_piece(struct thread_data *td, struct io_u *io_u)
* drop the old one, which we rely on the rb insert/lookup for
* handling.
*/
- if ((!td_random(td) || !td->o.overwrite) &&
+ if (((!td->o.verifysort) || !td_random(td) || !td->o.overwrite) &&
(file_randommap(td, ipo->file) || td->o.verify == VERIFY_NONE)) {
INIT_FLIST_HEAD(&ipo->list);
flist_add_tail(&ipo->list, &td->io_hist_list);
diff --git a/verify.c b/verify.c
index 568bae8..970b867 100644
--- a/verify.c
+++ b/verify.c
@@ -72,10 +72,10 @@ void fill_verify_pattern(struct thread_data *td, void *p, unsigned int len,
if (use_seed)
__fill_random_buf(p, len, seed);
else
- io_u->rand_seed = fill_random_buf(&td->buf_state, p, len);
+ io_u->rand_seed = fill_random_buf(&td->__verify_state, p, len);
return;
}
-
+
if (io_u->buf_filled_len >= len) {
dprint(FD_VERIFY, "using already filled verify pattern b=%d len=%u\n",
td->o.verify_pattern_bytes, len);
@@ -718,6 +718,10 @@ int verify_io_u(struct thread_data *td, struct io_u *io_u)
memswp(p, p + td->o.verify_offset, header_size);
hdr = p;
+ // Make rand_seed check pass when have verifysort or verify_backlog.
+ if (td->o.verifysort || (td->flags & TD_F_VER_BACKLOG))
+ io_u->rand_seed = hdr->rand_seed;
+
ret = verify_header(io_u, hdr);
switch (ret) {
case 0:
@@ -1056,6 +1060,12 @@ int get_next_verify(struct thread_data *td, struct io_u *io_u)
remove_trim_entry(td, ipo);
free(ipo);
dprint(FD_VERIFY, "get_next_verify: ret io_u %p\n", io_u);
+
+ if (!td->o.verify_pattern_bytes) {
+ io_u->rand_seed = __rand(&td->__verify_state);
+ if (sizeof(int) != sizeof(long *))
+ io_u->rand_seed *= __rand(&td->__verify_state);
+ }
return 0;
}
--
1.9.0.rc1.175.g0b1dcb5
^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH] fix rand_seed mismatches in verify phase
2014-02-03 21:31 [PATCH] fix rand_seed mismatches in verify phase Puthikorn Voravootivat
@ 2014-02-03 22:54 ` Grant Grundler
2014-02-04 18:54 ` Puthikorn Voravootivat
2014-02-04 18:56 ` Puthikorn Voravootivat
2014-02-05 17:15 ` Jens Axboe
1 sibling, 2 replies; 12+ messages in thread
From: Grant Grundler @ 2014-02-03 22:54 UTC (permalink / raw)
To: Puthikorn Voravootivat
Cc: Jens Axboe, FIO List, Gwendal Grignou, Grant Grundler
On Mon, Feb 3, 2014 at 1:31 PM, Puthikorn Voravootivat
<puthik@chromium.org> wrote:
> In verify phase, the rand_seed generated on replay does not match
> the written rand_seed.
>
> Multiple problems are causing this:
> 1. In verify phase fio does not set io_u->rand_seed to compare with
> hdr->rand_seed
> 2. In randrw scenario, fio log is stored in red-black tree in "sorted by LBA"
> order. Thus, it is imposible to replay the written order, or rather
> generate the seeds again in the same order.
> 3. In write phase, the code currently will generate rand_seed, write data
> and log rand_seed. When queuedepth > 1, it's possible the writes complete
> in a different order than rand_seed was generated. Thus when replaying
> the log, the generated rand_seed might not match what was written.
> 4. verify_backlog option will start verification before all the data has been
> written and it make rand_seed replay code broken with current design.
>
> Proposed fixes:
> 1. Use of existing verify_state to generate verify header.
> (and assumes this was the original intention of verify_state). And also
> adds code to replay rand_seed in verify phase.
> 2. If verifysort option is not enabled, store the write log in a list instead
> of the red-black tree. Otherwise, don't attempt to verify the rand_seed
> in the header.
> 3. In write phase, generate rand_seed, log rand_seed, write data. I.e. log
> IO transactions in the order generated, not completed.
> 4. Don't verify rand_seed when verify_backlog is enabled.
We are still seeing failures with rw=read + verify_backlog=1. Investigating.
Please send any feedback if you have a chance to look at this anyway.
We'll fix the C++ comment. :)
cheers,
grant
>
> Signed-off-by: Puthikorn Voravootivat <puthik@chromium.org>
> ---
> backend.c | 13 +++++++++++++
> io_u.c | 6 ------
> iolog.c | 2 +-
> verify.c | 14 ++++++++++++--
> 4 files changed, 26 insertions(+), 9 deletions(-)
>
> diff --git a/backend.c b/backend.c
> index 93e6632..bf9d066 100644
> --- a/backend.c
> +++ b/backend.c
> @@ -697,6 +697,13 @@ static uint64_t do_io(struct thread_data *td)
> */
> if (td->o.verify != VERIFY_NONE && io_u->ddir == DDIR_READ &&
> ((io_u->flags & IO_U_F_VER_LIST) || !td_rw(td))) {
> +
> + if (!td->o.verify_pattern_bytes) {
> + io_u->rand_seed = __rand(&td->__verify_state);
> + if (sizeof(int) != sizeof(long *))
> + io_u->rand_seed *= __rand(&td->__verify_state);
> + }
> +
> if (td->o.verify_async)
> io_u->end_io = verify_io_u_async;
> else
> @@ -707,6 +714,12 @@ static uint64_t do_io(struct thread_data *td)
> else
> td_set_runstate(td, TD_RUNNING);
>
> + if (td_write(td) && io_u->ddir == DDIR_WRITE &&
> + td->o.do_verify &&
> + td->o.verify != VERIFY_NONE &&
> + !td->o.experimental_verify)
> + log_io_piece(td, io_u);
> +
> ret = td_io_queue(td, io_u);
> switch (ret) {
> case FIO_Q_COMPLETED:
> diff --git a/io_u.c b/io_u.c
> index 518d884..f68b213 100644
> --- a/io_u.c
> +++ b/io_u.c
> @@ -1623,12 +1623,6 @@ static void io_completed(struct thread_data *td, struct io_u *io_u,
> utime_since_now(&td->start));
> }
>
> - if (td_write(td) && idx == DDIR_WRITE &&
> - td->o.do_verify &&
> - td->o.verify != VERIFY_NONE &&
> - !td->o.experimental_verify)
> - log_io_piece(td, io_u);
> -
> icd->bytes_done[idx] += bytes;
>
> if (io_u->end_io) {
> diff --git a/iolog.c b/iolog.c
> index ec29971..017b235 100644
> --- a/iolog.c
> +++ b/iolog.c
> @@ -209,7 +209,7 @@ void log_io_piece(struct thread_data *td, struct io_u *io_u)
> * drop the old one, which we rely on the rb insert/lookup for
> * handling.
> */
> - if ((!td_random(td) || !td->o.overwrite) &&
> + if (((!td->o.verifysort) || !td_random(td) || !td->o.overwrite) &&
> (file_randommap(td, ipo->file) || td->o.verify == VERIFY_NONE)) {
> INIT_FLIST_HEAD(&ipo->list);
> flist_add_tail(&ipo->list, &td->io_hist_list);
> diff --git a/verify.c b/verify.c
> index 568bae8..970b867 100644
> --- a/verify.c
> +++ b/verify.c
> @@ -72,10 +72,10 @@ void fill_verify_pattern(struct thread_data *td, void *p, unsigned int len,
> if (use_seed)
> __fill_random_buf(p, len, seed);
> else
> - io_u->rand_seed = fill_random_buf(&td->buf_state, p, len);
> + io_u->rand_seed = fill_random_buf(&td->__verify_state, p, len);
> return;
> }
> -
> +
> if (io_u->buf_filled_len >= len) {
> dprint(FD_VERIFY, "using already filled verify pattern b=%d len=%u\n",
> td->o.verify_pattern_bytes, len);
> @@ -718,6 +718,10 @@ int verify_io_u(struct thread_data *td, struct io_u *io_u)
> memswp(p, p + td->o.verify_offset, header_size);
> hdr = p;
>
> + // Make rand_seed check pass when have verifysort or verify_backlog.
> + if (td->o.verifysort || (td->flags & TD_F_VER_BACKLOG))
> + io_u->rand_seed = hdr->rand_seed;
> +
> ret = verify_header(io_u, hdr);
> switch (ret) {
> case 0:
> @@ -1056,6 +1060,12 @@ int get_next_verify(struct thread_data *td, struct io_u *io_u)
> remove_trim_entry(td, ipo);
> free(ipo);
> dprint(FD_VERIFY, "get_next_verify: ret io_u %p\n", io_u);
> +
> + if (!td->o.verify_pattern_bytes) {
> + io_u->rand_seed = __rand(&td->__verify_state);
> + if (sizeof(int) != sizeof(long *))
> + io_u->rand_seed *= __rand(&td->__verify_state);
> + }
> return 0;
> }
>
> --
> 1.9.0.rc1.175.g0b1dcb5
>
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] fix rand_seed mismatches in verify phase
2014-02-03 22:54 ` Grant Grundler
@ 2014-02-04 18:54 ` Puthikorn Voravootivat
2014-02-04 19:21 ` Grant Grundler
2014-02-05 18:27 ` Jens Axboe
2014-02-04 18:56 ` Puthikorn Voravootivat
1 sibling, 2 replies; 12+ messages in thread
From: Puthikorn Voravootivat @ 2014-02-04 18:54 UTC (permalink / raw)
To: Grant Grundler
Cc: Puthikorn Voravootivat, Jens Axboe, FIO List, Gwendal Grignou
[-- Attachment #1: Type: text/plain, Size: 7376 bytes --]
I tried to run the same test with fio 2.1.3 and saw the same error.
This error always occurs when rw=read or randread and verify_backlog=1
Here is an example job file and error message.
Job file:
[md5-sync-1-0-1-1-1]
filename=test213/fio/md5-sync-1-0-1-1-1
loops=1
direct=0
iodepth=1
ioengine=sync
verify=md5
size=1m
bs=4k
rw=read
verifysort=1
verify_backlog=1
Error message:
verify: bad magic header 869, wanted acca at file
test213/fio/md5-sync-1-0-1-1-1 offset 585728, length 2290190
On Mon, Feb 3, 2014 at 2:54 PM, Grant Grundler <grundler@chromium.org>wrote:
> On Mon, Feb 3, 2014 at 1:31 PM, Puthikorn Voravootivat
> <puthik@chromium.org> wrote:
> > In verify phase, the rand_seed generated on replay does not match
> > the written rand_seed.
> >
> > Multiple problems are causing this:
> > 1. In verify phase fio does not set io_u->rand_seed to compare with
> > hdr->rand_seed
> > 2. In randrw scenario, fio log is stored in red-black tree in "sorted by
> LBA"
> > order. Thus, it is imposible to replay the written order, or rather
> > generate the seeds again in the same order.
> > 3. In write phase, the code currently will generate rand_seed, write data
> > and log rand_seed. When queuedepth > 1, it's possible the writes
> complete
> > in a different order than rand_seed was generated. Thus when
> replaying
> > the log, the generated rand_seed might not match what was written.
> > 4. verify_backlog option will start verification before all the data has
> been
> > written and it make rand_seed replay code broken with current design.
> >
> > Proposed fixes:
> > 1. Use of existing verify_state to generate verify header.
> > (and assumes this was the original intention of verify_state). And
> also
> > adds code to replay rand_seed in verify phase.
> > 2. If verifysort option is not enabled, store the write log in a list
> instead
> > of the red-black tree. Otherwise, don't attempt to verify the
> rand_seed
> > in the header.
> > 3. In write phase, generate rand_seed, log rand_seed, write data. I.e.
> log
> > IO transactions in the order generated, not completed.
> > 4. Don't verify rand_seed when verify_backlog is enabled.
>
>
> We are still seeing failures with rw=read + verify_backlog=1.
> Investigating.
>
> Please send any feedback if you have a chance to look at this anyway.
> We'll fix the C++ comment. :)
>
> cheers,
> grant
>
> >
> > Signed-off-by: Puthikorn Voravootivat <puthik@chromium.org>
> > ---
> > backend.c | 13 +++++++++++++
> > io_u.c | 6 ------
> > iolog.c | 2 +-
> > verify.c | 14 ++++++++++++--
> > 4 files changed, 26 insertions(+), 9 deletions(-)
> >
> > diff --git a/backend.c b/backend.c
> > index 93e6632..bf9d066 100644
> > --- a/backend.c
> > +++ b/backend.c
> > @@ -697,6 +697,13 @@ static uint64_t do_io(struct thread_data *td)
> > */
> > if (td->o.verify != VERIFY_NONE && io_u->ddir ==
> DDIR_READ &&
> > ((io_u->flags & IO_U_F_VER_LIST) || !td_rw(td))) {
> > +
> > + if (!td->o.verify_pattern_bytes) {
> > + io_u->rand_seed =
> __rand(&td->__verify_state);
> > + if (sizeof(int) != sizeof(long *))
> > + io_u->rand_seed *=
> __rand(&td->__verify_state);
> > + }
> > +
> > if (td->o.verify_async)
> > io_u->end_io = verify_io_u_async;
> > else
> > @@ -707,6 +714,12 @@ static uint64_t do_io(struct thread_data *td)
> > else
> > td_set_runstate(td, TD_RUNNING);
> >
> > + if (td_write(td) && io_u->ddir == DDIR_WRITE &&
> > + td->o.do_verify &&
> > + td->o.verify != VERIFY_NONE &&
> > + !td->o.experimental_verify)
> > + log_io_piece(td, io_u);
> > +
> > ret = td_io_queue(td, io_u);
> > switch (ret) {
> > case FIO_Q_COMPLETED:
> > diff --git a/io_u.c b/io_u.c
> > index 518d884..f68b213 100644
> > --- a/io_u.c
> > +++ b/io_u.c
> > @@ -1623,12 +1623,6 @@ static void io_completed(struct thread_data *td,
> struct io_u *io_u,
> > utime_since_now(&td->start));
> > }
> >
> > - if (td_write(td) && idx == DDIR_WRITE &&
> > - td->o.do_verify &&
> > - td->o.verify != VERIFY_NONE &&
> > - !td->o.experimental_verify)
> > - log_io_piece(td, io_u);
> > -
> > icd->bytes_done[idx] += bytes;
> >
> > if (io_u->end_io) {
> > diff --git a/iolog.c b/iolog.c
> > index ec29971..017b235 100644
> > --- a/iolog.c
> > +++ b/iolog.c
> > @@ -209,7 +209,7 @@ void log_io_piece(struct thread_data *td, struct
> io_u *io_u)
> > * drop the old one, which we rely on the rb insert/lookup for
> > * handling.
> > */
> > - if ((!td_random(td) || !td->o.overwrite) &&
> > + if (((!td->o.verifysort) || !td_random(td) || !td->o.overwrite)
> &&
> > (file_randommap(td, ipo->file) || td->o.verify ==
> VERIFY_NONE)) {
> > INIT_FLIST_HEAD(&ipo->list);
> > flist_add_tail(&ipo->list, &td->io_hist_list);
> > diff --git a/verify.c b/verify.c
> > index 568bae8..970b867 100644
> > --- a/verify.c
> > +++ b/verify.c
> > @@ -72,10 +72,10 @@ void fill_verify_pattern(struct thread_data *td,
> void *p, unsigned int len,
> > if (use_seed)
> > __fill_random_buf(p, len, seed);
> > else
> > - io_u->rand_seed =
> fill_random_buf(&td->buf_state, p, len);
> > + io_u->rand_seed =
> fill_random_buf(&td->__verify_state, p, len);
> > return;
> > }
> > -
> > +
> > if (io_u->buf_filled_len >= len) {
> > dprint(FD_VERIFY, "using already filled verify pattern
> b=%d len=%u\n",
> > td->o.verify_pattern_bytes, len);
> > @@ -718,6 +718,10 @@ int verify_io_u(struct thread_data *td, struct io_u
> *io_u)
> > memswp(p, p + td->o.verify_offset, header_size);
> > hdr = p;
> >
> > + // Make rand_seed check pass when have verifysort or
> verify_backlog.
> > + if (td->o.verifysort || (td->flags & TD_F_VER_BACKLOG))
> > + io_u->rand_seed = hdr->rand_seed;
> > +
> > ret = verify_header(io_u, hdr);
> > switch (ret) {
> > case 0:
> > @@ -1056,6 +1060,12 @@ int get_next_verify(struct thread_data *td,
> struct io_u *io_u)
> > remove_trim_entry(td, ipo);
> > free(ipo);
> > dprint(FD_VERIFY, "get_next_verify: ret io_u %p\n",
> io_u);
> > +
> > + if (!td->o.verify_pattern_bytes) {
> > + io_u->rand_seed = __rand(&td->__verify_state);
> > + if (sizeof(int) != sizeof(long *))
> > + io_u->rand_seed *=
> __rand(&td->__verify_state);
> > + }
> > return 0;
> > }
> >
> > --
> > 1.9.0.rc1.175.g0b1dcb5
> >
>
[-- Attachment #2: Type: text/html, Size: 10134 bytes --]
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] fix rand_seed mismatches in verify phase
2014-02-03 22:54 ` Grant Grundler
2014-02-04 18:54 ` Puthikorn Voravootivat
@ 2014-02-04 18:56 ` Puthikorn Voravootivat
1 sibling, 0 replies; 12+ messages in thread
From: Puthikorn Voravootivat @ 2014-02-04 18:56 UTC (permalink / raw)
To: Grant Grundler
Cc: Puthikorn Voravootivat, Jens Axboe, FIO List, Gwendal Grignou
*Send email again using plain text mode
I tried to run the same test with fio 2.1.3 and saw the same error.
This error always occurs when rw=read or randread and verify_backlog=1
Here is an example job file and error message.
Job file:
[md5-sync-1-0-1-1-1]
filename=test213/fio/md5-sync-1-0-1-1-1
loops=1
direct=0
iodepth=1
ioengine=sync
verify=md5
size=1m
bs=4k
rw=read
verifysort=1
verify_backlog=1
Error message:
verify: bad magic header 869, wanted acca at file
test213/fio/md5-sync-1-0-1-1-1 offset 585728, length 2290190
On Mon, Feb 3, 2014 at 2:54 PM, Grant Grundler <grundler@chromium.org> wrote:
> On Mon, Feb 3, 2014 at 1:31 PM, Puthikorn Voravootivat
> <puthik@chromium.org> wrote:
>> In verify phase, the rand_seed generated on replay does not match
>> the written rand_seed.
>>
>> Multiple problems are causing this:
>> 1. In verify phase fio does not set io_u->rand_seed to compare with
>> hdr->rand_seed
>> 2. In randrw scenario, fio log is stored in red-black tree in "sorted by LBA"
>> order. Thus, it is imposible to replay the written order, or rather
>> generate the seeds again in the same order.
>> 3. In write phase, the code currently will generate rand_seed, write data
>> and log rand_seed. When queuedepth > 1, it's possible the writes complete
>> in a different order than rand_seed was generated. Thus when replaying
>> the log, the generated rand_seed might not match what was written.
>> 4. verify_backlog option will start verification before all the data has been
>> written and it make rand_seed replay code broken with current design.
>>
>> Proposed fixes:
>> 1. Use of existing verify_state to generate verify header.
>> (and assumes this was the original intention of verify_state). And also
>> adds code to replay rand_seed in verify phase.
>> 2. If verifysort option is not enabled, store the write log in a list instead
>> of the red-black tree. Otherwise, don't attempt to verify the rand_seed
>> in the header.
>> 3. In write phase, generate rand_seed, log rand_seed, write data. I.e. log
>> IO transactions in the order generated, not completed.
>> 4. Don't verify rand_seed when verify_backlog is enabled.
>
>
> We are still seeing failures with rw=read + verify_backlog=1. Investigating.
>
> Please send any feedback if you have a chance to look at this anyway.
> We'll fix the C++ comment. :)
>
> cheers,
> grant
>
>>
>> Signed-off-by: Puthikorn Voravootivat <puthik@chromium.org>
>> ---
>> backend.c | 13 +++++++++++++
>> io_u.c | 6 ------
>> iolog.c | 2 +-
>> verify.c | 14 ++++++++++++--
>> 4 files changed, 26 insertions(+), 9 deletions(-)
>>
>> diff --git a/backend.c b/backend.c
>> index 93e6632..bf9d066 100644
>> --- a/backend.c
>> +++ b/backend.c
>> @@ -697,6 +697,13 @@ static uint64_t do_io(struct thread_data *td)
>> */
>> if (td->o.verify != VERIFY_NONE && io_u->ddir == DDIR_READ &&
>> ((io_u->flags & IO_U_F_VER_LIST) || !td_rw(td))) {
>> +
>> + if (!td->o.verify_pattern_bytes) {
>> + io_u->rand_seed = __rand(&td->__verify_state);
>> + if (sizeof(int) != sizeof(long *))
>> + io_u->rand_seed *= __rand(&td->__verify_state);
>> + }
>> +
>> if (td->o.verify_async)
>> io_u->end_io = verify_io_u_async;
>> else
>> @@ -707,6 +714,12 @@ static uint64_t do_io(struct thread_data *td)
>> else
>> td_set_runstate(td, TD_RUNNING);
>>
>> + if (td_write(td) && io_u->ddir == DDIR_WRITE &&
>> + td->o.do_verify &&
>> + td->o.verify != VERIFY_NONE &&
>> + !td->o.experimental_verify)
>> + log_io_piece(td, io_u);
>> +
>> ret = td_io_queue(td, io_u);
>> switch (ret) {
>> case FIO_Q_COMPLETED:
>> diff --git a/io_u.c b/io_u.c
>> index 518d884..f68b213 100644
>> --- a/io_u.c
>> +++ b/io_u.c
>> @@ -1623,12 +1623,6 @@ static void io_completed(struct thread_data *td, struct io_u *io_u,
>> utime_since_now(&td->start));
>> }
>>
>> - if (td_write(td) && idx == DDIR_WRITE &&
>> - td->o.do_verify &&
>> - td->o.verify != VERIFY_NONE &&
>> - !td->o.experimental_verify)
>> - log_io_piece(td, io_u);
>> -
>> icd->bytes_done[idx] += bytes;
>>
>> if (io_u->end_io) {
>> diff --git a/iolog.c b/iolog.c
>> index ec29971..017b235 100644
>> --- a/iolog.c
>> +++ b/iolog.c
>> @@ -209,7 +209,7 @@ void log_io_piece(struct thread_data *td, struct io_u *io_u)
>> * drop the old one, which we rely on the rb insert/lookup for
>> * handling.
>> */
>> - if ((!td_random(td) || !td->o.overwrite) &&
>> + if (((!td->o.verifysort) || !td_random(td) || !td->o.overwrite) &&
>> (file_randommap(td, ipo->file) || td->o.verify == VERIFY_NONE)) {
>> INIT_FLIST_HEAD(&ipo->list);
>> flist_add_tail(&ipo->list, &td->io_hist_list);
>> diff --git a/verify.c b/verify.c
>> index 568bae8..970b867 100644
>> --- a/verify.c
>> +++ b/verify.c
>> @@ -72,10 +72,10 @@ void fill_verify_pattern(struct thread_data *td, void *p, unsigned int len,
>> if (use_seed)
>> __fill_random_buf(p, len, seed);
>> else
>> - io_u->rand_seed = fill_random_buf(&td->buf_state, p, len);
>> + io_u->rand_seed = fill_random_buf(&td->__verify_state, p, len);
>> return;
>> }
>> -
>> +
>> if (io_u->buf_filled_len >= len) {
>> dprint(FD_VERIFY, "using already filled verify pattern b=%d len=%u\n",
>> td->o.verify_pattern_bytes, len);
>> @@ -718,6 +718,10 @@ int verify_io_u(struct thread_data *td, struct io_u *io_u)
>> memswp(p, p + td->o.verify_offset, header_size);
>> hdr = p;
>>
>> + // Make rand_seed check pass when have verifysort or verify_backlog.
>> + if (td->o.verifysort || (td->flags & TD_F_VER_BACKLOG))
>> + io_u->rand_seed = hdr->rand_seed;
>> +
>> ret = verify_header(io_u, hdr);
>> switch (ret) {
>> case 0:
>> @@ -1056,6 +1060,12 @@ int get_next_verify(struct thread_data *td, struct io_u *io_u)
>> remove_trim_entry(td, ipo);
>> free(ipo);
>> dprint(FD_VERIFY, "get_next_verify: ret io_u %p\n", io_u);
>> +
>> + if (!td->o.verify_pattern_bytes) {
>> + io_u->rand_seed = __rand(&td->__verify_state);
>> + if (sizeof(int) != sizeof(long *))
>> + io_u->rand_seed *= __rand(&td->__verify_state);
>> + }
>> return 0;
>> }
>>
>> --
>> 1.9.0.rc1.175.g0b1dcb5
>>
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] fix rand_seed mismatches in verify phase
2014-02-04 18:54 ` Puthikorn Voravootivat
@ 2014-02-04 19:21 ` Grant Grundler
2014-02-05 18:27 ` Jens Axboe
1 sibling, 0 replies; 12+ messages in thread
From: Grant Grundler @ 2014-02-04 19:21 UTC (permalink / raw)
To: Puthikorn Voravootivat
Cc: Grant Grundler, Jens Axboe, FIO List, Gwendal Grignou
On Tue, Feb 4, 2014 at 10:54 AM, Puthikorn Voravootivat
<puthik@chromium.org> wrote:
> I tried to run the same test with fio 2.1.3 and saw the same error.
> This error always occurs when rw=read or randread and verify_backlog=1
>
> Here is an example job file and error message.
>
> Job file:
> [md5-sync-1-0-1-1-1]
> filename=test213/fio/md5-sync-1-0-1-1-1
> loops=1
> direct=0
> iodepth=1
> ioengine=sync
> verify=md5
> size=1m
> bs=4k
> rw=read
> verifysort=1
> verify_backlog=1
>
> Error message:
> verify: bad magic header 869, wanted acca at file
> test213/fio/md5-sync-1-0-1-1-1 offset 585728, length 2290190
Jens,
To restate the obvious, this is an existing bug in 2.1.3 release. Can
you review this patch please? I believe it resolves the remaining
issues directly related to rand_seed verification.
My gut feeling (after having spent most of yesterday trying to track
this down) is there is a race condition in the usage of struct io_u
between the verify_backlog and the "workload" thread. In particular, I
was looking at the use of IO_U_F_FREE_DEF in order to see if an io_u
was bing marked free (put_io_u()) prematurely before the
verify_backlog was done. But I didn't find that (yet?).
All the "high touch" payload verification methods pass and "low touch"
(meta) verification fails most (not all) of the time.
thanks
grant
>
>
> On Mon, Feb 3, 2014 at 2:54 PM, Grant Grundler <grundler@chromium.org>
> wrote:
>>
>> On Mon, Feb 3, 2014 at 1:31 PM, Puthikorn Voravootivat
>> <puthik@chromium.org> wrote:
>> > In verify phase, the rand_seed generated on replay does not match
>> > the written rand_seed.
>> >
>> > Multiple problems are causing this:
>> > 1. In verify phase fio does not set io_u->rand_seed to compare with
>> > hdr->rand_seed
>> > 2. In randrw scenario, fio log is stored in red-black tree in "sorted by
>> > LBA"
>> > order. Thus, it is imposible to replay the written order, or rather
>> > generate the seeds again in the same order.
>> > 3. In write phase, the code currently will generate rand_seed, write
>> > data
>> > and log rand_seed. When queuedepth > 1, it's possible the writes
>> > complete
>> > in a different order than rand_seed was generated. Thus when
>> > replaying
>> > the log, the generated rand_seed might not match what was written.
>> > 4. verify_backlog option will start verification before all the data has
>> > been
>> > written and it make rand_seed replay code broken with current design.
>> >
>> > Proposed fixes:
>> > 1. Use of existing verify_state to generate verify header.
>> > (and assumes this was the original intention of verify_state). And
>> > also
>> > adds code to replay rand_seed in verify phase.
>> > 2. If verifysort option is not enabled, store the write log in a list
>> > instead
>> > of the red-black tree. Otherwise, don't attempt to verify the
>> > rand_seed
>> > in the header.
>> > 3. In write phase, generate rand_seed, log rand_seed, write data. I.e.
>> > log
>> > IO transactions in the order generated, not completed.
>> > 4. Don't verify rand_seed when verify_backlog is enabled.
>>
>>
>> We are still seeing failures with rw=read + verify_backlog=1.
>> Investigating.
>>
>> Please send any feedback if you have a chance to look at this anyway.
>> We'll fix the C++ comment. :)
>>
>> cheers,
>> grant
>>
>> >
>> > Signed-off-by: Puthikorn Voravootivat <puthik@chromium.org>
>> > ---
>> > backend.c | 13 +++++++++++++
>> > io_u.c | 6 ------
>> > iolog.c | 2 +-
>> > verify.c | 14 ++++++++++++--
>> > 4 files changed, 26 insertions(+), 9 deletions(-)
>> >
>> > diff --git a/backend.c b/backend.c
>> > index 93e6632..bf9d066 100644
>> > --- a/backend.c
>> > +++ b/backend.c
>> > @@ -697,6 +697,13 @@ static uint64_t do_io(struct thread_data *td)
>> > */
>> > if (td->o.verify != VERIFY_NONE && io_u->ddir ==
>> > DDIR_READ &&
>> > ((io_u->flags & IO_U_F_VER_LIST) || !td_rw(td))) {
>> > +
>> > + if (!td->o.verify_pattern_bytes) {
>> > + io_u->rand_seed =
>> > __rand(&td->__verify_state);
>> > + if (sizeof(int) != sizeof(long *))
>> > + io_u->rand_seed *=
>> > __rand(&td->__verify_state);
>> > + }
>> > +
>> > if (td->o.verify_async)
>> > io_u->end_io = verify_io_u_async;
>> > else
>> > @@ -707,6 +714,12 @@ static uint64_t do_io(struct thread_data *td)
>> > else
>> > td_set_runstate(td, TD_RUNNING);
>> >
>> > + if (td_write(td) && io_u->ddir == DDIR_WRITE &&
>> > + td->o.do_verify &&
>> > + td->o.verify != VERIFY_NONE &&
>> > + !td->o.experimental_verify)
>> > + log_io_piece(td, io_u);
>> > +
>> > ret = td_io_queue(td, io_u);
>> > switch (ret) {
>> > case FIO_Q_COMPLETED:
>> > diff --git a/io_u.c b/io_u.c
>> > index 518d884..f68b213 100644
>> > --- a/io_u.c
>> > +++ b/io_u.c
>> > @@ -1623,12 +1623,6 @@ static void io_completed(struct thread_data *td,
>> > struct io_u *io_u,
>> > utime_since_now(&td->start));
>> > }
>> >
>> > - if (td_write(td) && idx == DDIR_WRITE &&
>> > - td->o.do_verify &&
>> > - td->o.verify != VERIFY_NONE &&
>> > - !td->o.experimental_verify)
>> > - log_io_piece(td, io_u);
>> > -
>> > icd->bytes_done[idx] += bytes;
>> >
>> > if (io_u->end_io) {
>> > diff --git a/iolog.c b/iolog.c
>> > index ec29971..017b235 100644
>> > --- a/iolog.c
>> > +++ b/iolog.c
>> > @@ -209,7 +209,7 @@ void log_io_piece(struct thread_data *td, struct
>> > io_u *io_u)
>> > * drop the old one, which we rely on the rb insert/lookup for
>> > * handling.
>> > */
>> > - if ((!td_random(td) || !td->o.overwrite) &&
>> > + if (((!td->o.verifysort) || !td_random(td) || !td->o.overwrite)
>> > &&
>> > (file_randommap(td, ipo->file) || td->o.verify ==
>> > VERIFY_NONE)) {
>> > INIT_FLIST_HEAD(&ipo->list);
>> > flist_add_tail(&ipo->list, &td->io_hist_list);
>> > diff --git a/verify.c b/verify.c
>> > index 568bae8..970b867 100644
>> > --- a/verify.c
>> > +++ b/verify.c
>> > @@ -72,10 +72,10 @@ void fill_verify_pattern(struct thread_data *td,
>> > void *p, unsigned int len,
>> > if (use_seed)
>> > __fill_random_buf(p, len, seed);
>> > else
>> > - io_u->rand_seed =
>> > fill_random_buf(&td->buf_state, p, len);
>> > + io_u->rand_seed =
>> > fill_random_buf(&td->__verify_state, p, len);
>> > return;
>> > }
>> > -
>> > +
>> > if (io_u->buf_filled_len >= len) {
>> > dprint(FD_VERIFY, "using already filled verify pattern
>> > b=%d len=%u\n",
>> > td->o.verify_pattern_bytes, len);
>> > @@ -718,6 +718,10 @@ int verify_io_u(struct thread_data *td, struct io_u
>> > *io_u)
>> > memswp(p, p + td->o.verify_offset, header_size);
>> > hdr = p;
>> >
>> > + // Make rand_seed check pass when have verifysort or
>> > verify_backlog.
>> > + if (td->o.verifysort || (td->flags & TD_F_VER_BACKLOG))
>> > + io_u->rand_seed = hdr->rand_seed;
>> > +
>> > ret = verify_header(io_u, hdr);
>> > switch (ret) {
>> > case 0:
>> > @@ -1056,6 +1060,12 @@ int get_next_verify(struct thread_data *td,
>> > struct io_u *io_u)
>> > remove_trim_entry(td, ipo);
>> > free(ipo);
>> > dprint(FD_VERIFY, "get_next_verify: ret io_u %p\n",
>> > io_u);
>> > +
>> > + if (!td->o.verify_pattern_bytes) {
>> > + io_u->rand_seed = __rand(&td->__verify_state);
>> > + if (sizeof(int) != sizeof(long *))
>> > + io_u->rand_seed *=
>> > __rand(&td->__verify_state);
>> > + }
>> > return 0;
>> > }
>> >
>> > --
>> > 1.9.0.rc1.175.g0b1dcb5
>> >
>
>
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] fix rand_seed mismatches in verify phase
2014-02-03 21:31 [PATCH] fix rand_seed mismatches in verify phase Puthikorn Voravootivat
2014-02-03 22:54 ` Grant Grundler
@ 2014-02-05 17:15 ` Jens Axboe
2014-02-05 18:16 ` Grant Grundler
1 sibling, 1 reply; 12+ messages in thread
From: Jens Axboe @ 2014-02-05 17:15 UTC (permalink / raw)
To: Puthikorn Voravootivat; +Cc: FIO List, Gwendal Grignou, Grant Grundler
On Mon, Feb 03 2014, Puthikorn Voravootivat wrote:
> In verify phase, the rand_seed generated on replay does not match
> the written rand_seed.
>
> Multiple problems are causing this:
> 1. In verify phase fio does not set io_u->rand_seed to compare with
> hdr->rand_seed
> 2. In randrw scenario, fio log is stored in red-black tree in "sorted by LBA"
> order. Thus, it is imposible to replay the written order, or rather
> generate the seeds again in the same order.
> 3. In write phase, the code currently will generate rand_seed, write data
> and log rand_seed. When queuedepth > 1, it's possible the writes complete
> in a different order than rand_seed was generated. Thus when replaying
> the log, the generated rand_seed might not match what was written.
> 4. verify_backlog option will start verification before all the data has been
> written and it make rand_seed replay code broken with current design.
>
> Proposed fixes:
> 1. Use of existing verify_state to generate verify header.
> (and assumes this was the original intention of verify_state). And also
> adds code to replay rand_seed in verify phase.
> 2. If verifysort option is not enabled, store the write log in a list instead
> of the red-black tree. Otherwise, don't attempt to verify the rand_seed
> in the header.
> 3. In write phase, generate rand_seed, log rand_seed, write data. I.e. log
> IO transactions in the order generated, not completed.
> 4. Don't verify rand_seed when verify_backlog is enabled.
Seems like a reasonable solution to it all. Thanks! I'll get this
applied.
--
Jens Axboe
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] fix rand_seed mismatches in verify phase
2014-02-05 17:15 ` Jens Axboe
@ 2014-02-05 18:16 ` Grant Grundler
0 siblings, 0 replies; 12+ messages in thread
From: Grant Grundler @ 2014-02-05 18:16 UTC (permalink / raw)
To: Jens Axboe
Cc: Puthikorn Voravootivat, FIO List, Gwendal Grignou, Grant Grundler
On Wed, Feb 5, 2014 at 9:15 AM, Jens Axboe <axboe@kernel.dk> wrote:
> On Mon, Feb 03 2014, Puthikorn Voravootivat wrote:
>> In verify phase, the rand_seed generated on replay does not match
>> the written rand_seed.
>>
>> Multiple problems are causing this:
>> 1. In verify phase fio does not set io_u->rand_seed to compare with
>> hdr->rand_seed
>> 2. In randrw scenario, fio log is stored in red-black tree in "sorted by LBA"
>> order. Thus, it is imposible to replay the written order, or rather
>> generate the seeds again in the same order.
>> 3. In write phase, the code currently will generate rand_seed, write data
>> and log rand_seed. When queuedepth > 1, it's possible the writes complete
>> in a different order than rand_seed was generated. Thus when replaying
>> the log, the generated rand_seed might not match what was written.
>> 4. verify_backlog option will start verification before all the data has been
>> written and it make rand_seed replay code broken with current design.
>>
>> Proposed fixes:
>> 1. Use of existing verify_state to generate verify header.
>> (and assumes this was the original intention of verify_state). And also
>> adds code to replay rand_seed in verify phase.
>> 2. If verifysort option is not enabled, store the write log in a list instead
>> of the red-black tree. Otherwise, don't attempt to verify the rand_seed
>> in the header.
>> 3. In write phase, generate rand_seed, log rand_seed, write data. I.e. log
>> IO transactions in the order generated, not completed.
>> 4. Don't verify rand_seed when verify_backlog is enabled.
>
> Seems like a reasonable solution to it all. Thanks! I'll get this
> applied.
Awesome - I see that it's applied. Puthikorn has one more fix in
flight and I'm going to suggest something you'll probably reject in
another email.
cheers,
grant
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] fix rand_seed mismatches in verify phase
2014-02-04 18:54 ` Puthikorn Voravootivat
2014-02-04 19:21 ` Grant Grundler
@ 2014-02-05 18:27 ` Jens Axboe
2014-02-05 18:31 ` Grant Grundler
1 sibling, 1 reply; 12+ messages in thread
From: Jens Axboe @ 2014-02-05 18:27 UTC (permalink / raw)
To: Puthikorn Voravootivat; +Cc: Grant Grundler, FIO List, Gwendal Grignou
On Tue, Feb 04 2014, Puthikorn Voravootivat wrote:
> I tried to run the same test with fio 2.1.3 and saw the same error.
> This error always occurs when rw=read or randread and verify_backlog=1
>
> Here is an example job file and error message.
>
> Job file:
> [md5-sync-1-0-1-1-1]
> filename=test213/fio/md5-sync-1-0-1-1-1
> loops=1
> direct=0
> iodepth=1
> ioengine=sync
> verify=md5
> size=1m
> bs=4k
> rw=read
> verifysort=1
> verify_backlog=1
>
> Error message:
> verify: bad magic header 869, wanted acca at file
> test213/fio/md5-sync-1-0-1-1-1 offset 585728, length 2290190
This is indeed breakage between 2.1.2 and 2.1.3. I ran a bisect on it,
it reveals:
20876c53b5d32f2da9049af5e7fb102133946981 is the first bad commit
commit 20876c53b5d32f2da9049af5e7fb102133946981
Author: Juan Casse <jcasse@chromium.org>
Date: Mon Sep 16 09:22:10 2013 -0700
Add condition to stop issuing io in do_io().
Problem:
When specifying iodepth > 1 while doing random async io
without a random map (and using the default random generator), the
main while loop in do_io() will issue extra io in the amount of io
allowed to accumulate in the queue. For example, a job with size=24k
and bs=8k will perform 4 io instead of the expected 3.
Reason:
The above behavior occurs because the while loop in do_io()
will continue until the amount of "completed io" >= size, but io
in the queue are not accounted for because they have not been
completed by io_u_queued_complete().
Exceptions:
The above behavior does not manifest itself when:
using random_generator=lfsr bacause lfsr knows when to stop
generating, and get_io_u() inside the while loop returns NULL,
breaking out of the loop.
using random map because random map ensures that all blocks
are hit, and after they are all hit, no more are generated.
Proposed Solution:
Stop while loop based on bytes issued instead of bytes completed.
Signed-off-by: Juan Casse <jcasse@chromium.org>
Reviewed-by: Grant Grundler <grundler@chromium.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
--
Jens Axboe
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] fix rand_seed mismatches in verify phase
2014-02-05 18:27 ` Jens Axboe
@ 2014-02-05 18:31 ` Grant Grundler
2014-02-05 19:30 ` Jens Axboe
0 siblings, 1 reply; 12+ messages in thread
From: Grant Grundler @ 2014-02-05 18:31 UTC (permalink / raw)
To: Jens Axboe
Cc: Puthikorn Voravootivat, Grant Grundler, FIO List, Gwendal Grignou
On Wed, Feb 5, 2014 at 10:27 AM, Jens Axboe <axboe@kernel.dk> wrote:
> On Tue, Feb 04 2014, Puthikorn Voravootivat wrote:
>> I tried to run the same test with fio 2.1.3 and saw the same error.
>> This error always occurs when rw=read or randread and verify_backlog=1
>>
>> Here is an example job file and error message.
>>
>> Job file:
>> [md5-sync-1-0-1-1-1]
>> filename=test213/fio/md5-sync-1-0-1-1-1
>> loops=1
>> direct=0
>> iodepth=1
>> ioengine=sync
>> verify=md5
>> size=1m
>> bs=4k
>> rw=read
>> verifysort=1
>> verify_backlog=1
>>
>> Error message:
>> verify: bad magic header 869, wanted acca at file
>> test213/fio/md5-sync-1-0-1-1-1 offset 585728, length 2290190
>
> This is indeed breakage between 2.1.2 and 2.1.3. I ran a bisect on it,
> it reveals:
>
> 20876c53b5d32f2da9049af5e7fb102133946981 is the first bad commit
> commit 20876c53b5d32f2da9049af5e7fb102133946981
> Author: Juan Casse <jcasse@chromium.org>
> Date: Mon Sep 16 09:22:10 2013 -0700
>
> Add condition to stop issuing io in do_io().
I think Puthik has a fix related to this but I haven't reviewed his
next patch yet. We've been talking about this behavior though.
I'll take another look at this patch to see what it's changing.
thanks,
grant
>
> Problem:
> When specifying iodepth > 1 while doing random async io
> without a random map (and using the default random generator), the
> main while loop in do_io() will issue extra io in the amount of io
> allowed to accumulate in the queue. For example, a job with size=24k
> and bs=8k will perform 4 io instead of the expected 3.
>
> Reason:
> The above behavior occurs because the while loop in do_io()
> will continue until the amount of "completed io" >= size, but io
> in the queue are not accounted for because they have not been
> completed by io_u_queued_complete().
>
> Exceptions:
> The above behavior does not manifest itself when:
> using random_generator=lfsr bacause lfsr knows when to stop
> generating, and get_io_u() inside the while loop returns NULL,
> breaking out of the loop.
> using random map because random map ensures that all blocks
> are hit, and after they are all hit, no more are generated.
>
> Proposed Solution:
> Stop while loop based on bytes issued instead of bytes completed.
>
> Signed-off-by: Juan Casse <jcasse@chromium.org>
> Reviewed-by: Grant Grundler <grundler@chromium.org>
> Signed-off-by: Jens Axboe <axboe@kernel.dk>
>
>
> --
> Jens Axboe
>
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] fix rand_seed mismatches in verify phase
2014-02-05 18:31 ` Grant Grundler
@ 2014-02-05 19:30 ` Jens Axboe
2014-02-05 21:31 ` Grant Grundler
0 siblings, 1 reply; 12+ messages in thread
From: Jens Axboe @ 2014-02-05 19:30 UTC (permalink / raw)
To: Grant Grundler; +Cc: Puthikorn Voravootivat, FIO List, Gwendal Grignou
On Wed, Feb 05 2014, Grant Grundler wrote:
> On Wed, Feb 5, 2014 at 10:27 AM, Jens Axboe <axboe@kernel.dk> wrote:
> > On Tue, Feb 04 2014, Puthikorn Voravootivat wrote:
> >> I tried to run the same test with fio 2.1.3 and saw the same error.
> >> This error always occurs when rw=read or randread and verify_backlog=1
> >>
> >> Here is an example job file and error message.
> >>
> >> Job file:
> >> [md5-sync-1-0-1-1-1]
> >> filename=test213/fio/md5-sync-1-0-1-1-1
> >> loops=1
> >> direct=0
> >> iodepth=1
> >> ioengine=sync
> >> verify=md5
> >> size=1m
> >> bs=4k
> >> rw=read
> >> verifysort=1
> >> verify_backlog=1
> >>
> >> Error message:
> >> verify: bad magic header 869, wanted acca at file
> >> test213/fio/md5-sync-1-0-1-1-1 offset 585728, length 2290190
> >
> > This is indeed breakage between 2.1.2 and 2.1.3. I ran a bisect on it,
> > it reveals:
> >
> > 20876c53b5d32f2da9049af5e7fb102133946981 is the first bad commit
> > commit 20876c53b5d32f2da9049af5e7fb102133946981
> > Author: Juan Casse <jcasse@chromium.org>
> > Date: Mon Sep 16 09:22:10 2013 -0700
> >
> > Add condition to stop issuing io in do_io().
>
> I think Puthik has a fix related to this but I haven't reviewed his
> next patch yet. We've been talking about this behavior though.
>
> I'll take another look at this patch to see what it's changing.
The below should take care of it. Juan's fix didn't account for verify
exceeding size=.
diff --git a/backend.c b/backend.c
index bf9d066e012a..3ac72e771cb9 100644
--- a/backend.c
+++ b/backend.c
@@ -642,7 +642,7 @@ static uint64_t do_io(struct thread_data *td)
uint64_t bytes_done[DDIR_RWDIR_CNT] = { 0, 0, 0 };
unsigned int i;
int ret = 0;
- uint64_t bytes_issued = 0;
+ uint64_t total_bytes, bytes_issued = 0;
if (in_ramp_time(td))
td_set_runstate(td, TD_RAMP);
@@ -651,6 +651,10 @@ static uint64_t do_io(struct thread_data *td)
lat_target_init(td);
+ total_bytes = td->o.size;
+ if (td->o.verify != VERIFY_NONE && td_write(td))
+ total_bytes += td->o.size;
+
while ((td->o.read_iolog_file && !flist_empty(&td->io_log_list)) ||
(!flist_empty(&td->trim_list)) || !io_bytes_exceeded(td) ||
td->o.time_based) {
@@ -678,7 +682,7 @@ static uint64_t do_io(struct thread_data *td)
if (flow_threshold_exceeded(td))
continue;
- if (bytes_issued >= (uint64_t) td->o.size)
+ if (bytes_issued >= total_bytes)
break;
io_u = get_io_u(td);
--
Jens Axboe
^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH] fix rand_seed mismatches in verify phase
2014-02-05 19:30 ` Jens Axboe
@ 2014-02-05 21:31 ` Grant Grundler
2014-02-05 21:38 ` Jens Axboe
0 siblings, 1 reply; 12+ messages in thread
From: Grant Grundler @ 2014-02-05 21:31 UTC (permalink / raw)
To: Jens Axboe
Cc: Grant Grundler, Puthikorn Voravootivat, FIO List, Gwendal Grignou
On Wed, Feb 5, 2014 at 11:30 AM, Jens Axboe <axboe@kernel.dk> wrote:
...
> The below should take care of it. Juan's fix didn't account for verify
> exceeding size=.
Ah...Juan didn't test with that option and Puthikorn did.
> diff --git a/backend.c b/backend.c
> index bf9d066e012a..3ac72e771cb9 100644
> --- a/backend.c
> +++ b/backend.c
> @@ -642,7 +642,7 @@ static uint64_t do_io(struct thread_data *td)
> uint64_t bytes_done[DDIR_RWDIR_CNT] = { 0, 0, 0 };
> unsigned int i;
> int ret = 0;
> - uint64_t bytes_issued = 0;
> + uint64_t total_bytes, bytes_issued = 0;
>
> if (in_ramp_time(td))
> td_set_runstate(td, TD_RAMP);
> @@ -651,6 +651,10 @@ static uint64_t do_io(struct thread_data *td)
>
> lat_target_init(td);
>
> + total_bytes = td->o.size;
> + if (td->o.verify != VERIFY_NONE && td_write(td))
> + total_bytes += td->o.size;
> +
The amount of verify IO depends on what percentage is write traffic.
Is each thread strictly reader or writer?
I didn't think so or we wouldn't need "td->this_io_bytes[]" for
DDIR_WRITE, DDIR_READ, and DDIR_TRIM at the same time.
> while ((td->o.read_iolog_file && !flist_empty(&td->io_log_list)) ||
> (!flist_empty(&td->trim_list)) || !io_bytes_exceeded(td) ||
I've gotten a bit hung up on what io_bytes_exceeded() uses (
td->this_io_bytes[]), bytes_done[], and bytes_issued. Seems like we
have three different mechanisms to do essentially the same thing. Can
I rip two of them out and just count "issued IO" for WRITE/READ/TRIM?
"io_bytes_exceeded()" is the only place I would expect this logic to
live in this loop. Having additional "break" in the middle of the loop
implies we've broken the original design.
cheers,
grant
> td->o.time_based) {
> @@ -678,7 +682,7 @@ static uint64_t do_io(struct thread_data *td)
> if (flow_threshold_exceeded(td))
> continue;
>
> - if (bytes_issued >= (uint64_t) td->o.size)
> + if (bytes_issued >= total_bytes)
> break;
>
> io_u = get_io_u(td);
>
> --
> Jens Axboe
>
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] fix rand_seed mismatches in verify phase
2014-02-05 21:31 ` Grant Grundler
@ 2014-02-05 21:38 ` Jens Axboe
0 siblings, 0 replies; 12+ messages in thread
From: Jens Axboe @ 2014-02-05 21:38 UTC (permalink / raw)
To: Grant Grundler; +Cc: Puthikorn Voravootivat, FIO List, Gwendal Grignou
On Wed, Feb 05 2014, Grant Grundler wrote:
> On Wed, Feb 5, 2014 at 11:30 AM, Jens Axboe <axboe@kernel.dk> wrote:
> ...
> > The below should take care of it. Juan's fix didn't account for verify
> > exceeding size=.
>
> Ah...Juan didn't test with that option and Puthikorn did.
>
> > diff --git a/backend.c b/backend.c
> > index bf9d066e012a..3ac72e771cb9 100644
> > --- a/backend.c
> > +++ b/backend.c
> > @@ -642,7 +642,7 @@ static uint64_t do_io(struct thread_data *td)
> > uint64_t bytes_done[DDIR_RWDIR_CNT] = { 0, 0, 0 };
> > unsigned int i;
> > int ret = 0;
> > - uint64_t bytes_issued = 0;
> > + uint64_t total_bytes, bytes_issued = 0;
> >
> > if (in_ramp_time(td))
> > td_set_runstate(td, TD_RAMP);
> > @@ -651,6 +651,10 @@ static uint64_t do_io(struct thread_data *td)
> >
> > lat_target_init(td);
> >
> > + total_bytes = td->o.size;
> > + if (td->o.verify != VERIFY_NONE && td_write(td))
> > + total_bytes += td->o.size;
> > +
>
> The amount of verify IO depends on what percentage is write traffic.
> Is each thread strictly reader or writer?
It does indeed. Note I updated this to depend on verify_backlog as well,
since that is the important bit. Each thread can have any mix of
read/write/trim.
> I didn't think so or we wouldn't need "td->this_io_bytes[]" for
> DDIR_WRITE, DDIR_READ, and DDIR_TRIM at the same time.
>
> > while ((td->o.read_iolog_file && !flist_empty(&td->io_log_list)) ||
> > (!flist_empty(&td->trim_list)) || !io_bytes_exceeded(td) ||
>
> I've gotten a bit hung up on what io_bytes_exceeded() uses (
> td->this_io_bytes[]), bytes_done[], and bytes_issued. Seems like we
> have three different mechanisms to do essentially the same thing. Can
> I rip two of them out and just count "issued IO" for WRITE/READ/TRIM?
this_io_bytes[] is per loop.
>
> "io_bytes_exceeded()" is the only place I would expect this logic to
> live in this loop. Having additional "break" in the middle of the loop
> implies we've broken the original design.
The fix is a bit of a hack, but the original commit was a hack too. It
would be a lot better to track this explicitly. For now it's just an
upper cap, normally we expect other conditions to terminate it.
So I'd welcome a cleanup in this area for sure.
--
Jens Axboe
^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2014-02-05 21:38 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-02-03 21:31 [PATCH] fix rand_seed mismatches in verify phase Puthikorn Voravootivat
2014-02-03 22:54 ` Grant Grundler
2014-02-04 18:54 ` Puthikorn Voravootivat
2014-02-04 19:21 ` Grant Grundler
2014-02-05 18:27 ` Jens Axboe
2014-02-05 18:31 ` Grant Grundler
2014-02-05 19:30 ` Jens Axboe
2014-02-05 21:31 ` Grant Grundler
2014-02-05 21:38 ` Jens Axboe
2014-02-04 18:56 ` Puthikorn Voravootivat
2014-02-05 17:15 ` Jens Axboe
2014-02-05 18:16 ` Grant Grundler
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.