All of lore.kernel.org
 help / color / mirror / Atom feed
* Meta verification regression starting with fio 2.1.5
@ 2014-05-06 18:25 Stoo Davies
  2014-05-06 21:59 ` Jens Axboe
  0 siblings, 1 reply; 4+ messages in thread
From: Stoo Davies @ 2014-05-06 18:25 UTC (permalink / raw)
  To: fio

I'm doing some powerfail recovery testing on a storage array over iSCSI.
Host is RHEL 6.4 kernel 2.6.32-358.el6.x86_64.

With fio 2.1.2 -> 2.1.4 the job file below rides through the disks going 
away, and continues I/O after they come back, without reporting any errors.
With fio 2.1.5 -> 2.1.8 when the disks come back fio immediately reports 
a meta verification error.

I captured a trace with an finisar analyzer, and can see that after the 
disks come back and the host logs back in, a read is issued for an lba 
which was never written to.
Since I don't see verification errors outside of the powerfail testing, 
I suspect fio isn't correctly handling failed writes during the time the 
disks are unavailable.

The trace file is rather large, but I can make it available if you need 
to see it.

[whee]
bs=8k
thread=4
time_based=1
runtime=864000
readwrite=randrw
direct=1
iodepth=128
ioengine=libaio
size=100%
verify=meta
do_verify=1
verify_fatal=1
verify_dump=1
verify_backlog=8192
buffer_compress_percentage=95
ignore_error=ENODEV:EIO,ENODEV:EIO,ENODEV:EIO
filename=/dev/mapper/lun0
.
.
filename=/dev/mapper/lun9


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Meta verification regression starting with fio 2.1.5
  2014-05-06 18:25 Meta verification regression starting with fio 2.1.5 Stoo Davies
@ 2014-05-06 21:59 ` Jens Axboe
  2014-05-07  0:35   ` Stoo Davies
  0 siblings, 1 reply; 4+ messages in thread
From: Jens Axboe @ 2014-05-06 21:59 UTC (permalink / raw)
  To: Stoo Davies, fio

[-- Attachment #1: Type: text/plain, Size: 1557 bytes --]

On 05/06/2014 12:25 PM, Stoo Davies wrote:
> I'm doing some powerfail recovery testing on a storage array over iSCSI.
> Host is RHEL 6.4 kernel 2.6.32-358.el6.x86_64.
> 
> With fio 2.1.2 -> 2.1.4 the job file below rides through the disks going
> away, and continues I/O after they come back, without reporting any errors.
> With fio 2.1.5 -> 2.1.8 when the disks come back fio immediately reports
> a meta verification error.
> 
> I captured a trace with an finisar analyzer, and can see that after the
> disks come back and the host logs back in, a read is issued for an lba
> which was never written to.
> Since I don't see verification errors outside of the powerfail testing,
> I suspect fio isn't correctly handling failed writes during the time the
> disks are unavailable.
> 
> The trace file is rather large, but I can make it available if you need
> to see it.
> 
> [whee]
> bs=8k
> thread=4
> time_based=1
> runtime=864000
> readwrite=randrw
> direct=1
> iodepth=128
> ioengine=libaio
> size=100%
> verify=meta
> do_verify=1
> verify_fatal=1
> verify_dump=1
> verify_backlog=8192
> buffer_compress_percentage=95
> ignore_error=ENODEV:EIO,ENODEV:EIO,ENODEV:EIO
> filename=/dev/mapper/lun0
> .
> .
> filename=/dev/mapper/lun9

2.1.5 did indeed change when the IO was logged for verification, so that
does explain why it fails for you now. That's a problem.

Can you try with this patch? I'm not going to commit it yet, I want to
carefully audit all paths to ensure we also unlog or trim an io_piece,
if we don't fully complete it.

-- 
Jens Axboe


[-- Attachment #2: unlog.patch --]
[-- Type: text/x-patch, Size: 2301 bytes --]

diff --git a/backend.c b/backend.c
index 9deef284e36b..e6a47716094f 100644
--- a/backend.c
+++ b/backend.c
@@ -780,6 +780,7 @@ static uint64_t do_io(struct thread_data *td)
 		case FIO_Q_COMPLETED:
 			if (io_u->error) {
 				ret = -io_u->error;
+				unlog_io_piece(td, io_u);
 				clear_io_u(td, io_u);
 			} else if (io_u->resid) {
 				int bytes = io_u->xfer_buflen - io_u->resid;
@@ -830,6 +831,7 @@ sync_done:
 			bytes_issued += io_u->xfer_buflen;
 			break;
 		case FIO_Q_BUSY:
+			unlog_io_piece(td, io_u);
 			requeue_io_u(td, &io_u);
 			ret2 = td_io_commit(td);
 			if (ret2 < 0)
diff --git a/io_u.c b/io_u.c
index 4b0b5a7a11bd..e132fd9d2d98 100644
--- a/io_u.c
+++ b/io_u.c
@@ -1622,8 +1622,15 @@ static void io_completed(struct thread_data *td, struct io_u *io_u,
 	 * Mark IO ok to verify
 	 */
 	if (io_u->ipo) {
-		io_u->ipo->flags &= ~IP_F_IN_FLIGHT;
-		write_barrier();
+		/*
+		 * Remove errored entry from the verification list
+		 */
+		if (io_u->error)
+			unlog_io_piece(td, io_u);
+		else {
+			io_u->ipo->flags &= ~IP_F_IN_FLIGHT;
+			write_barrier();
+		}
 	}
 
 	td_io_u_unlock(td);
diff --git a/iolog.c b/iolog.c
index f49895929c34..33ec07afed4e 100644
--- a/iolog.c
+++ b/iolog.c
@@ -268,6 +268,23 @@ restart:
 	td->io_hist_len++;
 }
 
+void unlog_io_piece(struct thread_data *td, struct io_u *io_u)
+{
+	struct io_piece *ipo = io_u->ipo;
+
+	if (!ipo)
+		return;
+
+	if (ipo->flags & IP_F_ONRB)
+		rb_erase(&ipo->rb_node, &td->io_hist_tree);
+	else if (ipo->flags & IP_F_ONLIST)
+		flist_del(&ipo->list);
+
+	free(ipo);
+	io_u->ipo = NULL;
+	td->io_hist_len--;
+}
+
 void write_iolog_close(struct thread_data *td)
 {
 	fflush(td->iolog_f);
diff --git a/iolog.h b/iolog.h
index 50d09e26bfbe..8b1d5880da74 100644
--- a/iolog.h
+++ b/iolog.h
@@ -110,6 +110,7 @@ extern void log_io_u(struct thread_data *, struct io_u *);
 extern void log_file(struct thread_data *, struct fio_file *, enum file_log_act);
 extern int __must_check init_iolog(struct thread_data *td);
 extern void log_io_piece(struct thread_data *, struct io_u *);
+extern void unlog_io_piece(struct thread_data *, struct io_u *);
 extern void queue_io_piece(struct thread_data *, struct io_piece *);
 extern void prune_io_piece_log(struct thread_data *);
 extern void write_iolog_close(struct thread_data *);

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: Meta verification regression starting with fio 2.1.5
  2014-05-06 21:59 ` Jens Axboe
@ 2014-05-07  0:35   ` Stoo Davies
  2014-05-07  1:10     ` Jens Axboe
  0 siblings, 1 reply; 4+ messages in thread
From: Stoo Davies @ 2014-05-07  0:35 UTC (permalink / raw)
  To: Jens Axboe, fio

On 05/06/2014 02:59 PM, Jens Axboe wrote:
> 2.1.5 did indeed change when the IO was logged for verification, so that
> does explain why it fails for you now. That's a problem.
>
> Can you try with this patch? I'm not going to commit it yet, I want to
> carefully audit all paths to ensure we also unlog or trim an io_piece,
> if we don't fully complete it.
>
10 loops through the powerfail test and all the hosts are still happy.
Looks good to me, thanks for the fast response.

Stoo



^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Meta verification regression starting with fio 2.1.5
  2014-05-07  0:35   ` Stoo Davies
@ 2014-05-07  1:10     ` Jens Axboe
  0 siblings, 0 replies; 4+ messages in thread
From: Jens Axboe @ 2014-05-07  1:10 UTC (permalink / raw)
  To: Stoo Davies, fio

On 2014-05-06 18:35, Stoo Davies wrote:
> On 05/06/2014 02:59 PM, Jens Axboe wrote:
>> 2.1.5 did indeed change when the IO was logged for verification, so that
>> does explain why it fails for you now. That's a problem.
>>
>> Can you try with this patch? I'm not going to commit it yet, I want to
>> carefully audit all paths to ensure we also unlog or trim an io_piece,
>> if we don't fully complete it.
>>
> 10 loops through the powerfail test and all the hosts are still happy.
> Looks good to me, thanks for the fast response.

Thanks a lot for reporting it, I have now committed a patch that is 
basically what I sent you, but catches the partial completion case as well.

-- 
Jens Axboe



^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2014-05-07  1:10 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-05-06 18:25 Meta verification regression starting with fio 2.1.5 Stoo Davies
2014-05-06 21:59 ` Jens Axboe
2014-05-07  0:35   ` Stoo Davies
2014-05-07  1:10     ` Jens Axboe

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.