All of lore.kernel.org
 help / color / mirror / Atom feed
* fio offset with ba
@ 2017-10-20 19:08 Jeff Furlong
  2017-10-20 21:01 ` Sitsofe Wheeler
  0 siblings, 1 reply; 10+ messages in thread
From: Jeff Furlong @ 2017-10-20 19:08 UTC (permalink / raw)
  To: fio

Hi All,
I don't quite follow the logic in the calculate offset function.  The offset parameter recently allows a percentage.  Suppose we set it to 50% and want to block align the IO's starting at 50% of device capacity, then block aligned to 8KB.

# fio -version
fio-3.1-60-g71aa

# blockdev --getsize64 /dev/nvme1n1
3200631791616

# fio --name=test_job --ioengine=libaio --direct=1 --rw=read --iodepth=1 --size=100% --bs=4k --filename=/dev/nvme1n1 --runtime=1s --offset=50% --log_offset=1 --write_iops_log=test_job --ba=8k

# cat test_job_iops.1.log
0, 1, 0, 4096, 1600315895808
0, 1, 0, 4096, 1600315899904
0, 1, 0, 4096, 1600315904000
0, 1, 0, 4096, 1600315908096

So we can see the device has 3200631791616 bytes, 50% of which is 1600315895808 bytes, which happens to be 4KB aligned, but not 8KB aligned.  Even though we set the --ba=8k parameter, the offset LBA as logged in the iops.1.log shows 4KB alignment.  Does --ba work for all IO's or only random IO's?  If all, does get_start_offset() control the raw offset value?  I don't see why the min(ba, bs) is used in the calculation, but perhaps I am missing something.  Thanks.

Regards,
Jeff


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: fio offset with ba
  2017-10-20 19:08 fio offset with ba Jeff Furlong
@ 2017-10-20 21:01 ` Sitsofe Wheeler
  2017-10-21  0:13   ` Jeff Furlong
  0 siblings, 1 reply; 10+ messages in thread
From: Sitsofe Wheeler @ 2017-10-20 21:01 UTC (permalink / raw)
  To: Jeff Furlong; +Cc: fio

Hi,

On 20 October 2017 at 20:08, Jeff Furlong <jeff.furlong@wdc.com> wrote:
>
> I don't quite follow the logic in the calculate offset function.  The offset parameter recently allows a percentage.  Suppose we set it to 50% and want to block align the IO's starting at 50% of device capacity, then block aligned to 8KB.
>
> # fio -version
> fio-3.1-60-g71aa
>
> # blockdev --getsize64 /dev/nvme1n1
> 3200631791616
>
> # fio --name=test_job --ioengine=libaio --direct=1 --rw=read --iodepth=1 --size=100% --bs=4k --filename=/dev/nvme1n1 --runtime=1s --offset=50% --log_offset=1 --write_iops_log=test_job --ba=8k
>
> # cat test_job_iops.1.log
> 0, 1, 0, 4096, 1600315895808
> 0, 1, 0, 4096, 1600315899904
> 0, 1, 0, 4096, 1600315904000
> 0, 1, 0, 4096, 1600315908096
>
> So we can see the device has 3200631791616 bytes, 50% of which is 1600315895808 bytes, which happens to be 4KB aligned, but not 8KB aligned.  Even though we set the --ba=8k parameter, the offset LBA as logged in the iops.1.log shows

Hmm I see the same problem with this job:
fio --name=test_job --ioengine=null --rw=read --iodepth=1
--size=3200631791616 --bs=4k --number_ios=1 --offset=50% --ba=8k
--debug=io

[...]
io       15013 fill_io_u: io_u 0x236ad80:
off=1600315895808/len=4096/ddir=0io       15013 /test_job.0.0io
15013

I think your guess about only impacting random I/O is probably right because
fio --name=test_job --randrepeat=0 --ioengine=null --rw=randread
--iodepth=1 --size=3200631791616 --bs=4k --number_ios=1 --offset=50%
--ba=8k --debug=io

picks offsets that are 8k aligned.

> 4KB alignment.  Does --ba work for all IO's or only random IO's?  If all, does get_start_offset() control the raw offset value?  I don't see why the min(ba, bs) is used in the calculation, but perhaps I am missing something.  Thanks.

Where is min(ba, bs) done - do you mean the bits around
https://github.com/axboe/fio/commit/89978a6b26f81bdbd63228e2e2a86f604ee46c56#diff-4abbf037246dd2e450dc3f6a2ac77180R845?
I agree you probably want to take the maximum of all the block
alignments but what if one of the smaller ones is not a multiple of
the largest one?

Would you like to propose a patch?

-- 
Sitsofe | http://sucs.org/~sits/

^ permalink raw reply	[flat|nested] 10+ messages in thread

* RE: fio offset with ba
  2017-10-20 21:01 ` Sitsofe Wheeler
@ 2017-10-21  0:13   ` Jeff Furlong
  2017-10-21  4:21     ` Sitsofe Wheeler
  0 siblings, 1 reply; 10+ messages in thread
From: Jeff Furlong @ 2017-10-21  0:13 UTC (permalink / raw)
  To: Sitsofe Wheeler; +Cc: fio

Yes, the bits around line 845 show the min() being used.  It seems get_start_offset() is used for the first IO and subsequent IO's, making things difficult.  I believe the correct fix here would be to set the ba specific to the io type.  The ba parameter allows for read/write/trim alignments.  So if io_u->ddir==DDIR_READ then we could just align to o->ba[DDIR_READ].  But I don't see how we would access io_u at this point, we don't know of the potential io_u at this time?  It could be a mixed read/write/trim.

Lacking that, brute force would suggest:

		if (fio_option_is_set(o, ba)) {	
			align_bs = (unsigned long long) o->ba[DDIR_READ];
			if(o->ba[DDIR_READ] != o->ba[DDIR_WRITE])
				align_bs = (unsigned long long) o->ba[DDIR_READ] * (unsigned long long) o->ba[DDIR_WRITE];
			if(align_bs != (unsigned long long) o->ba[DDIR_TRIM])
				align_bs = align_bs * (unsigned long long) o->ba[DDIR_TRIM];

But I see another problem in that o->ba[DDIR_READ] is not set for sequential workloads.  In fixup_options() we have:

	if (!o->ba[DDIR_READ] || !td_random(td))
		o->ba[DDIR_READ] = o->min_bs[DDIR_READ];
	if (!o->ba[DDIR_WRITE] || !td_random(td))
		o->ba[DDIR_WRITE] = o->min_bs[DDIR_WRITE];
	if (!o->ba[DDIR_TRIM] || !td_random(td))
		o->ba[DDIR_TRIM] = o->min_bs[DDIR_TRIM];

I don't follow that code, other than if sequential, set ba to min_bs?  If we remove that code and use above change, we can get the starting LBA to be aligned to the ba in the case of 

fio --name=test_job --ioengine=libaio --direct=1 --rw=read --iodepth=1 --size=100% --bs=4k --filename=/dev/nvme1n1 --number_ios=8 --offset=50% --log_offset=1 --write_iops_log=test_job --ba=8k

# cat test_job_iops.1.log
0, 1, 0, 4096, 1600315899904

It seems to be a hack, so I didn't create a patch for it.  Would like to better understand what I'm missing before breaking something.

Thanks.

Regards,
Jeff


-----Original Message-----
From: Sitsofe Wheeler [mailto:sitsofe@gmail.com] 
Sent: Friday, October 20, 2017 2:01 PM
To: Jeff Furlong <jeff.furlong@wdc.com>
Cc: fio@vger.kernel.org
Subject: Re: fio offset with ba

Hi,

On 20 October 2017 at 20:08, Jeff Furlong <jeff.furlong@wdc.com> wrote:
>
> I don't quite follow the logic in the calculate offset function.  The offset parameter recently allows a percentage.  Suppose we set it to 50% and want to block align the IO's starting at 50% of device capacity, then block aligned to 8KB.
>
> # fio -version
> fio-3.1-60-g71aa
>
> # blockdev --getsize64 /dev/nvme1n1
> 3200631791616
>
> # fio --name=test_job --ioengine=libaio --direct=1 --rw=read 
> --iodepth=1 --size=100% --bs=4k --filename=/dev/nvme1n1 --runtime=1s 
> --offset=50% --log_offset=1 --write_iops_log=test_job --ba=8k
>
> # cat test_job_iops.1.log
> 0, 1, 0, 4096, 1600315895808
> 0, 1, 0, 4096, 1600315899904
> 0, 1, 0, 4096, 1600315904000
> 0, 1, 0, 4096, 1600315908096
>
> So we can see the device has 3200631791616 bytes, 50% of which is 
> 1600315895808 bytes, which happens to be 4KB aligned, but not 8KB 
> aligned.  Even though we set the --ba=8k parameter, the offset LBA as 
> logged in the iops.1.log shows

Hmm I see the same problem with this job:
fio --name=test_job --ioengine=null --rw=read --iodepth=1
--size=3200631791616 --bs=4k --number_ios=1 --offset=50% --ba=8k --debug=io

[...]
io       15013 fill_io_u: io_u 0x236ad80:
off=1600315895808/len=4096/ddir=0io       15013 /test_job.0.0io
15013

I think your guess about only impacting random I/O is probably right because fio --name=test_job --randrepeat=0 --ioengine=null --rw=randread
--iodepth=1 --size=3200631791616 --bs=4k --number_ios=1 --offset=50% --ba=8k --debug=io

picks offsets that are 8k aligned.

> 4KB alignment.  Does --ba work for all IO's or only random IO's?  If all, does get_start_offset() control the raw offset value?  I don't see why the min(ba, bs) is used in the calculation, but perhaps I am missing something.  Thanks.

Where is min(ba, bs) done - do you mean the bits around https://github.com/axboe/fio/commit/89978a6b26f81bdbd63228e2e2a86f604ee46c56#diff-4abbf037246dd2e450dc3f6a2ac77180R845?
I agree you probably want to take the maximum of all the block alignments but what if one of the smaller ones is not a multiple of the largest one?

Would you like to propose a patch?

--
Sitsofe | http://sucs.org/~sits/

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: fio offset with ba
  2017-10-21  0:13   ` Jeff Furlong
@ 2017-10-21  4:21     ` Sitsofe Wheeler
  2017-10-23 23:56       ` Jeff Furlong
  0 siblings, 1 reply; 10+ messages in thread
From: Sitsofe Wheeler @ 2017-10-21  4:21 UTC (permalink / raw)
  To: Jeff Furlong; +Cc: fio, Jens Axboe

Hmm,

I see a few choices for trying to solve alignment of the offset and
forced alignment of sequential blocks:
* Introduce an "offset_align" option solely for aligning the "initial"
offset and stop trying to use blockalign in the offset. Document it
well to stop confusion with blockalign.
* Document blockalign as ONLY working for random I/O or
* Do blockalign for sequential I/O on a per I/O basis (potentially
creating gappy I/O). This is also the request from
https://github.com/axboe/fio/issues/341 . Technically this is has a
big overlap with zoned I/O and the existing "generated offset" used in
rw option but perhaps it's just different enough to be justified?

Thoughts?

On 21 October 2017 at 01:13, Jeff Furlong <jeff.furlong@wdc.com> wrote:
> Yes, the bits around line 845 show the min() being used.  It seems get_start_offset() is used for the first IO and subsequent IO's, making things difficult.  I believe the correct fix here would be to set the ba specific to the io type.  The ba parameter allows for read/write/trim alignments.  So if io_u->ddir==DDIR_READ then we could just align to o->ba[DDIR_READ].  But I don't see how we would access io_u at this point, we don't know of the potential io_u at this time?  It could be a mixed read/write/trim.
>
> Lacking that, brute force would suggest:
>
>                 if (fio_option_is_set(o, ba)) {
>                         align_bs = (unsigned long long) o->ba[DDIR_READ];
>                         if(o->ba[DDIR_READ] != o->ba[DDIR_WRITE])
>                                 align_bs = (unsigned long long) o->ba[DDIR_READ] * (unsigned long long) o->ba[DDIR_WRITE];
>                         if(align_bs != (unsigned long long) o->ba[DDIR_TRIM])
>                                 align_bs = align_bs * (unsigned long long) o->ba[DDIR_TRIM];
>
> But I see another problem in that o->ba[DDIR_READ] is not set for sequential workloads.  In fixup_options() we have:
>
>         if (!o->ba[DDIR_READ] || !td_random(td))
>                 o->ba[DDIR_READ] = o->min_bs[DDIR_READ];
>         if (!o->ba[DDIR_WRITE] || !td_random(td))
>                 o->ba[DDIR_WRITE] = o->min_bs[DDIR_WRITE];
>         if (!o->ba[DDIR_TRIM] || !td_random(td))
>                 o->ba[DDIR_TRIM] = o->min_bs[DDIR_TRIM];
>
> I don't follow that code, other than if sequential, set ba to min_bs?  If we remove that code and use above change, we can get the starting LBA to be aligned to the ba in the case of
>
> fio --name=test_job --ioengine=libaio --direct=1 --rw=read --iodepth=1 --size=100% --bs=4k --filename=/dev/nvme1n1 --number_ios=8 --offset=50% --log_offset=1 --write_iops_log=test_job --ba=8k
>
> # cat test_job_iops.1.log
> 0, 1, 0, 4096, 1600315899904
>
> It seems to be a hack, so I didn't create a patch for it.  Would like to better understand what I'm missing before breaking something.
>
> Thanks.
>
> Regards,
> Jeff
>
>
> -----Original Message-----
> From: Sitsofe Wheeler [mailto:sitsofe@gmail.com]
> Sent: Friday, October 20, 2017 2:01 PM
> To: Jeff Furlong <jeff.furlong@wdc.com>
> Cc: fio@vger.kernel.org
> Subject: Re: fio offset with ba
>
> Hi,
>
> On 20 October 2017 at 20:08, Jeff Furlong <jeff.furlong@wdc.com> wrote:
>>
>> I don't quite follow the logic in the calculate offset function.  The offset parameter recently allows a percentage.  Suppose we set it to 50% and want to block align the IO's starting at 50% of device capacity, then block aligned to 8KB.
>>
>> # fio -version
>> fio-3.1-60-g71aa
>>
>> # blockdev --getsize64 /dev/nvme1n1
>> 3200631791616
>>
>> # fio --name=test_job --ioengine=libaio --direct=1 --rw=read
>> --iodepth=1 --size=100% --bs=4k --filename=/dev/nvme1n1 --runtime=1s
>> --offset=50% --log_offset=1 --write_iops_log=test_job --ba=8k
>>
>> # cat test_job_iops.1.log
>> 0, 1, 0, 4096, 1600315895808
>> 0, 1, 0, 4096, 1600315899904
>> 0, 1, 0, 4096, 1600315904000
>> 0, 1, 0, 4096, 1600315908096
>>
>> So we can see the device has 3200631791616 bytes, 50% of which is
>> 1600315895808 bytes, which happens to be 4KB aligned, but not 8KB
>> aligned.  Even though we set the --ba=8k parameter, the offset LBA as
>> logged in the iops.1.log shows
>
> Hmm I see the same problem with this job:
> fio --name=test_job --ioengine=null --rw=read --iodepth=1
> --size=3200631791616 --bs=4k --number_ios=1 --offset=50% --ba=8k --debug=io
>
> [...]
> io       15013 fill_io_u: io_u 0x236ad80:
> off=1600315895808/len=4096/ddir=0io       15013 /test_job.0.0io
> 15013
>
> I think your guess about only impacting random I/O is probably right because fio --name=test_job --randrepeat=0 --ioengine=null --rw=randread
> --iodepth=1 --size=3200631791616 --bs=4k --number_ios=1 --offset=50% --ba=8k --debug=io
>
> picks offsets that are 8k aligned.
>
>> 4KB alignment.  Does --ba work for all IO's or only random IO's?  If all, does get_start_offset() control the raw offset value?  I don't see why the min(ba, bs) is used in the calculation, but perhaps I am missing something.  Thanks.
>
> Where is min(ba, bs) done - do you mean the bits around https://github.com/axboe/fio/commit/89978a6b26f81bdbd63228e2e2a86f604ee46c56#diff-4abbf037246dd2e450dc3f6a2ac77180R845?
> I agree you probably want to take the maximum of all the block alignments but what if one of the smaller ones is not a multiple of the largest one?
>
> Would you like to propose a patch?
>
> --
> Sitsofe | http://sucs.org/~sits/

-- 
Sitsofe | http://sucs.org/~sits/


^ permalink raw reply	[flat|nested] 10+ messages in thread

* RE: fio offset with ba
  2017-10-21  4:21     ` Sitsofe Wheeler
@ 2017-10-23 23:56       ` Jeff Furlong
  2017-10-24  6:21         ` Sitsofe Wheeler
  0 siblings, 1 reply; 10+ messages in thread
From: Jeff Furlong @ 2017-10-23 23:56 UTC (permalink / raw)
  To: Sitsofe Wheeler; +Cc: fio, Jens Axboe

>Hmm,
>
>I see a few choices for trying to solve alignment of the offset and forced alignment of sequential blocks:
>* Introduce an "offset_align" option solely for aligning the "initial"
>offset and stop trying to use blockalign in the offset. Document it well to stop confusion with blockalign.
>* Document blockalign as ONLY working for random I/O or
>* Do blockalign for sequential I/O on a per I/O basis (potentially creating gappy I/O). This is also the request from
>https://github.com/axboe/fio/issues/341 . Technically this is has a big overlap with zoned I/O and the existing "generated offset" used in rw >option but perhaps it's just different enough to be justified?
>
>Thoughts?

I tried the first approach.  Second approach seems set on random I/O already.  Third approach seems problematic with gappy I/O, unless you really want to skip some LBAs (possible, just not my intention on this workload).

# fio --name=test_job --ioengine=libaio --direct=1 --rw=read --iodepth=1 --size=100% --bs=4k --filename=/dev/nvme1n1 --number_ios=8 --offset=50% --log_offset=1 --write_iops_log=test_job --offset_align=8k

# cat test_job_iops.1.log 
0, 1, 0, 4096, 1600315899904
0, 1, 0, 4096, 1600315904000
0, 1, 0, 4096, 1600315908096
0, 1, 0, 4096, 1600315912192
0, 1, 0, 4096, 1600315916288
0, 1, 0, 4096, 1600315920384
0, 1, 0, 4096, 1600315924480
0, 1, 0, 4096, 1600315928576



diff --git a/HOWTO b/HOWTO
index 22a5849..c611fae 100644
--- a/HOWTO
+++ b/HOWTO
@@ -1128,13 +1128,19 @@ I/O type
 .. option:: offset=int
 
 	Start I/O at the provided offset in the file, given as either a fixed size in
-	bytes or a percentage. If a percentage is given, the next ``blockalign``-ed
+	bytes or a percentage. If a percentage is given, the next ``bs`` ``blockalign``-ed
 	offset will be used. Data before the given offset will not be touched. This
 	effectively caps the file size at `real_size - offset`. Can be combined with
 	:option:`size` to constrain the start and end range of the I/O workload.
 	A percentage can be specified by a number between 1 and 100 followed by '%',
 	for example, ``offset=20%`` to specify 20%.
 
+.. option:: offset_align=int
+
+	If a precentage offset is given, the provided offset is ``blockalign``-ed to
+	the ``blocksize``. This value will align the initial I/O to a new alignment.
+	Applies to sequential workloads only.
+
 .. option:: offset_increment=int
 
 	If this is provided, then the real offset becomes `offset + offset_increment
diff --git a/cconv.c b/cconv.c
index f809fd5..dc3c4e6 100644
--- a/cconv.c
+++ b/cconv.c
@@ -105,6 +105,7 @@ void convert_thread_options_to_cpu(struct thread_options *o,
 	o->file_size_low = le64_to_cpu(top->file_size_low);
 	o->file_size_high = le64_to_cpu(top->file_size_high);
 	o->start_offset = le64_to_cpu(top->start_offset);
+	o->start_offset_align = le64_to_cpu(top->start_offset_align);
 	o->start_offset_percent = le32_to_cpu(top->start_offset_percent);
 
 	for (i = 0; i < DDIR_RWDIR_CNT; i++) {
@@ -548,6 +549,7 @@ void convert_thread_options_to_net(struct thread_options_pack *top,
 	top->file_size_low = __cpu_to_le64(o->file_size_low);
 	top->file_size_high = __cpu_to_le64(o->file_size_high);
 	top->start_offset = __cpu_to_le64(o->start_offset);
+	top->start_offset_align = __cpu_to_le64(o->start_offset_align);
 	top->start_offset_percent = __cpu_to_le32(o->start_offset_percent);
 	top->trim_backlog = __cpu_to_le64(o->trim_backlog);
 	top->offset_increment = __cpu_to_le64(o->offset_increment);
diff --git a/filesetup.c b/filesetup.c
index 7a602d4..97535e7 100644
--- a/filesetup.c
+++ b/filesetup.c
@@ -869,10 +869,15 @@ uint64_t get_start_offset(struct thread_data *td, struct fio_file *f)
 
 	if (o->start_offset_percent > 0) {
 		/*
+		 * if offset_align is provided, set initial offset
+		 */
+		if (fio_option_is_set(o, start_offset_align)) {
+			align_bs = o->start_offset_align;
+		/*
 		 * if blockalign is provided, find the min across read, write,
 		 * and trim
 		 */
-		if (fio_option_is_set(o, ba)) {
+		} else if (fio_option_is_set(o, ba)) {
 			align_bs = (unsigned long long) min(o->ba[DDIR_READ], o->ba[DDIR_WRITE]);
 			align_bs = min((unsigned long long) o->ba[DDIR_TRIM], align_bs);
 		} else {
diff --git a/fio.1 b/fio.1
index 7787ef2..098ca2e 100644
--- a/fio.1
+++ b/fio.1
@@ -913,13 +913,18 @@ should be associated with them.
 .TP
 .BI offset \fR=\fPint
 Start I/O at the provided offset in the file, given as either a fixed size in
-bytes or a percentage. If a percentage is given, the next \fBblockalign\fR\-ed
+bytes or a percentage. If a percentage is given, the next \fBbs\fR \fBblockalign\fR\-ed
 offset will be used. Data before the given offset will not be touched. This
 effectively caps the file size at `real_size \- offset'. Can be combined with
 \fBsize\fR to constrain the start and end range of the I/O workload.
 A percentage can be specified by a number between 1 and 100 followed by '%',
 for example, `offset=20%' to specify 20%.
 .TP
+.BI offset_align \fR=\fPint
+If a precentage offset is given, the provided offset is \fBblockalign\fR\-ed to
+the \fBblocksize\fR. This value will align the initial I/O to a new alignment.
+Applies to sequential workloads only.
+.TP
 .BI offset_increment \fR=\fPint
 If this is provided, then the real offset becomes `\fBoffset\fR + \fBoffset_increment\fR
 * thread_number', where the thread number is a counter that starts at 0 and
diff --git a/options.c b/options.c
index ddcc4e5..a7821d7 100644
--- a/options.c
+++ b/options.c
@@ -2019,6 +2019,17 @@ struct fio_option fio_options[FIO_MAX_OPTS] = {
 		.group	= FIO_OPT_G_INVALID,
 	},
 	{
+		.name	= "offset_align",
+		.lname	= "IO offset alignment",
+		.type	= FIO_OPT_STR_VAL,
+		.off1	= offsetof(struct thread_options, start_offset_align),
+		.help	= "Start IO from this offset alignment",
+		.def	= "0",
+		.interval = 1024 * 1024,
+		.category = FIO_OPT_C_IO,
+		.group	= FIO_OPT_G_INVALID,
+	},
+	{
 		.name	= "offset_increment",
 		.lname	= "IO offset increment",
 		.type	= FIO_OPT_STR_VAL,
diff --git a/thread_options.h b/thread_options.h
index 1813cdc..5a037bf 100644
--- a/thread_options.h
+++ b/thread_options.h
@@ -78,6 +78,7 @@ struct thread_options {
 	unsigned long long file_size_low;
 	unsigned long long file_size_high;
 	unsigned long long start_offset;
+	unsigned long long start_offset_align;
 
 	unsigned int bs[DDIR_RWDIR_CNT];
 	unsigned int ba[DDIR_RWDIR_CNT];
@@ -355,6 +356,7 @@ struct thread_options_pack {
 	uint64_t file_size_low;
 	uint64_t file_size_high;
 	uint64_t start_offset;
+	uint64_t start_offset_align;
 
 	uint32_t bs[DDIR_RWDIR_CNT];
 	uint32_t ba[DDIR_RWDIR_CNT];



Regards,
Jeff

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: fio offset with ba
  2017-10-23 23:56       ` Jeff Furlong
@ 2017-10-24  6:21         ` Sitsofe Wheeler
  2017-10-26  0:27           ` Jeff Furlong
  0 siblings, 1 reply; 10+ messages in thread
From: Sitsofe Wheeler @ 2017-10-24  6:21 UTC (permalink / raw)
  To: Jeff Furlong; +Cc: fio, Jens Axboe

On 24 October 2017 at 00:56, Jeff Furlong <jeff.furlong@wdc.com> wrote:
>
> diff --git a/HOWTO b/HOWTO
> index 22a5849..c611fae 100644
> --- a/HOWTO
> +++ b/HOWTO
> @@ -1128,13 +1128,19 @@ I/O type
>  .. option:: offset=int
>
>         Start I/O at the provided offset in the file, given as either a fixed size in
> -       bytes or a percentage. If a percentage is given, the next ``blockalign``-ed
> +       bytes or a percentage. If a percentage is given, the next ``bs`` ``blockalign``-ed

I'd stop referencing blockalign here to avoid confusion and say something like:

When a percentage is given, the generated offset will be aligned up to
the minimum ``blocksize`` or to the value of ``offset_align``.

>         offset will be used. Data before the given offset will not be touched. This
>         effectively caps the file size at `real_size - offset`. Can be combined with
>         :option:`size` to constrain the start and end range of the I/O workload.
>         A percentage can be specified by a number between 1 and 100 followed by '%',
>         for example, ``offset=20%`` to specify 20%.
>
> +.. option:: offset_align=int
> +
> +       If a precentage offset is given, the provided offset is ``blockalign``-ed to
> +       the ``blocksize``. This value will align the initial I/O to a new alignment
> +       Applies to sequential workloads only.

Actually it will should apply to all workloads because a random
workload won't be able to generate an offset that is less than the
aligned offset. I'd go with something like the following:

If set to a non-zero number, the value generated by a percentage
``offset`` is aligned upwards to this value. Defaults to 0 meaning
that a percentage offset is aligned to the minimum block size.

> --- a/filesetup.c
> +++ b/filesetup.c
> @@ -869,10 +869,15 @@ uint64_t get_start_offset(struct thread_data *td, struct fio_file *f)
>
>         if (o->start_offset_percent > 0) {
>                 /*
> +                * if offset_align is provided, set initial offset
> +                */
> +               if (fio_option_is_set(o, start_offset_align)) {
> +                       align_bs = o->start_offset_align;
> +               /*
>                  * if blockalign is provided, find the min across read, write,
>                  * and trim
>                  */
> -               if (fio_option_is_set(o, ba)) {
> +               } else if (fio_option_is_set(o, ba)) {
>                         align_bs = (unsigned long long) min(o->ba[DDIR_READ], o->ba[DDIR_WRITE]);
>                         align_bs = min((unsigned long long) o->ba[DDIR_TRIM], align_bs);

Why check ba here? I don't think we get anything from trying to align
the offset to blockalign given its behaviour.

>                 } else {
> diff --git a/fio.1 b/fio.1
> index 7787ef2..098ca2e 100644
> --- a/fio.1
> +++ b/fio.1
> @@ -913,13 +913,18 @@ should be associated with them.
>  .TP
>  .BI offset \fR=\fPint
>  Start I/O at the provided offset in the file, given as either a fixed size in
> -bytes or a percentage. If a percentage is given, the next \fBblockalign\fR\-ed
> +bytes or a percentage. If a percentage is given, the next \fBbs\fR \fBblockalign\fR\-ed
>  offset will be used. Data before the given offset will not be touched. This
>  effectively caps the file size at `real_size \- offset'. Can be combined with
>  \fBsize\fR to constrain the start and end range of the I/O workload.
>  A percentage can be specified by a number between 1 and 100 followed by '%',
>  for example, `offset=20%' to specify 20%.

See above for documentation suggestions.

>  .TP
> +.BI offset_align \fR=\fPint
> +If a precentage offset is given, the provided offset is \fBblockalign\fR\-ed to
> +the \fBblocksize\fR. This value will align the initial I/O to a new alignment.
> +Applies to sequential workloads only.
> +.TP
>  .BI offset_increment \fR=\fPint
>  If this is provided, then the real offset becomes `\fBoffset\fR + \fBoffset_increment\fR
>  * thread_number', where the thread number is a counter that starts at 0 and
> diff --git a/options.c b/options.c
> index ddcc4e5..a7821d7 100644
> --- a/options.c
> +++ b/options.c
> @@ -2019,6 +2019,17 @@ struct fio_option fio_options[FIO_MAX_OPTS] = {
>                 .group  = FIO_OPT_G_INVALID,
>         },
>         {
> +               .name   = "offset_align",
> +               .lname  = "IO offset alignment",
> +               .type   = FIO_OPT_STR_VAL,

Isn't FIO_OPT_INT more appropriate?

> +               .off1   = offsetof(struct thread_options, start_offset_align),
> +               .help   = "Start IO from this offset alignment",
> +               .def    = "0",
> +               .interval = 1024 * 1024,

Actually here's a question - what does .interval do?

> +               .category = FIO_OPT_C_IO,
> +               .group  = FIO_OPT_G_INVALID,
> +       },
> +       {
>                 .name   = "offset_increment",
>                 .lname  = "IO offset increment",
>                 .type   = FIO_OPT_STR_VAL,
> diff --git a/thread_options.h b/thread_options.h
> index 1813cdc..5a037bf 100644
> --- a/thread_options.h
> +++ b/thread_options.h
> @@ -78,6 +78,7 @@ struct thread_options {
>         unsigned long long file_size_low;
>         unsigned long long file_size_high;
>         unsigned long long start_offset;
> +       unsigned long long start_offset_align;
>
>         unsigned int bs[DDIR_RWDIR_CNT];
>         unsigned int ba[DDIR_RWDIR_CNT];
> @@ -355,6 +356,7 @@ struct thread_options_pack {
>         uint64_t file_size_low;
>         uint64_t file_size_high;
>         uint64_t start_offset;
> +       uint64_t start_offset_align;
>
>         uint32_t bs[DDIR_RWDIR_CNT];
>         uint32_t ba[DDIR_RWDIR_CNT];
>
>
>
> Regards,
> Jeff

Thanks for trying this out Jeff!

-- 
Sitsofe | http://sucs.org/~sits/


^ permalink raw reply	[flat|nested] 10+ messages in thread

* RE: fio offset with ba
  2017-10-24  6:21         ` Sitsofe Wheeler
@ 2017-10-26  0:27           ` Jeff Furlong
  2017-10-26  4:24             ` Jens Axboe
  0 siblings, 1 reply; 10+ messages in thread
From: Jeff Furlong @ 2017-10-26  0:27 UTC (permalink / raw)
  To: Sitsofe Wheeler; +Cc: fio, Jens Axboe

>I'd stop referencing blockalign here to avoid confusion
Done.

>Actually it will should apply to all workloads because a random workload won't be able to generate an offset that is less than the aligned offset.
The offset_align will only apply if % is used in offset (not byte offset).  Agreed if sequential or random IO, get_start_offset() is invoked for starting LBA.  Applied updates.

>Why check ba here? I don't think we get anything from trying to align the offset to blockalign given its behaviour.
Done.

>Isn't FIO_OPT_INT more appropriate?
Good point, same as blockalign.

>Actually here's a question - what does .interval do?
Looks like it is used to override the default value of 1 in gopt_new_int():
	interval = 1.0;
	if (o->interval)
		interval = o->interval;
	i->spin = gtk_spin_button_new_with_range(o->minval, maxval, interval);
But your larger question may be why do some options have interval (e.g. offset) and some not have interval (e.g. ioengine).

Here is a v2 patch:



diff --git a/HOWTO b/HOWTO
index 22a5849..e7142c5 100644
--- a/HOWTO
+++ b/HOWTO
@@ -1128,13 +1128,20 @@ I/O type
 .. option:: offset=int
 
 	Start I/O at the provided offset in the file, given as either a fixed size in
-	bytes or a percentage. If a percentage is given, the next ``blockalign``-ed
-	offset will be used. Data before the given offset will not be touched. This
+	bytes or a percentage. If a percentage is given, the generated offset will be
+	aligned to the minimum ``blocksize`` or to the value of ``offset_align`` if
+	provided. Data before the given offset will not be touched. This
 	effectively caps the file size at `real_size - offset`. Can be combined with
 	:option:`size` to constrain the start and end range of the I/O workload.
 	A percentage can be specified by a number between 1 and 100 followed by '%',
 	for example, ``offset=20%`` to specify 20%.
 
+.. option:: offset_align=int
+
+	If set to non-zero value, the byte offset generated by a percentage ``offset``
+	is aligned upwards to this value. Defaults to 0 meaning that a percentage
+	offset is aligned to the minimum block size.
+
 .. option:: offset_increment=int
 
 	If this is provided, then the real offset becomes `offset + offset_increment
diff --git a/cconv.c b/cconv.c
index f809fd5..dc3c4e6 100644
--- a/cconv.c
+++ b/cconv.c
@@ -105,6 +105,7 @@ void convert_thread_options_to_cpu(struct thread_options *o,
 	o->file_size_low = le64_to_cpu(top->file_size_low);
 	o->file_size_high = le64_to_cpu(top->file_size_high);
 	o->start_offset = le64_to_cpu(top->start_offset);
+	o->start_offset_align = le64_to_cpu(top->start_offset_align);
 	o->start_offset_percent = le32_to_cpu(top->start_offset_percent);
 
 	for (i = 0; i < DDIR_RWDIR_CNT; i++) {
@@ -548,6 +549,7 @@ void convert_thread_options_to_net(struct thread_options_pack *top,
 	top->file_size_low = __cpu_to_le64(o->file_size_low);
 	top->file_size_high = __cpu_to_le64(o->file_size_high);
 	top->start_offset = __cpu_to_le64(o->start_offset);
+	top->start_offset_align = __cpu_to_le64(o->start_offset_align);
 	top->start_offset_percent = __cpu_to_le32(o->start_offset_percent);
 	top->trim_backlog = __cpu_to_le64(o->trim_backlog);
 	top->offset_increment = __cpu_to_le64(o->offset_increment);
diff --git a/filesetup.c b/filesetup.c
index 7a602d4..5d7ea5c 100644
--- a/filesetup.c
+++ b/filesetup.c
@@ -869,12 +869,10 @@ uint64_t get_start_offset(struct thread_data *td, struct fio_file *f)
 
 	if (o->start_offset_percent > 0) {
 		/*
-		 * if blockalign is provided, find the min across read, write,
-		 * and trim
+		 * if offset_align is provided, set initial offset
 		 */
-		if (fio_option_is_set(o, ba)) {
-			align_bs = (unsigned long long) min(o->ba[DDIR_READ], o->ba[DDIR_WRITE]);
-			align_bs = min((unsigned long long) o->ba[DDIR_TRIM], align_bs);
+		if (fio_option_is_set(o, start_offset_align)) {
+			align_bs = o->start_offset_align;
 		} else {
 			/* else take the minimum block size */
 			align_bs = td_min_bs(td);
diff --git a/fio.1 b/fio.1
index 7787ef2..96d8f11 100644
--- a/fio.1
+++ b/fio.1
@@ -913,13 +913,19 @@ should be associated with them.
 .TP
 .BI offset \fR=\fPint
 Start I/O at the provided offset in the file, given as either a fixed size in
-bytes or a percentage. If a percentage is given, the next \fBblockalign\fR\-ed
-offset will be used. Data before the given offset will not be touched. This
+bytes or a percentage. If a percentage is given, the generated offset will be
+aligned to the minimum \fBblocksize\fR or to the value of \fBoffset_align\fR if
+provided. Data before the given offset will not be touched. This
 effectively caps the file size at `real_size \- offset'. Can be combined with
 \fBsize\fR to constrain the start and end range of the I/O workload.
 A percentage can be specified by a number between 1 and 100 followed by '%',
 for example, `offset=20%' to specify 20%.
 .TP
+.BI offset_align \fR=\fPint
+If set to non-zero value, the byte offset generated by a percentage \fBoffset\fR
+is aligned upwards to this value. Defaults to 0 meaning that a percentage
+offset is aligned to the minimum block size.
+.TP
 .BI offset_increment \fR=\fPint
 If this is provided, then the real offset becomes `\fBoffset\fR + \fBoffset_increment\fR
 * thread_number', where the thread number is a counter that starts at 0 and
diff --git a/options.c b/options.c
index ddcc4e5..e88dc2c 100644
--- a/options.c
+++ b/options.c
@@ -2019,6 +2019,17 @@ struct fio_option fio_options[FIO_MAX_OPTS] = {
 		.group	= FIO_OPT_G_INVALID,
 	},
 	{
+		.name	= "offset_align",
+		.lname	= "IO offset alignment",
+		.type	= FIO_OPT_INT,
+		.off1	= offsetof(struct thread_options, start_offset_align),
+		.help	= "Start IO from this offset alignment",
+		.def	= "0",
+		.interval = 1024 * 1024,
+		.category = FIO_OPT_C_IO,
+		.group	= FIO_OPT_G_INVALID,
+	},
+	{
 		.name	= "offset_increment",
 		.lname	= "IO offset increment",
 		.type	= FIO_OPT_STR_VAL,
diff --git a/thread_options.h b/thread_options.h
index 1813cdc..5a037bf 100644
--- a/thread_options.h
+++ b/thread_options.h
@@ -78,6 +78,7 @@ struct thread_options {
 	unsigned long long file_size_low;
 	unsigned long long file_size_high;
 	unsigned long long start_offset;
+	unsigned long long start_offset_align;
 
 	unsigned int bs[DDIR_RWDIR_CNT];
 	unsigned int ba[DDIR_RWDIR_CNT];
@@ -355,6 +356,7 @@ struct thread_options_pack {
 	uint64_t file_size_low;
 	uint64_t file_size_high;
 	uint64_t start_offset;
+	uint64_t start_offset_align;
 
 	uint32_t bs[DDIR_RWDIR_CNT];
 	uint32_t ba[DDIR_RWDIR_CNT];



Regards,
Jeff


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: fio offset with ba
  2017-10-26  0:27           ` Jeff Furlong
@ 2017-10-26  4:24             ` Jens Axboe
  2017-10-26  6:27               ` Sitsofe Wheeler
  0 siblings, 1 reply; 10+ messages in thread
From: Jens Axboe @ 2017-10-26  4:24 UTC (permalink / raw)
  To: Jeff Furlong, Sitsofe Wheeler; +Cc: fio

On 10/25/2017 05:27 PM, Jeff Furlong wrote:
>> Actually here's a question - what does .interval do?
> Looks like it is used to override the default value of 1 in gopt_new_int():
> 	interval = 1.0;
> 	if (o->interval)
> 		interval = o->interval;
> 	i->spin = gtk_spin_button_new_with_range(o->minval, maxval, interval);
>
> But your larger question may be why do some options have interval
> (e.g. offset) and some not have interval (e.g. ioengine).

It's for the GUI, where you want to specify in which increments a value
is adjusted when you use the up and down arrows to change it. It's only
applicable to numberical values, it would not make sense to apply to
something like ioengine or similar.

> diff --git a/filesetup.c b/filesetup.c
> index 7a602d4..5d7ea5c 100644
> --- a/filesetup.c
> +++ b/filesetup.c
> @@ -869,12 +869,10 @@ uint64_t get_start_offset(struct thread_data *td, struct fio_file *f)
>  
>  	if (o->start_offset_percent > 0) {
>  		/*
> -		 * if blockalign is provided, find the min across read, write,
> -		 * and trim
> +		 * if offset_align is provided, set initial offset
>  		 */
> -		if (fio_option_is_set(o, ba)) {
> -			align_bs = (unsigned long long) min(o->ba[DDIR_READ], o->ba[DDIR_WRITE]);
> -			align_bs = min((unsigned long long) o->ba[DDIR_TRIM], align_bs);
> +		if (fio_option_is_set(o, start_offset_align)) {
> +			align_bs = o->start_offset_align;

I'm curious why this drops the 'ba' part?

-- 
Jens Axboe



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: fio offset with ba
  2017-10-26  4:24             ` Jens Axboe
@ 2017-10-26  6:27               ` Sitsofe Wheeler
  2017-10-26 14:25                 ` Jens Axboe
  0 siblings, 1 reply; 10+ messages in thread
From: Sitsofe Wheeler @ 2017-10-26  6:27 UTC (permalink / raw)
  To: Jens Axboe; +Cc: Jeff Furlong, fio

On 26 October 2017 at 05:24, Jens Axboe <axboe@kernel.dk> wrote:
> On 10/25/2017 05:27 PM, Jeff Furlong wrote:
>>> Actually here's a question - what does .interval do?
>> Looks like it is used to override the default value of 1 in gopt_new_int():
>>       interval = 1.0;
>>       if (o->interval)
>>               interval = o->interval;
>>       i->spin = gtk_spin_button_new_with_range(o->minval, maxval, interval);
>>
>> But your larger question may be why do some options have interval
>> (e.g. offset) and some not have interval (e.g. ioengine).
>
> It's for the GUI, where you want to specify in which increments a value
> is adjusted when you use the up and down arrows to change it. It's only
> applicable to numberical values, it would not make sense to apply to
> something like ioengine or similar.

You were right Jeff - it was a larger question. Jens thanks for
clearing up its purpose - I missed it was used by the GUI.

>> diff --git a/filesetup.c b/filesetup.c
>> index 7a602d4..5d7ea5c 100644
>> --- a/filesetup.c
>> +++ b/filesetup.c
>> @@ -869,12 +869,10 @@ uint64_t get_start_offset(struct thread_data *td, struct fio_file *f)
>>
>>       if (o->start_offset_percent > 0) {
>>               /*
>> -              * if blockalign is provided, find the min across read, write,
>> -              * and trim
>> +              * if offset_align is provided, set initial offset
>>                */
>> -             if (fio_option_is_set(o, ba)) {
>> -                     align_bs = (unsigned long long) min(o->ba[DDIR_READ], o->ba[DDIR_WRITE]);
>> -                     align_bs = min((unsigned long long) o->ba[DDIR_TRIM], align_bs);
>> +             if (fio_option_is_set(o, start_offset_align)) {
>> +                     align_bs = o->start_offset_align;
>
> I'm curious why this drops the 'ba' part?

blockalign actually only impacts random I/O (see
https://github.com/axboe/fio/issues/341 for someone requesting it to
affect sequential I/O but that's a separate issue). Since each random
I/O will "blockalign itself" we gain nothing from trying to align the
offset generated by offset percentage to it.

-- 
Sitsofe | http://sucs.org/~sits/


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: fio offset with ba
  2017-10-26  6:27               ` Sitsofe Wheeler
@ 2017-10-26 14:25                 ` Jens Axboe
  0 siblings, 0 replies; 10+ messages in thread
From: Jens Axboe @ 2017-10-26 14:25 UTC (permalink / raw)
  To: Sitsofe Wheeler; +Cc: Jeff Furlong, fio

On 10/25/2017 11:27 PM, Sitsofe Wheeler wrote:
> On 26 October 2017 at 05:24, Jens Axboe <axboe@kernel.dk> wrote:
>> On 10/25/2017 05:27 PM, Jeff Furlong wrote:
>>>> Actually here's a question - what does .interval do?
>>> Looks like it is used to override the default value of 1 in gopt_new_int():
>>>       interval = 1.0;
>>>       if (o->interval)
>>>               interval = o->interval;
>>>       i->spin = gtk_spin_button_new_with_range(o->minval, maxval, interval);
>>>
>>> But your larger question may be why do some options have interval
>>> (e.g. offset) and some not have interval (e.g. ioengine).
>>
>> It's for the GUI, where you want to specify in which increments a value
>> is adjusted when you use the up and down arrows to change it. It's only
>> applicable to numberical values, it would not make sense to apply to
>> something like ioengine or similar.
> 
> You were right Jeff - it was a larger question. Jens thanks for
> clearing up its purpose - I missed it was used by the GUI.
> 
>>> diff --git a/filesetup.c b/filesetup.c
>>> index 7a602d4..5d7ea5c 100644
>>> --- a/filesetup.c
>>> +++ b/filesetup.c
>>> @@ -869,12 +869,10 @@ uint64_t get_start_offset(struct thread_data *td, struct fio_file *f)
>>>
>>>       if (o->start_offset_percent > 0) {
>>>               /*
>>> -              * if blockalign is provided, find the min across read, write,
>>> -              * and trim
>>> +              * if offset_align is provided, set initial offset
>>>                */
>>> -             if (fio_option_is_set(o, ba)) {
>>> -                     align_bs = (unsigned long long) min(o->ba[DDIR_READ], o->ba[DDIR_WRITE]);
>>> -                     align_bs = min((unsigned long long) o->ba[DDIR_TRIM], align_bs);
>>> +             if (fio_option_is_set(o, start_offset_align)) {
>>> +                     align_bs = o->start_offset_align;
>>
>> I'm curious why this drops the 'ba' part?
> 
> blockalign actually only impacts random I/O (see
> https://github.com/axboe/fio/issues/341 for someone requesting it to
> affect sequential I/O but that's a separate issue). Since each random
> I/O will "blockalign itself" we gain nothing from trying to align the
> offset generated by offset percentage to it.

Gotcha. That's my only comment on the patch, if folks are happy with it,
we can slide it in.

-- 
Jens Axboe



^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2017-10-26 14:25 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-10-20 19:08 fio offset with ba Jeff Furlong
2017-10-20 21:01 ` Sitsofe Wheeler
2017-10-21  0:13   ` Jeff Furlong
2017-10-21  4:21     ` Sitsofe Wheeler
2017-10-23 23:56       ` Jeff Furlong
2017-10-24  6:21         ` Sitsofe Wheeler
2017-10-26  0:27           ` Jeff Furlong
2017-10-26  4:24             ` Jens Axboe
2017-10-26  6:27               ` Sitsofe Wheeler
2017-10-26 14:25                 ` Jens Axboe

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.