All of lore.kernel.org
 help / color / mirror / Atom feed
* Using FIO to benchmark Gluster
@ 2021-11-24 16:01 Seyed Mohammad Fakhraie
  2021-11-24 16:24 ` Dmitry Antipov
  0 siblings, 1 reply; 6+ messages in thread
From: Seyed Mohammad Fakhraie @ 2021-11-24 16:01 UTC (permalink / raw)
  To: fio

Hello people,
I am trying to benchmark Gluster with FIO. I am getting the following error:

4vm-1G-randrw: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B,
(T) 4096B-4096B, ioengine=gfapi, iodepth=4
...
fio-3.12
Starting 4 processes
failed fio extend file 4vm-1G-randrw.1.0 to 18446744073709551615
failed fio extend file 4vm-1G-randrw.3.0 to 18446744073709551615
failed fio extend file 4vm-1G-randrw.0.0 to 18446744073709551615
failed fio extend file 4vm-1G-randrw.2.0 to 18446744073709551615
Jobs: 4 (f=0): [f(4)][-.-%][eta 00m:00s]

Run status group 0 (all jobs)

Any help for debugging this issue is much appreciated. My FIO version
is "fio-3.28-104-gf7c3f".

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Using FIO to benchmark Gluster
  2021-11-24 16:01 Using FIO to benchmark Gluster Seyed Mohammad Fakhraie
@ 2021-11-24 16:24 ` Dmitry Antipov
       [not found]   ` <CACzyWQzih6gCQ4vYGA9Hwj-Zq4_d+YKC1j=secP8W9nr7bZCgg@mail.gmail.com>
  0 siblings, 1 reply; 6+ messages in thread
From: Dmitry Antipov @ 2021-11-24 16:24 UTC (permalink / raw)
  To: Seyed Mohammad Fakhraie, fio

On 11/24/21 19:01, Seyed Mohammad Fakhraie wrote:

> I am trying to benchmark Gluster with FIO. I am getting the following error:

Please provide full command line and an output of 'gluster volume info'.

Dmitry

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Using FIO to benchmark Gluster
       [not found]   ` <CACzyWQzih6gCQ4vYGA9Hwj-Zq4_d+YKC1j=secP8W9nr7bZCgg@mail.gmail.com>
@ 2021-11-25 15:18     ` Dmitry Antipov
       [not found]       ` <CACzyWQwCvsCkh=Ui_cr5Hh8jqZGEkX4B1YwujEnR6nYutjPveA@mail.gmail.com>
  0 siblings, 1 reply; 6+ messages in thread
From: Dmitry Antipov @ 2021-11-25 15:18 UTC (permalink / raw)
  To: Seyed Mohammad Fakhraie; +Cc: fio

On 11/25/21 17:53, Seyed Mohammad Fakhraie wrote:

> * The output of ‍'gluster volume info' is as follows:
> 
> *Volume Name: gv0
> Type: Distributed-Replicate
> Volume ID: 75946a3e-f670-4f58-a61e-c3c61e3d977d
> Status: Started
> Snapshot Count: 0
> Number of Bricks: 5 x (2 + 1) = 15
> Transport-type: tcp
> Bricks:
> Brick1: storage-node0:/data/brick0/gv0
> Brick2: storage-node1:/data/brick0/gv0
> Brick3: storage-node2:/data/arbit0/gv0 (arbiter)
> Brick4: storage-node3:/data/brick0/gv0
> Brick5: storage-node4:/data/brick0/gv0
> Brick6: storage-node0:/data/arbit0/gv0 (arbiter)
> Brick7: storage-node1:/data/brick1/gv0
> Brick8: storage-node2:/data/brick0/gv0
> Brick9: storage-node3:/data/arbit0/gv0 (arbiter)
> Brick10: storage-node4:/data/brick1/gv0
> Brick11: storage-node0:/data/brick1/gv0
> Brick12: storage-node1:/data/arbit0/gv0 (arbiter)
> Brick13: storage-node2:/data/brick1/gv0
> Brick14: storage-node3:/data/brick1/gv0
> Brick15: storage-node4:/data/arbit0/gv0 (arbiter)
> Options Reconfigured:
> performance.client-io-threads: off
> nfs.disable: on
> transport.address-family: inet‍‍‍

To track down an issue, I would recommend to try to reproduce
it on the very basic replicated volume first. It is important
to have all of the bricks on the same host, e.g.:

Volume Name: test0
Type: Replicate
Volume ID: a71f90a1-4136-4c87-bfc5-18b1da477864
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x 3 = 3
Transport-type: tcp
Bricks:
Brick1: node0:/pool/0                ;; same host
Brick2: node0:/pool/1                ;; same host
Brick3: node0:/pool/2                ;; same host
Options Reconfigured:
cluster.granular-entry-heal: on
storage.fips-mode-rchecksum: on
transport.address-family: inet
nfs.disable: on
performance.client-io-threads: off

Also limit your 'fio' workload to 'numjobs=1'.

Finally you forgot to mention GlusterFS version you're using.

Dmitry

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Using FIO to benchmark Gluster
       [not found]       ` <CACzyWQwCvsCkh=Ui_cr5Hh8jqZGEkX4B1YwujEnR6nYutjPveA@mail.gmail.com>
@ 2021-12-20 14:38         ` Dmitry Antipov
       [not found]           ` <CACzyWQwe2SYR_W=XyuCk=THBZ-0frnh5u1gtjALfnZBpu6EerA@mail.gmail.com>
  0 siblings, 1 reply; 6+ messages in thread
From: Dmitry Antipov @ 2021-12-20 14:38 UTC (permalink / raw)
  To: Seyed Mohammad Fakhraie; +Cc: fio

On 12/20/21 06:41, Seyed Mohammad Fakhraie wrote:

> I created a replica-3 setup on a single machine. It's description can
> be seen below:
> 
> volume Name: gv1
> Type: Replicate
> Volume ID: 14fce60f-78e1-4021-9939-81abcd71f761
> Status: Started
> Snapshot Count: 0
> Number of Bricks: 1 x 3 = 3
> Transport-type: tcp
> Bricks:
> Brick1: storage-node-8:/data/brick1/gv1
> Brick2: storage-node-8:/data/brick2/gv1
> Brick3: storage-node-8:/data/brick3/gv1
> Options Reconfigured:
> transport.address-family: inet
> nfs.disable: on
> performance.client-io-threads: off
> 
> I'm using Debian 10 with Gluster 5.5.
> After running the test using gfapi the same error showed itself during
> randread :
> 
> failed fio extend file 1vm-1G-rand-read-3rep to 18446744073709551615
> 
> The job file:
> [global]
> ioengine=gfapi
> volume=gv1
> brick=storage-node-8
> direct=1
> create_on_open=1
> ramp_time=1m
> iodepth=1
> numjobs=1
> openfiles=1
> bs=4k
> [1vm-1G-rand-read-3rep]
> filesize=1G
> time_based
> runtime=3min
> rw=randread
> write_bw_log=4k-read-3replica-ro.results
> write_iops_log=4k-read-3replica-ro.results
> write_lat_log=4k-read-3replica-ro.results

Reproduced, I'll take a look. To make sure we're seeing the same issue, could you also check brick
logs (usually /var/log/glusterfs/bricks/pool-[whatever].log) for something similar to the following:

[2021-12-20 14:27:18.193027 +0000] E [MSGID: 113038] [posix-inode-fd-ops.c:5318:posix_ftruncate] 0-test0-posix: ftruncate failed on fd=0x13e5838 (-1 [Invalid argument]
[2021-12-20 14:27:18.193163 +0000] E [MSGID: 115063] [server-rpc-fops_v2.c:1214:server4_ftruncate_cbk] 0-test0-server: TRUNCATE info [{frame=50}, {FTRUNCATE_fd_no=0}, 
{uuid_utoa=1287dc8e-2896-4580-a14b-ef84ea9aad84}, {client=CTX_ID:3acbbb24-88af-48cd-9da0-62922a7855a3-GRAPH_ID:0-PID:671237-HOST:fedora-PC_NAME:test0-client-0-RECON_NO:-0}, {error-xlator=test0-posix}, 
{errno=22}, {error=Invalid argument}]

Dmitry

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Using FIO to benchmark Gluster
       [not found]           ` <CACzyWQwe2SYR_W=XyuCk=THBZ-0frnh5u1gtjALfnZBpu6EerA@mail.gmail.com>
@ 2021-12-21  9:34             ` Dmitry Antipov
       [not found]               ` <CACzyWQy7MXp619uJhJ=uDWqabT_xwt=6bu=Z5103twPJJMz1=A@mail.gmail.com>
  0 siblings, 1 reply; 6+ messages in thread
From: Dmitry Antipov @ 2021-12-21  9:34 UTC (permalink / raw)
  To: Seyed Mohammad Fakhraie; +Cc: fio

On 12/20/21 19:11, Seyed Mohammad Fakhraie wrote:

> Hello Dmitry, Thanks for reaching out.
> by "reproduced", you mean that you're getting the same error during
> randomread workloads using gfapi? This is the error that I'm getting
> on all the setups that I've tested (3x Replica & 2x Replica with
> Arbiter) during randomread + gfapi:
> 
> failed fio extend file 4vm-1G-randrw.1.0 to 18446744073709551615

Definitely. What I'm seeing is:

# cat test0.fio
[global]
ioengine=gfapi
volume=test0
brick=localhost
direct=1
create_on_open=1
ramp_time=1m
iodepth=1
numjobs=1
openfiles=1
bs=4k
[1vm-1G-rand-read-3rep]
filesize=1G
time_based
runtime=3min
rw=randread
write_bw_log=4k-read-3replica-ro.results
write_iops_log=4k-read-3replica-ro.results
write_lat_log=4k-read-3replica-ro.results

# fio --debug=file test0.fio
fio: set debug option file
file     155116 dup files: 0
file     155116 add file 1vm-1G-rand-read-3rep.0.0
file     155116 resize file array to 2 files
file     155116 file 0x7fb75dbfa110 "1vm-1G-rand-read-3rep.0.0" added at 0
1vm-1G-rand-read-3rep: (g=0): rw=randread, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=gfapi, iodepth=1
fio-3.28
Starting 1 process
file     155116 setup files
file     155116 get file size for 0x7fb75dbfa110/0/1vm-1G-rand-read-3rep.0.0
file     155116 get file size 1vm-1G-rand-read-3rep.0.0
file     155126 fio setup 0x1962ea0
file     155126 trying file 1vm-1G-rand-read-3rep.0.0 280
file     155126 fio file 1vm-1G-rand-read-3rep.0.0 open mode rw td rw read
file     155126 fio extend file 1vm-1G-rand-read-3rep.0.0 from 0 to 18446744073709551615
failed fio extend file 1vm-1G-rand-read-3rep.0.0 to 18446744073709551615
file     155126 fio 0x1962ea0 created 1vm-1G-rand-read-3rep.0.0
file     155126 error 1 on open of 1vm-1G-rand-read-3rep.0.0
file     155126 get_next_file_rr: (nil)
file     155126 get_next_file: NULL
file     155126 close files

This looks like a weird error parsing job file. I didn't get the logic
behind get_file_sizes(), but the job file definitely requests 1G ant not
18446744073709551615, which is -1ULL.

Dmitry

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Using FIO to benchmark Gluster
       [not found]               ` <CACzyWQy7MXp619uJhJ=uDWqabT_xwt=6bu=Z5103twPJJMz1=A@mail.gmail.com>
@ 2021-12-21 11:35                 ` Dmitry Antipov
  0 siblings, 0 replies; 6+ messages in thread
From: Dmitry Antipov @ 2021-12-21 11:35 UTC (permalink / raw)
  To: Seyed Mohammad Fakhraie; +Cc: fio

On 12/21/21 13:21, Seyed Mohammad Fakhraie wrote:

> Thanks Dmitry. As you pointed out, 18446744073709551615 is an
> interesting number. I'm just thinking out loud here and I haven't
> looked at the code yet, could it be caused by an overflow?

Looking through fio sources, my guess is that -1 internally means
"not decided (yet)". Passing -1 via gfapi tells the GlusterFS brick
process to create (size_t)-1 bytes file, which is 2^64 - 1 and not
supported by GlusterFS. Most likely this value is too large for an
underlying filesystem as well (for example, XFS native limit is
2^63 - 1 bytes).

I would suggest to consider simpler scenarios. For example, the
following seems works as expected, and it's quite similar to your
job file:

fio --ioengine=gfapi --volume=test0 --brick=localhost --direct=1 \
     --create_on_open=1 --iodepth=1 --numjobs=1 --bs=4k --size=1G \
     --time_based --runtime=180 --rw=randread --name=test0

Dmitry

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2021-12-21 11:35 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-11-24 16:01 Using FIO to benchmark Gluster Seyed Mohammad Fakhraie
2021-11-24 16:24 ` Dmitry Antipov
     [not found]   ` <CACzyWQzih6gCQ4vYGA9Hwj-Zq4_d+YKC1j=secP8W9nr7bZCgg@mail.gmail.com>
2021-11-25 15:18     ` Dmitry Antipov
     [not found]       ` <CACzyWQwCvsCkh=Ui_cr5Hh8jqZGEkX4B1YwujEnR6nYutjPveA@mail.gmail.com>
2021-12-20 14:38         ` Dmitry Antipov
     [not found]           ` <CACzyWQwe2SYR_W=XyuCk=THBZ-0frnh5u1gtjALfnZBpu6EerA@mail.gmail.com>
2021-12-21  9:34             ` Dmitry Antipov
     [not found]               ` <CACzyWQy7MXp619uJhJ=uDWqabT_xwt=6bu=Z5103twPJJMz1=A@mail.gmail.com>
2021-12-21 11:35                 ` Dmitry Antipov

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.