* Using FIO to benchmark Gluster @ 2021-11-24 16:01 Seyed Mohammad Fakhraie 2021-11-24 16:24 ` Dmitry Antipov 0 siblings, 1 reply; 6+ messages in thread From: Seyed Mohammad Fakhraie @ 2021-11-24 16:01 UTC (permalink / raw) To: fio Hello people, I am trying to benchmark Gluster with FIO. I am getting the following error: 4vm-1G-randrw: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=gfapi, iodepth=4 ... fio-3.12 Starting 4 processes failed fio extend file 4vm-1G-randrw.1.0 to 18446744073709551615 failed fio extend file 4vm-1G-randrw.3.0 to 18446744073709551615 failed fio extend file 4vm-1G-randrw.0.0 to 18446744073709551615 failed fio extend file 4vm-1G-randrw.2.0 to 18446744073709551615 Jobs: 4 (f=0): [f(4)][-.-%][eta 00m:00s] Run status group 0 (all jobs) Any help for debugging this issue is much appreciated. My FIO version is "fio-3.28-104-gf7c3f". ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Using FIO to benchmark Gluster 2021-11-24 16:01 Using FIO to benchmark Gluster Seyed Mohammad Fakhraie @ 2021-11-24 16:24 ` Dmitry Antipov [not found] ` <CACzyWQzih6gCQ4vYGA9Hwj-Zq4_d+YKC1j=secP8W9nr7bZCgg@mail.gmail.com> 0 siblings, 1 reply; 6+ messages in thread From: Dmitry Antipov @ 2021-11-24 16:24 UTC (permalink / raw) To: Seyed Mohammad Fakhraie, fio On 11/24/21 19:01, Seyed Mohammad Fakhraie wrote: > I am trying to benchmark Gluster with FIO. I am getting the following error: Please provide full command line and an output of 'gluster volume info'. Dmitry ^ permalink raw reply [flat|nested] 6+ messages in thread
[parent not found: <CACzyWQzih6gCQ4vYGA9Hwj-Zq4_d+YKC1j=secP8W9nr7bZCgg@mail.gmail.com>]
* Re: Using FIO to benchmark Gluster [not found] ` <CACzyWQzih6gCQ4vYGA9Hwj-Zq4_d+YKC1j=secP8W9nr7bZCgg@mail.gmail.com> @ 2021-11-25 15:18 ` Dmitry Antipov [not found] ` <CACzyWQwCvsCkh=Ui_cr5Hh8jqZGEkX4B1YwujEnR6nYutjPveA@mail.gmail.com> 0 siblings, 1 reply; 6+ messages in thread From: Dmitry Antipov @ 2021-11-25 15:18 UTC (permalink / raw) To: Seyed Mohammad Fakhraie; +Cc: fio On 11/25/21 17:53, Seyed Mohammad Fakhraie wrote: > * The output of 'gluster volume info' is as follows: > > *Volume Name: gv0 > Type: Distributed-Replicate > Volume ID: 75946a3e-f670-4f58-a61e-c3c61e3d977d > Status: Started > Snapshot Count: 0 > Number of Bricks: 5 x (2 + 1) = 15 > Transport-type: tcp > Bricks: > Brick1: storage-node0:/data/brick0/gv0 > Brick2: storage-node1:/data/brick0/gv0 > Brick3: storage-node2:/data/arbit0/gv0 (arbiter) > Brick4: storage-node3:/data/brick0/gv0 > Brick5: storage-node4:/data/brick0/gv0 > Brick6: storage-node0:/data/arbit0/gv0 (arbiter) > Brick7: storage-node1:/data/brick1/gv0 > Brick8: storage-node2:/data/brick0/gv0 > Brick9: storage-node3:/data/arbit0/gv0 (arbiter) > Brick10: storage-node4:/data/brick1/gv0 > Brick11: storage-node0:/data/brick1/gv0 > Brick12: storage-node1:/data/arbit0/gv0 (arbiter) > Brick13: storage-node2:/data/brick1/gv0 > Brick14: storage-node3:/data/brick1/gv0 > Brick15: storage-node4:/data/arbit0/gv0 (arbiter) > Options Reconfigured: > performance.client-io-threads: off > nfs.disable: on > transport.address-family: inet To track down an issue, I would recommend to try to reproduce it on the very basic replicated volume first. It is important to have all of the bricks on the same host, e.g.: Volume Name: test0 Type: Replicate Volume ID: a71f90a1-4136-4c87-bfc5-18b1da477864 Status: Started Snapshot Count: 0 Number of Bricks: 1 x 3 = 3 Transport-type: tcp Bricks: Brick1: node0:/pool/0 ;; same host Brick2: node0:/pool/1 ;; same host Brick3: node0:/pool/2 ;; same host Options Reconfigured: cluster.granular-entry-heal: on storage.fips-mode-rchecksum: on transport.address-family: inet nfs.disable: on performance.client-io-threads: off Also limit your 'fio' workload to 'numjobs=1'. Finally you forgot to mention GlusterFS version you're using. Dmitry ^ permalink raw reply [flat|nested] 6+ messages in thread
[parent not found: <CACzyWQwCvsCkh=Ui_cr5Hh8jqZGEkX4B1YwujEnR6nYutjPveA@mail.gmail.com>]
* Re: Using FIO to benchmark Gluster [not found] ` <CACzyWQwCvsCkh=Ui_cr5Hh8jqZGEkX4B1YwujEnR6nYutjPveA@mail.gmail.com> @ 2021-12-20 14:38 ` Dmitry Antipov [not found] ` <CACzyWQwe2SYR_W=XyuCk=THBZ-0frnh5u1gtjALfnZBpu6EerA@mail.gmail.com> 0 siblings, 1 reply; 6+ messages in thread From: Dmitry Antipov @ 2021-12-20 14:38 UTC (permalink / raw) To: Seyed Mohammad Fakhraie; +Cc: fio On 12/20/21 06:41, Seyed Mohammad Fakhraie wrote: > I created a replica-3 setup on a single machine. It's description can > be seen below: > > volume Name: gv1 > Type: Replicate > Volume ID: 14fce60f-78e1-4021-9939-81abcd71f761 > Status: Started > Snapshot Count: 0 > Number of Bricks: 1 x 3 = 3 > Transport-type: tcp > Bricks: > Brick1: storage-node-8:/data/brick1/gv1 > Brick2: storage-node-8:/data/brick2/gv1 > Brick3: storage-node-8:/data/brick3/gv1 > Options Reconfigured: > transport.address-family: inet > nfs.disable: on > performance.client-io-threads: off > > I'm using Debian 10 with Gluster 5.5. > After running the test using gfapi the same error showed itself during > randread : > > failed fio extend file 1vm-1G-rand-read-3rep to 18446744073709551615 > > The job file: > [global] > ioengine=gfapi > volume=gv1 > brick=storage-node-8 > direct=1 > create_on_open=1 > ramp_time=1m > iodepth=1 > numjobs=1 > openfiles=1 > bs=4k > [1vm-1G-rand-read-3rep] > filesize=1G > time_based > runtime=3min > rw=randread > write_bw_log=4k-read-3replica-ro.results > write_iops_log=4k-read-3replica-ro.results > write_lat_log=4k-read-3replica-ro.results Reproduced, I'll take a look. To make sure we're seeing the same issue, could you also check brick logs (usually /var/log/glusterfs/bricks/pool-[whatever].log) for something similar to the following: [2021-12-20 14:27:18.193027 +0000] E [MSGID: 113038] [posix-inode-fd-ops.c:5318:posix_ftruncate] 0-test0-posix: ftruncate failed on fd=0x13e5838 (-1 [Invalid argument] [2021-12-20 14:27:18.193163 +0000] E [MSGID: 115063] [server-rpc-fops_v2.c:1214:server4_ftruncate_cbk] 0-test0-server: TRUNCATE info [{frame=50}, {FTRUNCATE_fd_no=0}, {uuid_utoa=1287dc8e-2896-4580-a14b-ef84ea9aad84}, {client=CTX_ID:3acbbb24-88af-48cd-9da0-62922a7855a3-GRAPH_ID:0-PID:671237-HOST:fedora-PC_NAME:test0-client-0-RECON_NO:-0}, {error-xlator=test0-posix}, {errno=22}, {error=Invalid argument}] Dmitry ^ permalink raw reply [flat|nested] 6+ messages in thread
[parent not found: <CACzyWQwe2SYR_W=XyuCk=THBZ-0frnh5u1gtjALfnZBpu6EerA@mail.gmail.com>]
* Re: Using FIO to benchmark Gluster [not found] ` <CACzyWQwe2SYR_W=XyuCk=THBZ-0frnh5u1gtjALfnZBpu6EerA@mail.gmail.com> @ 2021-12-21 9:34 ` Dmitry Antipov [not found] ` <CACzyWQy7MXp619uJhJ=uDWqabT_xwt=6bu=Z5103twPJJMz1=A@mail.gmail.com> 0 siblings, 1 reply; 6+ messages in thread From: Dmitry Antipov @ 2021-12-21 9:34 UTC (permalink / raw) To: Seyed Mohammad Fakhraie; +Cc: fio On 12/20/21 19:11, Seyed Mohammad Fakhraie wrote: > Hello Dmitry, Thanks for reaching out. > by "reproduced", you mean that you're getting the same error during > randomread workloads using gfapi? This is the error that I'm getting > on all the setups that I've tested (3x Replica & 2x Replica with > Arbiter) during randomread + gfapi: > > failed fio extend file 4vm-1G-randrw.1.0 to 18446744073709551615 Definitely. What I'm seeing is: # cat test0.fio [global] ioengine=gfapi volume=test0 brick=localhost direct=1 create_on_open=1 ramp_time=1m iodepth=1 numjobs=1 openfiles=1 bs=4k [1vm-1G-rand-read-3rep] filesize=1G time_based runtime=3min rw=randread write_bw_log=4k-read-3replica-ro.results write_iops_log=4k-read-3replica-ro.results write_lat_log=4k-read-3replica-ro.results # fio --debug=file test0.fio fio: set debug option file file 155116 dup files: 0 file 155116 add file 1vm-1G-rand-read-3rep.0.0 file 155116 resize file array to 2 files file 155116 file 0x7fb75dbfa110 "1vm-1G-rand-read-3rep.0.0" added at 0 1vm-1G-rand-read-3rep: (g=0): rw=randread, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=gfapi, iodepth=1 fio-3.28 Starting 1 process file 155116 setup files file 155116 get file size for 0x7fb75dbfa110/0/1vm-1G-rand-read-3rep.0.0 file 155116 get file size 1vm-1G-rand-read-3rep.0.0 file 155126 fio setup 0x1962ea0 file 155126 trying file 1vm-1G-rand-read-3rep.0.0 280 file 155126 fio file 1vm-1G-rand-read-3rep.0.0 open mode rw td rw read file 155126 fio extend file 1vm-1G-rand-read-3rep.0.0 from 0 to 18446744073709551615 failed fio extend file 1vm-1G-rand-read-3rep.0.0 to 18446744073709551615 file 155126 fio 0x1962ea0 created 1vm-1G-rand-read-3rep.0.0 file 155126 error 1 on open of 1vm-1G-rand-read-3rep.0.0 file 155126 get_next_file_rr: (nil) file 155126 get_next_file: NULL file 155126 close files This looks like a weird error parsing job file. I didn't get the logic behind get_file_sizes(), but the job file definitely requests 1G ant not 18446744073709551615, which is -1ULL. Dmitry ^ permalink raw reply [flat|nested] 6+ messages in thread
[parent not found: <CACzyWQy7MXp619uJhJ=uDWqabT_xwt=6bu=Z5103twPJJMz1=A@mail.gmail.com>]
* Re: Using FIO to benchmark Gluster [not found] ` <CACzyWQy7MXp619uJhJ=uDWqabT_xwt=6bu=Z5103twPJJMz1=A@mail.gmail.com> @ 2021-12-21 11:35 ` Dmitry Antipov 0 siblings, 0 replies; 6+ messages in thread From: Dmitry Antipov @ 2021-12-21 11:35 UTC (permalink / raw) To: Seyed Mohammad Fakhraie; +Cc: fio On 12/21/21 13:21, Seyed Mohammad Fakhraie wrote: > Thanks Dmitry. As you pointed out, 18446744073709551615 is an > interesting number. I'm just thinking out loud here and I haven't > looked at the code yet, could it be caused by an overflow? Looking through fio sources, my guess is that -1 internally means "not decided (yet)". Passing -1 via gfapi tells the GlusterFS brick process to create (size_t)-1 bytes file, which is 2^64 - 1 and not supported by GlusterFS. Most likely this value is too large for an underlying filesystem as well (for example, XFS native limit is 2^63 - 1 bytes). I would suggest to consider simpler scenarios. For example, the following seems works as expected, and it's quite similar to your job file: fio --ioengine=gfapi --volume=test0 --brick=localhost --direct=1 \ --create_on_open=1 --iodepth=1 --numjobs=1 --bs=4k --size=1G \ --time_based --runtime=180 --rw=randread --name=test0 Dmitry ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2021-12-21 11:35 UTC | newest] Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2021-11-24 16:01 Using FIO to benchmark Gluster Seyed Mohammad Fakhraie 2021-11-24 16:24 ` Dmitry Antipov [not found] ` <CACzyWQzih6gCQ4vYGA9Hwj-Zq4_d+YKC1j=secP8W9nr7bZCgg@mail.gmail.com> 2021-11-25 15:18 ` Dmitry Antipov [not found] ` <CACzyWQwCvsCkh=Ui_cr5Hh8jqZGEkX4B1YwujEnR6nYutjPveA@mail.gmail.com> 2021-12-20 14:38 ` Dmitry Antipov [not found] ` <CACzyWQwe2SYR_W=XyuCk=THBZ-0frnh5u1gtjALfnZBpu6EerA@mail.gmail.com> 2021-12-21 9:34 ` Dmitry Antipov [not found] ` <CACzyWQy7MXp619uJhJ=uDWqabT_xwt=6bu=Z5103twPJJMz1=A@mail.gmail.com> 2021-12-21 11:35 ` Dmitry Antipov
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.