Linux-RDMA Archive on lore.kernel.org
 help / color / Atom feed
From: Gioh Kim <gi-oh.kim@ionos.com>
To: Leon Romanovsky <leon@kernel.org>
Cc: Haakon Bugge <haakon.bugge@oracle.com>,
	Jinpu Wang <jinpu.wang@ionos.com>,
	OFED mailing list <linux-rdma@vger.kernel.org>,
	Bart Van Assche <bvanassche@acm.org>,
	Doug Ledford <dledford@redhat.com>,
	Jason Gunthorpe <jgg@ziepe.ca>,
	Haris Iqbal <haris.iqbal@ionos.com>,
	Gioh Kim <gi-oh.kim@cloud.ionos.com>
Subject: Re: [PATCHv2 for-next 1/3] RDMA/rtrs-clt: Print more info when an error happens
Date: Tue, 13 Apr 2021 15:11:33 +0200
Message-ID: <CAJX1YtZ9LLqugvQHa77PCxpyoLx-k31bh7eXfxuVWw0NHr6xAw@mail.gmail.com> (raw)
In-Reply-To: <YHU9hZfkNEDy94+s@unreal>


[-- Attachment #1: Type: text/plain, Size: 4990 bytes --]

On Tue, Apr 13, 2021 at 8:43 AM Leon Romanovsky <leon@kernel.org> wrote:
>
> On Tue, Apr 13, 2021 at 05:31:24AM +0000, Haakon Bugge wrote:
> >
> >
> > > On 12 Apr 2021, at 19:34, Leon Romanovsky <leon@kernel.org> wrote:
> > >
> > > On Mon, Apr 12, 2021 at 04:00:55PM +0200, Gioh Kim wrote:
> > >> On Mon, Apr 12, 2021 at 2:54 PM Jinpu Wang <jinpu.wang@ionos.com> wrote:
> > >>>
> > >>> On Mon, Apr 12, 2021 at 2:41 PM Leon Romanovsky <leon@kernel.org> wrote:
> > >>>>
> > >>>> On Mon, Apr 12, 2021 at 02:22:51PM +0200, Jinpu Wang wrote:
> > >>>>> On Tue, Apr 6, 2021 at 2:41 PM Leon Romanovsky <leon@kernel.org> wrote:
> > >>>>>>
> > >>>>>> On Tue, Apr 06, 2021 at 02:36:37PM +0200, Gioh Kim wrote:
> > >>>>>>> From: Gioh Kim <gi-oh.kim@cloud.ionos.com>
> > >>>>>>>
> > >>>>>>> Client prints only error value and it is not enough for debugging.
> > >>>>>>>
> > >>>>>>> 1. When client receives an error from server:
> > >>>>>>> the client does not only print the error value but also
> > >>>>>>> more information of server connection.
> > >>>>>>>
> > >>>>>>> 2. When client failes to send IO:
> > >>>>>>> the client gets an error from RDMA layer. It also
> > >>>>>>> print more information of server connection.
> > >>>>>>>
> > >>>>>>> Signed-off-by: Gioh Kim <gi-oh.kim@ionos.com>
> > >>>>>>> Signed-off-by: Jack Wang <jinpu.wang@ionos.com>
> > >>>>>>> ---
> > >>>>>>> drivers/infiniband/ulp/rtrs/rtrs-clt.c | 33 ++++++++++++++++++++++----
> > >>>>>>> 1 file changed, 29 insertions(+), 4 deletions(-)
> > >>>>>>>
> > >>>>>>> diff --git a/drivers/infiniband/ulp/rtrs/rtrs-clt.c b/drivers/infiniband/ulp/rtrs/rtrs-clt.c
> > >>>>>>> index 5062328ac577..a534b2b09e13 100644
> > >>>>>>> --- a/drivers/infiniband/ulp/rtrs/rtrs-clt.c
> > >>>>>>> +++ b/drivers/infiniband/ulp/rtrs/rtrs-clt.c
> > >>>>>>> @@ -437,6 +437,11 @@ static void complete_rdma_req(struct rtrs_clt_io_req *req, int errno,
> > >>>>>>>      req->in_use = false;
> > >>>>>>>      req->con = NULL;
> > >>>>>>>
> > >>>>>>> +     if (unlikely(errno)) {
> > >>>>>>
> > >>>>>> I'm sorry, but all your patches are full of these likely/unlikely cargo
> > >>>>>> cult. Can you please provide supportive performance data or delete all
> > >>>>>> likely/unlikely in all rtrs code?
> > >>>>>
> > >>>>> Hi Leon,
> > >>>>>
> > >>>>> All the likely/unlikely from the non-fast path was removed as you
> > >>>>> suggested in the past.
> > >>>>> This one is on IO path, my understanding is for the fast path, with
> > >>>>> likely/unlikely macro,
> > >>>>> the compiler will optimize the code for better branch prediction.
> > >>>>
> > >>>> In theory yes, in practice. gcc 10 generated same assembly code when I
> > >>>> placed likely() and replaced it with unlikely() later.
> > >>
> > >> Even-thought gcc 10 generated the same assembly code,
> > >> there is no guarantee for gcc 11 or gcc 12.
> > >>
> > >> I am reviewing rtrs source file and have found some unnecessary likely/unlikely.
> > >> But I think likely/unlikely are necessary for extreme cases.
> > >> I will have a discussion with my colleagues and inform you of the result.
> > >
> > > Please come with performance data.
> >
> > I think the best way to gather performance data is not remove the likely/unlikely, but swap their definitions. Less coding and more pronounced difference - if any.
>
> In theory, it will multiply by 2 gain/loss, which is nice to see if
> likely/ulikely change something.
>
> Thanks
>
> >
> >
> > Thxs, Håkon
> >

Hi,

In summary, there is no performance gap before/after swapping
likely/unlikely macros.
So I will send a patch to remove all likely/unlikely macros.

I guess that is because
- The performance of rnbd/rtrs depends on the network and block layer.
- The network and block layer are not fast enough to get impacted by
likely/unlikely.

I ran fio read test with 32 rnbd devices and 64/128 processes on 64-CORE server.
The fio generated the exact same result before and after the swapping.
Thanks to Håkon for the test idea.

Test environment:
- Intel(R) Xeon(R) Gold 6130 CPU @ 2.10GHz
- 376G memory
- kernel version: 5.4.86
- gcc version: gcc (Debian 8.3.0-6) 8.3.0
- Infiniband controller: Mellanox Technologies MT27800 Family [ConnectX-5]

Test result:
- before swapping:
32-dev/64-proc: IOPS=829k, BW=3239MiB/s
32-dev/128-proc: IOPS=816k, BW=3187MiB/s
- after swapping
 32-dev/64-proc: IOPS=829k, BW=3238MiB/s
32-dev/128-proc: IOPS=817k, BW=3191MiB/s
(128-proc is worse than 64-proc but that is another issue)

Attached files:
- 0001-swap-likely-and-unlikely.patch: a patch file swapping likely
and unlikely to show how I tested
- after_swap.txt: raw data after swapping
- current.txt: raw data before swapping

For your information, I ran the performance test on two 8-core desktop machines
that are directly linked by Infiniband cables without switch.
I got the same result with them: no performance difference.

[-- Attachment #2: current.txt --]
[-- Type: text/plain, Size: 22979 bytes --]

141 root@ps401a-914.nst:/tmp/ddd/gkim/ibnbd2# git reset --hard HEAD~2
HEAD is now at 99c7c2f RDMA/rtrs-clt: destroy sysfs after removing session from active list
root@ps401a-914.nst:/tmp/ddd/gkim/ibnbd2# make clean
make[1]: Entering directory '/usr/src/linux-5.4.86-pserver'
make[1]: Leaving directory '/usr/src/linux-5.4.86-pserver'
root@ps401a-914.nst:/tmp/ddd/gkim/ibnbd2# make
make[1]: Entering directory '/usr/src/linux-5.4.86-pserver'
  CC [M]  /tmp/ddd/gkim/ibnbd2/rnbd/rnbd-clt.o
  CC [M]  /tmp/ddd/gkim/ibnbd2/rnbd/rnbd-clt-sysfs.o
  CC [M]  /tmp/ddd/gkim/ibnbd2/rnbd/rnbd-common.o
  LD [M]  /tmp/ddd/gkim/ibnbd2/rnbd/rnbd-client.o
  CC [M]  /tmp/ddd/gkim/ibnbd2/rnbd/rnbd-srv.o
  CC [M]  /tmp/ddd/gkim/ibnbd2/rnbd/rnbd-srv-dev.o
  CC [M]  /tmp/ddd/gkim/ibnbd2/rnbd/rnbd-srv-sysfs.o
  LD [M]  /tmp/ddd/gkim/ibnbd2/rnbd/rnbd-server.o
  CC [M]  /tmp/ddd/gkim/ibnbd2/rtrs/rtrs.o
  LD [M]  /tmp/ddd/gkim/ibnbd2/rtrs/rtrs-core.o
  CC [M]  /tmp/ddd/gkim/ibnbd2/rtrs/rtrs-clt.o
  CC [M]  /tmp/ddd/gkim/ibnbd2/rtrs/rtrs-clt-stats.o
  CC [M]  /tmp/ddd/gkim/ibnbd2/rtrs/rtrs-clt-sysfs.o
  LD [M]  /tmp/ddd/gkim/ibnbd2/rtrs/rtrs-client.o
  CC [M]  /tmp/ddd/gkim/ibnbd2/rtrs/rtrs-srv.o
  CC [M]  /tmp/ddd/gkim/ibnbd2/rtrs/rtrs-srv-stats.o
  CC [M]  /tmp/ddd/gkim/ibnbd2/rtrs/rtrs-srv-sysfs.o
  LD [M]  /tmp/ddd/gkim/ibnbd2/rtrs/rtrs-server.o
  AR      /tmp/ddd/gkim/ibnbd2/built-in.a
  Building modules, stage 2.
  MODPOST 5 modules
  CC [M]  /tmp/ddd/gkim/ibnbd2/rnbd/rnbd-client.mod.o
  LD [M]  /tmp/ddd/gkim/ibnbd2/rnbd/rnbd-client.ko
  CC [M]  /tmp/ddd/gkim/ibnbd2/rnbd/rnbd-server.mod.o
  LD [M]  /tmp/ddd/gkim/ibnbd2/rnbd/rnbd-server.ko
  CC [M]  /tmp/ddd/gkim/ibnbd2/rtrs/rtrs-client.mod.o
  LD [M]  /tmp/ddd/gkim/ibnbd2/rtrs/rtrs-client.ko
  CC [M]  /tmp/ddd/gkim/ibnbd2/rtrs/rtrs-core.mod.o
  LD [M]  /tmp/ddd/gkim/ibnbd2/rtrs/rtrs-core.ko
  CC [M]  /tmp/ddd/gkim/ibnbd2/rtrs/rtrs-server.mod.o
  LD [M]  /tmp/ddd/gkim/ibnbd2/rtrs/rtrs-server.ko
make[1]: Leaving directory '/usr/src/linux-5.4.86-pserver'
root@ps401a-914.nst:/tmp/ddd/gkim/ibnbd2# sudo rmmod rnbd-client
root@ps401a-914.nst:/tmp/ddd/gkim/ibnbd2# sudo rmmod rtrs-client
root@ps401a-914.nst:/tmp/ddd/gkim/ibnbd2# sudo rmmod rtrs-core
root@ps401a-914.nst:/tmp/ddd/gkim/ibnbd2# sudo insmod rtrs/rtrs-core.ko
root@ps401a-914.nst:/tmp/ddd/gkim/ibnbd2# sudo insmod rtrs/rtrs-client.ko
root@ps401a-914.nst:/tmp/ddd/gkim/ibnbd2# sudo insmod rnbd/rnbd-client.ko




root@ps401a-914.nst:/tmp/ddd/gkim/ibnbd2# bash go_32dev.sh
fio start   : Di 13. Apr 10:38:09 UTC 2021
kernel info : Linux ps401a-914 5.4.86-pserver #5.4.86-3~deb10 SMP Fri Mar 5 12:29:36 UTC 2021 x86_64 GNU/Linux
fio version : fio-3.12
gcc: gcc (Debian 8.3.0-6) 8.3.0
Copyright (C) 2018 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
Start fio test
fiotest: (g=0): rw=randread, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=128
...
fio-3.12
Starting 64 processes
Jobs: 64 (f=2048): [r(64)][0.1%][r=3233MiB/s][r=828k IOPS][eta 01d:21h:06m:50s]
fiotest: (groupid=0, jobs=64): err= 0: pid=23219: Tue Apr 13 10:41:11 2021
  read: IOPS=829k, BW=3239MiB/s (3396MB/s)(569GiB/180011msec)
    slat (usec): min=82, max=156993, avg=1278.01, stdev=1683.01
    clat (nsec): min=1096, max=30634k, avg=8538323.42, stdev=2411851.29
     lat (usec): min=317, max=162011, avg=9816.39, stdev=2702.04
    clat percentiles (usec):
     |  1.00th=[ 3949],  5.00th=[ 5211], 10.00th=[ 5800], 20.00th=[ 6587],
     | 30.00th=[ 7111], 40.00th=[ 7701], 50.00th=[ 8225], 60.00th=[ 8848],
     | 70.00th=[ 9503], 80.00th=[10421], 90.00th=[11731], 95.00th=[12911],
     | 99.00th=[15270], 99.50th=[16319], 99.90th=[18482], 99.95th=[19530],
     | 99.99th=[21890]
   bw (  KiB/s): min=36864, max=166912, per=1.56%, avg=51773.23, stdev=3159.85, samples=22980
   iops        : min= 9216, max=41728, avg=12943.26, stdev=789.95, samples=22980
  lat (usec)   : 2=0.01%, 4=0.01%, 10=0.01%, 20=0.01%, 50=0.01%
  lat (usec)   : 100=0.01%, 250=0.04%, 500=0.04%, 750=0.02%, 1000=0.02%
  lat (msec)   : 2=0.06%, 4=0.90%, 10=74.65%, 20=24.23%, 50=0.03%
  cpu          : usr=0.97%, sys=4.95%, ctx=82650519, majf=0, minf=3555539
  IO depths    : 1=0.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=100.0%
     submit    : 0=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.1%, >=64=100.0%
     complete  : 0=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.1%, >=64=100.0%
     issued rwts: total=149252805,0,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=128

Run status group 0 (all jobs):
   READ: bw=3239MiB/s (3396MB/s), 3239MiB/s-3239MiB/s (3396MB/s-3396MB/s), io=569GiB (611GB), run=180011-180011msec

Disk stats (read/write):
  rnbd0: ios=4663183/0, merge=0/0, ticks=10123475/0, in_queue=314090, util=99.62%
  rnbd1: ios=4663180/0, merge=0/0, ticks=10445268/0, in_queue=334450, util=99.65%
  rnbd2: ios=4663181/0, merge=0/0, ticks=10631138/0, in_queue=346090, util=99.65%
  rnbd3: ios=4663179/0, merge=0/0, ticks=10791046/0, in_queue=350590, util=99.67%
  rnbd4: ios=4663175/0, merge=0/0, ticks=10888045/0, in_queue=361960, util=99.68%
  rnbd5: ios=4663183/0, merge=0/0, ticks=10994697/0, in_queue=369540, util=99.70%
  rnbd6: ios=4663182/0, merge=0/0, ticks=11010726/0, in_queue=365730, util=99.71%
  rnbd7: ios=4663182/0, merge=0/0, ticks=11041009/0, in_queue=373870, util=99.71%
  rnbd8: ios=4663185/0, merge=0/0, ticks=11050961/0, in_queue=375140, util=99.74%
  rnbd9: ios=4663185/0, merge=0/0, ticks=11063691/0, in_queue=373870, util=99.75%
  rnbd10: ios=4663184/0, merge=0/0, ticks=11099119/0, in_queue=375340, util=99.76%
  rnbd11: ios=4663187/0, merge=0/0, ticks=11112331/0, in_queue=382300, util=99.78%
  rnbd12: ios=4663187/0, merge=1/0, ticks=11078851/0, in_queue=376050, util=99.79%
  rnbd13: ios=4663188/0, merge=0/0, ticks=11087422/0, in_queue=376040, util=99.80%
  rnbd14: ios=4663190/0, merge=0/0, ticks=11070282/0, in_queue=378500, util=99.80%
  rnbd15: ios=4663190/0, merge=0/0, ticks=9417418/0, in_queue=271060, util=99.81%
  rnbd16: ios=4663191/0, merge=0/0, ticks=10588441/0, in_queue=348800, util=99.84%
  rnbd17: ios=4663193/0, merge=0/0, ticks=10740263/0, in_queue=364650, util=99.84%
  rnbd18: ios=4663195/0, merge=0/0, ticks=10813752/0, in_queue=371990, util=99.86%
  rnbd19: ios=4663193/0, merge=0/0, ticks=10878352/0, in_queue=375050, util=99.87%
  rnbd20: ios=4663193/0, merge=0/0, ticks=10845686/0, in_queue=371010, util=99.88%
  rnbd21: ios=4663195/0, merge=0/0, ticks=10854889/0, in_queue=373940, util=99.90%
  rnbd22: ios=4663197/0, merge=0/0, ticks=10936251/0, in_queue=378890, util=99.90%
  rnbd23: ios=4663195/0, merge=0/0, ticks=11000989/0, in_queue=380360, util=99.92%
  rnbd24: ios=4663200/0, merge=0/0, ticks=11056302/0, in_queue=389300, util=99.93%
  rnbd25: ios=4663199/0, merge=0/0, ticks=11099625/0, in_queue=396820, util=99.93%
  rnbd26: ios=4663197/0, merge=0/0, ticks=11091101/0, in_queue=391310, util=99.95%
  rnbd27: ios=4663201/0, merge=0/0, ticks=11108242/0, in_queue=396440, util=99.96%
  rnbd28: ios=4663203/0, merge=0/0, ticks=11222083/0, in_queue=405730, util=100.00%
  rnbd29: ios=4663205/0, merge=0/0, ticks=11251353/0, in_queue=412810, util=99.99%
  rnbd30: ios=4663201/0, merge=0/0, ticks=11238249/0, in_queue=413260, util=100.00%
  rnbd31: ios=4663218/0, merge=0/0, ticks=11267469/0, in_queue=414620, util=100.00%




root@ps401a-914.nst:/tmp/ddd/gkim/ibnbd2# bash go_32dev.sh
fio start   : Di 13. Apr 10:58:33 UTC 2021
kernel info : Linux ps401a-914 5.4.86-pserver #5.4.86-3~deb10 SMP Fri Mar 5 12:29:36 UTC 2021 x86_64 GNU/Linux
fio version : fio-3.12
gcc: gcc (Debian 8.3.0-6) 8.3.0
Copyright (C) 2018 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
Start fio test
fiotest: (g=0): rw=randread, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=128
...
fio-3.12
Starting 64 processes
Jobs: 64 (f=2048): [r(64)][100.0%][r=3227MiB/s][r=826k IOPS][eta 00m:00s]
fiotest: (groupid=0, jobs=64): err= 0: pid=25682: Tue Apr 13 11:01:35 2021
  read: IOPS=826k, BW=3228MiB/s (3385MB/s)(568GiB/180012msec)
    slat (usec): min=170, max=131883, avg=1286.52, stdev=1678.50
    clat (nsec): min=1357, max=31141k, avg=8561491.28, stdev=2417740.18
     lat (usec): min=502, max=133262, avg=9848.07, stdev=2710.62
    clat percentiles (usec):
     |  1.00th=[ 3949],  5.00th=[ 5211], 10.00th=[ 5800], 20.00th=[ 6587],
     | 30.00th=[ 7177], 40.00th=[ 7701], 50.00th=[ 8225], 60.00th=[ 8848],
     | 70.00th=[ 9503], 80.00th=[10421], 90.00th=[11863], 95.00th=[13042],
     | 99.00th=[15401], 99.50th=[16319], 99.90th=[18482], 99.95th=[19268],
     | 99.99th=[21890]
   bw (  KiB/s): min=32768, max=258048, per=1.56%, avg=51622.65, stdev=3500.81, samples=22982
   iops        : min= 8192, max=64512, avg=12905.63, stdev=875.20, samples=22982
  lat (usec)   : 2=0.01%, 4=0.01%, 10=0.01%, 20=0.01%, 50=0.01%
  lat (usec)   : 100=0.02%, 250=0.03%, 500=0.02%, 750=0.01%, 1000=0.01%
  lat (msec)   : 2=0.07%, 4=0.89%, 10=74.32%, 20=24.59%, 50=0.03%
  cpu          : usr=0.96%, sys=5.02%, ctx=81774475, majf=0, minf=3197872
  IO depths    : 1=0.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=100.0%
     submit    : 0=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=100.0%
     complete  : 0=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=100.0%
     issued rwts: total=148772864,0,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=128

Run status group 0 (all jobs):
   READ: bw=3228MiB/s (3385MB/s), 3228MiB/s-3228MiB/s (3385MB/s-3385MB/s), io=568GiB (609GB), run=180012-180012msec

Disk stats (read/write):
  rnbd0: ios=4647869/0, merge=0/0, ticks=10157474/0, in_queue=328770, util=99.64%
  rnbd1: ios=4647859/0, merge=0/0, ticks=10417978/0, in_queue=340850, util=99.67%
  rnbd2: ios=4647863/0, merge=0/0, ticks=10623846/0, in_queue=357690, util=99.68%
  rnbd3: ios=4647875/0, merge=0/0, ticks=10762992/0, in_queue=364280, util=99.70%
  rnbd4: ios=4647869/0, merge=0/0, ticks=10870408/0, in_queue=371350, util=99.70%
  rnbd5: ios=4647866/0, merge=0/0, ticks=10955394/0, in_queue=377050, util=99.72%
  rnbd6: ios=4647875/0, merge=0/0, ticks=11010235/0, in_queue=383640, util=99.73%
  rnbd7: ios=4647873/0, merge=0/0, ticks=11032744/0, in_queue=385440, util=99.73%
  rnbd8: ios=4647871/0, merge=0/0, ticks=11043787/0, in_queue=384150, util=99.76%
  rnbd9: ios=4647877/0, merge=0/0, ticks=9415047/0, in_queue=281570, util=99.77%
  rnbd10: ios=4647881/0, merge=0/0, ticks=10427629/0, in_queue=348200, util=99.78%
  rnbd11: ios=4647876/0, merge=0/0, ticks=10615844/0, in_queue=362920, util=99.81%
  rnbd12: ios=4647872/0, merge=1/0, ticks=10638181/0, in_queue=366520, util=99.81%
  rnbd13: ios=4647882/0, merge=0/0, ticks=10678828/0, in_queue=368450, util=99.82%
  rnbd14: ios=4647874/0, merge=0/0, ticks=10678088/0, in_queue=370470, util=99.83%
  rnbd15: ios=4647887/0, merge=0/0, ticks=10702071/0, in_queue=369640, util=99.84%
  rnbd16: ios=4647875/0, merge=0/0, ticks=10710059/0, in_queue=374900, util=99.86%
  rnbd17: ios=4647883/0, merge=0/0, ticks=10761335/0, in_queue=378860, util=99.87%
  rnbd18: ios=4647884/0, merge=0/0, ticks=10784441/0, in_queue=379010, util=99.89%
  rnbd19: ios=4647882/0, merge=0/0, ticks=10828395/0, in_queue=380680, util=99.92%
  rnbd20: ios=4647897/0, merge=0/0, ticks=10856351/0, in_queue=384050, util=99.91%
  rnbd21: ios=4647898/0, merge=0/0, ticks=10889226/0, in_queue=385120, util=99.94%
  rnbd22: ios=4647890/0, merge=0/0, ticks=10915544/0, in_queue=389180, util=99.93%
  rnbd23: ios=4647888/0, merge=0/0, ticks=10922797/0, in_queue=392600, util=99.94%
  rnbd24: ios=4647891/0, merge=0/0, ticks=10964743/0, in_queue=395200, util=99.95%
  rnbd25: ios=4647894/0, merge=0/0, ticks=11017459/0, in_queue=408070, util=99.96%
  rnbd26: ios=4647896/0, merge=0/0, ticks=11084377/0, in_queue=405700, util=99.98%
  rnbd27: ios=4647893/0, merge=0/0, ticks=11108133/0, in_queue=407300, util=99.99%
  rnbd28: ios=4647905/0, merge=0/0, ticks=11153595/0, in_queue=416610, util=100.00%
  rnbd29: ios=4647899/0, merge=0/0, ticks=11213811/0, in_queue=420820, util=100.00%
  rnbd30: ios=4647903/0, merge=0/0, ticks=11228964/0, in_queue=421960, util=100.00%
  rnbd31: ios=4647904/0, merge=0/0, ticks=11277612/0, in_queue=424840, util=100.00%



root@ps401a-914.nst:/tmp/ddd/gkim/ibnbd2# bash go_32dev.sh
fio start   : Di 13. Apr 11:04:17 UTC 2021
kernel info : Linux ps401a-914 5.4.86-pserver #5.4.86-3~deb10 SMP Fri Mar 5 12:29:36 UTC 2021 x86_64 GNU/Linux
fio version : fio-3.12
gcc: gcc (Debian 8.3.0-6) 8.3.0
Copyright (C) 2018 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
Start fio test
fiotest: (g=0): rw=randread, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=128
...
fio-3.12
Starting 64 processes
Jobs: 64 (f=2048): [r(64)][100.0%][r=3239MiB/s][r=829k IOPS][eta 00m:00s]
fiotest: (groupid=0, jobs=64): err= 0: pid=26523: Tue Apr 13 11:07:19 2021
  read: IOPS=829k, BW=3238MiB/s (3395MB/s)(569GiB/180019msec)
    slat (usec): min=168, max=172647, avg=1290.89, stdev=1702.42
    clat (nsec): min=1359, max=34740k, avg=8525517.65, stdev=2410440.28
     lat (usec): min=454, max=172848, avg=9816.46, stdev=2718.20
    clat percentiles (usec):
     |  1.00th=[ 3916],  5.00th=[ 5211], 10.00th=[ 5800], 20.00th=[ 6521],
     | 30.00th=[ 7111], 40.00th=[ 7635], 50.00th=[ 8225], 60.00th=[ 8848],
     | 70.00th=[ 9503], 80.00th=[10421], 90.00th=[11731], 95.00th=[12911],
     | 99.00th=[15270], 99.50th=[16188], 99.90th=[18482], 99.95th=[19530],
     | 99.99th=[22152]
   bw (  KiB/s): min=31744, max=282624, per=1.56%, avg=51770.75, stdev=3922.53, samples=22985
   iops        : min= 7936, max=70656, avg=12942.66, stdev=980.63, samples=22985
  lat (usec)   : 2=0.01%, 4=0.01%, 10=0.01%, 20=0.01%, 50=0.03%
  lat (usec)   : 100=0.02%, 250=0.04%, 500=0.02%, 750=0.01%, 1000=0.01%
  lat (msec)   : 2=0.06%, 4=0.91%, 10=74.79%, 20=24.07%, 50=0.04%
  cpu          : usr=1.00%, sys=4.98%, ctx=82332948, majf=0, minf=3752201
  IO depths    : 1=0.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=100.0%
     submit    : 0=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=100.0%
     complete  : 0=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=100.0%
     issued rwts: total=149203144,0,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=128

Run status group 0 (all jobs):
   READ: bw=3238MiB/s (3395MB/s), 3238MiB/s-3238MiB/s (3395MB/s-3395MB/s), io=569GiB (611GB), run=180019-180019msec

Disk stats (read/write):
  rnbd0: ios=4661066/0, merge=0/0, ticks=10264711/0, in_queue=322110, util=99.62%
  rnbd1: ios=4661063/0, merge=0/0, ticks=10550895/0, in_queue=343350, util=99.64%
  rnbd2: ios=4661075/0, merge=0/0, ticks=10724899/0, in_queue=347820, util=99.65%
  rnbd3: ios=4661082/0, merge=0/0, ticks=10733532/0, in_queue=349620, util=99.67%
  rnbd4: ios=4661077/0, merge=0/0, ticks=10863981/0, in_queue=357250, util=99.67%
  rnbd5: ios=4661072/0, merge=0/0, ticks=10989829/0, in_queue=368510, util=99.68%
  rnbd6: ios=4661085/0, merge=0/0, ticks=11013824/0, in_queue=365960, util=99.68%
  rnbd7: ios=4661082/0, merge=0/0, ticks=11069249/0, in_queue=372950, util=99.70%
  rnbd8: ios=4661074/0, merge=0/0, ticks=9409880/0, in_queue=262130, util=99.72%
  rnbd9: ios=4661075/0, merge=0/0, ticks=10481517/0, in_queue=338960, util=99.73%
  rnbd10: ios=4661071/0, merge=0/0, ticks=10676791/0, in_queue=352930, util=99.74%
  rnbd11: ios=4661080/0, merge=0/0, ticks=10703680/0, in_queue=352040, util=99.76%
  rnbd12: ios=4661073/0, merge=0/0, ticks=10694124/0, in_queue=354740, util=99.76%
  rnbd13: ios=4661074/0, merge=0/0, ticks=10705697/0, in_queue=353340, util=99.78%
  rnbd14: ios=4661082/0, merge=0/0, ticks=10748646/0, in_queue=361020, util=99.78%
  rnbd15: ios=4661089/0, merge=0/0, ticks=10754424/0, in_queue=362580, util=99.79%
  rnbd16: ios=4661085/0, merge=0/0, ticks=10781956/0, in_queue=362570, util=99.82%
  rnbd17: ios=4661083/0, merge=0/0, ticks=10842119/0, in_queue=369510, util=99.83%
  rnbd18: ios=4661084/0, merge=0/0, ticks=10835750/0, in_queue=370490, util=99.85%
  rnbd19: ios=4661093/0, merge=0/0, ticks=10903655/0, in_queue=373100, util=99.86%
  rnbd20: ios=4661092/0, merge=0/0, ticks=10917943/0, in_queue=376360, util=99.87%
  rnbd21: ios=4661094/0, merge=0/0, ticks=10951428/0, in_queue=380590, util=99.89%
  rnbd22: ios=4661088/0, merge=0/0, ticks=10969831/0, in_queue=379130, util=99.89%
  rnbd23: ios=4661087/0, merge=0/0, ticks=11036709/0, in_queue=387860, util=99.91%
  rnbd24: ios=4661095/0, merge=0/0, ticks=11045243/0, in_queue=389770, util=99.91%
  rnbd25: ios=4661094/0, merge=0/0, ticks=11096795/0, in_queue=391280, util=99.93%
  rnbd26: ios=4661089/0, merge=0/0, ticks=11168428/0, in_queue=401420, util=99.95%
  rnbd27: ios=4661088/0, merge=0/0, ticks=11199394/0, in_queue=406780, util=99.96%
  rnbd28: ios=4661100/0, merge=0/0, ticks=11211816/0, in_queue=401820, util=99.98%
  rnbd29: ios=4661107/0, merge=0/0, ticks=11260826/0, in_queue=410480, util=99.99%
  rnbd30: ios=4661111/0, merge=0/0, ticks=11301041/0, in_queue=418590, util=99.99%
  rnbd31: ios=4661105/0, merge=0/0, ticks=11264838/0, in_queue=414460, util=100.00%

root@ps401a-914.nst:/tmp/ddd/gkim/ibnbd2# bash go_32dev_128proc.sh
fio start   : Di 13. Apr 11:36:53 UTC 2021
kernel info : Linux ps401a-914 5.4.86-pserver #5.4.86-3~deb10 SMP Fri Mar 5 12:29:36 UTC 2021 x86_64 GNU/Linux
fio version : fio-3.12
gcc: gcc (Debian 8.3.0-6) 8.3.0
Copyright (C) 2018 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
Start fio test
fiotest: (g=0): rw=randread, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=128
...
fio-3.12
Starting 128 processes
Jobs: 62 (f=0): [f(1),_(2),f(1),_(1),f(1),_(3),f(1),_(2),f(3),_(2),f(1),_(3),f(1),_(3),f(2),_(1),f(1),_(8),f(4),_(1),f(1),_(2),f(1),_(1),f(1),_(1),f(1),_(2),f(2),_(1),f(1),_(2),f(3),_(1),f(1),_(1),f(1),_(1),
f(2),_(1),f(2),_(1),f(4),_(1),f(1),_(5),f(1),_(1),f(1),_(2),f(2),_(6),f(6),_(1),f(3),_(1),f(1),_(4),f(5),_(1),f(3),_(3),f(1),_(1),f(2)][100.0%][r=2504MiB/s][r=641k IOPS][eta 00m:00s]
fiotest: (groupid=0, jobs=128): err= 0: pid=32673: Tue Apr 13 11:39:56 2021
  read: IOPS=816k, BW=3187MiB/s (3341MB/s)(560GiB/180030msec)
    slat (usec): min=194, max=481083, avg=7768.18, stdev=5693.43
    clat (nsec): min=1163, max=40685k, avg=12208741.78, stdev=3544380.51
     lat (usec): min=509, max=483253, avg=19976.98, stdev=5877.39
    clat percentiles (usec):
     |  1.00th=[ 4555],  5.00th=[ 6849], 10.00th=[ 7963], 20.00th=[ 9372],
     | 30.00th=[10290], 40.00th=[11207], 50.00th=[11994], 60.00th=[12911],
     | 70.00th=[13829], 80.00th=[15008], 90.00th=[16712], 95.00th=[18220],
     | 99.00th=[21627], 99.50th=[23200], 99.90th=[26346], 99.95th=[27657],
     | 99.99th=[30540]
   bw (  KiB/s): min= 2048, max=173732, per=0.78%, avg=25459.52, stdev=2917.03, samples=45970
   iops        : min=  512, max=43433, avg=6364.85, stdev=729.26, samples=45970
  lat (usec)   : 2=0.01%, 4=0.01%, 10=0.01%, 20=0.01%, 50=0.03%
  lat (usec)   : 100=0.04%, 250=0.05%, 500=0.03%, 750=0.01%, 1000=0.01%
  lat (msec)   : 2=0.06%, 4=0.44%, 10=25.99%, 20=71.05%, 50=2.28%
  cpu          : usr=0.58%, sys=2.18%, ctx=75776088, majf=0, minf=6628006
  IO depths    : 1=0.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=100.0%
     submit    : 0=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=100.0%
     complete  : 0=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=100.0%
     issued rwts: total=146863336,0,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=128

Run status group 0 (all jobs):
   READ: bw=3187MiB/s (3341MB/s), 3187MiB/s-3187MiB/s (3341MB/s-3341MB/s), io=560GiB (602GB), run=180030-180030msec

Disk stats (read/write):
  rnbd0: ios=4589419/0, merge=0/0, ticks=14454693/0, in_queue=2184000, util=99.67%
  rnbd1: ios=4589419/0, merge=0/0, ticks=14939602/0, in_queue=2323370, util=99.68%
  rnbd2: ios=4589422/0, merge=0/0, ticks=15071745/0, in_queue=2363720, util=99.69%
  rnbd3: ios=4589422/0, merge=0/0, ticks=15196019/0, in_queue=2395360, util=99.70%
  rnbd4: ios=4589422/0, merge=0/0, ticks=15229235/0, in_queue=2401140, util=99.71%
  rnbd5: ios=4589423/0, merge=0/0, ticks=15265557/0, in_queue=2395570, util=99.72%
  rnbd6: ios=4589425/0, merge=0/0, ticks=15305021/0, in_queue=2405310, util=99.72%
  rnbd7: ios=4589424/0, merge=0/0, ticks=15361167/0, in_queue=2417510, util=99.73%
  rnbd8: ios=4589425/0, merge=0/0, ticks=15374308/0, in_queue=2419060, util=99.76%
  rnbd9: ios=4589427/0, merge=0/0, ticks=11516180/0, in_queue=1438280, util=99.77%
  rnbd10: ios=4589427/0, merge=0/0, ticks=14170395/0, in_queue=2147290, util=99.77%
  rnbd11: ios=4589427/0, merge=0/0, ticks=14720156/0, in_queue=2296450, util=99.79%
  rnbd12: ios=4589426/0, merge=0/0, ticks=14798828/0, in_queue=2319630, util=99.79%
  rnbd13: ios=4589426/0, merge=0/0, ticks=14779692/0, in_queue=2314110, util=99.81%
  rnbd14: ios=4589426/0, merge=0/0, ticks=14807971/0, in_queue=2320940, util=99.81%
  rnbd15: ios=4589427/0, merge=0/0, ticks=14815817/0, in_queue=2320800, util=99.82%
  rnbd16: ios=4589428/0, merge=0/0, ticks=14874511/0, in_queue=2341850, util=99.85%
  rnbd17: ios=4589428/0, merge=0/0, ticks=14884319/0, in_queue=2340410, util=99.86%
  rnbd18: ios=4589429/0, merge=0/0, ticks=14889740/0, in_queue=2350840, util=99.88%
  rnbd19: ios=4589428/0, merge=0/0, ticks=14851478/0, in_queue=2333840, util=99.89%
  rnbd20: ios=4589430/0, merge=0/0, ticks=14917096/0, in_queue=2350030, util=99.89%
  rnbd21: ios=4589430/0, merge=0/0, ticks=14896627/0, in_queue=2342230, util=99.91%
  rnbd22: ios=4589431/0, merge=0/0, ticks=14865768/0, in_queue=2323510, util=99.91%
  rnbd23: ios=4589431/0, merge=0/0, ticks=14943766/0, in_queue=2351210, util=99.92%
  rnbd24: ios=4589431/0, merge=0/0, ticks=14952482/0, in_queue=2368310, util=99.92%
  rnbd25: ios=4589431/0, merge=0/0, ticks=14966379/0, in_queue=2367030, util=99.93%
  rnbd26: ios=4589433/0, merge=0/0, ticks=14975019/0, in_queue=2368030, util=99.95%
  rnbd27: ios=4589434/0, merge=0/0, ticks=14990885/0, in_queue=2369840, util=99.96%
  rnbd28: ios=4589433/0, merge=0/0, ticks=14936498/0, in_queue=2366290, util=99.98%
  rnbd29: ios=4589433/0, merge=0/0, ticks=14986887/0, in_queue=2380530, util=99.99%
  rnbd30: ios=4589434/0, merge=1/0, ticks=15018143/0, in_queue=2395330, util=99.99%
  rnbd31: ios=4589435/0, merge=0/0, ticks=14995177/0, in_queue=2381090, util=100.00%


[-- Attachment #3: 0001-swap-likely-and-unlikely.patch --]
[-- Type: text/x-patch, Size: 20266 bytes --]

From 2636311e5e2894bd7c7800939a3b9b68e7a93bcc Mon Sep 17 00:00:00 2001
From: Gioh Kim <gi-oh.kim@ionos.com>
Date: Tue, 13 Apr 2021 14:00:27 +0200
Subject: [PATCH] swap likely and unlikely

---
 rtrs/rtrs-clt.c | 134 +++++++++++++++++++++++++-----------------------
 1 file changed, 70 insertions(+), 64 deletions(-)

diff --git a/rtrs/rtrs-clt.c b/rtrs/rtrs-clt.c
index 1b4b3e6..6235827 100644
--- a/rtrs/rtrs-clt.c
+++ b/rtrs/rtrs-clt.c
@@ -17,6 +17,12 @@
 #include "rtrs-clt.h"
 #include "rtrs-log.h"
 
+
+
+#define MYLIKELY(x) unlikely(x)
+#define MYUNLIKELY(x) likely(x)
+
+
 #define RTRS_CONNECT_TIMEOUT_MS 30000
 /*
  * Wait a bit before trying to reconnect after a failure
@@ -80,9 +86,9 @@ __rtrs_get_permit(struct rtrs_clt *clt, enum rtrs_clt_con_type con_type)
 	 */
 	do {
 		bit = find_first_zero_bit(clt->permits_map, max_depth);
-		if (unlikely(bit >= max_depth))
+		if (MYUNLIKELY(bit >= max_depth))
 			return NULL;
-	} while (unlikely(test_and_set_bit_lock(bit, clt->permits_map)));
+	} while (MYUNLIKELY(test_and_set_bit_lock(bit, clt->permits_map)));
 
 	permit = get_permit(clt, bit);
 	WARN_ON(permit->mem_id != bit);
@@ -120,14 +126,14 @@ struct rtrs_permit *rtrs_clt_get_permit(struct rtrs_clt *clt,
 	DEFINE_WAIT(wait);
 
 	permit = __rtrs_get_permit(clt, con_type);
-	if (likely(permit) || !can_wait)
+	if (MYLIKELY(permit) || !can_wait)
 		return permit;
 
 	do {
 		prepare_to_wait(&clt->permits_wait, &wait,
 				TASK_UNINTERRUPTIBLE);
 		permit = __rtrs_get_permit(clt, con_type);
-		if (likely(permit))
+		if (MYLIKELY(permit))
 			break;
 
 		io_schedule();
@@ -180,7 +186,7 @@ struct rtrs_clt_con *rtrs_permit_to_clt_con(struct rtrs_clt_sess *sess,
 {
 	int id = 0;
 
-	if (likely(permit->con_type == RTRS_IO_CON))
+	if (MYLIKELY(permit->con_type == RTRS_IO_CON))
 		id = (permit->cpu_id % (sess->s.irq_con_num - 1)) + 1;
 
 	return to_clt_con(sess->s.con[id]);
@@ -299,7 +305,7 @@ static void rtrs_clt_fast_reg_done(struct ib_cq *cq, struct ib_wc *wc)
 {
 	struct rtrs_clt_con *con = cq->cq_context;
 
-	if (unlikely(wc->status != IB_WC_SUCCESS)) {
+	if (MYUNLIKELY(wc->status != IB_WC_SUCCESS)) {
 		rtrs_err(con->c.sess, "Failed IB_WR_REG_MR: %s\n",
 			  ib_wc_status_msg(wc->status));
 		rtrs_rdma_error_recovery(con);
@@ -319,13 +325,13 @@ static void rtrs_clt_inv_rkey_done(struct ib_cq *cq, struct ib_wc *wc)
 		container_of(wc->wr_cqe, typeof(*req), inv_cqe);
 	struct rtrs_clt_con *con = cq->cq_context;
 
-	if (unlikely(wc->status != IB_WC_SUCCESS)) {
+	if (MYUNLIKELY(wc->status != IB_WC_SUCCESS)) {
 		rtrs_err(con->c.sess, "Failed IB_WR_LOCAL_INV: %s\n",
 			  ib_wc_status_msg(wc->status));
 		rtrs_rdma_error_recovery(con);
 	}
 	req->need_inv = false;
-	if (likely(req->need_inv_comp))
+	if (MYLIKELY(req->need_inv_comp))
 		complete(&req->inv_comp);
 	else
 		/* Complete request from INV callback */
@@ -360,7 +366,7 @@ static void complete_rdma_req(struct rtrs_clt_io_req *req, int errno,
 	sess = to_clt_sess(con->c.sess);
 
 	if (req->sg_cnt) {
-		if (unlikely(req->dir == DMA_FROM_DEVICE && req->need_inv)) {
+		if (MYUNLIKELY(req->dir == DMA_FROM_DEVICE && req->need_inv)) {
 			/*
 			 * We are here to invalidate read requests
 			 * ourselves.  In normal scenario server should
@@ -375,7 +381,7 @@ static void complete_rdma_req(struct rtrs_clt_io_req *req, int errno,
 			 *        should do that ourselves.
 			 */
 
-			if (likely(can_wait)) {
+			if (MYLIKELY(can_wait)) {
 				req->need_inv_comp = true;
 			} else {
 				/* This should be IO path, so always notify */
@@ -386,10 +392,10 @@ static void complete_rdma_req(struct rtrs_clt_io_req *req, int errno,
 
 			refcount_inc(&req->ref);
 			err = rtrs_inv_rkey(req);
-			if (unlikely(err)) {
+			if (MYUNLIKELY(err)) {
 				rtrs_err(con->c.sess, "Send INV WR key=%#x: %d\n",
 					  req->mr->rkey, err);
-			} else if (likely(can_wait)) {
+			} else if (MYLIKELY(can_wait)) {
 				wait_for_completion(&req->inv_comp);
 			} else {
 				/*
@@ -414,7 +420,7 @@ static void complete_rdma_req(struct rtrs_clt_io_req *req, int errno,
 	req->in_use = false;
 	req->con = NULL;
 
-	if (unlikely(errno)) {
+	if (MYUNLIKELY(errno)) {
 		rtrs_err_rl(con->c.sess, "IO request failed: error=%d path=%s [%s:%u] notify=%d\n",
 			    errno, kobject_name(&sess->kobj), sess->hca_name, sess->hca_port, notify);
 	}
@@ -432,7 +438,7 @@ static int rtrs_post_send_rdma(struct rtrs_clt_con *con,
 	enum ib_send_flags flags;
 	struct ib_sge sge;
 
-	if (unlikely(!req->sg_size)) {
+	if (MYUNLIKELY(!req->sg_size)) {
 		rtrs_wrn(con->c.sess,
 			 "Doing RDMA Write failed, no data supplied\n");
 		return -EINVAL;
@@ -482,7 +488,7 @@ static void rtrs_clt_recv_done(struct rtrs_clt_con *con, struct ib_wc *wc)
 	iu = container_of(wc->wr_cqe, struct rtrs_iu,
 			  cqe);
 	err = rtrs_iu_post_recv(&con->c, iu);
-	if (unlikely(err)) {
+	if (MYUNLIKELY(err)) {
 		rtrs_err(con->c.sess, "post iu failed %d\n", err);
 		rtrs_rdma_error_recovery(con);
 	}
@@ -502,7 +508,7 @@ static void rtrs_clt_rkey_rsp_done(struct rtrs_clt_con *con, struct ib_wc *wc)
 
 	iu = container_of(wc->wr_cqe, struct rtrs_iu, cqe);
 
-	if (unlikely(wc->byte_len < sizeof(*msg))) {
+	if (MYUNLIKELY(wc->byte_len < sizeof(*msg))) {
 		rtrs_err(con->c.sess, "rkey response is malformed: size %d\n",
 			  wc->byte_len);
 		goto out;
@@ -510,7 +516,7 @@ static void rtrs_clt_rkey_rsp_done(struct rtrs_clt_con *con, struct ib_wc *wc)
 	ib_dma_sync_single_for_cpu(sess->s.dev->ib_dev, iu->dma_addr,
 				   iu->size, DMA_FROM_DEVICE);
 	msg = iu->buf;
-	if (unlikely(le16_to_cpu(msg->type) != RTRS_MSG_RKEY_RSP)) {
+	if (MYUNLIKELY(le16_to_cpu(msg->type) != RTRS_MSG_RKEY_RSP)) {
 		rtrs_err(sess->clt, "rkey response is malformed: type %d\n",
 			  le16_to_cpu(msg->type));
 		goto out;
@@ -520,7 +526,7 @@ static void rtrs_clt_rkey_rsp_done(struct rtrs_clt_con *con, struct ib_wc *wc)
 		goto out;
 
 	rtrs_from_imm(be32_to_cpu(wc->ex.imm_data), &imm_type, &imm_payload);
-	if (likely(imm_type == RTRS_IO_RSP_IMM ||
+	if (MYLIKELY(imm_type == RTRS_IO_RSP_IMM ||
 		   imm_type == RTRS_IO_RSP_W_INV_IMM)) {
 		u32 msg_id;
 
@@ -574,7 +580,7 @@ static void rtrs_clt_rdma_done(struct ib_cq *cq, struct ib_wc *wc)
 	bool w_inval = false;
 	int err;
 
-	if (unlikely(wc->status != IB_WC_SUCCESS)) {
+	if (MYUNLIKELY(wc->status != IB_WC_SUCCESS)) {
 		if (wc->status != IB_WC_WR_FLUSH_ERR) {
 			rtrs_err(sess->clt, "RDMA failed: %s\n",
 				  ib_wc_status_msg(wc->status));
@@ -594,7 +600,7 @@ static void rtrs_clt_rdma_done(struct ib_cq *cq, struct ib_wc *wc)
 			return;
 		rtrs_from_imm(be32_to_cpu(wc->ex.imm_data),
 			       &imm_type, &imm_payload);
-		if (likely(imm_type == RTRS_IO_RSP_IMM ||
+		if (MYLIKELY(imm_type == RTRS_IO_RSP_IMM ||
 			   imm_type == RTRS_IO_RSP_W_INV_IMM)) {
 			u32 msg_id;
 
@@ -626,7 +632,7 @@ static void rtrs_clt_rdma_done(struct ib_cq *cq, struct ib_wc *wc)
 			err = rtrs_post_recv_empty_x2(&con->c, &io_comp_cqe);
 		else
 			err = rtrs_post_recv_empty(&con->c, &io_comp_cqe);
-		if (unlikely(err)) {
+		if (MYUNLIKELY(err)) {
 			rtrs_err(con->c.sess, "rtrs_post_recv_empty(): %d\n",
 				  err);
 			rtrs_rdma_error_recovery(con);
@@ -671,7 +677,7 @@ static int post_recv_io(struct rtrs_clt_con *con, size_t q_size)
 		} else {
 			err = rtrs_post_recv_empty(&con->c, &io_comp_cqe);
 		}
-		if (unlikely(err))
+		if (MYUNLIKELY(err))
 			return err;
 	}
 
@@ -696,7 +702,7 @@ static int post_recv_sess(struct rtrs_clt_sess *sess)
 		q_size *= 2;
 
 		err = post_recv_io(to_clt_con(sess->s.con[cid]), q_size);
-		if (unlikely(err)) {
+		if (MYUNLIKELY(err)) {
 			rtrs_err(sess->clt, "post_recv_io(), err: %d\n", err);
 			return err;
 		}
@@ -757,7 +763,7 @@ static struct rtrs_clt_sess *get_next_path_rr(struct path_it *it)
 
 	ppcpu_path = this_cpu_ptr(clt->pcpu_path);
 	path = rcu_dereference(*ppcpu_path);
-	if (unlikely(!path))
+	if (MYUNLIKELY(!path))
 		path = list_first_or_null_rcu(&clt->paths_list,
 					      typeof(*path), s.entry);
 	else
@@ -788,10 +794,10 @@ static struct rtrs_clt_sess *get_next_path_min_inflight(struct path_it *it)
 	int inflight;
 
 	list_for_each_entry_rcu(sess, &clt->paths_list, s.entry) {
-		if (unlikely(READ_ONCE(sess->state) != RTRS_CLT_CONNECTED))
+		if (MYUNLIKELY(READ_ONCE(sess->state) != RTRS_CLT_CONNECTED))
 			continue;
 
-		if (unlikely(!list_empty(raw_cpu_ptr(sess->mp_skip_entry))))
+		if (MYUNLIKELY(!list_empty(raw_cpu_ptr(sess->mp_skip_entry))))
 			continue;
 
 		inflight = atomic_read(&sess->stats->inflight);
@@ -839,10 +845,10 @@ static struct rtrs_clt_sess *get_next_path_min_latency(struct path_it *it)
 	ktime_t latency;
 
 	list_for_each_entry_rcu(sess, &clt->paths_list, s.entry) {
-		if (unlikely(READ_ONCE(sess->state) != RTRS_CLT_CONNECTED))
+		if (MYUNLIKELY(READ_ONCE(sess->state) != RTRS_CLT_CONNECTED))
 			continue;
 
-		if (unlikely(!list_empty(raw_cpu_ptr(sess->mp_skip_entry))))
+		if (MYUNLIKELY(!list_empty(raw_cpu_ptr(sess->mp_skip_entry))))
 			continue;
 
 		latency = sess->s.hb_cur_latency;
@@ -1033,7 +1039,7 @@ static int rtrs_map_sg_fr(struct rtrs_clt_io_req *req, size_t count)
 	nr = ib_map_mr_sg(req->mr, req->sglist, count, NULL, SZ_4K);
 	if (nr < 0)
 		return nr;
-	if (unlikely(nr < req->sg_cnt))
+	if (MYUNLIKELY(nr < req->sg_cnt))
 		return -EINVAL;
 	ib_update_fast_reg_key(req->mr, ib_inc_rkey(req->mr->rkey));
 
@@ -1057,7 +1063,7 @@ static int rtrs_clt_write_req(struct rtrs_clt_io_req *req)
 
 	const size_t tsize = sizeof(*msg) + req->data_len + req->usr_len;
 
-	if (unlikely(tsize > sess->chunk_size)) {
+	if (MYUNLIKELY(tsize > sess->chunk_size)) {
 		rtrs_wrn(s, "Write request failed, size too big %zu > %d\n",
 			  tsize, sess->chunk_size);
 		return -EMSGSIZE;
@@ -1065,7 +1071,7 @@ static int rtrs_clt_write_req(struct rtrs_clt_io_req *req)
 	if (req->sg_cnt) {
 		count = ib_dma_map_sg(sess->s.dev->ib_dev, req->sglist,
 				      req->sg_cnt, req->dir);
-		if (unlikely(!count)) {
+		if (MYUNLIKELY(!count)) {
 			rtrs_wrn(s, "Write request failed, map failed\n");
 			return -EINVAL;
 		}
@@ -1120,7 +1126,7 @@ static int rtrs_clt_write_req(struct rtrs_clt_io_req *req)
 				      req->usr_len + sizeof(*msg) +
 				      sizeof(struct rtrs_sg_desc),
 				      imm, wr, &inv_wr);
-	if (unlikely(ret)) {
+	if (MYUNLIKELY(ret)) {
 		rtrs_err_rl(s, "Write request failed: error=%d path=%s [%s:%u]\n",
 			    ret, kobject_name(&sess->kobj), sess->hca_name, sess->hca_port);
 		if (sess->clt->mp_policy == MP_POLICY_MIN_INFLIGHT)
@@ -1150,7 +1156,7 @@ static int rtrs_clt_read_req(struct rtrs_clt_io_req *req)
 
 	const size_t tsize = sizeof(*msg) + req->data_len + req->usr_len;
 
-	if (unlikely(tsize > sess->chunk_size)) {
+	if (MYUNLIKELY(tsize > sess->chunk_size)) {
 		rtrs_wrn(s,
 			  "Read request failed, message size is %zu, bigger than CHUNK_SIZE %d\n",
 			  tsize, sess->chunk_size);
@@ -1160,7 +1166,7 @@ static int rtrs_clt_read_req(struct rtrs_clt_io_req *req)
 	if (req->sg_cnt) {
 		count = ib_dma_map_sg(dev->ib_dev, req->sglist, req->sg_cnt,
 				      req->dir);
-		if (unlikely(!count)) {
+		if (MYUNLIKELY(!count)) {
 			rtrs_wrn(s,
 				  "Read request failed, dma map failed\n");
 			return -EINVAL;
@@ -1234,7 +1240,7 @@ static int rtrs_clt_read_req(struct rtrs_clt_io_req *req)
 
 	ret = rtrs_post_send_rdma(req->con, req, &sess->rbufs[buf_id],
 				   req->data_len, imm, wr);
-	if (unlikely(ret)) {
+	if (MYUNLIKELY(ret)) {
 		rtrs_err_rl(s, "Read request failed: error=%d path=%s [%s:%u]\n",
 			    ret, kobject_name(&sess->kobj), sess->hca_name, sess->hca_port);
 		if (sess->clt->mp_policy == MP_POLICY_MIN_INFLIGHT)
@@ -1265,7 +1271,7 @@ static int rtrs_clt_failover_req(struct rtrs_clt *clt,
 	for (path_it_init(&it, clt);
 	     (alive_sess = it.next_path(&it)) && it.i < it.clt->paths_num;
 	     it.i++) {
-		if (unlikely(READ_ONCE(alive_sess->state) !=
+		if (MYUNLIKELY(READ_ONCE(alive_sess->state) !=
 			     RTRS_CLT_CONNECTED))
 			continue;
 		req = rtrs_clt_get_copy_req(alive_sess, fail_req);
@@ -1273,7 +1279,7 @@ static int rtrs_clt_failover_req(struct rtrs_clt *clt,
 			err = rtrs_clt_write_req(req);
 		else
 			err = rtrs_clt_read_req(req);
-		if (unlikely(err)) {
+		if (MYUNLIKELY(err)) {
 			req->in_use = false;
 			continue;
 		}
@@ -1308,7 +1314,7 @@ static void fail_all_outstanding_reqs(struct rtrs_clt_sess *sess)
 		complete_rdma_req(req, -ECONNABORTED, false, true);
 
 		err = rtrs_clt_failover_req(clt, req);
-		if (unlikely(err))
+		if (MYUNLIKELY(err))
 			/* Failover failed, notify anyway */
 			req->conf(req->priv, err);
 	}
@@ -1352,7 +1358,7 @@ static int alloc_sess_reqs(struct rtrs_clt_sess *sess)
 			goto out;
 		sg_cnt = NOREG_CNT + 1;
 		req->sge = kcalloc(sg_cnt, sizeof(*req->sge), GFP_KERNEL);
-		if (unlikely(!req->sge))
+		if (MYUNLIKELY(!req->sge))
 			goto out;
 
 		req->mr = ib_alloc_mr(sess->s.dev->ib_pd, IB_MR_TYPE_MEM_REG,
@@ -1946,7 +1952,7 @@ static int rtrs_clt_rdma_cm_handler(struct rdma_cm_id *cm_id,
 		break;
 	case RDMA_CM_EVENT_ESTABLISHED:
 		cm_err = rtrs_rdma_conn_established(con, ev);
-		if (likely(!cm_err)) {
+		if (MYLIKELY(!cm_err)) {
 			/*
 			 * Report success and wake up. Here we abuse state_wq,
 			 * i.e. wake up without state change, but we set cm_err.
@@ -2365,7 +2371,7 @@ static void rtrs_clt_info_req_done(struct ib_cq *cq, struct ib_wc *wc)
 	iu = container_of(wc->wr_cqe, struct rtrs_iu, cqe);
 	rtrs_iu_free(iu, sess->s.dev->ib_dev, 1);
 
-	if (unlikely(wc->status != IB_WC_SUCCESS)) {
+	if (MYUNLIKELY(wc->status != IB_WC_SUCCESS)) {
 		rtrs_err(sess->clt, "Sess info request send failed: %s\n",
 			  ib_wc_status_msg(wc->status));
 		rtrs_clt_change_state_get_old(sess, RTRS_CLT_CONNECTING_ERR, NULL);
@@ -2382,7 +2388,7 @@ static int process_info_rsp(struct rtrs_clt_sess *sess,
 	int i, sgi;
 
 	sg_cnt = le16_to_cpu(msg->sg_cnt);
-	if (unlikely(!sg_cnt || (sess->queue_depth % sg_cnt))) {
+	if (MYUNLIKELY(!sg_cnt || (sess->queue_depth % sg_cnt))) {
 		rtrs_err(sess->clt, "Incorrect sg_cnt %d, is not multiple\n",
 			  sg_cnt);
 		return -EINVAL;
@@ -2392,7 +2398,7 @@ static int process_info_rsp(struct rtrs_clt_sess *sess,
 	 * Check if IB immediate data size is enough to hold the mem_id and
 	 * the offset inside the memory chunk.
 	 */
-	if (unlikely((ilog2(sg_cnt - 1) + 1) +
+	if (MYUNLIKELY((ilog2(sg_cnt - 1) + 1) +
 		     (ilog2(sess->chunk_size - 1) + 1) >
 		     MAX_IMM_PAYL_BITS)) {
 		rtrs_err(sess->clt,
@@ -2412,7 +2418,7 @@ static int process_info_rsp(struct rtrs_clt_sess *sess,
 
 		total_len += len;
 
-		if (unlikely(!len || (len % sess->chunk_size))) {
+		if (MYUNLIKELY(!len || (len % sess->chunk_size))) {
 			rtrs_err(sess->clt, "Incorrect [%d].len %d\n", sgi,
 				  len);
 			return -EINVAL;
@@ -2426,11 +2432,11 @@ static int process_info_rsp(struct rtrs_clt_sess *sess,
 		}
 	}
 	/* Sanity check */
-	if (unlikely(sgi != sg_cnt || i != sess->queue_depth)) {
+	if (MYUNLIKELY(sgi != sg_cnt || i != sess->queue_depth)) {
 		rtrs_err(sess->clt, "Incorrect sg vector, not fully mapped\n");
 		return -EINVAL;
 	}
-	if (unlikely(total_len != sess->chunk_size * sess->queue_depth)) {
+	if (MYUNLIKELY(total_len != sess->chunk_size * sess->queue_depth)) {
 		rtrs_err(sess->clt, "Incorrect total_len %d\n", total_len);
 		return -EINVAL;
 	}
@@ -2452,14 +2458,14 @@ static void rtrs_clt_info_rsp_done(struct ib_cq *cq, struct ib_wc *wc)
 
 	WARN_ON(con->c.cid);
 	iu = container_of(wc->wr_cqe, struct rtrs_iu, cqe);
-	if (unlikely(wc->status != IB_WC_SUCCESS)) {
+	if (MYUNLIKELY(wc->status != IB_WC_SUCCESS)) {
 		rtrs_err(sess->clt, "Sess info response recv failed: %s\n",
 			  ib_wc_status_msg(wc->status));
 		goto out;
 	}
 	WARN_ON(wc->opcode != IB_WC_RECV);
 
-	if (unlikely(wc->byte_len < sizeof(*msg))) {
+	if (MYUNLIKELY(wc->byte_len < sizeof(*msg))) {
 		rtrs_err(sess->clt, "Sess info response is malformed: size %d\n",
 			  wc->byte_len);
 		goto out;
@@ -2467,24 +2473,24 @@ static void rtrs_clt_info_rsp_done(struct ib_cq *cq, struct ib_wc *wc)
 	ib_dma_sync_single_for_cpu(sess->s.dev->ib_dev, iu->dma_addr,
 				   iu->size, DMA_FROM_DEVICE);
 	msg = iu->buf;
-	if (unlikely(le16_to_cpu(msg->type) != RTRS_MSG_INFO_RSP)) {
+	if (MYUNLIKELY(le16_to_cpu(msg->type) != RTRS_MSG_INFO_RSP)) {
 		rtrs_err(sess->clt, "Sess info response is malformed: type %d\n",
 			  le16_to_cpu(msg->type));
 		goto out;
 	}
 	rx_sz  = sizeof(*msg);
 	rx_sz += sizeof(msg->desc[0]) * le16_to_cpu(msg->sg_cnt);
-	if (unlikely(wc->byte_len < rx_sz)) {
+	if (MYUNLIKELY(wc->byte_len < rx_sz)) {
 		rtrs_err(sess->clt, "Sess info response is malformed: size %d\n",
 			  wc->byte_len);
 		goto out;
 	}
 	err = process_info_rsp(sess, msg);
-	if (unlikely(err))
+	if (MYUNLIKELY(err))
 		goto out;
 
 	err = post_recv_sess(sess);
-	if (unlikely(err))
+	if (MYUNLIKELY(err))
 		goto out;
 
 	state = RTRS_CLT_CONNECTED;
@@ -2511,13 +2517,13 @@ static int rtrs_send_sess_info(struct rtrs_clt_sess *sess)
 			       rtrs_clt_info_req_done);
 	rx_iu = rtrs_iu_alloc(1, rx_sz, GFP_KERNEL, sess->s.dev->ib_dev,
 			       DMA_FROM_DEVICE, rtrs_clt_info_rsp_done);
-	if (unlikely(!tx_iu || !rx_iu)) {
+	if (MYUNLIKELY(!tx_iu || !rx_iu)) {
 		err = -ENOMEM;
 		goto out;
 	}
 	/* Prepare for getting info response */
 	err = rtrs_iu_post_recv(&usr_con->c, rx_iu);
-	if (unlikely(err)) {
+	if (MYUNLIKELY(err)) {
 		rtrs_err(sess->clt, "rtrs_iu_post_recv(), err: %d\n", err);
 		goto out;
 	}
@@ -2532,7 +2538,7 @@ static int rtrs_send_sess_info(struct rtrs_clt_sess *sess)
 
 	/* Send info request */
 	err = rtrs_iu_post_send(&usr_con->c, tx_iu, sizeof(*msg), NULL);
-	if (unlikely(err)) {
+	if (MYUNLIKELY(err)) {
 		rtrs_err(sess->clt, "rtrs_iu_post_send(), err: %d\n", err);
 		goto out;
 	}
@@ -2543,7 +2549,7 @@ static int rtrs_send_sess_info(struct rtrs_clt_sess *sess)
 					 sess->state != RTRS_CLT_CONNECTING,
 					 msecs_to_jiffies(
 						 RTRS_CONNECT_TIMEOUT_MS));
-	if (unlikely(READ_ONCE(sess->state) != RTRS_CLT_CONNECTED)) {
+	if (MYUNLIKELY(READ_ONCE(sess->state) != RTRS_CLT_CONNECTED)) {
 		if (READ_ONCE(sess->state) == RTRS_CLT_CONNECTING_ERR)
 			err = -ECONNRESET;
 		else
@@ -2555,7 +2561,7 @@ out:
 		rtrs_iu_free(tx_iu, sess->s.dev->ib_dev, 1);
 	if (rx_iu)
 		rtrs_iu_free(rx_iu, sess->s.dev->ib_dev, 1);
-	if (unlikely(err))
+	if (MYUNLIKELY(err))
 		/* If we've never taken async path because of malloc problems */
 		rtrs_clt_change_state_get_old(sess, RTRS_CLT_CONNECTING_ERR, NULL);
 
@@ -2899,7 +2905,7 @@ int rtrs_clt_remove_path_from_sysfs(struct rtrs_clt_sess *sess,
 							&old_state);
 	} while (!changed && old_state != RTRS_CLT_DEAD);
 
-	if (likely(changed)) {
+	if (MYLIKELY(changed)) {
 		rtrs_clt_remove_path_from_arr(sess);
 		rtrs_clt_destroy_sess_files(sess, sysfs_self);
 		kobject_put(&sess->kobj);
@@ -2971,14 +2977,14 @@ int rtrs_clt_request(int dir, struct rtrs_clt_req_ops *ops,
 	rcu_read_lock();
 	for (path_it_init(&it, clt);
 	     (sess = it.next_path(&it)) && it.i < it.clt->paths_num; it.i++) {
-		if (unlikely(READ_ONCE(sess->state) != RTRS_CLT_CONNECTED))
+		if (MYUNLIKELY(READ_ONCE(sess->state) != RTRS_CLT_CONNECTED))
 			continue;
 
 		err = rtrs_clt_should_fail_request(&sess->fault_inject);
-		if (unlikely(err))
+		if (MYUNLIKELY(err))
 			continue;
 
-		if (unlikely(usr_len + hdr_len > sess->max_hdr_size)) {
+		if (MYUNLIKELY(usr_len + hdr_len > sess->max_hdr_size)) {
 			rtrs_wrn_rl(sess->clt,
 				     "%s request failed, user message size is %zu and header length %zu, but max size is %u\n",
 				     dir == READ ? "Read" : "Write",
@@ -2993,7 +2999,7 @@ int rtrs_clt_request(int dir, struct rtrs_clt_req_ops *ops,
 			err = rtrs_clt_read_req(req);
 		else
 			err = rtrs_clt_write_req(req);
-		if (unlikely(err)) {
+		if (MYUNLIKELY(err)) {
 			req->in_use = false;
 			continue;
 		}
@@ -3017,12 +3023,12 @@ int rtrs_clt_rdma_cq_direct(struct rtrs_clt *clt, unsigned int index)
 	rcu_read_lock();
 	for (path_it_init(&it, clt);
 	     (sess = it.next_path(&it)) && it.i < it.clt->paths_num; it.i++) {
-		if (unlikely(READ_ONCE(sess->state) != RTRS_CLT_CONNECTED))
+		if (MYUNLIKELY(READ_ONCE(sess->state) != RTRS_CLT_CONNECTED))
 			continue;
 
 		con = sess->s.con[index + 1];
 		cnt = ib_process_cq_direct(con->cq, -1);
-		if (likely(cnt))
+		if (MYLIKELY(cnt))
 			break;
 	}
 	path_it_deinit(&it);
-- 
2.25.1


[-- Attachment #4: after_swap.txt --]
[-- Type: text/plain, Size: 15842 bytes --]

root@ps401a-914.nst:/tmp/ddd/gkim/ibnbd2# git show HEAD | head
commit 2636311e5e2894bd7c7800939a3b9b68e7a93bcc
Author: Gioh Kim <gi-oh.kim@ionos.com>
Date:   Tue Apr 13 14:00:27 2021 +0200

    swap likely and unlikely

diff --git a/rtrs/rtrs-clt.c b/rtrs/rtrs-clt.c
index 1b4b3e6..6235827 100644
--- a/rtrs/rtrs-clt.c
+++ b/rtrs/rtrs-clt.c


141 root@ps401a-914.nst:/tmp/ddd/gkim/ibnbd2# make clean && make
make[1]: Entering directory '/usr/src/linux-5.4.86-pserver'
  CLEAN   /tmp/ddd/gkim/ibnbd2/Module.symvers
make[1]: Leaving directory '/usr/src/linux-5.4.86-pserver'
make[1]: Entering directory '/usr/src/linux-5.4.86-pserver'
  CC [M]  /tmp/ddd/gkim/ibnbd2/rnbd/rnbd-clt.o
  CC [M]  /tmp/ddd/gkim/ibnbd2/rnbd/rnbd-clt-sysfs.o
  CC [M]  /tmp/ddd/gkim/ibnbd2/rnbd/rnbd-common.o
  LD [M]  /tmp/ddd/gkim/ibnbd2/rnbd/rnbd-client.o
  CC [M]  /tmp/ddd/gkim/ibnbd2/rnbd/rnbd-srv.o
  CC [M]  /tmp/ddd/gkim/ibnbd2/rnbd/rnbd-srv-dev.o
  CC [M]  /tmp/ddd/gkim/ibnbd2/rnbd/rnbd-srv-sysfs.o
  LD [M]  /tmp/ddd/gkim/ibnbd2/rnbd/rnbd-server.o
  CC [M]  /tmp/ddd/gkim/ibnbd2/rtrs/rtrs.o
  LD [M]  /tmp/ddd/gkim/ibnbd2/rtrs/rtrs-core.o
  CC [M]  /tmp/ddd/gkim/ibnbd2/rtrs/rtrs-clt.o
  CC [M]  /tmp/ddd/gkim/ibnbd2/rtrs/rtrs-clt-stats.o
  CC [M]  /tmp/ddd/gkim/ibnbd2/rtrs/rtrs-clt-sysfs.o
  LD [M]  /tmp/ddd/gkim/ibnbd2/rtrs/rtrs-client.o
  CC [M]  /tmp/ddd/gkim/ibnbd2/rtrs/rtrs-srv.o
  CC [M]  /tmp/ddd/gkim/ibnbd2/rtrs/rtrs-srv-stats.o
  CC [M]  /tmp/ddd/gkim/ibnbd2/rtrs/rtrs-srv-sysfs.o
  LD [M]  /tmp/ddd/gkim/ibnbd2/rtrs/rtrs-server.o
  AR      /tmp/ddd/gkim/ibnbd2/built-in.a 
  Building modules, stage 2.
  MODPOST 5 modules
  CC [M]  /tmp/ddd/gkim/ibnbd2/rnbd/rnbd-client.mod.o
  LD [M]  /tmp/ddd/gkim/ibnbd2/rnbd/rnbd-client.ko
  CC [M]  /tmp/ddd/gkim/ibnbd2/rnbd/rnbd-server.mod.o
  LD [M]  /tmp/ddd/gkim/ibnbd2/rnbd/rnbd-server.ko
  CC [M]  /tmp/ddd/gkim/ibnbd2/rtrs/rtrs-client.mod.o
  LD [M]  /tmp/ddd/gkim/ibnbd2/rtrs/rtrs-client.ko
  CC [M]  /tmp/ddd/gkim/ibnbd2/rtrs/rtrs-core.mod.o
  LD [M]  /tmp/ddd/gkim/ibnbd2/rtrs/rtrs-core.ko
  CC [M]  /tmp/ddd/gkim/ibnbd2/rtrs/rtrs-server.mod.o
  LD [M]  /tmp/ddd/gkim/ibnbd2/rtrs/rtrs-server.ko
make[1]: Leaving directory '/usr/src/linux-5.4.86-pserver'
root@ps401a-914.nst:/tmp/ddd/gkim/ibnbd2# rmmod rnbd-client
root@ps401a-914.nst:/tmp/ddd/gkim/ibnbd2# rmmod rtrs-client
root@ps401a-914.nst:/tmp/ddd/gkim/ibnbd2# rmmod rtrs-core
root@ps401a-914.nst:/tmp/ddd/gkim/ibnbd2# insmod rtrs/rtrs-core.ko
root@ps401a-914.nst:/tmp/ddd/gkim/ibnbd2# insmod rtrs/rtrs-client.ko
root@ps401a-914.nst:/tmp/ddd/gkim/ibnbd2# insmod rnbd/rnbd-client.ko


fio start   : Di 13. Apr 12:10:30 UTC 2021
kernel info : Linux ps401a-914 5.4.86-pserver #5.4.86-3~deb10 SMP Fri Mar 5 12:29:36 UTC 2021 x86_64 GNU/Linux
fio version : fio-3.12
gcc: gcc (Debian 8.3.0-6) 8.3.0
Copyright (C) 2018 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
Start fio test
fiotest: (g=0): rw=randread, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=128
...
fio-3.12
Starting 64 processes
Jobs: 64 (f=1814): [r(4),f(2),r(5),f(1),r(11),f(1),r(2),f(1),r(4),f(2),r(5),f(1),r(4),f(1),r(12),f(1),r(5),f(1),r(1)][100.0%][r=3244MiB/s][r=830k IOPS][eta 00m:00s]
fiotest: (groupid=0, jobs=64): err= 0: pid=37528: Tue Apr 13 12:13:32 2021
  read: IOPS=829k, BW=3238MiB/s (3395MB/s)(569GiB/180025msec)
    slat (usec): min=165, max=195365, avg=1271.82, stdev=1671.22
    clat (nsec): min=1080, max=32693k, avg=8544062.13, stdev=2411682.41
     lat (usec): min=407, max=206880, avg=9815.94, stdev=2694.31
    clat percentiles (usec):
     |  1.00th=[ 3949],  5.00th=[ 5211], 10.00th=[ 5800], 20.00th=[ 6587],
     | 30.00th=[ 7177], 40.00th=[ 7701], 50.00th=[ 8225], 60.00th=[ 8848],
     | 70.00th=[ 9503], 80.00th=[10421], 90.00th=[11731], 95.00th=[12911],
     | 99.00th=[15270], 99.50th=[16319], 99.90th=[18482], 99.95th=[19530],
     | 99.99th=[22152]
   bw (  KiB/s): min=29696, max=254980, per=1.56%, avg=51775.19, stdev=3418.36, samples=22980
   iops        : min= 7424, max=63745, avg=12943.76, stdev=854.59, samples=22980
  lat (usec)   : 2=0.01%, 4=0.01%, 10=0.01%, 20=0.01%, 50=0.04%
  lat (usec)   : 100=0.04%, 250=0.02%, 500=0.01%, 750=0.01%, 1000=0.01%
  lat (msec)   : 2=0.05%, 4=0.90%, 10=74.58%, 20=24.31%, 50=0.04%
  cpu          : usr=1.00%, sys=4.97%, ctx=82399686, majf=0, minf=3717209
  IO depths    : 1=0.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=100.0%
     submit    : 0=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=100.0%
     complete  : 0=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=100.0%
     issued rwts: total=149229295,0,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=128

Run status group 0 (all jobs):
   READ: bw=3238MiB/s (3395MB/s), 3238MiB/s-3238MiB/s (3395MB/s-3395MB/s), io=569GiB (611GB), run=180025-180025msec

Disk stats (read/write):
  rnbd0: ios=4661956/0, merge=0/0, ticks=10212115/0, in_queue=323800, util=99.67%
  rnbd1: ios=4661956/0, merge=0/0, ticks=10489817/0, in_queue=340940, util=99.70%
  rnbd2: ios=4661959/0, merge=0/0, ticks=10510042/0, in_queue=342180, util=99.70%
  rnbd3: ios=4661958/0, merge=0/0, ticks=10702745/0, in_queue=353350, util=99.73%
  rnbd4: ios=4661964/0, merge=0/0, ticks=10793914/0, in_queue=359190, util=99.74%
  rnbd5: ios=4661968/0, merge=0/0, ticks=10913714/0, in_queue=369150, util=99.74%
  rnbd6: ios=4661960/0, merge=0/0, ticks=10958094/0, in_queue=370730, util=99.74%
  rnbd7: ios=4661968/0, merge=0/0, ticks=10976320/0, in_queue=370090, util=99.76%
  rnbd8: ios=4661964/0, merge=0/0, ticks=11014804/0, in_queue=375780, util=99.79%
  rnbd9: ios=4661964/0, merge=0/0, ticks=11031969/0, in_queue=376760, util=99.80%
  rnbd10: ios=4661966/0, merge=0/0, ticks=11047729/0, in_queue=375450, util=99.81%
  rnbd11: ios=4661975/0, merge=0/0, ticks=11053595/0, in_queue=378140, util=99.83%
  rnbd12: ios=4661972/0, merge=1/0, ticks=11087759/0, in_queue=376570, util=99.83%
  rnbd13: ios=4661975/0, merge=0/0, ticks=11066221/0, in_queue=381940, util=99.85%
  rnbd14: ios=4661967/0, merge=0/0, ticks=11092973/0, in_queue=381730, util=99.85%
  rnbd15: ios=4661981/0, merge=0/0, ticks=11056803/0, in_queue=382830, util=99.86%
  rnbd16: ios=4661985/0, merge=0/0, ticks=9447901/0, in_queue=280700, util=99.89%
  rnbd17: ios=4661978/0, merge=0/0, ticks=10506961/0, in_queue=348500, util=99.90%
  rnbd18: ios=4661983/0, merge=0/0, ticks=10702411/0, in_queue=364060, util=99.92%
  rnbd19: ios=4661977/0, merge=0/0, ticks=10777160/0, in_queue=374250, util=99.92%
  rnbd20: ios=4661982/0, merge=0/0, ticks=10780637/0, in_queue=371820, util=99.93%
  rnbd21: ios=4661980/0, merge=0/0, ticks=10841533/0, in_queue=376750, util=99.95%
  rnbd22: ios=4661984/0, merge=0/0, ticks=10869817/0, in_queue=378430, util=99.95%
  rnbd23: ios=4661985/0, merge=0/0, ticks=10966341/0, in_queue=387410, util=99.96%
  rnbd24: ios=4661987/0, merge=0/0, ticks=10957613/0, in_queue=390960, util=99.96%
  rnbd25: ios=4661988/0, merge=0/0, ticks=11015585/0, in_queue=390920, util=99.97%
  rnbd26: ios=4661980/0, merge=0/0, ticks=11074411/0, in_queue=398090, util=100.00%
  rnbd27: ios=4661985/0, merge=0/0, ticks=11122911/0, in_queue=404760, util=100.00%
  rnbd28: ios=4661993/0, merge=0/0, ticks=11095077/0, in_queue=402480, util=100.00%
  rnbd29: ios=4661991/0, merge=0/0, ticks=11170485/0, in_queue=408370, util=100.00%
  rnbd30: ios=4661992/0, merge=0/0, ticks=11213819/0, in_queue=409730, util=100.00%
  rnbd31: ios=4661989/0, merge=0/0, ticks=11263063/0, in_queue=420640, util=100.00%


root@ps401a-914.nst:/tmp/ddd/gkim/ibnbd2# bash go_32dev_128proc.sh
fio start   : Di 13. Apr 12:42:42 UTC 2021
kernel info : Linux ps401a-914 5.4.86-pserver #5.4.86-3~deb10 SMP Fri Mar 5 12:29:36 UTC 2021 x86_64 GNU/Linux
fio version : fio-3.12
gcc: gcc (Debian 8.3.0-6) 8.3.0
Copyright (C) 2018 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
Start fio test
fiotest: (g=0): rw=randread, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=128
...
fio-3.12
Starting 128 processes
Jobs: 123 (f=2946): [r(3),f(1),r(8),f(1),r(4),f(2),r(1),f(2),r(1),f(1),r(1),f(1),r(5),f(1),r(1),f(1),r(3),_(2),r(2),f(2),r(4),f(2),r(5),f(2),r(6),f(3),r(1),f(2),r(1),_(1),r(7),f(1),r(2),f(1),r(5),f(1),r(5),f
(5),r(1),f(1),r(1),_(2),f(1),r(2),f(1),r(22)][16.8%][r=3190MiB/s][r=817k IOPS][eta 15m:00s]
fiotest: (groupid=0, jobs=128): err= 0: pid=39254: Tue Apr 13 12:45:45 2021
  read: IOPS=817k, BW=3191MiB/s (3346MB/s)(561GiB/180029msec)
    slat (usec): min=69, max=412616, avg=7725.43, stdev=5595.18
    clat (nsec): min=1054, max=39850k, avg=12233217.00, stdev=3542424.84
     lat (usec): min=179, max=421190, avg=19958.71, stdev=5791.92
    clat percentiles (usec):
     |  1.00th=[ 4555],  5.00th=[ 6849], 10.00th=[ 7963], 20.00th=[ 9372],
     | 30.00th=[10290], 40.00th=[11207], 50.00th=[11994], 60.00th=[12911],
     | 70.00th=[13829], 80.00th=[15008], 90.00th=[16712], 95.00th=[18220],
     | 99.00th=[21627], 99.50th=[22938], 99.90th=[26084], 99.95th=[27395],
     | 99.99th=[30540]
   bw (  KiB/s): min= 3072, max=146432, per=0.78%, avg=25495.37, stdev=2758.09, samples=45992
   iops        : min=  768, max=36608, avg=6373.81, stdev=689.53, samples=45992
  lat (usec)   : 2=0.01%, 4=0.01%, 10=0.01%, 20=0.01%, 50=0.02%
  lat (usec)   : 100=0.02%, 250=0.03%, 500=0.04%, 750=0.03%, 1000=0.03%
  lat (msec)   : 2=0.09%, 4=0.43%, 10=25.69%, 20=71.33%, 50=2.29%
  cpu          : usr=0.57%, sys=2.20%, ctx=75748318, majf=0, minf=6305149
  IO depths    : 1=0.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=100.0%
     submit    : 0=0.0%, 4=0.0%, 8=0.0%, 16=0.1%, 32=0.0%, 64=0.1%, >=64=100.0%
     complete  : 0=0.0%, 4=0.0%, 8=0.0%, 16=0.1%, 32=0.0%, 64=0.1%, >=64=100.0%
     issued rwts: total=147045041,0,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=128

Run status group 0 (all jobs):
   READ: bw=3191MiB/s (3346MB/s), 3191MiB/s-3191MiB/s (3346MB/s-3346MB/s), io=561GiB (602GB), run=180029-180029msec


Disk stats (read/write):
  rnbd0: ios=4589511/0, merge=0/0, ticks=13423743/0, in_queue=1931790, util=99.58%
  rnbd1: ios=4589508/0, merge=0/0, ticks=14476127/0, in_queue=2228970, util=99.60%
  rnbd2: ios=4589517/0, merge=0/0, ticks=14624196/0, in_queue=2271880, util=99.61%
  rnbd3: ios=4589504/0, merge=0/0, ticks=14686013/0, in_queue=2286760, util=99.63%
  rnbd4: ios=4595158/0, merge=0/0, ticks=14772465/0, in_queue=2316590, util=99.62%
  rnbd5: ios=4595158/0, merge=0/0, ticks=14805361/0, in_queue=2314670, util=99.65%
  rnbd6: ios=4595158/0, merge=0/0, ticks=14817116/0, in_queue=2324300, util=99.65%
  rnbd7: ios=4595158/0, merge=0/0, ticks=14833164/0, in_queue=2320360, util=99.66%
  rnbd8: ios=4595158/0, merge=0/0, ticks=14900960/0, in_queue=2340200, util=99.68%
  rnbd9: ios=4595158/0, merge=0/0, ticks=14917077/0, in_queue=2345260, util=99.70%
  rnbd10: ios=4595158/0, merge=0/0, ticks=14931826/0, in_queue=2344540, util=99.71%
  rnbd11: ios=4595158/0, merge=0/0, ticks=14963132/0, in_queue=2345350, util=99.72%
  rnbd12: ios=4595158/0, merge=0/0, ticks=14978944/0, in_queue=2371930, util=99.73%
  rnbd13: ios=4595158/0, merge=0/0, ticks=14953823/0, in_queue=2349200, util=99.75%
  rnbd14: ios=4595157/0, merge=0/0, ticks=14991909/0, in_queue=2361030, util=99.75%
  rnbd15: ios=4595157/0, merge=0/0, ticks=15039741/0, in_queue=2379400, util=99.76%
  rnbd16: ios=4595157/0, merge=0/0, ticks=15057599/0, in_queue=2387550, util=99.79%
  rnbd17: ios=4595157/0, merge=0/0, ticks=15052981/0, in_queue=2378570, util=99.80%
  rnbd18: ios=4595157/0, merge=0/0, ticks=15364367/0, in_queue=2455030, util=99.83%
  rnbd19: ios=4595157/0, merge=0/0, ticks=15369998/0, in_queue=2462130, util=99.84%
  rnbd20: ios=4595157/0, merge=0/0, ticks=14953262/0, in_queue=2354080, util=99.84%
  rnbd21: ios=4595157/0, merge=0/0, ticks=15116061/0, in_queue=2404290, util=99.86%
  rnbd22: ios=4595157/0, merge=0/0, ticks=15190489/0, in_queue=2419870, util=99.86%
  rnbd23: ios=4595157/0, merge=0/0, ticks=15212165/0, in_queue=2414980, util=99.88%
  rnbd24: ios=4595157/0, merge=0/0, ticks=15225716/0, in_queue=2429100, util=99.88%
  rnbd25: ios=4595157/0, merge=0/0, ticks=15239578/0, in_queue=2428240, util=99.89%
  rnbd26: ios=4595157/0, merge=0/0, ticks=15251628/0, in_queue=2427670, util=99.92%
  rnbd27: ios=4595157/0, merge=0/0, ticks=13955168/0, in_queue=2127300, util=99.93%
  rnbd28: ios=4595157/0, merge=0/0, ticks=14694941/0, in_queue=2329440, util=99.95%
  rnbd29: ios=4595157/0, merge=0/0, ticks=14804318/0, in_queue=2355090, util=99.96%
  rnbd30: ios=4595157/0, merge=0/0, ticks=15183672/0, in_queue=2421440, util=99.96%
  rnbd31: ios=4595157/0, merge=0/0, ticks=11575822/0, in_queue=1492750, util=99.98%
  
  
root@ps401a-914.nst:/tmp/ddd/gkim/ibnbd2# bash go_32dev_128proc.sh
fio start   : Di 13. Apr 12:51:40 UTC 2021
kernel info : Linux ps401a-914 5.4.86-pserver #5.4.86-3~deb10 SMP Fri Mar 5 12:29:36 UTC 2021 x86_64 GNU/Linux
fio version : fio-3.12
gcc: gcc (Debian 8.3.0-6) 8.3.0
Copyright (C) 2018 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
Start fio test
fiotest: (g=0): rw=randread, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=128
...
fio-3.12
Starting 128 processes
Jobs: 70 (f=470): [f(4),_(1),f(2),_(1),f(1),_(1),f(1),_(1),f(1),_(5),f(1),_(2),f(1),_(1),f(2),_(3),f(2),_(1),f(3),r(1),_(4),f(1),_(1),f(2),r(1),_(1),f(2),_(1),f(1),_(1),f(1),_(5),f(3),_(2),f(4),_(5),f(1),_(1
),f(1),_(2),f(3),_(2),f(1),_(1),f(1),_(1),f(2),_(1),r(1),_(1),f(5),_(1),f(1),_(3),f(5),_(1),f(1),_(2),f(1),_(1),f(1),_(2),f(9),_(1),f(2),_(1),f(1),_(1)][1.7%][r=3210MiB/s][r=822k IOPS][eta 02h:54m:00s]
fiotest: (groupid=0, jobs=128): err= 0: pid=40166: Tue Apr 13 12:54:43 2021
  read: IOPS=817k, BW=3193MiB/s (3348MB/s)(561GiB/180023msec)
    slat (usec): min=7, max=292298, avg=7787.34, stdev=5758.03
    clat (nsec): min=1586, max=50911k, avg=12167070.96, stdev=3539907.52
     lat (usec): min=206, max=300176, avg=19954.48, stdev=5924.10
    clat percentiles (usec):
     |  1.00th=[ 4490],  5.00th=[ 6783], 10.00th=[ 7898], 20.00th=[ 9241],
     | 30.00th=[10290], 40.00th=[11076], 50.00th=[11994], 60.00th=[12780],
     | 70.00th=[13829], 80.00th=[15008], 90.00th=[16712], 95.00th=[18220],
     | 99.00th=[21627], 99.50th=[22938], 99.90th=[26084], 99.95th=[27395],
     | 99.99th=[30540]
   bw (  KiB/s): min=10240, max=156672, per=0.78%, avg=25505.77, stdev=3163.58, samples=45970
   iops        : min= 2560, max=39168, avg=6376.42, stdev=790.91, samples=45970
  lat (usec)   : 2=0.01%, 4=0.01%, 10=0.01%, 20=0.01%, 50=0.02%
  lat (usec)   : 100=0.02%, 250=0.02%, 500=0.02%, 750=0.02%, 1000=0.02%
  lat (msec)   : 2=0.11%, 4=0.49%, 10=26.37%, 20=70.68%, 50=2.22%
  lat (msec)   : 100=0.01%
  cpu          : usr=0.55%, sys=2.16%, ctx=75466111, majf=0, minf=5747705
  IO depths    : 1=0.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=100.0%
     submit    : 0=0.0%, 4=0.1%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=100.0%
     complete  : 0=0.0%, 4=0.1%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=100.0%
     issued rwts: total=147144833,0,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=128

Run status group 0 (all jobs):
   READ: bw=3193MiB/s (3348MB/s), 3193MiB/s-3193MiB/s (3348MB/s-3348MB/s), io=561GiB (603GB), run=180023-180023msec

Disk stats (read/write):
  rnbd0: ios=4598269/0, merge=0/0, ticks=14961482/0, in_queue=2294850, util=99.60%



  reply index

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-04-06 12:36 [PATCHv2 for-next 0/3] Improve debugging messages Gioh Kim
2021-04-06 12:36 ` [PATCHv2 for-next 1/3] RDMA/rtrs-clt: Print more info when an error happens Gioh Kim
2021-04-06 12:41   ` Leon Romanovsky
2021-04-09  9:57     ` Gioh Kim
2021-04-12 12:22     ` Jinpu Wang
2021-04-12 12:41       ` Leon Romanovsky
2021-04-12 12:53         ` Jinpu Wang
2021-04-12 14:00           ` Gioh Kim
2021-04-12 17:34             ` Leon Romanovsky
2021-04-13  5:31               ` Haakon Bugge
2021-04-13  6:43                 ` Leon Romanovsky
2021-04-13 13:11                   ` Gioh Kim [this message]
2021-04-13 19:31                     ` Leon Romanovsky
2021-04-13 22:52   ` Jason Gunthorpe
2021-04-06 12:36 ` [PATCHv2 for-next 2/3] RDMA/rtrs-srv: More debugging info when fail to send reply Gioh Kim
2021-04-06 12:36 ` [PATCHv2 for-next 3/3] RDMA/rtrs-clt: Simplify error message Gioh Kim
2021-04-13 22:52 ` [PATCHv2 for-next 0/3] Improve debugging messages Jason Gunthorpe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAJX1YtZ9LLqugvQHa77PCxpyoLx-k31bh7eXfxuVWw0NHr6xAw@mail.gmail.com \
    --to=gi-oh.kim@ionos.com \
    --cc=bvanassche@acm.org \
    --cc=dledford@redhat.com \
    --cc=gi-oh.kim@cloud.ionos.com \
    --cc=haakon.bugge@oracle.com \
    --cc=haris.iqbal@ionos.com \
    --cc=jgg@ziepe.ca \
    --cc=jinpu.wang@ionos.com \
    --cc=leon@kernel.org \
    --cc=linux-rdma@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Linux-RDMA Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-rdma/0 linux-rdma/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-rdma linux-rdma/ https://lore.kernel.org/linux-rdma \
		linux-rdma@vger.kernel.org
	public-inbox-index linux-rdma

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-rdma


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git