* Connect-IB not performing as well as ConnectX-3 with iSER @ 2016-06-06 22:36 Robert LeBlanc [not found] ` <CAANLjFoL5zow4f4RXP5t8LM7wsWN1OQ-hD2mtPUBTLkJ7UZ5kA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 0 siblings, 1 reply; 20+ messages in thread From: Robert LeBlanc @ 2016-06-06 22:36 UTC (permalink / raw) To: linux-rdma-u79uwXL29TY76Z2rM5mHXA I'm trying to understand why our Connect-IB card is not performing as well as our ConnectX-3 card. There are 3 ports between the two cards and 12 paths to the iSER target which is a RAM disk. 8: ib0.9770@ib0: <BROADCAST,MULTICAST,UP,LOWER_UP,M-DOWN> mtu 65520 qdisc pfifo_fast state UP group default qlen 256 link/infiniband 80:00:02:0a:fe:80:00:00:00:00:00:00:0c:c4:7a:ff:ff:4f:e5:d1 brd 00:ff:ff:ff:ff:12:40:1b:97:70:00:00:00:00:00:00:ff:ff:ff:ff inet 10.218.128.17/16 brd 10.218.255.255 scope global ib0.9770 inet 10.218.202.17/16 brd 10.218.255.255 scope global secondary ib0.9770:0 inet 10.218.203.17/16 brd 10.218.255.255 scope global secondary ib0.9770:1 inet 10.218.204.17/16 brd 10.218.255.255 scope global secondary ib0.9770:2 inet6 fe80::ec4:7aff:ff4f:e5d1/64 scope link 9: ib1.9770@ib1: <BROADCAST,MULTICAST,UP,LOWER_UP,M-DOWN> mtu 65520 qdisc pfifo_fast state UP group default qlen 256 link/infiniband 80:00:00:2d:fe:80:00:00:00:00:00:00:e4:1d:2d:03:00:00:df:90 brd 00:ff:ff:ff:ff:12:40:1b:97:70:00:00:00:00:00:00:ff:ff:ff:ff inet 10.219.128.17/16 brd 10.219.255.255 scope global ib1.9770 inet 10.219.202.17/16 brd 10.219.255.255 scope global secondary ib1.9770:0 inet 10.219.203.17/16 brd 10.219.255.255 scope global secondary ib1.9770:1 inet 10.219.204.17/16 brd 10.219.255.255 scope global secondary ib1.9770:2 inet6 fe80::e61d:2d03:0:df90/64 scope link 10: ib2.9770@ib2: <BROADCAST,MULTICAST,UP,LOWER_UP,M-DOWN> mtu 65520 qdisc pfifo_fast state UP group default qlen 256 link/infiniband 80:00:00:2f:fe:80:00:00:00:00:00:00:e4:1d:2d:03:00:00:df:98 brd 00:ff:ff:ff:ff:12:40:1b:97:70:00:00:00:00:00:00:ff:ff:ff:ff inet 10.220.128.17/16 brd 10.220.255.255 scope global ib2.9770 inet 10.220.202.17/16 brd 10.220.255.255 scope global secondary ib2.9770:0 inet 10.220.203.17/16 brd 10.220.255.255 scope global secondary ib2.9770:1 inet 10.220.204.17/16 brd 10.220.255.255 scope global secondary ib2.9770:2 inet6 fe80::e61d:2d03:0:df98/64 scope link The ConnectX-3 card is ib0 and Connect-IB is ib{1,2}. # ibv_devinfo hca_id: mlx5_0 transport: InfiniBand (0) fw_ver: 10.16.1006 node_guid: e41d:2d03:0000:df90 sys_image_guid: e41d:2d03:0000:df90 vendor_id: 0x02c9 vendor_part_id: 4113 hw_ver: 0x0 board_id: MT_1210110019 phys_port_cnt: 2 port: 1 state: PORT_ACTIVE (4) max_mtu: 4096 (5) active_mtu: 4096 (5) sm_lid: 1 port_lid: 29 port_lmc: 0x00 link_layer: InfiniBand port: 2 state: PORT_ACTIVE (4) max_mtu: 4096 (5) active_mtu: 4096 (5) sm_lid: 1 port_lid: 28 port_lmc: 0x00 link_layer: InfiniBand hca_id: mlx4_0 transport: InfiniBand (0) fw_ver: 2.35.5100 node_guid: 0cc4:7aff:ff4f:e5d0 sys_image_guid: 0cc4:7aff:ff4f:e5d3 vendor_id: 0x02c9 vendor_part_id: 4099 hw_ver: 0x0 board_id: SM_2221000001000 phys_port_cnt: 1 port: 1 state: PORT_ACTIVE (4) max_mtu: 4096 (5) active_mtu: 4096 (5) sm_lid: 1 port_lid: 34 port_lmc: 0x00 link_layer: InfiniBand When I run fio against each path individually, I get: disk;target IP;bandwidth,IOPs,Execution time sdn;10.218.128.17;5053682;1263420;16599 sde;10.218.202.17;5032158;1258039;16670 sdh;10.218.203.17;4993516;1248379;16799 sdk;10.218.204.17;5081848;1270462;16507 sdc;10.219.128.17;3750942;937735;22364 sdf;10.219.202.17;3746921;936730;22388 sdi;10.219.203.17;3873929;968482;21654 sdl;10.219.204.17;3841465;960366;21837 sdd;10.220.128.17;3760358;940089;22308 sdg;10.220.202.17;3866252;966563;21697 sdj;10.220.203.17;3757495;939373;22325 sdm;10.220.204.17;4064051;1016012;20641 However, running ib_send_bw, I get: # ib_send_bw -d mlx4_0 -i 1 10.218.128.17 -F --report_gbits --------------------------------------------------------------------------------------- Send BW Test Dual-port : OFF Device : mlx4_0 Number of qps : 1 Transport type : IB Connection type : RC Using SRQ : OFF TX depth : 128 CQ Moderation : 100 Mtu : 2048[B] Link type : IB Max inline data : 0[B] rdma_cm QPs : OFF Data ex. method : Ethernet --------------------------------------------------------------------------------------- local address: LID 0x3f QPN 0x02b5 PSN 0x87274e remote address: LID 0x22 QPN 0x0213 PSN 0xaf9232 --------------------------------------------------------------------------------------- #bytes #iterations BW peak[Gb/sec] BW average[Gb/sec] MsgRate[Mpps] Conflicting CPU frequency values detected: 3219.835000 != 3063.531000 Test integrity may be harmed ! Warning: measured timestamp frequency 2599.95 differs from nominal 3219.84 MHz 65536 1000 50.57 50.57 0.096461 --------------------------------------------------------------------------------------- # ib_send_bw -d mlx5_0 -i 1 10.219.128.17 -F --report_gbits --------------------------------------------------------------------------------------- Send BW Test Dual-port : OFF Device : mlx5_0 Number of qps : 1 Transport type : IB Connection type : RC Using SRQ : OFF TX depth : 128 CQ Moderation : 100 Mtu : 4096[B] Link type : IB Max inline data : 0[B] rdma_cm QPs : OFF Data ex. method : Ethernet --------------------------------------------------------------------------------------- local address: LID 0x12 QPN 0x003e PSN 0x75f1a0 remote address: LID 0x1d QPN 0x003e PSN 0x7f7f71 --------------------------------------------------------------------------------------- #bytes #iterations BW peak[Gb/sec] BW average[Gb/sec] MsgRate[Mpps] Conflicting CPU frequency values detected: 3399.906000 != 2747.773000 Test integrity may be harmed ! Warning: measured timestamp frequency 2599.98 differs from nominal 3399.91 MHz 65536 1000 52.12 52.12 0.099414 --------------------------------------------------------------------------------------- # ib_send_bw -d mlx5_0 -i 2 10.220.128.17 -F --report_gbits --------------------------------------------------------------------------------------- Send BW Test Dual-port : OFF Device : mlx5_0 Number of qps : 1 Transport type : IB Connection type : RC Using SRQ : OFF TX depth : 128 CQ Moderation : 100 Mtu : 4096[B] Link type : IB Max inline data : 0[B] rdma_cm QPs : OFF Data ex. method : Ethernet --------------------------------------------------------------------------------------- local address: LID 0x0f QPN 0x0041 PSN 0xb7203d remote address: LID 0x1c QPN 0x0041 PSN 0xf8b80a --------------------------------------------------------------------------------------- #bytes #iterations BW peak[Gb/sec] BW average[Gb/sec] MsgRate[Mpps] Conflicting CPU frequency values detected: 3327.796000 != 1771.046000 Test integrity may be harmed ! Warning: measured timestamp frequency 2599.97 differs from nominal 3327.8 MHz 65536 1000 52.14 52.14 0.099441 --------------------------------------------------------------------------------------- Here I see that the ConnectX-3 cards with iSER is matching the performance of the ib_send_bw. However, the Connect-IB performs better than the mlx4 with ib_send_bw, but performs much worse with iSER. This is running the 4.4.4 kernel. Is there some ideas of what I can do to get the iSER performance out of the Connect-IB cards? ---------------- Robert LeBlanc PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 20+ messages in thread
[parent not found: <CAANLjFoL5zow4f4RXP5t8LM7wsWN1OQ-hD2mtPUBTLkJ7UZ5kA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>]
* Re: Connect-IB not performing as well as ConnectX-3 with iSER [not found] ` <CAANLjFoL5zow4f4RXP5t8LM7wsWN1OQ-hD2mtPUBTLkJ7UZ5kA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> @ 2016-06-07 12:02 ` Max Gurtovoy [not found] ` <5756B7D2.5040009-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> 0 siblings, 1 reply; 20+ messages in thread From: Max Gurtovoy @ 2016-06-07 12:02 UTC (permalink / raw) To: Robert LeBlanc, linux-rdma-u79uwXL29TY76Z2rM5mHXA On 6/7/2016 1:36 AM, Robert LeBlanc wrote: > I'm trying to understand why our Connect-IB card is not performing as > well as our ConnectX-3 card. There are 3 ports between the two cards > and 12 paths to the iSER target which is a RAM disk. <snip> > > When I run fio against each path individually, I get: What is the scenario (bs, numjobs, iodepth) for each run ? Which target do you use ? backing store ? > > disk;target IP;bandwidth,IOPs,Execution time > sdn;10.218.128.17;5053682;1263420;16599 > sde;10.218.202.17;5032158;1258039;16670 > sdh;10.218.203.17;4993516;1248379;16799 > sdk;10.218.204.17;5081848;1270462;16507 > sdc;10.219.128.17;3750942;937735;22364 > sdf;10.219.202.17;3746921;936730;22388 > sdi;10.219.203.17;3873929;968482;21654 > sdl;10.219.204.17;3841465;960366;21837 > sdd;10.220.128.17;3760358;940089;22308 > sdg;10.220.202.17;3866252;966563;21697 > sdj;10.220.203.17;3757495;939373;22325 > sdm;10.220.204.17;4064051;1016012;20641 > > However, running ib_send_bw, I get: > > # ib_send_bw -d mlx4_0 -i 1 10.218.128.17 -F --report_gbits > --------------------------------------------------------------------------------------- > Send BW Test > Dual-port : OFF Device : mlx4_0 > Number of qps : 1 Transport type : IB > Connection type : RC Using SRQ : OFF > TX depth : 128 > CQ Moderation : 100 > Mtu : 2048[B] > Link type : IB > Max inline data : 0[B] > rdma_cm QPs : OFF > Data ex. method : Ethernet > --------------------------------------------------------------------------------------- > local address: LID 0x3f QPN 0x02b5 PSN 0x87274e > remote address: LID 0x22 QPN 0x0213 PSN 0xaf9232 > --------------------------------------------------------------------------------------- > #bytes #iterations BW peak[Gb/sec] BW average[Gb/sec] MsgRate[Mpps] > Conflicting CPU frequency values detected: 3219.835000 != 3063.531000 > Test integrity may be harmed ! > Warning: measured timestamp frequency 2599.95 differs from nominal 3219.84 MHz > 65536 1000 50.57 50.57 0.096461 > --------------------------------------------------------------------------------------- > # ib_send_bw -d mlx5_0 -i 1 10.219.128.17 -F --report_gbits > --------------------------------------------------------------------------------------- > Send BW Test > Dual-port : OFF Device : mlx5_0 > Number of qps : 1 Transport type : IB > Connection type : RC Using SRQ : OFF > TX depth : 128 > CQ Moderation : 100 > Mtu : 4096[B] > Link type : IB > Max inline data : 0[B] > rdma_cm QPs : OFF > Data ex. method : Ethernet > --------------------------------------------------------------------------------------- > local address: LID 0x12 QPN 0x003e PSN 0x75f1a0 > remote address: LID 0x1d QPN 0x003e PSN 0x7f7f71 > --------------------------------------------------------------------------------------- > #bytes #iterations BW peak[Gb/sec] BW average[Gb/sec] MsgRate[Mpps] > Conflicting CPU frequency values detected: 3399.906000 != 2747.773000 > Test integrity may be harmed ! > Warning: measured timestamp frequency 2599.98 differs from nominal 3399.91 MHz > 65536 1000 52.12 52.12 0.099414 > --------------------------------------------------------------------------------------- > # ib_send_bw -d mlx5_0 -i 2 10.220.128.17 -F --report_gbits > --------------------------------------------------------------------------------------- > Send BW Test > Dual-port : OFF Device : mlx5_0 > Number of qps : 1 Transport type : IB > Connection type : RC Using SRQ : OFF > TX depth : 128 > CQ Moderation : 100 > Mtu : 4096[B] > Link type : IB > Max inline data : 0[B] > rdma_cm QPs : OFF > Data ex. method : Ethernet > --------------------------------------------------------------------------------------- > local address: LID 0x0f QPN 0x0041 PSN 0xb7203d > remote address: LID 0x1c QPN 0x0041 PSN 0xf8b80a > --------------------------------------------------------------------------------------- > #bytes #iterations BW peak[Gb/sec] BW average[Gb/sec] MsgRate[Mpps] > Conflicting CPU frequency values detected: 3327.796000 != 1771.046000 > Test integrity may be harmed ! > Warning: measured timestamp frequency 2599.97 differs from nominal 3327.8 MHz > 65536 1000 52.14 52.14 0.099441 > --------------------------------------------------------------------------------------- > > Here I see that the ConnectX-3 cards with iSER is matching the > performance of the ib_send_bw. However, the Connect-IB performs better > than the mlx4 with ib_send_bw, but performs much worse with iSER. > > This is running the 4.4.4 kernel. Is there some ideas of what I can do > to get the iSER performance out of the Connect-IB cards? did you see this regression in different kernel ? > > ---------------- > Robert LeBlanc > PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 20+ messages in thread
[parent not found: <5756B7D2.5040009-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>]
* Re: Connect-IB not performing as well as ConnectX-3 with iSER [not found] ` <5756B7D2.5040009-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> @ 2016-06-07 16:48 ` Robert LeBlanc [not found] ` <CAANLjFq4CoOSbng=aPHiSsFB=1HMSwAhhLiCjt+88dzz24OT9w-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 0 siblings, 1 reply; 20+ messages in thread From: Robert LeBlanc @ 2016-06-07 16:48 UTC (permalink / raw) To: Max Gurtovoy; +Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA The target is LIO (same kernel) with a 200 GB RAM disk and I'm running fio as follows: fio --rw=read --bs=4K --size=2G --numjobs=40 --name=worker.matt --group_reporting --minimal | cut -d';' -f7,8,9 All of the paths are set the same with noop and nomerges to either 1 or 2 (doesn't make a big difference). I started looking into this when the 4.6 kernel wasn't performing as well as we were able to get the 4.4 kernel to work. I went back to the 4.4 kernel and I could not replicate the 4+ million IOPs. So I started breaking down the problem to smaller pieces and found this anomaly. Since there hasn't been any suggestions up to this point, I'll check other kernel version to see if it is specific to certain kernels. If you need more information, please let me know. Thanks, ---------------- Robert LeBlanc PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 On Tue, Jun 7, 2016 at 6:02 AM, Max Gurtovoy <maxg-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> wrote: > > > On 6/7/2016 1:36 AM, Robert LeBlanc wrote: >> >> I'm trying to understand why our Connect-IB card is not performing as >> well as our ConnectX-3 card. There are 3 ports between the two cards >> and 12 paths to the iSER target which is a RAM disk. > > > <snip> > >> >> When I run fio against each path individually, I get: > > > What is the scenario (bs, numjobs, iodepth) for each run ? > Which target do you use ? backing store ? > > >> >> disk;target IP;bandwidth,IOPs,Execution time >> sdn;10.218.128.17;5053682;1263420;16599 >> sde;10.218.202.17;5032158;1258039;16670 >> sdh;10.218.203.17;4993516;1248379;16799 >> sdk;10.218.204.17;5081848;1270462;16507 >> sdc;10.219.128.17;3750942;937735;22364 >> sdf;10.219.202.17;3746921;936730;22388 >> sdi;10.219.203.17;3873929;968482;21654 >> sdl;10.219.204.17;3841465;960366;21837 >> sdd;10.220.128.17;3760358;940089;22308 >> sdg;10.220.202.17;3866252;966563;21697 >> sdj;10.220.203.17;3757495;939373;22325 >> sdm;10.220.204.17;4064051;1016012;20641 >> >> However, running ib_send_bw, I get: >> >> # ib_send_bw -d mlx4_0 -i 1 10.218.128.17 -F --report_gbits >> >> --------------------------------------------------------------------------------------- >> Send BW Test >> Dual-port : OFF Device : mlx4_0 >> Number of qps : 1 Transport type : IB >> Connection type : RC Using SRQ : OFF >> TX depth : 128 >> CQ Moderation : 100 >> Mtu : 2048[B] >> Link type : IB >> Max inline data : 0[B] >> rdma_cm QPs : OFF >> Data ex. method : Ethernet >> >> --------------------------------------------------------------------------------------- >> local address: LID 0x3f QPN 0x02b5 PSN 0x87274e >> remote address: LID 0x22 QPN 0x0213 PSN 0xaf9232 >> >> --------------------------------------------------------------------------------------- >> #bytes #iterations BW peak[Gb/sec] BW average[Gb/sec] >> MsgRate[Mpps] >> Conflicting CPU frequency values detected: 3219.835000 != 3063.531000 >> Test integrity may be harmed ! >> Warning: measured timestamp frequency 2599.95 differs from nominal 3219.84 >> MHz >> 65536 1000 50.57 50.57 0.096461 >> >> --------------------------------------------------------------------------------------- >> # ib_send_bw -d mlx5_0 -i 1 10.219.128.17 -F --report_gbits >> >> --------------------------------------------------------------------------------------- >> Send BW Test >> Dual-port : OFF Device : mlx5_0 >> Number of qps : 1 Transport type : IB >> Connection type : RC Using SRQ : OFF >> TX depth : 128 >> CQ Moderation : 100 >> Mtu : 4096[B] >> Link type : IB >> Max inline data : 0[B] >> rdma_cm QPs : OFF >> Data ex. method : Ethernet >> >> --------------------------------------------------------------------------------------- >> local address: LID 0x12 QPN 0x003e PSN 0x75f1a0 >> remote address: LID 0x1d QPN 0x003e PSN 0x7f7f71 >> >> --------------------------------------------------------------------------------------- >> #bytes #iterations BW peak[Gb/sec] BW average[Gb/sec] >> MsgRate[Mpps] >> Conflicting CPU frequency values detected: 3399.906000 != 2747.773000 >> Test integrity may be harmed ! >> Warning: measured timestamp frequency 2599.98 differs from nominal 3399.91 >> MHz >> 65536 1000 52.12 52.12 0.099414 >> >> --------------------------------------------------------------------------------------- >> # ib_send_bw -d mlx5_0 -i 2 10.220.128.17 -F --report_gbits >> >> --------------------------------------------------------------------------------------- >> Send BW Test >> Dual-port : OFF Device : mlx5_0 >> Number of qps : 1 Transport type : IB >> Connection type : RC Using SRQ : OFF >> TX depth : 128 >> CQ Moderation : 100 >> Mtu : 4096[B] >> Link type : IB >> Max inline data : 0[B] >> rdma_cm QPs : OFF >> Data ex. method : Ethernet >> >> --------------------------------------------------------------------------------------- >> local address: LID 0x0f QPN 0x0041 PSN 0xb7203d >> remote address: LID 0x1c QPN 0x0041 PSN 0xf8b80a >> >> --------------------------------------------------------------------------------------- >> #bytes #iterations BW peak[Gb/sec] BW average[Gb/sec] >> MsgRate[Mpps] >> Conflicting CPU frequency values detected: 3327.796000 != 1771.046000 >> Test integrity may be harmed ! >> Warning: measured timestamp frequency 2599.97 differs from nominal 3327.8 >> MHz >> 65536 1000 52.14 52.14 0.099441 >> >> --------------------------------------------------------------------------------------- >> >> Here I see that the ConnectX-3 cards with iSER is matching the >> performance of the ib_send_bw. However, the Connect-IB performs better >> than the mlx4 with ib_send_bw, but performs much worse with iSER. >> >> This is running the 4.4.4 kernel. Is there some ideas of what I can do >> to get the iSER performance out of the Connect-IB cards? > > > did you see this regression in different kernel ? > > >> >> ---------------- >> Robert LeBlanc >> PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 20+ messages in thread
[parent not found: <CAANLjFq4CoOSbng=aPHiSsFB=1HMSwAhhLiCjt+88dzz24OT9w-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>]
* Re: Connect-IB not performing as well as ConnectX-3 with iSER [not found] ` <CAANLjFq4CoOSbng=aPHiSsFB=1HMSwAhhLiCjt+88dzz24OT9w-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> @ 2016-06-07 22:37 ` Robert LeBlanc [not found] ` <CAANLjFoLJNQWtHHqjHmhc0iBq14NAV_GgkbyQabjzyeN56t+Ow-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 0 siblings, 1 reply; 20+ messages in thread From: Robert LeBlanc @ 2016-06-07 22:37 UTC (permalink / raw) To: Max Gurtovoy; +Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA On the 4.1.15 kernel: sdc;10.218.128.17;3971878;992969;21120 sdd;10.218.202.17;3967745;991936;21142 sdg;10.218.203.17;3938128;984532;21301 sdk;10.218.204.17;3952602;988150;21223 sdn;10.219.128.17;4615719;1153929;18174 sdf;10.219.202.17;4622331;1155582;18148 sdi;10.219.203.17;4602297;1150574;18227 sdl;10.219.204.17;4565477;1141369;18374 sde;10.220.128.17;4594986;1148746;18256 sdh;10.220.202.17;4590209;1147552;18275 sdj;10.220.203.17;4599017;1149754;18240 sdm;10.220.204.17;4610898;1152724;18193 On the 4.6.0 kernel: sdc;10.218.128.17;3239219;809804;25897 sdf;10.218.202.17;3321300;830325;25257 sdm;10.218.203.17;3339015;834753;25123 sdk;10.218.204.17;3637573;909393;23061 sde;10.219.128.17;3325777;831444;25223 sdl;10.219.202.17;3305464;826366;25378 sdg;10.219.203.17;3304032;826008;25389 sdn;10.219.204.17;3330001;832500;25191 sdd;10.220.128.17;4624370;1156092;18140 sdi;10.220.202.17;4619277;1154819;18160 sdj;10.220.203.17;4610138;1152534;18196 sdh;10.220.204.17;4586445;1146611;18290 It seems that there is a lot of changes between the kernels. I had these kernels already on the box and I can bisect them if you think it would help. It is really odd that port 2 on the Connect-IB card did better than port 1 on the 4.6.0 kernel. ---------------- Robert LeBlanc PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 On Tue, Jun 7, 2016 at 10:48 AM, Robert LeBlanc <robert-4JaGZRWAfWbajFs6igw21g@public.gmane.org> wrote: > The target is LIO (same kernel) with a 200 GB RAM disk and I'm running > fio as follows: > > fio --rw=read --bs=4K --size=2G --numjobs=40 --name=worker.matt > --group_reporting --minimal | cut -d';' -f7,8,9 > > All of the paths are set the same with noop and nomerges to either 1 > or 2 (doesn't make a big difference). > > I started looking into this when the 4.6 kernel wasn't performing as > well as we were able to get the 4.4 kernel to work. I went back to the > 4.4 kernel and I could not replicate the 4+ million IOPs. So I started > breaking down the problem to smaller pieces and found this anomaly. > Since there hasn't been any suggestions up to this point, I'll check > other kernel version to see if it is specific to certain kernels. If > you need more information, please let me know. > > Thanks, > ---------------- > Robert LeBlanc > PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 > > > On Tue, Jun 7, 2016 at 6:02 AM, Max Gurtovoy <maxg-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> wrote: >> >> >> On 6/7/2016 1:36 AM, Robert LeBlanc wrote: >>> >>> I'm trying to understand why our Connect-IB card is not performing as >>> well as our ConnectX-3 card. There are 3 ports between the two cards >>> and 12 paths to the iSER target which is a RAM disk. >> >> >> <snip> >> >>> >>> When I run fio against each path individually, I get: >> >> >> What is the scenario (bs, numjobs, iodepth) for each run ? >> Which target do you use ? backing store ? >> >> >>> >>> disk;target IP;bandwidth,IOPs,Execution time >>> sdn;10.218.128.17;5053682;1263420;16599 >>> sde;10.218.202.17;5032158;1258039;16670 >>> sdh;10.218.203.17;4993516;1248379;16799 >>> sdk;10.218.204.17;5081848;1270462;16507 >>> sdc;10.219.128.17;3750942;937735;22364 >>> sdf;10.219.202.17;3746921;936730;22388 >>> sdi;10.219.203.17;3873929;968482;21654 >>> sdl;10.219.204.17;3841465;960366;21837 >>> sdd;10.220.128.17;3760358;940089;22308 >>> sdg;10.220.202.17;3866252;966563;21697 >>> sdj;10.220.203.17;3757495;939373;22325 >>> sdm;10.220.204.17;4064051;1016012;20641 >>> >>> However, running ib_send_bw, I get: >>> >>> # ib_send_bw -d mlx4_0 -i 1 10.218.128.17 -F --report_gbits >>> >>> --------------------------------------------------------------------------------------- >>> Send BW Test >>> Dual-port : OFF Device : mlx4_0 >>> Number of qps : 1 Transport type : IB >>> Connection type : RC Using SRQ : OFF >>> TX depth : 128 >>> CQ Moderation : 100 >>> Mtu : 2048[B] >>> Link type : IB >>> Max inline data : 0[B] >>> rdma_cm QPs : OFF >>> Data ex. method : Ethernet >>> >>> --------------------------------------------------------------------------------------- >>> local address: LID 0x3f QPN 0x02b5 PSN 0x87274e >>> remote address: LID 0x22 QPN 0x0213 PSN 0xaf9232 >>> >>> --------------------------------------------------------------------------------------- >>> #bytes #iterations BW peak[Gb/sec] BW average[Gb/sec] >>> MsgRate[Mpps] >>> Conflicting CPU frequency values detected: 3219.835000 != 3063.531000 >>> Test integrity may be harmed ! >>> Warning: measured timestamp frequency 2599.95 differs from nominal 3219.84 >>> MHz >>> 65536 1000 50.57 50.57 0.096461 >>> >>> --------------------------------------------------------------------------------------- >>> # ib_send_bw -d mlx5_0 -i 1 10.219.128.17 -F --report_gbits >>> >>> --------------------------------------------------------------------------------------- >>> Send BW Test >>> Dual-port : OFF Device : mlx5_0 >>> Number of qps : 1 Transport type : IB >>> Connection type : RC Using SRQ : OFF >>> TX depth : 128 >>> CQ Moderation : 100 >>> Mtu : 4096[B] >>> Link type : IB >>> Max inline data : 0[B] >>> rdma_cm QPs : OFF >>> Data ex. method : Ethernet >>> >>> --------------------------------------------------------------------------------------- >>> local address: LID 0x12 QPN 0x003e PSN 0x75f1a0 >>> remote address: LID 0x1d QPN 0x003e PSN 0x7f7f71 >>> >>> --------------------------------------------------------------------------------------- >>> #bytes #iterations BW peak[Gb/sec] BW average[Gb/sec] >>> MsgRate[Mpps] >>> Conflicting CPU frequency values detected: 3399.906000 != 2747.773000 >>> Test integrity may be harmed ! >>> Warning: measured timestamp frequency 2599.98 differs from nominal 3399.91 >>> MHz >>> 65536 1000 52.12 52.12 0.099414 >>> >>> --------------------------------------------------------------------------------------- >>> # ib_send_bw -d mlx5_0 -i 2 10.220.128.17 -F --report_gbits >>> >>> --------------------------------------------------------------------------------------- >>> Send BW Test >>> Dual-port : OFF Device : mlx5_0 >>> Number of qps : 1 Transport type : IB >>> Connection type : RC Using SRQ : OFF >>> TX depth : 128 >>> CQ Moderation : 100 >>> Mtu : 4096[B] >>> Link type : IB >>> Max inline data : 0[B] >>> rdma_cm QPs : OFF >>> Data ex. method : Ethernet >>> >>> --------------------------------------------------------------------------------------- >>> local address: LID 0x0f QPN 0x0041 PSN 0xb7203d >>> remote address: LID 0x1c QPN 0x0041 PSN 0xf8b80a >>> >>> --------------------------------------------------------------------------------------- >>> #bytes #iterations BW peak[Gb/sec] BW average[Gb/sec] >>> MsgRate[Mpps] >>> Conflicting CPU frequency values detected: 3327.796000 != 1771.046000 >>> Test integrity may be harmed ! >>> Warning: measured timestamp frequency 2599.97 differs from nominal 3327.8 >>> MHz >>> 65536 1000 52.14 52.14 0.099441 >>> >>> --------------------------------------------------------------------------------------- >>> >>> Here I see that the ConnectX-3 cards with iSER is matching the >>> performance of the ib_send_bw. However, the Connect-IB performs better >>> than the mlx4 with ib_send_bw, but performs much worse with iSER. >>> >>> This is running the 4.4.4 kernel. Is there some ideas of what I can do >>> to get the iSER performance out of the Connect-IB cards? >> >> >> did you see this regression in different kernel ? >> >> >>> >>> ---------------- >>> Robert LeBlanc >>> PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 20+ messages in thread
[parent not found: <CAANLjFoLJNQWtHHqjHmhc0iBq14NAV_GgkbyQabjzyeN56t+Ow-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>]
* Re: Connect-IB not performing as well as ConnectX-3 with iSER [not found] ` <CAANLjFoLJNQWtHHqjHmhc0iBq14NAV_GgkbyQabjzyeN56t+Ow-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> @ 2016-06-08 13:52 ` Max Gurtovoy [not found] ` <57582336.10407-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> 0 siblings, 1 reply; 20+ messages in thread From: Max Gurtovoy @ 2016-06-08 13:52 UTC (permalink / raw) To: Robert LeBlanc; +Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA On 6/8/2016 1:37 AM, Robert LeBlanc wrote: > On the 4.1.15 kernel: > sdc;10.218.128.17;3971878;992969;21120 > sdd;10.218.202.17;3967745;991936;21142 > sdg;10.218.203.17;3938128;984532;21301 > sdk;10.218.204.17;3952602;988150;21223 > sdn;10.219.128.17;4615719;1153929;18174 > sdf;10.219.202.17;4622331;1155582;18148 > sdi;10.219.203.17;4602297;1150574;18227 > sdl;10.219.204.17;4565477;1141369;18374 > sde;10.220.128.17;4594986;1148746;18256 > sdh;10.220.202.17;4590209;1147552;18275 > sdj;10.220.203.17;4599017;1149754;18240 > sdm;10.220.204.17;4610898;1152724;18193 > > On the 4.6.0 kernel: > sdc;10.218.128.17;3239219;809804;25897 > sdf;10.218.202.17;3321300;830325;25257 > sdm;10.218.203.17;3339015;834753;25123 > sdk;10.218.204.17;3637573;909393;23061 > sde;10.219.128.17;3325777;831444;25223 > sdl;10.219.202.17;3305464;826366;25378 > sdg;10.219.203.17;3304032;826008;25389 > sdn;10.219.204.17;3330001;832500;25191 > sdd;10.220.128.17;4624370;1156092;18140 > sdi;10.220.202.17;4619277;1154819;18160 > sdj;10.220.203.17;4610138;1152534;18196 > sdh;10.220.204.17;4586445;1146611;18290 > > It seems that there is a lot of changes between the kernels. I had > these kernels already on the box and I can bisect them if you think it > would help. It is really odd that port 2 on the Connect-IB card did > better than port 1 on the 4.6.0 kernel. > ---------------- > Robert LeBlanc > PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 so in these kernels you get better performance with the C-IB than CX3 ? we need to find the bottleneck. Can you increase the iodepth and/or block size to see if we can reach the wire speed. another try is to load ib_iser with always_register=N. what is the cpu utilzation in both initiator/target ? did you spread the irq affinity ? > > > On Tue, Jun 7, 2016 at 10:48 AM, Robert LeBlanc <robert-4JaGZRWAfWbajFs6igw21g@public.gmane.org> wrote: >> The target is LIO (same kernel) with a 200 GB RAM disk and I'm running >> fio as follows: >> >> fio --rw=read --bs=4K --size=2G --numjobs=40 --name=worker.matt >> --group_reporting --minimal | cut -d';' -f7,8,9 >> >> All of the paths are set the same with noop and nomerges to either 1 >> or 2 (doesn't make a big difference). >> >> I started looking into this when the 4.6 kernel wasn't performing as >> well as we were able to get the 4.4 kernel to work. I went back to the >> 4.4 kernel and I could not replicate the 4+ million IOPs. So I started >> breaking down the problem to smaller pieces and found this anomaly. >> Since there hasn't been any suggestions up to this point, I'll check >> other kernel version to see if it is specific to certain kernels. If >> you need more information, please let me know. >> >> Thanks, >> ---------------- >> Robert LeBlanc >> PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 >> >> >> On Tue, Jun 7, 2016 at 6:02 AM, Max Gurtovoy <maxg-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> wrote: >>> >>> >>> On 6/7/2016 1:36 AM, Robert LeBlanc wrote: >>>> >>>> I'm trying to understand why our Connect-IB card is not performing as >>>> well as our ConnectX-3 card. There are 3 ports between the two cards >>>> and 12 paths to the iSER target which is a RAM disk. >>> >>> >>> <snip> >>> >>>> >>>> When I run fio against each path individually, I get: >>> >>> >>> What is the scenario (bs, numjobs, iodepth) for each run ? >>> Which target do you use ? backing store ? >>> >>> >>>> >>>> disk;target IP;bandwidth,IOPs,Execution time >>>> sdn;10.218.128.17;5053682;1263420;16599 >>>> sde;10.218.202.17;5032158;1258039;16670 >>>> sdh;10.218.203.17;4993516;1248379;16799 >>>> sdk;10.218.204.17;5081848;1270462;16507 >>>> sdc;10.219.128.17;3750942;937735;22364 >>>> sdf;10.219.202.17;3746921;936730;22388 >>>> sdi;10.219.203.17;3873929;968482;21654 >>>> sdl;10.219.204.17;3841465;960366;21837 >>>> sdd;10.220.128.17;3760358;940089;22308 >>>> sdg;10.220.202.17;3866252;966563;21697 >>>> sdj;10.220.203.17;3757495;939373;22325 >>>> sdm;10.220.204.17;4064051;1016012;20641 >>>> -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 20+ messages in thread
[parent not found: <57582336.10407-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>]
* Re: Connect-IB not performing as well as ConnectX-3 with iSER [not found] ` <57582336.10407-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> @ 2016-06-08 15:33 ` Robert LeBlanc 2016-06-10 21:36 ` Robert LeBlanc 0 siblings, 1 reply; 20+ messages in thread From: Robert LeBlanc @ 2016-06-08 15:33 UTC (permalink / raw) To: Max Gurtovoy; +Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA With 4.1.15, the C-IB card gets about 1.15 MIOPs, while the CX3 gets about 0.99 MIOPs. But starting with the 4.4.4 kernel, the C-IB card drops to 0.96 MIOPs and the CX3 card jumps to 1.25 MIOPs. In the 4.6.0 kernel, both cards drop, the C-IB to 0.82 MIOPs and the CX3 to 1.15 MIOPs. I confirmed this morning that the card order was swapped on the 4.6.0 kernel and it was not different ports of the C-IB performing differently, but different cards. Given the limitations of the PCIe 8x port for the CX3, I think 1.25 MIOPs is about the best we can do there. In summary, the performance of the C-IB card drops after 4.1.15 and gets progressively worse as the kernels increase. The CX3 card peaks at the 4.4.4 kernel and degrades a bit on the 4.6.0 kernel. Increasing the IO depth by adding jobs does not improve performance, it actually decreases performance. Based on an average of 4 runs at each job number from 1-80, the Goldilocks zone is 31-57 jobs where the difference in performance is less than 1%. Similarly, increasing block request size does not really change the figures to reach line speed. Here is the output of the 4.6.0 kernel with 4M bs: sdc;10.218.128.17;3354638;819;25006 sdf;10.218.202.17;3376920;824;24841 sdm;10.218.203.17;3367431;822;24911 sdk;10.218.204.17;3378960;824;24826 sde;10.219.128.17;3366350;821;24919 sdl;10.219.202.17;3379641;825;24821 sdg;10.219.203.17;3391254;827;24736 sdn;10.219.204.17;3401706;830;24660 sdd;10.220.128.17;4597505;1122;18246 sdi;10.220.202.17;4594231;1121;18259 sdj;10.220.203.17;4667598;1139;17972 sdh;10.220.204.17;4628197;1129;18125 The CPU on the target is a kworker thread at 96%, but no single processor over 15%. The initiator has low fio CPU utilization (<10%) for each job and no single CPU over 22% utilized. I have tried manually spreading the IRQ affinity over the processors of the respective NUMA nodes and there was no noticeable change in performance when doing so. Loading ib_iser on the initiator shows maybe a slight increase in performance: sdc;10.218.128.17;3396885;849221;24695 sdf;10.218.202.17;3429240;857310;24462 sdi;10.218.203.17;3454234;863558;24285 sdm;10.218.204.17;3391666;847916;24733 sde;10.219.128.17;3403914;850978;24644 sdh;10.219.202.17;3491034;872758;24029 sdk;10.219.203.17;3390569;847642;24741 sdl;10.219.204.17;3498898;874724;23975 sdd;10.220.128.17;4664743;1166185;17983 sdg;10.220.202.17;4624880;1156220;18138 sdj;10.220.203.17;4616227;1154056;18172 sdn;10.220.204.17;4619786;1154946;18158 I'd like to see the C-IB card at 1.25+ MIOPs (I know that the target can do that performance and we were limited on the CX3 by the PCIe bus which isn't an issue with the 16x C-IB card for a single port). Although the loss of performance in the CX3 card is concerning, I'm mostly focused on the C-IB card at the moment. I will probably start bisecting 4.1.15 to 4.4.4 to see if I can identify when the performance of the C-IB card degrades. ---------------- Robert LeBlanc PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 On Wed, Jun 8, 2016 at 7:52 AM, Max Gurtovoy <maxg-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> wrote: > > > On 6/8/2016 1:37 AM, Robert LeBlanc wrote: >> >> On the 4.1.15 kernel: >> sdc;10.218.128.17;3971878;992969;21120 >> sdd;10.218.202.17;3967745;991936;21142 >> sdg;10.218.203.17;3938128;984532;21301 >> sdk;10.218.204.17;3952602;988150;21223 >> sdn;10.219.128.17;4615719;1153929;18174 >> sdf;10.219.202.17;4622331;1155582;18148 >> sdi;10.219.203.17;4602297;1150574;18227 >> sdl;10.219.204.17;4565477;1141369;18374 >> sde;10.220.128.17;4594986;1148746;18256 >> sdh;10.220.202.17;4590209;1147552;18275 >> sdj;10.220.203.17;4599017;1149754;18240 >> sdm;10.220.204.17;4610898;1152724;18193 >> >> On the 4.6.0 kernel: >> sdc;10.218.128.17;3239219;809804;25897 >> sdf;10.218.202.17;3321300;830325;25257 >> sdm;10.218.203.17;3339015;834753;25123 >> sdk;10.218.204.17;3637573;909393;23061 >> sde;10.219.128.17;3325777;831444;25223 >> sdl;10.219.202.17;3305464;826366;25378 >> sdg;10.219.203.17;3304032;826008;25389 >> sdn;10.219.204.17;3330001;832500;25191 >> sdd;10.220.128.17;4624370;1156092;18140 >> sdi;10.220.202.17;4619277;1154819;18160 >> sdj;10.220.203.17;4610138;1152534;18196 >> sdh;10.220.204.17;4586445;1146611;18290 >> >> It seems that there is a lot of changes between the kernels. I had >> these kernels already on the box and I can bisect them if you think it >> would help. It is really odd that port 2 on the Connect-IB card did >> better than port 1 on the 4.6.0 kernel. >> ---------------- >> Robert LeBlanc >> PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 > > > so in these kernels you get better performance with the C-IB than CX3 ? > we need to find the bottleneck. > Can you increase the iodepth and/or block size to see if we can reach the > wire speed. > another try is to load ib_iser with always_register=N. > > what is the cpu utilzation in both initiator/target ? > did you spread the irq affinity ? > > >> >> >> On Tue, Jun 7, 2016 at 10:48 AM, Robert LeBlanc <robert-4JaGZRWAfWbajFs6igw21g@public.gmane.org> >> wrote: >>> >>> The target is LIO (same kernel) with a 200 GB RAM disk and I'm running >>> fio as follows: >>> >>> fio --rw=read --bs=4K --size=2G --numjobs=40 --name=worker.matt >>> --group_reporting --minimal | cut -d';' -f7,8,9 >>> >>> All of the paths are set the same with noop and nomerges to either 1 >>> or 2 (doesn't make a big difference). >>> >>> I started looking into this when the 4.6 kernel wasn't performing as >>> well as we were able to get the 4.4 kernel to work. I went back to the >>> 4.4 kernel and I could not replicate the 4+ million IOPs. So I started >>> breaking down the problem to smaller pieces and found this anomaly. >>> Since there hasn't been any suggestions up to this point, I'll check >>> other kernel version to see if it is specific to certain kernels. If >>> you need more information, please let me know. >>> >>> Thanks, >>> ---------------- >>> Robert LeBlanc >>> PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 >>> >>> >>> On Tue, Jun 7, 2016 at 6:02 AM, Max Gurtovoy <maxg-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> wrote: >>>> >>>> >>>> >>>> On 6/7/2016 1:36 AM, Robert LeBlanc wrote: >>>>> >>>>> >>>>> I'm trying to understand why our Connect-IB card is not performing as >>>>> well as our ConnectX-3 card. There are 3 ports between the two cards >>>>> and 12 paths to the iSER target which is a RAM disk. >>>> >>>> >>>> >>>> <snip> >>>> >>>>> >>>>> When I run fio against each path individually, I get: >>>> >>>> >>>> >>>> What is the scenario (bs, numjobs, iodepth) for each run ? >>>> Which target do you use ? backing store ? >>>> >>>> >>>>> >>>>> disk;target IP;bandwidth,IOPs,Execution time >>>>> sdn;10.218.128.17;5053682;1263420;16599 >>>>> sde;10.218.202.17;5032158;1258039;16670 >>>>> sdh;10.218.203.17;4993516;1248379;16799 >>>>> sdk;10.218.204.17;5081848;1270462;16507 >>>>> sdc;10.219.128.17;3750942;937735;22364 >>>>> sdf;10.219.202.17;3746921;936730;22388 >>>>> sdi;10.219.203.17;3873929;968482;21654 >>>>> sdl;10.219.204.17;3841465;960366;21837 >>>>> sdd;10.220.128.17;3760358;940089;22308 >>>>> sdg;10.220.202.17;3866252;966563;21697 >>>>> sdj;10.220.203.17;3757495;939373;22325 >>>>> sdm;10.220.204.17;4064051;1016012;20641 >>>>> > -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Connect-IB not performing as well as ConnectX-3 with iSER 2016-06-08 15:33 ` Robert LeBlanc @ 2016-06-10 21:36 ` Robert LeBlanc [not found] ` <CAANLjFrv-0VArTEkgqbrhzFjn1fg_egpCJuQZnAurVrHjbL_qA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 0 siblings, 1 reply; 20+ messages in thread From: Robert LeBlanc @ 2016-06-10 21:36 UTC (permalink / raw) To: Max Gurtovoy; +Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA I bisected the kernel and it looks like the performance of the Connect-IB card goes down and the performance of the ConnectX-3 card goes up with this commit (but I'm not sure why this would cause this): ab46db0a3325a064bb24e826b12995d157565efb is the first bad commit commit ab46db0a3325a064bb24e826b12995d157565efb Author: Jiri Olsa <jolsa-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> Date: Thu Dec 3 10:06:43 2015 +0100 perf stat: Use perf_evlist__enable in handle_initial_delay No need to mimic the behaviour of perf_evlist__enable, we can use it directly. Signed-off-by: Jiri Olsa <jolsa-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> Tested-by: Arnaldo Carvalho de Melo <acme-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> Cc: Adrian Hunter <adrian.hunter-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org> Cc: David Ahern <dsahern-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> Cc: Namhyung Kim <namhyung-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> Cc: Peter Zijlstra <a.p.zijlstra-/NLkJaSkS4VmR6Xm/wNWPw@public.gmane.org> Link: http://lkml.kernel.org/r/1449133606-14429-5-git-send-email-jolsa-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org Signed-off-by: Arnaldo Carvalho de Melo <acme-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> :040000 040000 67e69893bf6d47b372e08d7089d37a7b9f602fa7 b63d9b366f078eabf86f4da3d1cc53ae7434a949 M tools 4.4.0_rc2_3e27c920 sdc;10.218.128.17;5291495;1322873;15853 sde;10.218.202.17;4966024;1241506;16892 sdh;10.218.203.17;4980471;1245117;16843 sdk;10.218.204.17;4966612;1241653;16890 sdd;10.219.128.17;5060084;1265021;16578 sdf;10.219.202.17;5065278;1266319;16561 sdi;10.219.203.17;5047600;1261900;16619 sdl;10.219.204.17;5036992;1259248;16654 sdn;10.220.128.17;3775081;943770;22221 sdg;10.220.202.17;3758336;939584;22320 sdj;10.220.203.17;3792832;948208;22117 sdm;10.220.204.17;3771516;942879;22242 4.4.0_rc2_ab46db0a sdc;10.218.128.17;3792146;948036;22121 sdf;10.218.202.17;3738405;934601;22439 sdj;10.218.203.17;3764239;941059;22285 sdl;10.218.204.17;3785302;946325;22161 sdd;10.219.128.17;3762382;940595;22296 sdg;10.219.202.17;3765760;941440;22276 sdi;10.219.203.17;3873751;968437;21655 sdm;10.219.204.17;3769483;942370;22254 sde;10.220.128.17;5022517;1255629;16702 sdh;10.220.202.17;5018911;1254727;16714 sdk;10.220.203.17;5037295;1259323;16653 sdn;10.220.204.17;5033064;1258266;16667 ---------------- Robert LeBlanc PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 On Wed, Jun 8, 2016 at 9:33 AM, Robert LeBlanc <robert-4JaGZRWAfWbajFs6igw21g@public.gmane.org> wrote: > With 4.1.15, the C-IB card gets about 1.15 MIOPs, while the CX3 gets > about 0.99 MIOPs. But starting with the 4.4.4 kernel, the C-IB card > drops to 0.96 MIOPs and the CX3 card jumps to 1.25 MIOPs. In the 4.6.0 > kernel, both cards drop, the C-IB to 0.82 MIOPs and the CX3 to 1.15 > MIOPs. I confirmed this morning that the card order was swapped on the > 4.6.0 kernel and it was not different ports of the C-IB performing > differently, but different cards. > > Given the limitations of the PCIe 8x port for the CX3, I think 1.25 > MIOPs is about the best we can do there. In summary, the performance > of the C-IB card drops after 4.1.15 and gets progressively worse as > the kernels increase. The CX3 card peaks at the 4.4.4 kernel and > degrades a bit on the 4.6.0 kernel. > > Increasing the IO depth by adding jobs does not improve performance, > it actually decreases performance. Based on an average of 4 runs at > each job number from 1-80, the Goldilocks zone is 31-57 jobs where the > difference in performance is less than 1%. > > Similarly, increasing block request size does not really change the > figures to reach line speed. > > Here is the output of the 4.6.0 kernel with 4M bs: > sdc;10.218.128.17;3354638;819;25006 > sdf;10.218.202.17;3376920;824;24841 > sdm;10.218.203.17;3367431;822;24911 > sdk;10.218.204.17;3378960;824;24826 > sde;10.219.128.17;3366350;821;24919 > sdl;10.219.202.17;3379641;825;24821 > sdg;10.219.203.17;3391254;827;24736 > sdn;10.219.204.17;3401706;830;24660 > sdd;10.220.128.17;4597505;1122;18246 > sdi;10.220.202.17;4594231;1121;18259 > sdj;10.220.203.17;4667598;1139;17972 > sdh;10.220.204.17;4628197;1129;18125 > > The CPU on the target is a kworker thread at 96%, but no single > processor over 15%. The initiator has low fio CPU utilization (<10%) > for each job and no single CPU over 22% utilized. > > I have tried manually spreading the IRQ affinity over the processors > of the respective NUMA nodes and there was no noticeable change in > performance when doing so. > > Loading ib_iser on the initiator shows maybe a slight increase in performance: > > sdc;10.218.128.17;3396885;849221;24695 > sdf;10.218.202.17;3429240;857310;24462 > sdi;10.218.203.17;3454234;863558;24285 > sdm;10.218.204.17;3391666;847916;24733 > sde;10.219.128.17;3403914;850978;24644 > sdh;10.219.202.17;3491034;872758;24029 > sdk;10.219.203.17;3390569;847642;24741 > sdl;10.219.204.17;3498898;874724;23975 > sdd;10.220.128.17;4664743;1166185;17983 > sdg;10.220.202.17;4624880;1156220;18138 > sdj;10.220.203.17;4616227;1154056;18172 > sdn;10.220.204.17;4619786;1154946;18158 > > I'd like to see the C-IB card at 1.25+ MIOPs (I know that the target > can do that performance and we were limited on the CX3 by the PCIe bus > which isn't an issue with the 16x C-IB card for a single port). > Although the loss of performance in the CX3 card is concerning, I'm > mostly focused on the C-IB card at the moment. I will probably start > bisecting 4.1.15 to 4.4.4 to see if I can identify when the > performance of the C-IB card degrades. > ---------------- > Robert LeBlanc > PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 > > > On Wed, Jun 8, 2016 at 7:52 AM, Max Gurtovoy <maxg-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> wrote: >> >> >> On 6/8/2016 1:37 AM, Robert LeBlanc wrote: >>> >>> On the 4.1.15 kernel: >>> sdc;10.218.128.17;3971878;992969;21120 >>> sdd;10.218.202.17;3967745;991936;21142 >>> sdg;10.218.203.17;3938128;984532;21301 >>> sdk;10.218.204.17;3952602;988150;21223 >>> sdn;10.219.128.17;4615719;1153929;18174 >>> sdf;10.219.202.17;4622331;1155582;18148 >>> sdi;10.219.203.17;4602297;1150574;18227 >>> sdl;10.219.204.17;4565477;1141369;18374 >>> sde;10.220.128.17;4594986;1148746;18256 >>> sdh;10.220.202.17;4590209;1147552;18275 >>> sdj;10.220.203.17;4599017;1149754;18240 >>> sdm;10.220.204.17;4610898;1152724;18193 >>> >>> On the 4.6.0 kernel: >>> sdc;10.218.128.17;3239219;809804;25897 >>> sdf;10.218.202.17;3321300;830325;25257 >>> sdm;10.218.203.17;3339015;834753;25123 >>> sdk;10.218.204.17;3637573;909393;23061 >>> sde;10.219.128.17;3325777;831444;25223 >>> sdl;10.219.202.17;3305464;826366;25378 >>> sdg;10.219.203.17;3304032;826008;25389 >>> sdn;10.219.204.17;3330001;832500;25191 >>> sdd;10.220.128.17;4624370;1156092;18140 >>> sdi;10.220.202.17;4619277;1154819;18160 >>> sdj;10.220.203.17;4610138;1152534;18196 >>> sdh;10.220.204.17;4586445;1146611;18290 >>> >>> It seems that there is a lot of changes between the kernels. I had >>> these kernels already on the box and I can bisect them if you think it >>> would help. It is really odd that port 2 on the Connect-IB card did >>> better than port 1 on the 4.6.0 kernel. >>> ---------------- >>> Robert LeBlanc >>> PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 >> >> >> so in these kernels you get better performance with the C-IB than CX3 ? >> we need to find the bottleneck. >> Can you increase the iodepth and/or block size to see if we can reach the >> wire speed. >> another try is to load ib_iser with always_register=N. >> >> what is the cpu utilzation in both initiator/target ? >> did you spread the irq affinity ? >> >> >>> >>> >>> On Tue, Jun 7, 2016 at 10:48 AM, Robert LeBlanc <robert-4JaGZRWAfWbajFs6igw21g@public.gmane.org> >>> wrote: >>>> >>>> The target is LIO (same kernel) with a 200 GB RAM disk and I'm running >>>> fio as follows: >>>> >>>> fio --rw=read --bs=4K --size=2G --numjobs=40 --name=worker.matt >>>> --group_reporting --minimal | cut -d';' -f7,8,9 >>>> >>>> All of the paths are set the same with noop and nomerges to either 1 >>>> or 2 (doesn't make a big difference). >>>> >>>> I started looking into this when the 4.6 kernel wasn't performing as >>>> well as we were able to get the 4.4 kernel to work. I went back to the >>>> 4.4 kernel and I could not replicate the 4+ million IOPs. So I started >>>> breaking down the problem to smaller pieces and found this anomaly. >>>> Since there hasn't been any suggestions up to this point, I'll check >>>> other kernel version to see if it is specific to certain kernels. If >>>> you need more information, please let me know. >>>> >>>> Thanks, >>>> ---------------- >>>> Robert LeBlanc >>>> PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 >>>> >>>> >>>> On Tue, Jun 7, 2016 at 6:02 AM, Max Gurtovoy <maxg-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> wrote: >>>>> >>>>> >>>>> >>>>> On 6/7/2016 1:36 AM, Robert LeBlanc wrote: >>>>>> >>>>>> >>>>>> I'm trying to understand why our Connect-IB card is not performing as >>>>>> well as our ConnectX-3 card. There are 3 ports between the two cards >>>>>> and 12 paths to the iSER target which is a RAM disk. >>>>> >>>>> >>>>> >>>>> <snip> >>>>> >>>>>> >>>>>> When I run fio against each path individually, I get: >>>>> >>>>> >>>>> >>>>> What is the scenario (bs, numjobs, iodepth) for each run ? >>>>> Which target do you use ? backing store ? >>>>> >>>>> >>>>>> >>>>>> disk;target IP;bandwidth,IOPs,Execution time >>>>>> sdn;10.218.128.17;5053682;1263420;16599 >>>>>> sde;10.218.202.17;5032158;1258039;16670 >>>>>> sdh;10.218.203.17;4993516;1248379;16799 >>>>>> sdk;10.218.204.17;5081848;1270462;16507 >>>>>> sdc;10.219.128.17;3750942;937735;22364 >>>>>> sdf;10.219.202.17;3746921;936730;22388 >>>>>> sdi;10.219.203.17;3873929;968482;21654 >>>>>> sdl;10.219.204.17;3841465;960366;21837 >>>>>> sdd;10.220.128.17;3760358;940089;22308 >>>>>> sdg;10.220.202.17;3866252;966563;21697 >>>>>> sdj;10.220.203.17;3757495;939373;22325 >>>>>> sdm;10.220.204.17;4064051;1016012;20641 >>>>>> >> -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 20+ messages in thread
[parent not found: <CAANLjFrv-0VArTEkgqbrhzFjn1fg_egpCJuQZnAurVrHjbL_qA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>]
* Re: Connect-IB not performing as well as ConnectX-3 with iSER [not found] ` <CAANLjFrv-0VArTEkgqbrhzFjn1fg_egpCJuQZnAurVrHjbL_qA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> @ 2016-06-20 15:23 ` Robert LeBlanc [not found] ` <CAANLjFqoV-5HK0c+LdEbuxd81Vm=g=WE3cQgp47dH-yfYjZjGw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 0 siblings, 1 reply; 20+ messages in thread From: Robert LeBlanc @ 2016-06-20 15:23 UTC (permalink / raw) To: linux-rdma-u79uwXL29TY76Z2rM5mHXA, linux-scsi-u79uwXL29TY76Z2rM5mHXA Cc: Max Gurtovoy Adding linux-scsi This last week I tried to figure out where a 10-15% decrease in performance showed up between 4.5 and 4.6 using iSER and ConnectX-3 and Connect-IB cards (10.{218,219}.*.17 are Connect-IB and 10.220.*.17 are ConnectX-3). To review, straight RDMA transfers between cards showed line rate was being achieved, just iSER was not able to achieve those same rates for some cards on different kernels. 4.5 vanilla default config sdc;10.218.128.17;3800048;950012;22075 sdi;10.218.202.17;3757158;939289;22327 sdg;10.218.203.17;3774062;943515;22227 sdn;10.218.204.17;3816299;954074;21981 sdd;10.219.128.17;3821863;955465;21949 sdf;10.219.202.17;3784106;946026;22168 sdj;10.219.203.17;3827094;956773;21919 sdm;10.219.204.17;3788208;947052;22144 sde;10.220.128.17;5054596;1263649;16596 sdh;10.220.202.17;5013811;1253452;16731 sdl;10.220.203.17;5052160;1263040;16604 sdk;10.220.204.17;4990248;1247562;16810 4.6 vanilla default config sde;10.218.128.17;3431063;857765;24449 sdf;10.218.202.17;3360685;840171;24961 sdi;10.218.203.17;3355174;838793;25002 sdm;10.218.204.17;3360955;840238;24959 sdd;10.219.128.17;3337288;834322;25136 sdh;10.219.202.17;3327492;831873;25210 sdj;10.219.203.17;3380867;845216;24812 sdk;10.219.204.17;3418340;854585;24540 sdc;10.220.128.17;4668377;1167094;17969 sdg;10.220.202.17;4716675;1179168;17785 sdl;10.220.203.17;4675663;1168915;17941 sdn;10.220.204.17;4631519;1157879;18112 I narrowed the performance degradation to this series 7861728..5e47f19, but while trying to bisect it, the changes were erratic between each commit that I could not figure out exactly which introduced the issue. If someone could give me some pointers on what to do, I can keep trying to dig through this. 4.5.0_rc5_7861728d_00001 sdc;10.218.128.17;3747591;936897;22384 sdf;10.218.202.17;3750607;937651;22366 sdh;10.218.203.17;3750439;937609;22367 sdn;10.218.204.17;3771008;942752;22245 sde;10.219.128.17;3867678;966919;21689 sdg;10.219.202.17;3781889;945472;22181 sdk;10.219.203.17;3791804;947951;22123 sdl;10.219.204.17;3795406;948851;22102 sdd;10.220.128.17;5039110;1259777;16647 sdi;10.220.202.17;4992921;1248230;16801 sdj;10.220.203.17;5015610;1253902;16725 sdm;10.220.204.17;5087087;1271771;16490 4.5.0_rc5_f81bf458_00018 sdb;10.218.128.17;5023720;1255930;16698 sde;10.218.202.17;5016809;1254202;16721 sdj;10.218.203.17;5021915;1255478;16704 sdk;10.218.204.17;5021314;1255328;16706 sdc;10.219.128.17;4984318;1246079;16830 sdf;10.219.202.17;4986096;1246524;16824 sdh;10.219.203.17;5043958;1260989;16631 sdm;10.219.204.17;5032460;1258115;16669 sdd;10.220.128.17;3736740;934185;22449 sdg;10.220.202.17;3728767;932191;22497 sdi;10.220.203.17;3752117;938029;22357 sdl;10.220.204.17;3763901;940975;22287 4.5.0_rc5_07b63196_00027 sdb;10.218.128.17;3606142;901535;23262 sdg;10.218.202.17;3570988;892747;23491 sdf;10.218.203.17;3576011;894002;23458 sdk;10.218.204.17;3558113;889528;23576 sdc;10.219.128.17;3577384;894346;23449 sde;10.219.202.17;3575401;893850;23462 sdj;10.219.203.17;3567798;891949;23512 sdl;10.219.204.17;3584262;896065;23404 sdd;10.220.128.17;4430680;1107670;18933 sdh;10.220.202.17;4488286;1122071;18690 sdi;10.220.203.17;4487326;1121831;18694 sdm;10.220.204.17;4441236;1110309;18888 4.5.0_rc5_5e47f198_00036 sdb;10.218.128.17;3519597;879899;23834 sdi;10.218.202.17;3512229;878057;23884 sdh;10.218.203.17;3518563;879640;23841 sdk;10.218.204.17;3582119;895529;23418 sdd;10.219.128.17;3550883;887720;23624 sdj;10.219.202.17;3558415;889603;23574 sde;10.219.203.17;3552086;888021;23616 sdl;10.219.204.17;3579521;894880;23435 sdc;10.220.128.17;4532912;1133228;18506 sdf;10.220.202.17;4558035;1139508;18404 sdg;10.220.203.17;4601035;1150258;18232 sdm;10.220.204.17;4548150;1137037;18444 While bisecting the kernel, I also stumbled across one that worked really well for both adapters which I haven't seen in the release kernels. 4.5.0_rc3_1aaa57f5_00399 sdc;10.218.128.17;4627942;1156985;18126 sdf;10.218.202.17;4590963;1147740;18272 sdk;10.218.203.17;4564980;1141245;18376 sdn;10.218.204.17;4571946;1142986;18348 sdd;10.219.128.17;4591717;1147929;18269 sdi;10.219.202.17;4505644;1126411;18618 sdg;10.219.203.17;4562001;1140500;18388 sdl;10.219.204.17;4583187;1145796;18303 sde;10.220.128.17;5511568;1377892;15220 sdh;10.220.202.17;5515555;1378888;15209 sdj;10.220.203.17;5609983;1402495;14953 sdm;10.220.204.17;5509035;1377258;15227 Here the ConnectX-3 card is performing perfectly while the Connect-IB card still has some room for improvement. I'd like to get to the bottom of why I'm not seeing the same performance out of the newer kernels, but I just don't understand the code. I've tried to do what I can in narrowing down where major changes happened in the kernel to cause these changes in hopes that it would help someone on the list. If there is anything I can do to help out, please let me know. Thank you, ---------------- Robert LeBlanc PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 On Fri, Jun 10, 2016 at 3:36 PM, Robert LeBlanc <robert-4JaGZRWAfWbajFs6igw21g@public.gmane.org> wrote: > I bisected the kernel and it looks like the performance of the > Connect-IB card goes down and the performance of the ConnectX-3 card > goes up with this commit (but I'm not sure why this would cause this): > > ab46db0a3325a064bb24e826b12995d157565efb is the first bad commit > commit ab46db0a3325a064bb24e826b12995d157565efb > Author: Jiri Olsa <jolsa-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> > Date: Thu Dec 3 10:06:43 2015 +0100 > > perf stat: Use perf_evlist__enable in handle_initial_delay > > No need to mimic the behaviour of perf_evlist__enable, we can use it > directly. > > Signed-off-by: Jiri Olsa <jolsa-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> > Tested-by: Arnaldo Carvalho de Melo <acme-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> > Cc: Adrian Hunter <adrian.hunter-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org> > Cc: David Ahern <dsahern-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> > Cc: Namhyung Kim <namhyung-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> > Cc: Peter Zijlstra <a.p.zijlstra-/NLkJaSkS4VmR6Xm/wNWPw@public.gmane.org> > Link: http://lkml.kernel.org/r/1449133606-14429-5-git-send-email-jolsa-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org > Signed-off-by: Arnaldo Carvalho de Melo <acme-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> > > :040000 040000 67e69893bf6d47b372e08d7089d37a7b9f602fa7 > b63d9b366f078eabf86f4da3d1cc53ae7434a949 M tools > > 4.4.0_rc2_3e27c920 > sdc;10.218.128.17;5291495;1322873;15853 > sde;10.218.202.17;4966024;1241506;16892 > sdh;10.218.203.17;4980471;1245117;16843 > sdk;10.218.204.17;4966612;1241653;16890 > sdd;10.219.128.17;5060084;1265021;16578 > sdf;10.219.202.17;5065278;1266319;16561 > sdi;10.219.203.17;5047600;1261900;16619 > sdl;10.219.204.17;5036992;1259248;16654 > sdn;10.220.128.17;3775081;943770;22221 > sdg;10.220.202.17;3758336;939584;22320 > sdj;10.220.203.17;3792832;948208;22117 > sdm;10.220.204.17;3771516;942879;22242 > > 4.4.0_rc2_ab46db0a > sdc;10.218.128.17;3792146;948036;22121 > sdf;10.218.202.17;3738405;934601;22439 > sdj;10.218.203.17;3764239;941059;22285 > sdl;10.218.204.17;3785302;946325;22161 > sdd;10.219.128.17;3762382;940595;22296 > sdg;10.219.202.17;3765760;941440;22276 > sdi;10.219.203.17;3873751;968437;21655 > sdm;10.219.204.17;3769483;942370;22254 > sde;10.220.128.17;5022517;1255629;16702 > sdh;10.220.202.17;5018911;1254727;16714 > sdk;10.220.203.17;5037295;1259323;16653 > sdn;10.220.204.17;5033064;1258266;16667 > > ---------------- > Robert LeBlanc > PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 > > > On Wed, Jun 8, 2016 at 9:33 AM, Robert LeBlanc <robert-4JaGZRWAfWbajFs6igw21g@public.gmane.org> wrote: >> With 4.1.15, the C-IB card gets about 1.15 MIOPs, while the CX3 gets >> about 0.99 MIOPs. But starting with the 4.4.4 kernel, the C-IB card >> drops to 0.96 MIOPs and the CX3 card jumps to 1.25 MIOPs. In the 4.6.0 >> kernel, both cards drop, the C-IB to 0.82 MIOPs and the CX3 to 1.15 >> MIOPs. I confirmed this morning that the card order was swapped on the >> 4.6.0 kernel and it was not different ports of the C-IB performing >> differently, but different cards. >> >> Given the limitations of the PCIe 8x port for the CX3, I think 1.25 >> MIOPs is about the best we can do there. In summary, the performance >> of the C-IB card drops after 4.1.15 and gets progressively worse as >> the kernels increase. The CX3 card peaks at the 4.4.4 kernel and >> degrades a bit on the 4.6.0 kernel. >> >> Increasing the IO depth by adding jobs does not improve performance, >> it actually decreases performance. Based on an average of 4 runs at >> each job number from 1-80, the Goldilocks zone is 31-57 jobs where the >> difference in performance is less than 1%. >> >> Similarly, increasing block request size does not really change the >> figures to reach line speed. >> >> Here is the output of the 4.6.0 kernel with 4M bs: >> sdc;10.218.128.17;3354638;819;25006 >> sdf;10.218.202.17;3376920;824;24841 >> sdm;10.218.203.17;3367431;822;24911 >> sdk;10.218.204.17;3378960;824;24826 >> sde;10.219.128.17;3366350;821;24919 >> sdl;10.219.202.17;3379641;825;24821 >> sdg;10.219.203.17;3391254;827;24736 >> sdn;10.219.204.17;3401706;830;24660 >> sdd;10.220.128.17;4597505;1122;18246 >> sdi;10.220.202.17;4594231;1121;18259 >> sdj;10.220.203.17;4667598;1139;17972 >> sdh;10.220.204.17;4628197;1129;18125 >> >> The CPU on the target is a kworker thread at 96%, but no single >> processor over 15%. The initiator has low fio CPU utilization (<10%) >> for each job and no single CPU over 22% utilized. >> >> I have tried manually spreading the IRQ affinity over the processors >> of the respective NUMA nodes and there was no noticeable change in >> performance when doing so. >> >> Loading ib_iser on the initiator shows maybe a slight increase in performance: >> >> sdc;10.218.128.17;3396885;849221;24695 >> sdf;10.218.202.17;3429240;857310;24462 >> sdi;10.218.203.17;3454234;863558;24285 >> sdm;10.218.204.17;3391666;847916;24733 >> sde;10.219.128.17;3403914;850978;24644 >> sdh;10.219.202.17;3491034;872758;24029 >> sdk;10.219.203.17;3390569;847642;24741 >> sdl;10.219.204.17;3498898;874724;23975 >> sdd;10.220.128.17;4664743;1166185;17983 >> sdg;10.220.202.17;4624880;1156220;18138 >> sdj;10.220.203.17;4616227;1154056;18172 >> sdn;10.220.204.17;4619786;1154946;18158 >> >> I'd like to see the C-IB card at 1.25+ MIOPs (I know that the target >> can do that performance and we were limited on the CX3 by the PCIe bus >> which isn't an issue with the 16x C-IB card for a single port). >> Although the loss of performance in the CX3 card is concerning, I'm >> mostly focused on the C-IB card at the moment. I will probably start >> bisecting 4.1.15 to 4.4.4 to see if I can identify when the >> performance of the C-IB card degrades. >> ---------------- >> Robert LeBlanc >> PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 >> >> >> On Wed, Jun 8, 2016 at 7:52 AM, Max Gurtovoy <maxg-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> wrote: >>> >>> >>> On 6/8/2016 1:37 AM, Robert LeBlanc wrote: >>>> >>>> On the 4.1.15 kernel: >>>> sdc;10.218.128.17;3971878;992969;21120 >>>> sdd;10.218.202.17;3967745;991936;21142 >>>> sdg;10.218.203.17;3938128;984532;21301 >>>> sdk;10.218.204.17;3952602;988150;21223 >>>> sdn;10.219.128.17;4615719;1153929;18174 >>>> sdf;10.219.202.17;4622331;1155582;18148 >>>> sdi;10.219.203.17;4602297;1150574;18227 >>>> sdl;10.219.204.17;4565477;1141369;18374 >>>> sde;10.220.128.17;4594986;1148746;18256 >>>> sdh;10.220.202.17;4590209;1147552;18275 >>>> sdj;10.220.203.17;4599017;1149754;18240 >>>> sdm;10.220.204.17;4610898;1152724;18193 >>>> >>>> On the 4.6.0 kernel: >>>> sdc;10.218.128.17;3239219;809804;25897 >>>> sdf;10.218.202.17;3321300;830325;25257 >>>> sdm;10.218.203.17;3339015;834753;25123 >>>> sdk;10.218.204.17;3637573;909393;23061 >>>> sde;10.219.128.17;3325777;831444;25223 >>>> sdl;10.219.202.17;3305464;826366;25378 >>>> sdg;10.219.203.17;3304032;826008;25389 >>>> sdn;10.219.204.17;3330001;832500;25191 >>>> sdd;10.220.128.17;4624370;1156092;18140 >>>> sdi;10.220.202.17;4619277;1154819;18160 >>>> sdj;10.220.203.17;4610138;1152534;18196 >>>> sdh;10.220.204.17;4586445;1146611;18290 >>>> >>>> It seems that there is a lot of changes between the kernels. I had >>>> these kernels already on the box and I can bisect them if you think it >>>> would help. It is really odd that port 2 on the Connect-IB card did >>>> better than port 1 on the 4.6.0 kernel. >>>> ---------------- >>>> Robert LeBlanc >>>> PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 >>> >>> >>> so in these kernels you get better performance with the C-IB than CX3 ? >>> we need to find the bottleneck. >>> Can you increase the iodepth and/or block size to see if we can reach the >>> wire speed. >>> another try is to load ib_iser with always_register=N. >>> >>> what is the cpu utilzation in both initiator/target ? >>> did you spread the irq affinity ? >>> >>> >>>> >>>> >>>> On Tue, Jun 7, 2016 at 10:48 AM, Robert LeBlanc <robert-4JaGZRWAfWbajFs6igw21g@public.gmane.org> >>>> wrote: >>>>> >>>>> The target is LIO (same kernel) with a 200 GB RAM disk and I'm running >>>>> fio as follows: >>>>> >>>>> fio --rw=read --bs=4K --size=2G --numjobs=40 --name=worker.matt >>>>> --group_reporting --minimal | cut -d';' -f7,8,9 >>>>> >>>>> All of the paths are set the same with noop and nomerges to either 1 >>>>> or 2 (doesn't make a big difference). >>>>> >>>>> I started looking into this when the 4.6 kernel wasn't performing as >>>>> well as we were able to get the 4.4 kernel to work. I went back to the >>>>> 4.4 kernel and I could not replicate the 4+ million IOPs. So I started >>>>> breaking down the problem to smaller pieces and found this anomaly. >>>>> Since there hasn't been any suggestions up to this point, I'll check >>>>> other kernel version to see if it is specific to certain kernels. If >>>>> you need more information, please let me know. >>>>> >>>>> Thanks, >>>>> ---------------- >>>>> Robert LeBlanc >>>>> PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 >>>>> >>>>> >>>>> On Tue, Jun 7, 2016 at 6:02 AM, Max Gurtovoy <maxg-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> wrote: >>>>>> >>>>>> >>>>>> >>>>>> On 6/7/2016 1:36 AM, Robert LeBlanc wrote: >>>>>>> >>>>>>> >>>>>>> I'm trying to understand why our Connect-IB card is not performing as >>>>>>> well as our ConnectX-3 card. There are 3 ports between the two cards >>>>>>> and 12 paths to the iSER target which is a RAM disk. >>>>>> >>>>>> >>>>>> >>>>>> <snip> >>>>>> >>>>>>> >>>>>>> When I run fio against each path individually, I get: >>>>>> >>>>>> >>>>>> >>>>>> What is the scenario (bs, numjobs, iodepth) for each run ? >>>>>> Which target do you use ? backing store ? >>>>>> >>>>>> >>>>>>> >>>>>>> disk;target IP;bandwidth,IOPs,Execution time >>>>>>> sdn;10.218.128.17;5053682;1263420;16599 >>>>>>> sde;10.218.202.17;5032158;1258039;16670 >>>>>>> sdh;10.218.203.17;4993516;1248379;16799 >>>>>>> sdk;10.218.204.17;5081848;1270462;16507 >>>>>>> sdc;10.219.128.17;3750942;937735;22364 >>>>>>> sdf;10.219.202.17;3746921;936730;22388 >>>>>>> sdi;10.219.203.17;3873929;968482;21654 >>>>>>> sdl;10.219.204.17;3841465;960366;21837 >>>>>>> sdd;10.220.128.17;3760358;940089;22308 >>>>>>> sdg;10.220.202.17;3866252;966563;21697 >>>>>>> sdj;10.220.203.17;3757495;939373;22325 >>>>>>> sdm;10.220.204.17;4064051;1016012;20641 >>>>>>> >>> -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 20+ messages in thread
[parent not found: <CAANLjFqoV-5HK0c+LdEbuxd81Vm=g=WE3cQgp47dH-yfYjZjGw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>]
* Re: Connect-IB not performing as well as ConnectX-3 with iSER [not found] ` <CAANLjFqoV-5HK0c+LdEbuxd81Vm=g=WE3cQgp47dH-yfYjZjGw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> @ 2016-06-20 21:27 ` Max Gurtovoy [not found] ` <3646a0c9-3f2d-66b8-c4da-c91ca1d01cee-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> 2016-06-21 13:08 ` Sagi Grimberg 1 sibling, 1 reply; 20+ messages in thread From: Max Gurtovoy @ 2016-06-20 21:27 UTC (permalink / raw) To: Robert LeBlanc, linux-rdma-u79uwXL29TY76Z2rM5mHXA, linux-scsi-u79uwXL29TY76Z2rM5mHXA Did you see this kind of regression in SRP ? or with some other target (e.g TGT) ? Trying to understand if it's a ULP issue or LLD... On 6/20/2016 6:23 PM, Robert LeBlanc wrote: > Adding linux-scsi > > This last week I tried to figure out where a 10-15% decrease in > performance showed up between 4.5 and 4.6 using iSER and ConnectX-3 > and Connect-IB cards (10.{218,219}.*.17 are Connect-IB and 10.220.*.17 > are ConnectX-3). To review, straight RDMA transfers between cards > showed line rate was being achieved, just iSER was not able to achieve > those same rates for some cards on different kernels. > > 4.5 vanilla default config > sdc;10.218.128.17;3800048;950012;22075 > sdi;10.218.202.17;3757158;939289;22327 > sdg;10.218.203.17;3774062;943515;22227 > sdn;10.218.204.17;3816299;954074;21981 > sdd;10.219.128.17;3821863;955465;21949 > sdf;10.219.202.17;3784106;946026;22168 > sdj;10.219.203.17;3827094;956773;21919 > sdm;10.219.204.17;3788208;947052;22144 > sde;10.220.128.17;5054596;1263649;16596 > sdh;10.220.202.17;5013811;1253452;16731 > sdl;10.220.203.17;5052160;1263040;16604 > sdk;10.220.204.17;4990248;1247562;16810 > > 4.6 vanilla default config > sde;10.218.128.17;3431063;857765;24449 > sdf;10.218.202.17;3360685;840171;24961 > sdi;10.218.203.17;3355174;838793;25002 > sdm;10.218.204.17;3360955;840238;24959 > sdd;10.219.128.17;3337288;834322;25136 > sdh;10.219.202.17;3327492;831873;25210 > sdj;10.219.203.17;3380867;845216;24812 > sdk;10.219.204.17;3418340;854585;24540 > sdc;10.220.128.17;4668377;1167094;17969 > sdg;10.220.202.17;4716675;1179168;17785 > sdl;10.220.203.17;4675663;1168915;17941 > sdn;10.220.204.17;4631519;1157879;18112 > > I narrowed the performance degradation to this series > 7861728..5e47f19, but while trying to bisect it, the changes were > erratic between each commit that I could not figure out exactly which > introduced the issue. If someone could give me some pointers on what > to do, I can keep trying to dig through this. > > 4.5.0_rc5_7861728d_00001 > sdc;10.218.128.17;3747591;936897;22384 > sdf;10.218.202.17;3750607;937651;22366 > sdh;10.218.203.17;3750439;937609;22367 > sdn;10.218.204.17;3771008;942752;22245 > sde;10.219.128.17;3867678;966919;21689 > sdg;10.219.202.17;3781889;945472;22181 > sdk;10.219.203.17;3791804;947951;22123 > sdl;10.219.204.17;3795406;948851;22102 > sdd;10.220.128.17;5039110;1259777;16647 > sdi;10.220.202.17;4992921;1248230;16801 > sdj;10.220.203.17;5015610;1253902;16725 > sdm;10.220.204.17;5087087;1271771;16490 > > 4.5.0_rc5_f81bf458_00018 > sdb;10.218.128.17;5023720;1255930;16698 > sde;10.218.202.17;5016809;1254202;16721 > sdj;10.218.203.17;5021915;1255478;16704 > sdk;10.218.204.17;5021314;1255328;16706 > sdc;10.219.128.17;4984318;1246079;16830 > sdf;10.219.202.17;4986096;1246524;16824 > sdh;10.219.203.17;5043958;1260989;16631 > sdm;10.219.204.17;5032460;1258115;16669 > sdd;10.220.128.17;3736740;934185;22449 > sdg;10.220.202.17;3728767;932191;22497 > sdi;10.220.203.17;3752117;938029;22357 > sdl;10.220.204.17;3763901;940975;22287 > > 4.5.0_rc5_07b63196_00027 > sdb;10.218.128.17;3606142;901535;23262 > sdg;10.218.202.17;3570988;892747;23491 > sdf;10.218.203.17;3576011;894002;23458 > sdk;10.218.204.17;3558113;889528;23576 > sdc;10.219.128.17;3577384;894346;23449 > sde;10.219.202.17;3575401;893850;23462 > sdj;10.219.203.17;3567798;891949;23512 > sdl;10.219.204.17;3584262;896065;23404 > sdd;10.220.128.17;4430680;1107670;18933 > sdh;10.220.202.17;4488286;1122071;18690 > sdi;10.220.203.17;4487326;1121831;18694 > sdm;10.220.204.17;4441236;1110309;18888 > > 4.5.0_rc5_5e47f198_00036 > sdb;10.218.128.17;3519597;879899;23834 > sdi;10.218.202.17;3512229;878057;23884 > sdh;10.218.203.17;3518563;879640;23841 > sdk;10.218.204.17;3582119;895529;23418 > sdd;10.219.128.17;3550883;887720;23624 > sdj;10.219.202.17;3558415;889603;23574 > sde;10.219.203.17;3552086;888021;23616 > sdl;10.219.204.17;3579521;894880;23435 > sdc;10.220.128.17;4532912;1133228;18506 > sdf;10.220.202.17;4558035;1139508;18404 > sdg;10.220.203.17;4601035;1150258;18232 > sdm;10.220.204.17;4548150;1137037;18444 > > While bisecting the kernel, I also stumbled across one that worked > really well for both adapters which I haven't seen in the release > kernels. > > 4.5.0_rc3_1aaa57f5_00399 > sdc;10.218.128.17;4627942;1156985;18126 > sdf;10.218.202.17;4590963;1147740;18272 > sdk;10.218.203.17;4564980;1141245;18376 > sdn;10.218.204.17;4571946;1142986;18348 > sdd;10.219.128.17;4591717;1147929;18269 > sdi;10.219.202.17;4505644;1126411;18618 > sdg;10.219.203.17;4562001;1140500;18388 > sdl;10.219.204.17;4583187;1145796;18303 > sde;10.220.128.17;5511568;1377892;15220 > sdh;10.220.202.17;5515555;1378888;15209 > sdj;10.220.203.17;5609983;1402495;14953 > sdm;10.220.204.17;5509035;1377258;15227 > > Here the ConnectX-3 card is performing perfectly while the Connect-IB > card still has some room for improvement. > > I'd like to get to the bottom of why I'm not seeing the same > performance out of the newer kernels, but I just don't understand the > code. I've tried to do what I can in narrowing down where major > changes happened in the kernel to cause these changes in hopes that it > would help someone on the list. If there is anything I can do to help > out, please let me know. > > Thank you, > ---------------- > Robert LeBlanc > PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 > > > On Fri, Jun 10, 2016 at 3:36 PM, Robert LeBlanc <robert-4JaGZRWAfWbajFs6igw21g@public.gmane.org> wrote: >> I bisected the kernel and it looks like the performance of the >> Connect-IB card goes down and the performance of the ConnectX-3 card >> goes up with this commit (but I'm not sure why this would cause this): >> >> ab46db0a3325a064bb24e826b12995d157565efb is the first bad commit >> commit ab46db0a3325a064bb24e826b12995d157565efb >> Author: Jiri Olsa <jolsa-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> >> Date: Thu Dec 3 10:06:43 2015 +0100 >> >> perf stat: Use perf_evlist__enable in handle_initial_delay >> >> No need to mimic the behaviour of perf_evlist__enable, we can use it >> directly. >> >> Signed-off-by: Jiri Olsa <jolsa-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> >> Tested-by: Arnaldo Carvalho de Melo <acme-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> >> Cc: Adrian Hunter <adrian.hunter-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org> >> Cc: David Ahern <dsahern-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> >> Cc: Namhyung Kim <namhyung-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> >> Cc: Peter Zijlstra <a.p.zijlstra-/NLkJaSkS4VmR6Xm/wNWPw@public.gmane.org> >> Link: http://lkml.kernel.org/r/1449133606-14429-5-git-send-email-jolsa-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org >> Signed-off-by: Arnaldo Carvalho de Melo <acme-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> >> >> :040000 040000 67e69893bf6d47b372e08d7089d37a7b9f602fa7 >> b63d9b366f078eabf86f4da3d1cc53ae7434a949 M tools >> >> 4.4.0_rc2_3e27c920 >> sdc;10.218.128.17;5291495;1322873;15853 >> sde;10.218.202.17;4966024;1241506;16892 >> sdh;10.218.203.17;4980471;1245117;16843 >> sdk;10.218.204.17;4966612;1241653;16890 >> sdd;10.219.128.17;5060084;1265021;16578 >> sdf;10.219.202.17;5065278;1266319;16561 >> sdi;10.219.203.17;5047600;1261900;16619 >> sdl;10.219.204.17;5036992;1259248;16654 >> sdn;10.220.128.17;3775081;943770;22221 >> sdg;10.220.202.17;3758336;939584;22320 >> sdj;10.220.203.17;3792832;948208;22117 >> sdm;10.220.204.17;3771516;942879;22242 >> >> 4.4.0_rc2_ab46db0a >> sdc;10.218.128.17;3792146;948036;22121 >> sdf;10.218.202.17;3738405;934601;22439 >> sdj;10.218.203.17;3764239;941059;22285 >> sdl;10.218.204.17;3785302;946325;22161 >> sdd;10.219.128.17;3762382;940595;22296 >> sdg;10.219.202.17;3765760;941440;22276 >> sdi;10.219.203.17;3873751;968437;21655 >> sdm;10.219.204.17;3769483;942370;22254 >> sde;10.220.128.17;5022517;1255629;16702 >> sdh;10.220.202.17;5018911;1254727;16714 >> sdk;10.220.203.17;5037295;1259323;16653 >> sdn;10.220.204.17;5033064;1258266;16667 >> >> ---------------- >> Robert LeBlanc >> PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 >> >> >> On Wed, Jun 8, 2016 at 9:33 AM, Robert LeBlanc <robert-4JaGZRWAfWbajFs6igw21g@public.gmane.org> wrote: >>> With 4.1.15, the C-IB card gets about 1.15 MIOPs, while the CX3 gets >>> about 0.99 MIOPs. But starting with the 4.4.4 kernel, the C-IB card >>> drops to 0.96 MIOPs and the CX3 card jumps to 1.25 MIOPs. In the 4.6.0 >>> kernel, both cards drop, the C-IB to 0.82 MIOPs and the CX3 to 1.15 >>> MIOPs. I confirmed this morning that the card order was swapped on the >>> 4.6.0 kernel and it was not different ports of the C-IB performing >>> differently, but different cards. >>> >>> Given the limitations of the PCIe 8x port for the CX3, I think 1.25 >>> MIOPs is about the best we can do there. In summary, the performance >>> of the C-IB card drops after 4.1.15 and gets progressively worse as >>> the kernels increase. The CX3 card peaks at the 4.4.4 kernel and >>> degrades a bit on the 4.6.0 kernel. >>> >>> Increasing the IO depth by adding jobs does not improve performance, >>> it actually decreases performance. Based on an average of 4 runs at >>> each job number from 1-80, the Goldilocks zone is 31-57 jobs where the >>> difference in performance is less than 1%. >>> >>> Similarly, increasing block request size does not really change the >>> figures to reach line speed. >>> >>> Here is the output of the 4.6.0 kernel with 4M bs: >>> sdc;10.218.128.17;3354638;819;25006 >>> sdf;10.218.202.17;3376920;824;24841 >>> sdm;10.218.203.17;3367431;822;24911 >>> sdk;10.218.204.17;3378960;824;24826 >>> sde;10.219.128.17;3366350;821;24919 >>> sdl;10.219.202.17;3379641;825;24821 >>> sdg;10.219.203.17;3391254;827;24736 >>> sdn;10.219.204.17;3401706;830;24660 >>> sdd;10.220.128.17;4597505;1122;18246 >>> sdi;10.220.202.17;4594231;1121;18259 >>> sdj;10.220.203.17;4667598;1139;17972 >>> sdh;10.220.204.17;4628197;1129;18125 >>> >>> The CPU on the target is a kworker thread at 96%, but no single >>> processor over 15%. The initiator has low fio CPU utilization (<10%) >>> for each job and no single CPU over 22% utilized. >>> >>> I have tried manually spreading the IRQ affinity over the processors >>> of the respective NUMA nodes and there was no noticeable change in >>> performance when doing so. >>> >>> Loading ib_iser on the initiator shows maybe a slight increase in performance: >>> >>> sdc;10.218.128.17;3396885;849221;24695 >>> sdf;10.218.202.17;3429240;857310;24462 >>> sdi;10.218.203.17;3454234;863558;24285 >>> sdm;10.218.204.17;3391666;847916;24733 >>> sde;10.219.128.17;3403914;850978;24644 >>> sdh;10.219.202.17;3491034;872758;24029 >>> sdk;10.219.203.17;3390569;847642;24741 >>> sdl;10.219.204.17;3498898;874724;23975 >>> sdd;10.220.128.17;4664743;1166185;17983 >>> sdg;10.220.202.17;4624880;1156220;18138 >>> sdj;10.220.203.17;4616227;1154056;18172 >>> sdn;10.220.204.17;4619786;1154946;18158 >>> >>> I'd like to see the C-IB card at 1.25+ MIOPs (I know that the target >>> can do that performance and we were limited on the CX3 by the PCIe bus >>> which isn't an issue with the 16x C-IB card for a single port). >>> Although the loss of performance in the CX3 card is concerning, I'm >>> mostly focused on the C-IB card at the moment. I will probably start >>> bisecting 4.1.15 to 4.4.4 to see if I can identify when the >>> performance of the C-IB card degrades. >>> ---------------- >>> Robert LeBlanc >>> PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 >>> >>> >>> On Wed, Jun 8, 2016 at 7:52 AM, Max Gurtovoy <maxg-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> wrote: >>>> >>>> >>>> On 6/8/2016 1:37 AM, Robert LeBlanc wrote: >>>>> >>>>> On the 4.1.15 kernel: >>>>> sdc;10.218.128.17;3971878;992969;21120 >>>>> sdd;10.218.202.17;3967745;991936;21142 >>>>> sdg;10.218.203.17;3938128;984532;21301 >>>>> sdk;10.218.204.17;3952602;988150;21223 >>>>> sdn;10.219.128.17;4615719;1153929;18174 >>>>> sdf;10.219.202.17;4622331;1155582;18148 >>>>> sdi;10.219.203.17;4602297;1150574;18227 >>>>> sdl;10.219.204.17;4565477;1141369;18374 >>>>> sde;10.220.128.17;4594986;1148746;18256 >>>>> sdh;10.220.202.17;4590209;1147552;18275 >>>>> sdj;10.220.203.17;4599017;1149754;18240 >>>>> sdm;10.220.204.17;4610898;1152724;18193 >>>>> >>>>> On the 4.6.0 kernel: >>>>> sdc;10.218.128.17;3239219;809804;25897 >>>>> sdf;10.218.202.17;3321300;830325;25257 >>>>> sdm;10.218.203.17;3339015;834753;25123 >>>>> sdk;10.218.204.17;3637573;909393;23061 >>>>> sde;10.219.128.17;3325777;831444;25223 >>>>> sdl;10.219.202.17;3305464;826366;25378 >>>>> sdg;10.219.203.17;3304032;826008;25389 >>>>> sdn;10.219.204.17;3330001;832500;25191 >>>>> sdd;10.220.128.17;4624370;1156092;18140 >>>>> sdi;10.220.202.17;4619277;1154819;18160 >>>>> sdj;10.220.203.17;4610138;1152534;18196 >>>>> sdh;10.220.204.17;4586445;1146611;18290 >>>>> >>>>> It seems that there is a lot of changes between the kernels. I had >>>>> these kernels already on the box and I can bisect them if you think it >>>>> would help. It is really odd that port 2 on the Connect-IB card did >>>>> better than port 1 on the 4.6.0 kernel. >>>>> ---------------- >>>>> Robert LeBlanc >>>>> PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 >>>> >>>> >>>> so in these kernels you get better performance with the C-IB than CX3 ? >>>> we need to find the bottleneck. >>>> Can you increase the iodepth and/or block size to see if we can reach the >>>> wire speed. >>>> another try is to load ib_iser with always_register=N. >>>> >>>> what is the cpu utilzation in both initiator/target ? >>>> did you spread the irq affinity ? >>>> >>>> >>>>> >>>>> >>>>> On Tue, Jun 7, 2016 at 10:48 AM, Robert LeBlanc <robert-4JaGZRWAfWbajFs6igw21g@public.gmane.org> >>>>> wrote: >>>>>> >>>>>> The target is LIO (same kernel) with a 200 GB RAM disk and I'm running >>>>>> fio as follows: >>>>>> >>>>>> fio --rw=read --bs=4K --size=2G --numjobs=40 --name=worker.matt >>>>>> --group_reporting --minimal | cut -d';' -f7,8,9 >>>>>> >>>>>> All of the paths are set the same with noop and nomerges to either 1 >>>>>> or 2 (doesn't make a big difference). >>>>>> >>>>>> I started looking into this when the 4.6 kernel wasn't performing as >>>>>> well as we were able to get the 4.4 kernel to work. I went back to the >>>>>> 4.4 kernel and I could not replicate the 4+ million IOPs. So I started >>>>>> breaking down the problem to smaller pieces and found this anomaly. >>>>>> Since there hasn't been any suggestions up to this point, I'll check >>>>>> other kernel version to see if it is specific to certain kernels. If >>>>>> you need more information, please let me know. >>>>>> >>>>>> Thanks, >>>>>> ---------------- >>>>>> Robert LeBlanc >>>>>> PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 >>>>>> >>>>>> >>>>>> On Tue, Jun 7, 2016 at 6:02 AM, Max Gurtovoy <maxg-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> wrote: >>>>>>> >>>>>>> >>>>>>> >>>>>>> On 6/7/2016 1:36 AM, Robert LeBlanc wrote: >>>>>>>> >>>>>>>> >>>>>>>> I'm trying to understand why our Connect-IB card is not performing as >>>>>>>> well as our ConnectX-3 card. There are 3 ports between the two cards >>>>>>>> and 12 paths to the iSER target which is a RAM disk. >>>>>>> >>>>>>> >>>>>>> >>>>>>> <snip> >>>>>>> >>>>>>>> >>>>>>>> When I run fio against each path individually, I get: >>>>>>> >>>>>>> >>>>>>> >>>>>>> What is the scenario (bs, numjobs, iodepth) for each run ? >>>>>>> Which target do you use ? backing store ? >>>>>>> >>>>>>> >>>>>>>> >>>>>>>> disk;target IP;bandwidth,IOPs,Execution time >>>>>>>> sdn;10.218.128.17;5053682;1263420;16599 >>>>>>>> sde;10.218.202.17;5032158;1258039;16670 >>>>>>>> sdh;10.218.203.17;4993516;1248379;16799 >>>>>>>> sdk;10.218.204.17;5081848;1270462;16507 >>>>>>>> sdc;10.219.128.17;3750942;937735;22364 >>>>>>>> sdf;10.219.202.17;3746921;936730;22388 >>>>>>>> sdi;10.219.203.17;3873929;968482;21654 >>>>>>>> sdl;10.219.204.17;3841465;960366;21837 >>>>>>>> sdd;10.220.128.17;3760358;940089;22308 >>>>>>>> sdg;10.220.202.17;3866252;966563;21697 >>>>>>>> sdj;10.220.203.17;3757495;939373;22325 >>>>>>>> sdm;10.220.204.17;4064051;1016012;20641 >>>>>>>> >>>> -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 20+ messages in thread
[parent not found: <3646a0c9-3f2d-66b8-c4da-c91ca1d01cee-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>]
* Re: Connect-IB not performing as well as ConnectX-3 with iSER [not found] ` <3646a0c9-3f2d-66b8-c4da-c91ca1d01cee-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> @ 2016-06-20 21:52 ` Robert LeBlanc 0 siblings, 0 replies; 20+ messages in thread From: Robert LeBlanc @ 2016-06-20 21:52 UTC (permalink / raw) To: Max Gurtovoy Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, linux-scsi-u79uwXL29TY76Z2rM5mHXA I can test with SRP and report back what I find (haven't used SRP in years so I'll need to brush up on it). ---------------- Robert LeBlanc PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 On Mon, Jun 20, 2016 at 3:27 PM, Max Gurtovoy <maxg-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> wrote: > Did you see this kind of regression in SRP ? or with some other target (e.g > TGT) ? > Trying to understand if it's a ULP issue or LLD... > > > On 6/20/2016 6:23 PM, Robert LeBlanc wrote: >> >> Adding linux-scsi >> >> This last week I tried to figure out where a 10-15% decrease in >> performance showed up between 4.5 and 4.6 using iSER and ConnectX-3 >> and Connect-IB cards (10.{218,219}.*.17 are Connect-IB and 10.220.*.17 >> are ConnectX-3). To review, straight RDMA transfers between cards >> showed line rate was being achieved, just iSER was not able to achieve >> those same rates for some cards on different kernels. >> >> 4.5 vanilla default config >> sdc;10.218.128.17;3800048;950012;22075 >> sdi;10.218.202.17;3757158;939289;22327 >> sdg;10.218.203.17;3774062;943515;22227 >> sdn;10.218.204.17;3816299;954074;21981 >> sdd;10.219.128.17;3821863;955465;21949 >> sdf;10.219.202.17;3784106;946026;22168 >> sdj;10.219.203.17;3827094;956773;21919 >> sdm;10.219.204.17;3788208;947052;22144 >> sde;10.220.128.17;5054596;1263649;16596 >> sdh;10.220.202.17;5013811;1253452;16731 >> sdl;10.220.203.17;5052160;1263040;16604 >> sdk;10.220.204.17;4990248;1247562;16810 >> >> 4.6 vanilla default config >> sde;10.218.128.17;3431063;857765;24449 >> sdf;10.218.202.17;3360685;840171;24961 >> sdi;10.218.203.17;3355174;838793;25002 >> sdm;10.218.204.17;3360955;840238;24959 >> sdd;10.219.128.17;3337288;834322;25136 >> sdh;10.219.202.17;3327492;831873;25210 >> sdj;10.219.203.17;3380867;845216;24812 >> sdk;10.219.204.17;3418340;854585;24540 >> sdc;10.220.128.17;4668377;1167094;17969 >> sdg;10.220.202.17;4716675;1179168;17785 >> sdl;10.220.203.17;4675663;1168915;17941 >> sdn;10.220.204.17;4631519;1157879;18112 >> >> I narrowed the performance degradation to this series >> 7861728..5e47f19, but while trying to bisect it, the changes were >> erratic between each commit that I could not figure out exactly which >> introduced the issue. If someone could give me some pointers on what >> to do, I can keep trying to dig through this. >> >> 4.5.0_rc5_7861728d_00001 >> sdc;10.218.128.17;3747591;936897;22384 >> sdf;10.218.202.17;3750607;937651;22366 >> sdh;10.218.203.17;3750439;937609;22367 >> sdn;10.218.204.17;3771008;942752;22245 >> sde;10.219.128.17;3867678;966919;21689 >> sdg;10.219.202.17;3781889;945472;22181 >> sdk;10.219.203.17;3791804;947951;22123 >> sdl;10.219.204.17;3795406;948851;22102 >> sdd;10.220.128.17;5039110;1259777;16647 >> sdi;10.220.202.17;4992921;1248230;16801 >> sdj;10.220.203.17;5015610;1253902;16725 >> sdm;10.220.204.17;5087087;1271771;16490 >> >> 4.5.0_rc5_f81bf458_00018 >> sdb;10.218.128.17;5023720;1255930;16698 >> sde;10.218.202.17;5016809;1254202;16721 >> sdj;10.218.203.17;5021915;1255478;16704 >> sdk;10.218.204.17;5021314;1255328;16706 >> sdc;10.219.128.17;4984318;1246079;16830 >> sdf;10.219.202.17;4986096;1246524;16824 >> sdh;10.219.203.17;5043958;1260989;16631 >> sdm;10.219.204.17;5032460;1258115;16669 >> sdd;10.220.128.17;3736740;934185;22449 >> sdg;10.220.202.17;3728767;932191;22497 >> sdi;10.220.203.17;3752117;938029;22357 >> sdl;10.220.204.17;3763901;940975;22287 >> >> 4.5.0_rc5_07b63196_00027 >> sdb;10.218.128.17;3606142;901535;23262 >> sdg;10.218.202.17;3570988;892747;23491 >> sdf;10.218.203.17;3576011;894002;23458 >> sdk;10.218.204.17;3558113;889528;23576 >> sdc;10.219.128.17;3577384;894346;23449 >> sde;10.219.202.17;3575401;893850;23462 >> sdj;10.219.203.17;3567798;891949;23512 >> sdl;10.219.204.17;3584262;896065;23404 >> sdd;10.220.128.17;4430680;1107670;18933 >> sdh;10.220.202.17;4488286;1122071;18690 >> sdi;10.220.203.17;4487326;1121831;18694 >> sdm;10.220.204.17;4441236;1110309;18888 >> >> 4.5.0_rc5_5e47f198_00036 >> sdb;10.218.128.17;3519597;879899;23834 >> sdi;10.218.202.17;3512229;878057;23884 >> sdh;10.218.203.17;3518563;879640;23841 >> sdk;10.218.204.17;3582119;895529;23418 >> sdd;10.219.128.17;3550883;887720;23624 >> sdj;10.219.202.17;3558415;889603;23574 >> sde;10.219.203.17;3552086;888021;23616 >> sdl;10.219.204.17;3579521;894880;23435 >> sdc;10.220.128.17;4532912;1133228;18506 >> sdf;10.220.202.17;4558035;1139508;18404 >> sdg;10.220.203.17;4601035;1150258;18232 >> sdm;10.220.204.17;4548150;1137037;18444 >> >> While bisecting the kernel, I also stumbled across one that worked >> really well for both adapters which I haven't seen in the release >> kernels. >> >> 4.5.0_rc3_1aaa57f5_00399 >> sdc;10.218.128.17;4627942;1156985;18126 >> sdf;10.218.202.17;4590963;1147740;18272 >> sdk;10.218.203.17;4564980;1141245;18376 >> sdn;10.218.204.17;4571946;1142986;18348 >> sdd;10.219.128.17;4591717;1147929;18269 >> sdi;10.219.202.17;4505644;1126411;18618 >> sdg;10.219.203.17;4562001;1140500;18388 >> sdl;10.219.204.17;4583187;1145796;18303 >> sde;10.220.128.17;5511568;1377892;15220 >> sdh;10.220.202.17;5515555;1378888;15209 >> sdj;10.220.203.17;5609983;1402495;14953 >> sdm;10.220.204.17;5509035;1377258;15227 >> >> Here the ConnectX-3 card is performing perfectly while the Connect-IB >> card still has some room for improvement. >> >> I'd like to get to the bottom of why I'm not seeing the same >> performance out of the newer kernels, but I just don't understand the >> code. I've tried to do what I can in narrowing down where major >> changes happened in the kernel to cause these changes in hopes that it >> would help someone on the list. If there is anything I can do to help >> out, please let me know. >> >> Thank you, >> ---------------- >> Robert LeBlanc >> PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 >> >> >> On Fri, Jun 10, 2016 at 3:36 PM, Robert LeBlanc <robert-4JaGZRWAfWbajFs6igw21g@public.gmane.org> >> wrote: >>> >>> I bisected the kernel and it looks like the performance of the >>> Connect-IB card goes down and the performance of the ConnectX-3 card >>> goes up with this commit (but I'm not sure why this would cause this): >>> >>> ab46db0a3325a064bb24e826b12995d157565efb is the first bad commit >>> commit ab46db0a3325a064bb24e826b12995d157565efb >>> Author: Jiri Olsa <jolsa-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> >>> Date: Thu Dec 3 10:06:43 2015 +0100 >>> >>> perf stat: Use perf_evlist__enable in handle_initial_delay >>> >>> No need to mimic the behaviour of perf_evlist__enable, we can use it >>> directly. >>> >>> Signed-off-by: Jiri Olsa <jolsa-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> >>> Tested-by: Arnaldo Carvalho de Melo <acme-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> >>> Cc: Adrian Hunter <adrian.hunter-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org> >>> Cc: David Ahern <dsahern-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> >>> Cc: Namhyung Kim <namhyung-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> >>> Cc: Peter Zijlstra <a.p.zijlstra-/NLkJaSkS4VmR6Xm/wNWPw@public.gmane.org> >>> Link: >>> http://lkml.kernel.org/r/1449133606-14429-5-git-send-email-jolsa-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org >>> Signed-off-by: Arnaldo Carvalho de Melo <acme-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> >>> >>> :040000 040000 67e69893bf6d47b372e08d7089d37a7b9f602fa7 >>> b63d9b366f078eabf86f4da3d1cc53ae7434a949 M tools >>> >>> 4.4.0_rc2_3e27c920 >>> sdc;10.218.128.17;5291495;1322873;15853 >>> sde;10.218.202.17;4966024;1241506;16892 >>> sdh;10.218.203.17;4980471;1245117;16843 >>> sdk;10.218.204.17;4966612;1241653;16890 >>> sdd;10.219.128.17;5060084;1265021;16578 >>> sdf;10.219.202.17;5065278;1266319;16561 >>> sdi;10.219.203.17;5047600;1261900;16619 >>> sdl;10.219.204.17;5036992;1259248;16654 >>> sdn;10.220.128.17;3775081;943770;22221 >>> sdg;10.220.202.17;3758336;939584;22320 >>> sdj;10.220.203.17;3792832;948208;22117 >>> sdm;10.220.204.17;3771516;942879;22242 >>> >>> 4.4.0_rc2_ab46db0a >>> sdc;10.218.128.17;3792146;948036;22121 >>> sdf;10.218.202.17;3738405;934601;22439 >>> sdj;10.218.203.17;3764239;941059;22285 >>> sdl;10.218.204.17;3785302;946325;22161 >>> sdd;10.219.128.17;3762382;940595;22296 >>> sdg;10.219.202.17;3765760;941440;22276 >>> sdi;10.219.203.17;3873751;968437;21655 >>> sdm;10.219.204.17;3769483;942370;22254 >>> sde;10.220.128.17;5022517;1255629;16702 >>> sdh;10.220.202.17;5018911;1254727;16714 >>> sdk;10.220.203.17;5037295;1259323;16653 >>> sdn;10.220.204.17;5033064;1258266;16667 >>> >>> ---------------- >>> Robert LeBlanc >>> PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 >>> >>> >>> On Wed, Jun 8, 2016 at 9:33 AM, Robert LeBlanc <robert-4JaGZRWAfWbajFs6igw21g@public.gmane.org> >>> wrote: >>>> >>>> With 4.1.15, the C-IB card gets about 1.15 MIOPs, while the CX3 gets >>>> about 0.99 MIOPs. But starting with the 4.4.4 kernel, the C-IB card >>>> drops to 0.96 MIOPs and the CX3 card jumps to 1.25 MIOPs. In the 4.6.0 >>>> kernel, both cards drop, the C-IB to 0.82 MIOPs and the CX3 to 1.15 >>>> MIOPs. I confirmed this morning that the card order was swapped on the >>>> 4.6.0 kernel and it was not different ports of the C-IB performing >>>> differently, but different cards. >>>> >>>> Given the limitations of the PCIe 8x port for the CX3, I think 1.25 >>>> MIOPs is about the best we can do there. In summary, the performance >>>> of the C-IB card drops after 4.1.15 and gets progressively worse as >>>> the kernels increase. The CX3 card peaks at the 4.4.4 kernel and >>>> degrades a bit on the 4.6.0 kernel. >>>> >>>> Increasing the IO depth by adding jobs does not improve performance, >>>> it actually decreases performance. Based on an average of 4 runs at >>>> each job number from 1-80, the Goldilocks zone is 31-57 jobs where the >>>> difference in performance is less than 1%. >>>> >>>> Similarly, increasing block request size does not really change the >>>> figures to reach line speed. >>>> >>>> Here is the output of the 4.6.0 kernel with 4M bs: >>>> sdc;10.218.128.17;3354638;819;25006 >>>> sdf;10.218.202.17;3376920;824;24841 >>>> sdm;10.218.203.17;3367431;822;24911 >>>> sdk;10.218.204.17;3378960;824;24826 >>>> sde;10.219.128.17;3366350;821;24919 >>>> sdl;10.219.202.17;3379641;825;24821 >>>> sdg;10.219.203.17;3391254;827;24736 >>>> sdn;10.219.204.17;3401706;830;24660 >>>> sdd;10.220.128.17;4597505;1122;18246 >>>> sdi;10.220.202.17;4594231;1121;18259 >>>> sdj;10.220.203.17;4667598;1139;17972 >>>> sdh;10.220.204.17;4628197;1129;18125 >>>> >>>> The CPU on the target is a kworker thread at 96%, but no single >>>> processor over 15%. The initiator has low fio CPU utilization (<10%) >>>> for each job and no single CPU over 22% utilized. >>>> >>>> I have tried manually spreading the IRQ affinity over the processors >>>> of the respective NUMA nodes and there was no noticeable change in >>>> performance when doing so. >>>> >>>> Loading ib_iser on the initiator shows maybe a slight increase in >>>> performance: >>>> >>>> sdc;10.218.128.17;3396885;849221;24695 >>>> sdf;10.218.202.17;3429240;857310;24462 >>>> sdi;10.218.203.17;3454234;863558;24285 >>>> sdm;10.218.204.17;3391666;847916;24733 >>>> sde;10.219.128.17;3403914;850978;24644 >>>> sdh;10.219.202.17;3491034;872758;24029 >>>> sdk;10.219.203.17;3390569;847642;24741 >>>> sdl;10.219.204.17;3498898;874724;23975 >>>> sdd;10.220.128.17;4664743;1166185;17983 >>>> sdg;10.220.202.17;4624880;1156220;18138 >>>> sdj;10.220.203.17;4616227;1154056;18172 >>>> sdn;10.220.204.17;4619786;1154946;18158 >>>> >>>> I'd like to see the C-IB card at 1.25+ MIOPs (I know that the target >>>> can do that performance and we were limited on the CX3 by the PCIe bus >>>> which isn't an issue with the 16x C-IB card for a single port). >>>> Although the loss of performance in the CX3 card is concerning, I'm >>>> mostly focused on the C-IB card at the moment. I will probably start >>>> bisecting 4.1.15 to 4.4.4 to see if I can identify when the >>>> performance of the C-IB card degrades. >>>> ---------------- >>>> Robert LeBlanc >>>> PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 >>>> >>>> >>>> On Wed, Jun 8, 2016 at 7:52 AM, Max Gurtovoy <maxg-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> wrote: >>>>> >>>>> >>>>> >>>>> On 6/8/2016 1:37 AM, Robert LeBlanc wrote: >>>>>> >>>>>> >>>>>> On the 4.1.15 kernel: >>>>>> sdc;10.218.128.17;3971878;992969;21120 >>>>>> sdd;10.218.202.17;3967745;991936;21142 >>>>>> sdg;10.218.203.17;3938128;984532;21301 >>>>>> sdk;10.218.204.17;3952602;988150;21223 >>>>>> sdn;10.219.128.17;4615719;1153929;18174 >>>>>> sdf;10.219.202.17;4622331;1155582;18148 >>>>>> sdi;10.219.203.17;4602297;1150574;18227 >>>>>> sdl;10.219.204.17;4565477;1141369;18374 >>>>>> sde;10.220.128.17;4594986;1148746;18256 >>>>>> sdh;10.220.202.17;4590209;1147552;18275 >>>>>> sdj;10.220.203.17;4599017;1149754;18240 >>>>>> sdm;10.220.204.17;4610898;1152724;18193 >>>>>> >>>>>> On the 4.6.0 kernel: >>>>>> sdc;10.218.128.17;3239219;809804;25897 >>>>>> sdf;10.218.202.17;3321300;830325;25257 >>>>>> sdm;10.218.203.17;3339015;834753;25123 >>>>>> sdk;10.218.204.17;3637573;909393;23061 >>>>>> sde;10.219.128.17;3325777;831444;25223 >>>>>> sdl;10.219.202.17;3305464;826366;25378 >>>>>> sdg;10.219.203.17;3304032;826008;25389 >>>>>> sdn;10.219.204.17;3330001;832500;25191 >>>>>> sdd;10.220.128.17;4624370;1156092;18140 >>>>>> sdi;10.220.202.17;4619277;1154819;18160 >>>>>> sdj;10.220.203.17;4610138;1152534;18196 >>>>>> sdh;10.220.204.17;4586445;1146611;18290 >>>>>> >>>>>> It seems that there is a lot of changes between the kernels. I had >>>>>> these kernels already on the box and I can bisect them if you think it >>>>>> would help. It is really odd that port 2 on the Connect-IB card did >>>>>> better than port 1 on the 4.6.0 kernel. >>>>>> ---------------- >>>>>> Robert LeBlanc >>>>>> PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 >>>>> >>>>> >>>>> >>>>> so in these kernels you get better performance with the C-IB than CX3 ? >>>>> we need to find the bottleneck. >>>>> Can you increase the iodepth and/or block size to see if we can reach >>>>> the >>>>> wire speed. >>>>> another try is to load ib_iser with always_register=N. >>>>> >>>>> what is the cpu utilzation in both initiator/target ? >>>>> did you spread the irq affinity ? >>>>> >>>>> >>>>>> >>>>>> >>>>>> On Tue, Jun 7, 2016 at 10:48 AM, Robert LeBlanc <robert-4JaGZRWAfWbajFs6igw21g@public.gmane.org> >>>>>> wrote: >>>>>>> >>>>>>> >>>>>>> The target is LIO (same kernel) with a 200 GB RAM disk and I'm >>>>>>> running >>>>>>> fio as follows: >>>>>>> >>>>>>> fio --rw=read --bs=4K --size=2G --numjobs=40 --name=worker.matt >>>>>>> --group_reporting --minimal | cut -d';' -f7,8,9 >>>>>>> >>>>>>> All of the paths are set the same with noop and nomerges to either 1 >>>>>>> or 2 (doesn't make a big difference). >>>>>>> >>>>>>> I started looking into this when the 4.6 kernel wasn't performing as >>>>>>> well as we were able to get the 4.4 kernel to work. I went back to >>>>>>> the >>>>>>> 4.4 kernel and I could not replicate the 4+ million IOPs. So I >>>>>>> started >>>>>>> breaking down the problem to smaller pieces and found this anomaly. >>>>>>> Since there hasn't been any suggestions up to this point, I'll check >>>>>>> other kernel version to see if it is specific to certain kernels. If >>>>>>> you need more information, please let me know. >>>>>>> >>>>>>> Thanks, >>>>>>> ---------------- >>>>>>> Robert LeBlanc >>>>>>> PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 >>>>>>> >>>>>>> >>>>>>> On Tue, Jun 7, 2016 at 6:02 AM, Max Gurtovoy <maxg-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> >>>>>>> wrote: >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> On 6/7/2016 1:36 AM, Robert LeBlanc wrote: >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> I'm trying to understand why our Connect-IB card is not performing >>>>>>>>> as >>>>>>>>> well as our ConnectX-3 card. There are 3 ports between the two >>>>>>>>> cards >>>>>>>>> and 12 paths to the iSER target which is a RAM disk. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> <snip> >>>>>>>> >>>>>>>>> >>>>>>>>> When I run fio against each path individually, I get: >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> What is the scenario (bs, numjobs, iodepth) for each run ? >>>>>>>> Which target do you use ? backing store ? >>>>>>>> >>>>>>>> >>>>>>>>> >>>>>>>>> disk;target IP;bandwidth,IOPs,Execution time >>>>>>>>> sdn;10.218.128.17;5053682;1263420;16599 >>>>>>>>> sde;10.218.202.17;5032158;1258039;16670 >>>>>>>>> sdh;10.218.203.17;4993516;1248379;16799 >>>>>>>>> sdk;10.218.204.17;5081848;1270462;16507 >>>>>>>>> sdc;10.219.128.17;3750942;937735;22364 >>>>>>>>> sdf;10.219.202.17;3746921;936730;22388 >>>>>>>>> sdi;10.219.203.17;3873929;968482;21654 >>>>>>>>> sdl;10.219.204.17;3841465;960366;21837 >>>>>>>>> sdd;10.220.128.17;3760358;940089;22308 >>>>>>>>> sdg;10.220.202.17;3866252;966563;21697 >>>>>>>>> sdj;10.220.203.17;3757495;939373;22325 >>>>>>>>> sdm;10.220.204.17;4064051;1016012;20641 >>>>>>>>> >>>>> > -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Connect-IB not performing as well as ConnectX-3 with iSER [not found] ` <CAANLjFqoV-5HK0c+LdEbuxd81Vm=g=WE3cQgp47dH-yfYjZjGw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 2016-06-20 21:27 ` Max Gurtovoy @ 2016-06-21 13:08 ` Sagi Grimberg [not found] ` <57693C6A.3020805-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> 1 sibling, 1 reply; 20+ messages in thread From: Sagi Grimberg @ 2016-06-21 13:08 UTC (permalink / raw) To: Robert LeBlanc, linux-rdma-u79uwXL29TY76Z2rM5mHXA, linux-scsi-u79uwXL29TY76Z2rM5mHXA Cc: Max Gurtovoy Hey Robert, > I narrowed the performance degradation to this series > 7861728..5e47f19, but while trying to bisect it, the changes were > erratic between each commit that I could not figure out exactly which > introduced the issue. If someone could give me some pointers on what > to do, I can keep trying to dig through this. This bisection brings suspects: e3416ab2d156 iser-target: Kill the ->isert_cmd back pointer in struct iser_tx_desc d1ca2ed7dcf8 iser-target: Kill struct isert_rdma_wr 9679cc51eb13 iser-target: Convert to new CQ API 5adabdd122e4 iser-target: Split and properly type the login buffer ed1083b251f0 iser-target: Remove ISER_RECV_DATA_SEG_LEN 26c7b673db57 iser-target: Remove impossible condition from isert_wait_conn 69c48846f1c7 iser-target: Remove redundant wait in release_conn 6d1fba0c2cc7 iser-target: Rework connection termination f81bf458208e iser-target: Separate flows for np listeners and connections cma events aea92980601f iser-target: Add new state ISER_CONN_BOUND to isert_conn b89a7c25462b iser-target: Fix identification of login rx descriptor type However I don't really see performance implications in these patches, not to mention something that would affect on ConnectIB... Given that your bisection brings up target side patches, I have a couple questions: 1. Are the CPU usage in the target side at 100%, or the initiator side is the bottleneck? 2. Would it be possible to use another target implementation? TGT maybe? 3. Can you try testing right before 9679cc51eb13? This is a patch that involves data-plane. 4. Can you try the latest upstream kernel? The iser target code uses a generic data-transfer library and I'm interested in knowing what is the status there. Cheers, Sagi. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 20+ messages in thread
[parent not found: <57693C6A.3020805-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>]
* Re: Connect-IB not performing as well as ConnectX-3 with iSER [not found] ` <57693C6A.3020805-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> @ 2016-06-21 14:50 ` Robert LeBlanc [not found] ` <CAANLjFpUyAYB+ZzMwFKBpa4yLmALPzcRGJX1kExVrLARZmZRkA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 0 siblings, 1 reply; 20+ messages in thread From: Robert LeBlanc @ 2016-06-21 14:50 UTC (permalink / raw) To: Sagi Grimberg Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, linux-scsi-u79uwXL29TY76Z2rM5mHXA, Max Gurtovoy Sagi, I'm working to implement SRP (I think I got it all working) to test some of the commits. I can try TGT afterwards and the commit you mention. I haven't been watching the CPU lately, but before when I was doing a lot of testing, there wasn't any one thread that was at 100%. There are several threads that have high utilization, but none 100% and there is plenty of CPU capacity available (32 cores). I can capture some of that data if it is helpful. I did test 4.7_rc3 on Friday, but it didn't change much, is that "new" enough? 4.7.0_rc3_5edb5649 sdc;10.218.128.17;3260244;815061;25730 sdg;10.218.202.17;3405988;851497;24629 sdh;10.218.203.17;3307419;826854;25363 sdm;10.218.204.17;3430502;857625;24453 sdi;10.219.128.17;3544282;886070;23668 sdj;10.219.202.17;3412083;853020;24585 sdk;10.219.203.17;3422385;855596;24511 sdl;10.219.204.17;3444164;861041;24356 sdb;10.220.128.17;4803646;1200911;17463 sdd;10.220.202.17;4832982;1208245;17357 sde;10.220.203.17;4809430;1202357;17442 sdf;10.220.204.17;4808878;1202219;17444 Thanks for the suggestions, I'll work to get some of the requested data back to you guys quickly. ---------------- Robert LeBlanc PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 On Tue, Jun 21, 2016 at 7:08 AM, Sagi Grimberg <sagigrim-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote: > Hey Robert, > >> I narrowed the performance degradation to this series >> 7861728..5e47f19, but while trying to bisect it, the changes were >> erratic between each commit that I could not figure out exactly which >> introduced the issue. If someone could give me some pointers on what >> to do, I can keep trying to dig through this. > > > This bisection brings suspects: > > e3416ab2d156 iser-target: Kill the ->isert_cmd back pointer in struct > iser_tx_desc > d1ca2ed7dcf8 iser-target: Kill struct isert_rdma_wr > 9679cc51eb13 iser-target: Convert to new CQ API > 5adabdd122e4 iser-target: Split and properly type the login buffer > ed1083b251f0 iser-target: Remove ISER_RECV_DATA_SEG_LEN > 26c7b673db57 iser-target: Remove impossible condition from isert_wait_conn > 69c48846f1c7 iser-target: Remove redundant wait in release_conn > 6d1fba0c2cc7 iser-target: Rework connection termination > f81bf458208e iser-target: Separate flows for np listeners and connections > cma events > aea92980601f iser-target: Add new state ISER_CONN_BOUND to isert_conn > b89a7c25462b iser-target: Fix identification of login rx descriptor type > > However I don't really see performance implications in these patches, > not to mention something that would affect on ConnectIB... > > Given that your bisection brings up target side patches, I have > a couple questions: > > 1. Are the CPU usage in the target side at 100%, or the initiator side > is the bottleneck? > > 2. Would it be possible to use another target implementation? TGT maybe? > > 3. Can you try testing right before 9679cc51eb13? This is a patch that > involves data-plane. > > 4. Can you try the latest upstream kernel? The iser target code uses > a generic data-transfer library and I'm interested in knowing what is > the status there. > > Cheers, > Sagi. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 20+ messages in thread
[parent not found: <CAANLjFpUyAYB+ZzMwFKBpa4yLmALPzcRGJX1kExVrLARZmZRkA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>]
* Re: Connect-IB not performing as well as ConnectX-3 with iSER [not found] ` <CAANLjFpUyAYB+ZzMwFKBpa4yLmALPzcRGJX1kExVrLARZmZRkA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> @ 2016-06-21 20:26 ` Robert LeBlanc [not found] ` <CAANLjFpeL0AkuGW-q5Bmm-dff0UqFOM_sAOaG7=vyqmwnOoTcQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 2016-06-22 16:21 ` Sagi Grimberg 0 siblings, 2 replies; 20+ messages in thread From: Robert LeBlanc @ 2016-06-21 20:26 UTC (permalink / raw) To: Sagi Grimberg Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, linux-scsi-u79uwXL29TY76Z2rM5mHXA, Max Gurtovoy Sagi & Max, Here is the results of SRP using the same ramdisk backstore that I was using from iSER (as same as can be between reboots and restoring targetcli config). I also tested the commit before 9679cc51eb13 (5adabdd122e471fe978d49471624bab08b5373a7) which is included here. I'm not seeing a correlation between iSER and SRP that would lead me to believe that the changes are happening in both implementations. Does this provide enough information for you, or do you think TGT will be needed? 4.4 (afd2ff9) vanilla default config sdb;10.218.128.17;5150176;1287544;16288 sdd;10.218.202.17;5092337;1273084;16473 sdh;10.218.203.17;5129078;1282269;16355 sdk;10.218.204.17;5129078;1282269;16355 sdg;10.219.128.17;5155874;1288968;16270 sdf;10.219.202.17;5131588;1282897;16347 sdi;10.219.203.17;5165399;1291349;16240 sdl;10.219.204.17;5157459;1289364;16265 sdc;10.220.128.17;3684223;921055;22769 sde;10.220.202.17;3692169;923042;22720 sdj;10.220.203.17;3699170;924792;22677 Sdm;10.220.204.17;3697865;924466;22685 mlx5_0;sde;2968368;742092;28260 mlx4_0;sdd;3325645;831411;25224 mlx5_0;sdc;3023466;755866;27745 4.4.0_rc2_3e27c920 sdc;10.218.128.17;5291495;1322873;15853 sde;10.218.202.17;4966024;1241506;16892 sdh;10.218.203.17;4980471;1245117;16843 sdk;10.218.204.17;4966612;1241653;16890 sdd;10.219.128.17;5060084;1265021;16578 sdf;10.219.202.17;5065278;1266319;16561 sdi;10.219.203.17;5047600;1261900;16619 sdl;10.219.204.17;5036992;1259248;16654 sdn;10.220.128.17;3775081;943770;22221 sdg;10.220.202.17;3758336;939584;22320 sdj;10.220.203.17;3792832;948208;22117 Sdm;10.220.204.17;3771516;942879;22242 Mlx4_0;sde;4648715;1162178;18045 ~73% cpu ib_srpt_compl Mlx5_0;sdd;3476566;869141;24129 ~80% cpu ib_srpt_compl mlx5_0;sdc;3492343;873085;24020 4.4.0_rc2_ab46db0a sdc;10.218.128.17;3792146;948036;22121 sdf;10.218.202.17;3738405;934601;22439 sdj;10.218.203.17;3764239;941059;22285 sdl;10.218.204.17;3785302;946325;22161 sdd;10.219.128.17;3762382;940595;22296 sdg;10.219.202.17;3765760;941440;22276 sdi;10.219.203.17;3873751;968437;21655 sdm;10.219.204.17;3769483;942370;22254 sde;10.220.128.17;5022517;1255629;16702 sdh;10.220.202.17;5018911;1254727;16714 sdk;10.220.203.17;5037295;1259323;16653 Sdn;10.220.204.17;5033064;1258266;16667 mlx4_0;sde;4635358;1158839;18097 mlx5_0;sdd;3459077;864769;24251 mlx5_0;sdc;3465650;866412;24205 4.5.0_rc3_1aaa57f5_00399 sdc;10.218.128.17;4627942;1156985;18126 sdf;10.218.202.17;4590963;1147740;18272 sdk;10.218.203.17;4564980;1141245;18376 sdn;10.218.204.17;4571946;1142986;18348 sdd;10.219.128.17;4591717;1147929;18269 sdi;10.219.202.17;4505644;1126411;18618 sdg;10.219.203.17;4562001;1140500;18388 sdl;10.219.204.17;4583187;1145796;18303 sde;10.220.128.17;5511568;1377892;15220 sdh;10.220.202.17;5515555;1378888;15209 sdj;10.220.203.17;5609983;1402495;14953 sdm;10.220.204.17;5509035;1377258;15227 Mlx5_0;sde;3593013;898253;23347 100% CPU kworker/u69:2 Mlx5_0;sdd;3588555;897138;23376 100% CPU kworker/u69:2 Mlx4_0;sdc;3525662;881415;23793 100% CPU kworker/u68:0 4.5.0_rc5_7861728d_00001 sdc;10.218.128.17;3747591;936897;22384 sdf;10.218.202.17;3750607;937651;22366 sdh;10.218.203.17;3750439;937609;22367 sdn;10.218.204.17;3771008;942752;22245 sde;10.219.128.17;3867678;966919;21689 sdg;10.219.202.17;3781889;945472;22181 sdk;10.219.203.17;3791804;947951;22123 sdl;10.219.204.17;3795406;948851;22102 sdd;10.220.128.17;5039110;1259777;16647 sdi;10.220.202.17;4992921;1248230;16801 sdj;10.220.203.17;5015610;1253902;16725 Sdm;10.220.204.17;5087087;1271771;16490 Mlx5_0;sde;2930722;732680;28623 ~98% CPU kworker/u69:0 Mlx5_0;sdd;2910891;727722;28818 ~98% CPU kworker/u69:0 Mlx4_0;sdc;3263668;815917;25703 ~98% CPU kworker/u68:0 4.5.0_rc5_f81bf458_00018 sdb;10.218.128.17;5023720;1255930;16698 sde;10.218.202.17;5016809;1254202;16721 sdj;10.218.203.17;5021915;1255478;16704 sdk;10.218.204.17;5021314;1255328;16706 sdc;10.219.128.17;4984318;1246079;16830 sdf;10.219.202.17;4986096;1246524;16824 sdh;10.219.203.17;5043958;1260989;16631 sdm;10.219.204.17;5032460;1258115;16669 sdd;10.220.128.17;3736740;934185;22449 sdg;10.220.202.17;3728767;932191;22497 sdi;10.220.203.17;3752117;938029;22357 Sdl;10.220.204.17;3763901;940975;22287 Srpt keeps crashing couldn't test 4.5.0_rc5_5adabdd1_00023 Sdc;10.218.128.17;3726448;931612;22511 ~97% CPU kworker/u69:4 sdf;10.218.202.17;3750271;937567;22368 sdi;10.218.203.17;3749266;937316;22374 sdj;10.218.204.17;3798844;949711;22082 sde;10.219.128.17;3759852;939963;22311 ~97% CPU kworker/u69:4 sdg;10.219.202.17;3772534;943133;22236 sdl;10.219.203.17;3769483;942370;22254 sdn;10.219.204.17;3790604;947651;22130 sdd;10.220.128.17;5171130;1292782;16222 ~96% CPU kworker/u68:3 sdh;10.220.202.17;5105354;1276338;16431 sdk;10.220.203.17;4995300;1248825;16793 sdm;10.220.204.17;4959564;1239891;16914 Srpt crashes 4.5.0_rc5_07b63196_00027 sdb;10.218.128.17;3606142;901535;23262 sdg;10.218.202.17;3570988;892747;23491 sdf;10.218.203.17;3576011;894002;23458 sdk;10.218.204.17;3558113;889528;23576 sdc;10.219.128.17;3577384;894346;23449 sde;10.219.202.17;3575401;893850;23462 sdj;10.219.203.17;3567798;891949;23512 sdl;10.219.204.17;3584262;896065;23404 sdd;10.220.128.17;4430680;1107670;18933 sdh;10.220.202.17;4488286;1122071;18690 sdi;10.220.203.17;4487326;1121831;18694 sdm;10.220.204.17;4441236;1110309;18888 Srpt crashes 4.5.0_rc5_5e47f198_00036 sdb;10.218.128.17;3519597;879899;23834 sdi;10.218.202.17;3512229;878057;23884 sdh;10.218.203.17;3518563;879640;23841 sdk;10.218.204.17;3582119;895529;23418 sdd;10.219.128.17;3550883;887720;23624 sdj;10.219.202.17;3558415;889603;23574 sde;10.219.203.17;3552086;888021;23616 sdl;10.219.204.17;3579521;894880;23435 sdc;10.220.128.17;4532912;1133228;18506 sdf;10.220.202.17;4558035;1139508;18404 sdg;10.220.203.17;4601035;1150258;18232 sdm;10.220.204.17;4548150;1137037;18444 srpt crashes 4.6.2 vanilla default config sde;10.218.128.17;3431063;857765;24449 sdf;10.218.202.17;3360685;840171;24961 sdi;10.218.203.17;3355174;838793;25002 sdm;10.218.204.17;3360955;840238;24959 sdd;10.219.128.17;3337288;834322;25136 sdh;10.219.202.17;3327492;831873;25210 sdj;10.219.203.17;3380867;845216;24812 sdk;10.219.204.17;3418340;854585;24540 sdc;10.220.128.17;4668377;1167094;17969 sdg;10.220.202.17;4716675;1179168;17785 sdl;10.220.203.17;4675663;1168915;17941 sdn;10.220.204.17;4631519;1157879;18112 Mlx5_0;sde;3390021;847505;24745 ~98% CPU kworker/u69:3 Mlx5_0;sdd;3207512;801878;26153 ~98% CPU kworker/u69:3 Mlx4_0;sdc;2998072;749518;27980 ~98% CPU kworker/u68:0 4.7.0_rc3_5edb5649 sdc;10.218.128.17;3260244;815061;25730 sdg;10.218.202.17;3405988;851497;24629 sdh;10.218.203.17;3307419;826854;25363 sdm;10.218.204.17;3430502;857625;24453 sdi;10.219.128.17;3544282;886070;23668 sdj;10.219.202.17;3412083;853020;24585 sdk;10.219.203.17;3422385;855596;24511 sdl;10.219.204.17;3444164;861041;24356 sdb;10.220.128.17;4803646;1200911;17463 sdd;10.220.202.17;4832982;1208245;17357 sde;10.220.203.17;4809430;1202357;17442 sdf;10.220.204.17;4808878;1202219;17444 mlx5_0;sdd;2986864;746716;28085 mlx5_0;sdc;2963648;740912;28305 mlx4_0;sdb;3317228;829307;25288 Thanks, ---------------- Robert LeBlanc PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 On Tue, Jun 21, 2016 at 8:50 AM, Robert LeBlanc <robert-4JaGZRWAfWbajFs6igw21g@public.gmane.org> wrote: > Sagi, > > I'm working to implement SRP (I think I got it all working) to test > some of the commits. I can try TGT afterwards and the commit you > mention. I haven't been watching the CPU lately, but before when I was > doing a lot of testing, there wasn't any one thread that was at 100%. > There are several threads that have high utilization, but none 100% > and there is plenty of CPU capacity available (32 cores). I can > capture some of that data if it is helpful. I did test 4.7_rc3 on > Friday, but it didn't change much, is that "new" enough? > > 4.7.0_rc3_5edb5649 > sdc;10.218.128.17;3260244;815061;25730 > sdg;10.218.202.17;3405988;851497;24629 > sdh;10.218.203.17;3307419;826854;25363 > sdm;10.218.204.17;3430502;857625;24453 > sdi;10.219.128.17;3544282;886070;23668 > sdj;10.219.202.17;3412083;853020;24585 > sdk;10.219.203.17;3422385;855596;24511 > sdl;10.219.204.17;3444164;861041;24356 > sdb;10.220.128.17;4803646;1200911;17463 > sdd;10.220.202.17;4832982;1208245;17357 > sde;10.220.203.17;4809430;1202357;17442 > sdf;10.220.204.17;4808878;1202219;17444 > > Thanks for the suggestions, I'll work to get some of the requested > data back to you guys quickly. > ---------------- > Robert LeBlanc > PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 > > > On Tue, Jun 21, 2016 at 7:08 AM, Sagi Grimberg <sagigrim-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote: >> Hey Robert, >> >>> I narrowed the performance degradation to this series >>> 7861728..5e47f19, but while trying to bisect it, the changes were >>> erratic between each commit that I could not figure out exactly which >>> introduced the issue. If someone could give me some pointers on what >>> to do, I can keep trying to dig through this. >> >> >> This bisection brings suspects: >> >> e3416ab2d156 iser-target: Kill the ->isert_cmd back pointer in struct >> iser_tx_desc >> d1ca2ed7dcf8 iser-target: Kill struct isert_rdma_wr >> 9679cc51eb13 iser-target: Convert to new CQ API >> 5adabdd122e4 iser-target: Split and properly type the login buffer >> ed1083b251f0 iser-target: Remove ISER_RECV_DATA_SEG_LEN >> 26c7b673db57 iser-target: Remove impossible condition from isert_wait_conn >> 69c48846f1c7 iser-target: Remove redundant wait in release_conn >> 6d1fba0c2cc7 iser-target: Rework connection termination >> f81bf458208e iser-target: Separate flows for np listeners and connections >> cma events >> aea92980601f iser-target: Add new state ISER_CONN_BOUND to isert_conn >> b89a7c25462b iser-target: Fix identification of login rx descriptor type >> >> However I don't really see performance implications in these patches, >> not to mention something that would affect on ConnectIB... >> >> Given that your bisection brings up target side patches, I have >> a couple questions: >> >> 1. Are the CPU usage in the target side at 100%, or the initiator side >> is the bottleneck? >> >> 2. Would it be possible to use another target implementation? TGT maybe? >> >> 3. Can you try testing right before 9679cc51eb13? This is a patch that >> involves data-plane. >> >> 4. Can you try the latest upstream kernel? The iser target code uses >> a generic data-transfer library and I'm interested in knowing what is >> the status there. >> >> Cheers, >> Sagi. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 20+ messages in thread
[parent not found: <CAANLjFpeL0AkuGW-q5Bmm-dff0UqFOM_sAOaG7=vyqmwnOoTcQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>]
* Re: Connect-IB not performing as well as ConnectX-3 with iSER [not found] ` <CAANLjFpeL0AkuGW-q5Bmm-dff0UqFOM_sAOaG7=vyqmwnOoTcQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> @ 2016-06-22 8:18 ` Bart Van Assche [not found] ` <86d4404a-fa6a-72de-8e83-827072c308b5-XdAiOPVOjttBDgjK7y7TUQ@public.gmane.org> 2016-06-22 9:52 ` Sagi Grimberg 1 sibling, 1 reply; 20+ messages in thread From: Bart Van Assche @ 2016-06-22 8:18 UTC (permalink / raw) To: Robert LeBlanc, Sagi Grimberg Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, linux-scsi-u79uwXL29TY76Z2rM5mHXA, Max Gurtovoy On 06/21/2016 10:26 PM, Robert LeBlanc wrote: > Srpt keeps crashing couldn't test If this is reproducible with the latest rc kernel or with any of the stable kernels please report this in a separate e-mail, together with the crash call stack and information about how to reproduce this. Thanks, Bart. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 20+ messages in thread
[parent not found: <86d4404a-fa6a-72de-8e83-827072c308b5-XdAiOPVOjttBDgjK7y7TUQ@public.gmane.org>]
* Re: Connect-IB not performing as well as ConnectX-3 with iSER [not found] ` <86d4404a-fa6a-72de-8e83-827072c308b5-XdAiOPVOjttBDgjK7y7TUQ@public.gmane.org> @ 2016-06-22 12:23 ` Laurence Oberman 2016-06-22 15:45 ` Robert LeBlanc 1 sibling, 0 replies; 20+ messages in thread From: Laurence Oberman @ 2016-06-22 12:23 UTC (permalink / raw) To: Bart Van Assche Cc: Robert LeBlanc, Sagi Grimberg, linux-rdma-u79uwXL29TY76Z2rM5mHXA, linux-scsi-u79uwXL29TY76Z2rM5mHXA, Max Gurtovoy ----- Original Message ----- > From: "Bart Van Assche" <bart.vanassche-XdAiOPVOjttBDgjK7y7TUQ@public.gmane.org> > To: "Robert LeBlanc" <robert-4JaGZRWAfWbajFs6igw21g@public.gmane.org>, "Sagi Grimberg" <sagigrim-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> > Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, linux-scsi-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, "Max Gurtovoy" <maxg-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> > Sent: Wednesday, June 22, 2016 4:18:31 AM > Subject: Re: Connect-IB not performing as well as ConnectX-3 with iSER > > On 06/21/2016 10:26 PM, Robert LeBlanc wrote: > > Srpt keeps crashing couldn't test > > If this is reproducible with the latest rc kernel or with any of the > stable kernels please report this in a separate e-mail, together with > the crash call stack and information about how to reproduce this. > > Thanks, > > Bart. > -- > To unsubscribe from this list: send the line "unsubscribe linux-rdma" in > the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Robert I am exercising the ib_srpt configured vi a targetlio very heavily in 4.7.0-rc1. I have no crashes or issues. I also had 4.5 running ib_srpt with no crashes, although I had some other timeouts etc depending on the load. What sort of crashes are you talking about ? Does the system crash, ib_srpt dump stack ? Thanks Laurence -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Connect-IB not performing as well as ConnectX-3 with iSER [not found] ` <86d4404a-fa6a-72de-8e83-827072c308b5-XdAiOPVOjttBDgjK7y7TUQ@public.gmane.org> 2016-06-22 12:23 ` Laurence Oberman @ 2016-06-22 15:45 ` Robert LeBlanc 1 sibling, 0 replies; 20+ messages in thread From: Robert LeBlanc @ 2016-06-22 15:45 UTC (permalink / raw) To: Bart Van Assche, Laurence Oberman Cc: Sagi Grimberg, linux-scsi-u79uwXL29TY76Z2rM5mHXA, Max Gurtovoy, linux-rdma-u79uwXL29TY76Z2rM5mHXA There is no need to be concerned about srpt crashing in the latest kernel. Srpt only crashed when I was testing kernels in that change set (7861728..5e47f19) that I identified the 10-15% performance drop in iSER between the 4.5 and 4.6 kernel. My tests from the 4.6 to 4.7rc3 didn't have a problem with srpt crashing. The format of the output is as follows: Kernel_tag_commit iSER tests with results in this format <dev>;<target IP>;<bandwidth KB/s>;<IOPs>;<execution time ms> (last three fields are fields 7,8,9 from fio) i.e. sdc;10.218.128.17;3260244;815061;25730 SRP LIO tests <IB driver>;<dev>;<bandwidth KB/s>;<IOPs>;<execution time ms> i.e. mlx5_0;sdd;2986864;746716;28085 This is repeated for each kernel tested. On some tests I also documented the observed CPU utilization of some of the target processes. In some cases I was lazy and if the information was the same for both mlx5 targets, I didn't duplicate it. For iSER, there are four aliases on each adapter to provide four paths for each IB port (this is a remnant of some previous multipathing tests, and now only serves as providing additional data points to know how repeatable the tests are). 10.218.*.17 and 10.219.*.17 are generally on the mlx5 ports while 19.220.*.17 are on the mlx4 port (some tests had the adapters swapped, but none of these did and it is easy to identify them by the grouping). This test was performed against each path individually. I created an ext4 filesystem on the device (no partitions), then would mount the file system on one path, run the test, umount the path, mount the next path, run the test, etc so that there is no multipathing confusing the tests. I also am _NOT_ running the tests on all paths at the same time using fio. The fio command I'm using is: fio --rw=read --bs=4K --size=2G --numjobs=40 --name=worker.matt --group_reporting --minimal | cut -d';' -f7,8,9 I hope that clears up the confusion, if not, please ask for more clarification. On Jun 22, 2016 2:18 AM, "Bart Van Assche" <bart.vanassche-XdAiOPVOjttBDgjK7y7TUQ@public.gmane.org> wrote: > > On 06/21/2016 10:26 PM, Robert LeBlanc wrote: >> >> Srpt keeps crashing couldn't test > > > If this is reproducible with the latest rc kernel or with any of the stable kernels please report this in a separate e-mail, together with the crash call stack and information about how to reproduce this. > > Thanks, > > Bart. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Connect-IB not performing as well as ConnectX-3 with iSER [not found] ` <CAANLjFpeL0AkuGW-q5Bmm-dff0UqFOM_sAOaG7=vyqmwnOoTcQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 2016-06-22 8:18 ` Bart Van Assche @ 2016-06-22 9:52 ` Sagi Grimberg 1 sibling, 0 replies; 20+ messages in thread From: Sagi Grimberg @ 2016-06-22 9:52 UTC (permalink / raw) To: Robert LeBlanc Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, linux-scsi-u79uwXL29TY76Z2rM5mHXA, Max Gurtovoy > Sagi & Max, > > Here is the results of SRP using the same ramdisk backstore that I was > using from iSER (as same as can be between reboots and restoring > targetcli config). I also tested the commit before 9679cc51eb13 > (5adabdd122e471fe978d49471624bab08b5373a7) which is included here. I'm > not seeing a correlation between iSER and SRP that would lead me to > believe that the changes are happening in both implementations. > > Does this provide enough information for you, or do you think TGT will > be needed? I'm a little lost on which test belongs to what, can you specify that more clearly? > > 4.4 (afd2ff9) vanilla default config > sdb;10.218.128.17;5150176;1287544;16288 > sdd;10.218.202.17;5092337;1273084;16473 > sdh;10.218.203.17;5129078;1282269;16355 > sdk;10.218.204.17;5129078;1282269;16355 > sdg;10.219.128.17;5155874;1288968;16270 > sdf;10.219.202.17;5131588;1282897;16347 > sdi;10.219.203.17;5165399;1291349;16240 > sdl;10.219.204.17;5157459;1289364;16265 > sdc;10.220.128.17;3684223;921055;22769 > sde;10.220.202.17;3692169;923042;22720 > sdj;10.220.203.17;3699170;924792;22677 > Sdm;10.220.204.17;3697865;924466;22685 > > mlx5_0;sde;2968368;742092;28260 > mlx4_0;sdd;3325645;831411;25224 > mlx5_0;sdc;3023466;755866;27745 > > 4.4.0_rc2_3e27c920 > sdc;10.218.128.17;5291495;1322873;15853 > sde;10.218.202.17;4966024;1241506;16892 > sdh;10.218.203.17;4980471;1245117;16843 > sdk;10.218.204.17;4966612;1241653;16890 > sdd;10.219.128.17;5060084;1265021;16578 > sdf;10.219.202.17;5065278;1266319;16561 > sdi;10.219.203.17;5047600;1261900;16619 > sdl;10.219.204.17;5036992;1259248;16654 > sdn;10.220.128.17;3775081;943770;22221 > sdg;10.220.202.17;3758336;939584;22320 > sdj;10.220.203.17;3792832;948208;22117 > Sdm;10.220.204.17;3771516;942879;22242 > > Mlx4_0;sde;4648715;1162178;18045 ~73% cpu ib_srpt_compl > Mlx5_0;sdd;3476566;869141;24129 ~80% cpu ib_srpt_compl > mlx5_0;sdc;3492343;873085;24020 > > 4.4.0_rc2_ab46db0a > sdc;10.218.128.17;3792146;948036;22121 > sdf;10.218.202.17;3738405;934601;22439 > sdj;10.218.203.17;3764239;941059;22285 > sdl;10.218.204.17;3785302;946325;22161 > sdd;10.219.128.17;3762382;940595;22296 > sdg;10.219.202.17;3765760;941440;22276 > sdi;10.219.203.17;3873751;968437;21655 > sdm;10.219.204.17;3769483;942370;22254 > sde;10.220.128.17;5022517;1255629;16702 > sdh;10.220.202.17;5018911;1254727;16714 > sdk;10.220.203.17;5037295;1259323;16653 > Sdn;10.220.204.17;5033064;1258266;16667 > > mlx4_0;sde;4635358;1158839;18097 > mlx5_0;sdd;3459077;864769;24251 > mlx5_0;sdc;3465650;866412;24205 > > 4.5.0_rc3_1aaa57f5_00399 > > sdc;10.218.128.17;4627942;1156985;18126 > sdf;10.218.202.17;4590963;1147740;18272 > sdk;10.218.203.17;4564980;1141245;18376 > sdn;10.218.204.17;4571946;1142986;18348 > sdd;10.219.128.17;4591717;1147929;18269 > sdi;10.219.202.17;4505644;1126411;18618 > sdg;10.219.203.17;4562001;1140500;18388 > sdl;10.219.204.17;4583187;1145796;18303 > sde;10.220.128.17;5511568;1377892;15220 > sdh;10.220.202.17;5515555;1378888;15209 > sdj;10.220.203.17;5609983;1402495;14953 > sdm;10.220.204.17;5509035;1377258;15227 > > Mlx5_0;sde;3593013;898253;23347 100% CPU kworker/u69:2 > Mlx5_0;sdd;3588555;897138;23376 100% CPU kworker/u69:2 > Mlx4_0;sdc;3525662;881415;23793 100% CPU kworker/u68:0 > > 4.5.0_rc5_7861728d_00001 > sdc;10.218.128.17;3747591;936897;22384 > sdf;10.218.202.17;3750607;937651;22366 > sdh;10.218.203.17;3750439;937609;22367 > sdn;10.218.204.17;3771008;942752;22245 > sde;10.219.128.17;3867678;966919;21689 > sdg;10.219.202.17;3781889;945472;22181 > sdk;10.219.203.17;3791804;947951;22123 > sdl;10.219.204.17;3795406;948851;22102 > sdd;10.220.128.17;5039110;1259777;16647 > sdi;10.220.202.17;4992921;1248230;16801 > sdj;10.220.203.17;5015610;1253902;16725 > Sdm;10.220.204.17;5087087;1271771;16490 > > Mlx5_0;sde;2930722;732680;28623 ~98% CPU kworker/u69:0 > Mlx5_0;sdd;2910891;727722;28818 ~98% CPU kworker/u69:0 > Mlx4_0;sdc;3263668;815917;25703 ~98% CPU kworker/u68:0 > > 4.5.0_rc5_f81bf458_00018 > sdb;10.218.128.17;5023720;1255930;16698 > sde;10.218.202.17;5016809;1254202;16721 > sdj;10.218.203.17;5021915;1255478;16704 > sdk;10.218.204.17;5021314;1255328;16706 > sdc;10.219.128.17;4984318;1246079;16830 > sdf;10.219.202.17;4986096;1246524;16824 > sdh;10.219.203.17;5043958;1260989;16631 > sdm;10.219.204.17;5032460;1258115;16669 > sdd;10.220.128.17;3736740;934185;22449 > sdg;10.220.202.17;3728767;932191;22497 > sdi;10.220.203.17;3752117;938029;22357 > Sdl;10.220.204.17;3763901;940975;22287 > > Srpt keeps crashing couldn't test > > 4.5.0_rc5_5adabdd1_00023 > Sdc;10.218.128.17;3726448;931612;22511 ~97% CPU kworker/u69:4 > sdf;10.218.202.17;3750271;937567;22368 > sdi;10.218.203.17;3749266;937316;22374 > sdj;10.218.204.17;3798844;949711;22082 > sde;10.219.128.17;3759852;939963;22311 ~97% CPU kworker/u69:4 > sdg;10.219.202.17;3772534;943133;22236 > sdl;10.219.203.17;3769483;942370;22254 > sdn;10.219.204.17;3790604;947651;22130 > sdd;10.220.128.17;5171130;1292782;16222 ~96% CPU kworker/u68:3 > sdh;10.220.202.17;5105354;1276338;16431 > sdk;10.220.203.17;4995300;1248825;16793 > sdm;10.220.204.17;4959564;1239891;16914 > > Srpt crashes > > 4.5.0_rc5_07b63196_00027 > sdb;10.218.128.17;3606142;901535;23262 > sdg;10.218.202.17;3570988;892747;23491 > sdf;10.218.203.17;3576011;894002;23458 > sdk;10.218.204.17;3558113;889528;23576 > sdc;10.219.128.17;3577384;894346;23449 > sde;10.219.202.17;3575401;893850;23462 > sdj;10.219.203.17;3567798;891949;23512 > sdl;10.219.204.17;3584262;896065;23404 > sdd;10.220.128.17;4430680;1107670;18933 > sdh;10.220.202.17;4488286;1122071;18690 > sdi;10.220.203.17;4487326;1121831;18694 > sdm;10.220.204.17;4441236;1110309;18888 > > Srpt crashes > > 4.5.0_rc5_5e47f198_00036 > sdb;10.218.128.17;3519597;879899;23834 > sdi;10.218.202.17;3512229;878057;23884 > sdh;10.218.203.17;3518563;879640;23841 > sdk;10.218.204.17;3582119;895529;23418 > sdd;10.219.128.17;3550883;887720;23624 > sdj;10.219.202.17;3558415;889603;23574 > sde;10.219.203.17;3552086;888021;23616 > sdl;10.219.204.17;3579521;894880;23435 > sdc;10.220.128.17;4532912;1133228;18506 > sdf;10.220.202.17;4558035;1139508;18404 > sdg;10.220.203.17;4601035;1150258;18232 > sdm;10.220.204.17;4548150;1137037;18444 > > srpt crashes > > 4.6.2 vanilla default config > sde;10.218.128.17;3431063;857765;24449 > sdf;10.218.202.17;3360685;840171;24961 > sdi;10.218.203.17;3355174;838793;25002 > sdm;10.218.204.17;3360955;840238;24959 > sdd;10.219.128.17;3337288;834322;25136 > sdh;10.219.202.17;3327492;831873;25210 > sdj;10.219.203.17;3380867;845216;24812 > sdk;10.219.204.17;3418340;854585;24540 > sdc;10.220.128.17;4668377;1167094;17969 > sdg;10.220.202.17;4716675;1179168;17785 > sdl;10.220.203.17;4675663;1168915;17941 > sdn;10.220.204.17;4631519;1157879;18112 > > Mlx5_0;sde;3390021;847505;24745 ~98% CPU kworker/u69:3 > Mlx5_0;sdd;3207512;801878;26153 ~98% CPU kworker/u69:3 > Mlx4_0;sdc;2998072;749518;27980 ~98% CPU kworker/u68:0 > > 4.7.0_rc3_5edb5649 > sdc;10.218.128.17;3260244;815061;25730 > sdg;10.218.202.17;3405988;851497;24629 > sdh;10.218.203.17;3307419;826854;25363 > sdm;10.218.204.17;3430502;857625;24453 > sdi;10.219.128.17;3544282;886070;23668 > sdj;10.219.202.17;3412083;853020;24585 > sdk;10.219.203.17;3422385;855596;24511 > sdl;10.219.204.17;3444164;861041;24356 > sdb;10.220.128.17;4803646;1200911;17463 > sdd;10.220.202.17;4832982;1208245;17357 > sde;10.220.203.17;4809430;1202357;17442 > sdf;10.220.204.17;4808878;1202219;17444 > > mlx5_0;sdd;2986864;746716;28085 > mlx5_0;sdc;2963648;740912;28305 > mlx4_0;sdb;3317228;829307;25288 > > Thanks, > ---------------- > Robert LeBlanc > PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 > > > On Tue, Jun 21, 2016 at 8:50 AM, Robert LeBlanc <robert-4JaGZRWAfWbajFs6igw21g@public.gmane.org> wrote: >> Sagi, >> >> I'm working to implement SRP (I think I got it all working) to test >> some of the commits. I can try TGT afterwards and the commit you >> mention. I haven't been watching the CPU lately, but before when I was >> doing a lot of testing, there wasn't any one thread that was at 100%. >> There are several threads that have high utilization, but none 100% >> and there is plenty of CPU capacity available (32 cores). I can >> capture some of that data if it is helpful. I did test 4.7_rc3 on >> Friday, but it didn't change much, is that "new" enough? >> >> 4.7.0_rc3_5edb5649 >> sdc;10.218.128.17;3260244;815061;25730 >> sdg;10.218.202.17;3405988;851497;24629 >> sdh;10.218.203.17;3307419;826854;25363 >> sdm;10.218.204.17;3430502;857625;24453 >> sdi;10.219.128.17;3544282;886070;23668 >> sdj;10.219.202.17;3412083;853020;24585 >> sdk;10.219.203.17;3422385;855596;24511 >> sdl;10.219.204.17;3444164;861041;24356 >> sdb;10.220.128.17;4803646;1200911;17463 >> sdd;10.220.202.17;4832982;1208245;17357 >> sde;10.220.203.17;4809430;1202357;17442 >> sdf;10.220.204.17;4808878;1202219;17444 >> >> Thanks for the suggestions, I'll work to get some of the requested >> data back to you guys quickly. >> ---------------- >> Robert LeBlanc >> PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 >> >> >> On Tue, Jun 21, 2016 at 7:08 AM, Sagi Grimberg <sagigrim-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote: >>> Hey Robert, >>> >>>> I narrowed the performance degradation to this series >>>> 7861728..5e47f19, but while trying to bisect it, the changes were >>>> erratic between each commit that I could not figure out exactly which >>>> introduced the issue. If someone could give me some pointers on what >>>> to do, I can keep trying to dig through this. >>> >>> >>> This bisection brings suspects: >>> >>> e3416ab2d156 iser-target: Kill the ->isert_cmd back pointer in struct >>> iser_tx_desc >>> d1ca2ed7dcf8 iser-target: Kill struct isert_rdma_wr >>> 9679cc51eb13 iser-target: Convert to new CQ API >>> 5adabdd122e4 iser-target: Split and properly type the login buffer >>> ed1083b251f0 iser-target: Remove ISER_RECV_DATA_SEG_LEN >>> 26c7b673db57 iser-target: Remove impossible condition from isert_wait_conn >>> 69c48846f1c7 iser-target: Remove redundant wait in release_conn >>> 6d1fba0c2cc7 iser-target: Rework connection termination >>> f81bf458208e iser-target: Separate flows for np listeners and connections >>> cma events >>> aea92980601f iser-target: Add new state ISER_CONN_BOUND to isert_conn >>> b89a7c25462b iser-target: Fix identification of login rx descriptor type >>> >>> However I don't really see performance implications in these patches, >>> not to mention something that would affect on ConnectIB... >>> >>> Given that your bisection brings up target side patches, I have >>> a couple questions: >>> >>> 1. Are the CPU usage in the target side at 100%, or the initiator side >>> is the bottleneck? >>> >>> 2. Would it be possible to use another target implementation? TGT maybe? >>> >>> 3. Can you try testing right before 9679cc51eb13? This is a patch that >>> involves data-plane. >>> >>> 4. Can you try the latest upstream kernel? The iser target code uses >>> a generic data-transfer library and I'm interested in knowing what is >>> the status there. >>> >>> Cheers, >>> Sagi. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Connect-IB not performing as well as ConnectX-3 with iSER 2016-06-21 20:26 ` Robert LeBlanc [not found] ` <CAANLjFpeL0AkuGW-q5Bmm-dff0UqFOM_sAOaG7=vyqmwnOoTcQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> @ 2016-06-22 16:21 ` Sagi Grimberg [not found] ` <576ABB1B.4020509-NQWnxTmZq1alnMjI0IkVqw@public.gmane.org> 1 sibling, 1 reply; 20+ messages in thread From: Sagi Grimberg @ 2016-06-22 16:21 UTC (permalink / raw) To: Robert LeBlanc, Sagi Grimberg; +Cc: linux-rdma, linux-scsi, Max Gurtovoy Let me see if I get this correct: > 4.5.0_rc3_1aaa57f5_00399 > > sdc;10.218.128.17;4627942;1156985;18126 > sdf;10.218.202.17;4590963;1147740;18272 > sdk;10.218.203.17;4564980;1141245;18376 > sdn;10.218.204.17;4571946;1142986;18348 > sdd;10.219.128.17;4591717;1147929;18269 > sdi;10.219.202.17;4505644;1126411;18618 > sdg;10.219.203.17;4562001;1140500;18388 > sdl;10.219.204.17;4583187;1145796;18303 > sde;10.220.128.17;5511568;1377892;15220 > sdh;10.220.202.17;5515555;1378888;15209 > sdj;10.220.203.17;5609983;1402495;14953 > sdm;10.220.204.17;5509035;1377258;15227 In 1aaa57f5 you get on CIB ~115K IOPs per sd device and on CX3 you get around 140K IOPs per sd device. > > Mlx5_0;sde;3593013;898253;23347 100% CPU kworker/u69:2 > Mlx5_0;sdd;3588555;897138;23376 100% CPU kworker/u69:2 > Mlx4_0;sdc;3525662;881415;23793 100% CPU kworker/u68:0 Is this on the host or the target? > 4.5.0_rc5_7861728d_00001 > sdc;10.218.128.17;3747591;936897;22384 > sdf;10.218.202.17;3750607;937651;22366 > sdh;10.218.203.17;3750439;937609;22367 > sdn;10.218.204.17;3771008;942752;22245 > sde;10.219.128.17;3867678;966919;21689 > sdg;10.219.202.17;3781889;945472;22181 > sdk;10.219.203.17;3791804;947951;22123 > sdl;10.219.204.17;3795406;948851;22102 > sdd;10.220.128.17;5039110;1259777;16647 > sdi;10.220.202.17;4992921;1248230;16801 > sdj;10.220.203.17;5015610;1253902;16725 > Sdm;10.220.204.17;5087087;1271771;16490 In 7861728d you get on CIB ~95K IOPs per sd device and on CX3 you get around 125K IOPs per sd device. I don't see any difference in the code around iser/isert, in fact, I don't see any commit in drivers/infiniband > > Mlx5_0;sde;2930722;732680;28623 ~98% CPU kworker/u69:0 > Mlx5_0;sdd;2910891;727722;28818 ~98% CPU kworker/u69:0 > Mlx4_0;sdc;3263668;815917;25703 ~98% CPU kworker/u68:0 Again, host or target? > 4.5.0_rc5_f81bf458_00018 > sdb;10.218.128.17;5023720;1255930;16698 > sde;10.218.202.17;5016809;1254202;16721 > sdj;10.218.203.17;5021915;1255478;16704 > sdk;10.218.204.17;5021314;1255328;16706 > sdc;10.219.128.17;4984318;1246079;16830 > sdf;10.219.202.17;4986096;1246524;16824 > sdh;10.219.203.17;5043958;1260989;16631 > sdm;10.219.204.17;5032460;1258115;16669 > sdd;10.220.128.17;3736740;934185;22449 > sdg;10.220.202.17;3728767;932191;22497 > sdi;10.220.203.17;3752117;938029;22357 > Sdl;10.220.204.17;3763901;940975;22287 In f81bf458 you get on CIB ~125K IOPs per sd device and on CX3 you get around 93K IOPs per sd device which is the other way around? CIB is better than CX3? The commits in this gap are: f81bf458208e iser-target: Separate flows for np listeners and connections cma events aea92980601f iser-target: Add new state ISER_CONN_BOUND to isert_conn b89a7c25462b iser-target: Fix identification of login rx descriptor type None of those should affect the data-path. > > Srpt keeps crashing couldn't test > > 4.5.0_rc5_5adabdd1_00023 > Sdc;10.218.128.17;3726448;931612;22511 ~97% CPU kworker/u69:4 > sdf;10.218.202.17;3750271;937567;22368 > sdi;10.218.203.17;3749266;937316;22374 > sdj;10.218.204.17;3798844;949711;22082 > sde;10.219.128.17;3759852;939963;22311 ~97% CPU kworker/u69:4 > sdg;10.219.202.17;3772534;943133;22236 > sdl;10.219.203.17;3769483;942370;22254 > sdn;10.219.204.17;3790604;947651;22130 > sdd;10.220.128.17;5171130;1292782;16222 ~96% CPU kworker/u68:3 > sdh;10.220.202.17;5105354;1276338;16431 > sdk;10.220.203.17;4995300;1248825;16793 > sdm;10.220.204.17;4959564;1239891;16914 In 5adabdd1 you get on CIB ~94K IOPs per sd device and on CX3 you get around 130K IOPs per sd device which means you flipped again (very strange). The commits in this gap are: 5adabdd122e4 iser-target: Split and properly type the login buffer ed1083b251f0 iser-target: Remove ISER_RECV_DATA_SEG_LEN 26c7b673db57 iser-target: Remove impossible condition from isert_wait_conn 69c48846f1c7 iser-target: Remove redundant wait in release_conn 6d1fba0c2cc7 iser-target: Rework connection termination Again, none are suspected to implicate the data-plane. > Srpt crashes > > 4.5.0_rc5_07b63196_00027 > sdb;10.218.128.17;3606142;901535;23262 > sdg;10.218.202.17;3570988;892747;23491 > sdf;10.218.203.17;3576011;894002;23458 > sdk;10.218.204.17;3558113;889528;23576 > sdc;10.219.128.17;3577384;894346;23449 > sde;10.219.202.17;3575401;893850;23462 > sdj;10.219.203.17;3567798;891949;23512 > sdl;10.219.204.17;3584262;896065;23404 > sdd;10.220.128.17;4430680;1107670;18933 > sdh;10.220.202.17;4488286;1122071;18690 > sdi;10.220.203.17;4487326;1121831;18694 > sdm;10.220.204.17;4441236;1110309;18888 In 5adabdd1 you get on CIB ~89K IOPs per sd device and on CX3 you get around 112K IOPs per sd device The commits in this gap are: e3416ab2d156 iser-target: Kill the ->isert_cmd back pointer in struct iser_tx_desc d1ca2ed7dcf8 iser-target: Kill struct isert_rdma_wr 9679cc51eb13 iser-target: Convert to new CQ API Which do effect the data-path, but nothing that can explain a specific CIB issue. Moreover, the perf drop happened before that. > Srpt crashes > > 4.5.0_rc5_5e47f198_00036 > sdb;10.218.128.17;3519597;879899;23834 > sdi;10.218.202.17;3512229;878057;23884 > sdh;10.218.203.17;3518563;879640;23841 > sdk;10.218.204.17;3582119;895529;23418 > sdd;10.219.128.17;3550883;887720;23624 > sdj;10.219.202.17;3558415;889603;23574 > sde;10.219.203.17;3552086;888021;23616 > sdl;10.219.204.17;3579521;894880;23435 > sdc;10.220.128.17;4532912;1133228;18506 > sdf;10.220.202.17;4558035;1139508;18404 > sdg;10.220.203.17;4601035;1150258;18232 > sdm;10.220.204.17;4548150;1137037;18444 Same results, and no commit added so makes sense. > srpt crashes > > 4.6.2 vanilla default config > sde;10.218.128.17;3431063;857765;24449 > sdf;10.218.202.17;3360685;840171;24961 > sdi;10.218.203.17;3355174;838793;25002 > sdm;10.218.204.17;3360955;840238;24959 > sdd;10.219.128.17;3337288;834322;25136 > sdh;10.219.202.17;3327492;831873;25210 > sdj;10.219.203.17;3380867;845216;24812 > sdk;10.219.204.17;3418340;854585;24540 > sdc;10.220.128.17;4668377;1167094;17969 > sdg;10.220.202.17;4716675;1179168;17785 > sdl;10.220.203.17;4675663;1168915;17941 > sdn;10.220.204.17;4631519;1157879;18112 > > Mlx5_0;sde;3390021;847505;24745 ~98% CPU kworker/u69:3 > Mlx5_0;sdd;3207512;801878;26153 ~98% CPU kworker/u69:3 > Mlx4_0;sdc;2998072;749518;27980 ~98% CPU kworker/u68:0 > > 4.7.0_rc3_5edb5649 > sdc;10.218.128.17;3260244;815061;25730 > sdg;10.218.202.17;3405988;851497;24629 > sdh;10.218.203.17;3307419;826854;25363 > sdm;10.218.204.17;3430502;857625;24453 > sdi;10.219.128.17;3544282;886070;23668 > sdj;10.219.202.17;3412083;853020;24585 > sdk;10.219.203.17;3422385;855596;24511 > sdl;10.219.204.17;3444164;861041;24356 > sdb;10.220.128.17;4803646;1200911;17463 > sdd;10.220.202.17;4832982;1208245;17357 > sde;10.220.203.17;4809430;1202357;17442 > sdf;10.220.204.17;4808878;1202219;17444 Here there is a new rdma_rw api, which doesn't make a difference in performance (but no improvement also). ------------------ So all in all I still don't know what can be the root-cause here. You mentioned that you are running fio over a filesystem. Is it possible to run your tests directly over the block devices? And can you run the fio with DIRECT-IO? Also, usually iser, srp and other rdma ULPs are sensitive to the IRQ assignments of the HCA. An incorrect IRQ affinity assignment might bring all sorts of noise to performance tests. The normal practice to get the most out of the HCA is usually to spread the IRQ assignments linearly on all CPUs (https://community.mellanox.com/docs/DOC-1483). Did you perform any steps to spread IRQ interrupts? is irqbalance daemon on? It would be good to try and isolate the drop and make sure it is real and not randomly generated due to some noise in the form of IRQ assignments. ^ permalink raw reply [flat|nested] 20+ messages in thread
[parent not found: <576ABB1B.4020509-NQWnxTmZq1alnMjI0IkVqw@public.gmane.org>]
* Re: Connect-IB not performing as well as ConnectX-3 with iSER [not found] ` <576ABB1B.4020509-NQWnxTmZq1alnMjI0IkVqw@public.gmane.org> @ 2016-06-22 17:46 ` Robert LeBlanc [not found] ` <CAANLjFqp8qStMCtcEjsoprfpD1=qnYguKU5+8rL9pkYwHv4PKw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 0 siblings, 1 reply; 20+ messages in thread From: Robert LeBlanc @ 2016-06-22 17:46 UTC (permalink / raw) To: Sagi Grimberg Cc: Sagi Grimberg, linux-rdma-u79uwXL29TY76Z2rM5mHXA, linux-scsi-u79uwXL29TY76Z2rM5mHXA, Max Gurtovoy Sagi, Yes you are understanding the data correctly and what I'm seeing. I think you are also seeing the confusion that I've been running into trying to figure this out as well. As far as your questions about SRP, the performance data is from the initiator and the CPU info is from the target (all fio threads on the initiator were low CPU utilization). I spent a good day tweaking the IRQ assignments (spreading IRQs to all cores, spreading to all cores on the NUMA node the card is attached to, and spreading to all non-hyperthreaded cores on the NUMA node). None of these provided any substantial gains/detriments (irqbalance was not running). I don't know if there is IRQ steering going on, but in some cases with irqbalance not running the IRQs would get pinned back to the previous core(s) and I'd have to set them again. I did not use the Mellanox scripts, I just did it by hand based on the documents/scripts. I also offlined all cores on the second NUMA node which didn't help either. I got more performance gains with nomerges (1 or 2 provided about the same gain, 2 slightly more) and the queue. It seems that something in 1aaa57f5 was going right as both cards performed very well without needing any IRQ fudging. I understand that there are many moving parts to try and figure this out, it could be anywhere in the IB drivers, LIO, and even the SCSI sub systems, RAM disk implementation or file system. However since the performance is bouncing between cards, it seems it is unlikely something very common (except when both cards show a loss/gain), but as you mentioned, there doesn't seem to be any rhyme or reason to the shifts. I haven't been using the straight block device in these tests, before when I did, after one thread read the data, if another read that same block it then started reading it from cache invalidating the test. I could only saturate the path/port by highly threaded jobs, I may have to partition out the disk for block testing. When I ran the tests using direct I/O the performance was far lower and harder for me to know when I was reaching the theoretical max of the card/links/PCIe. I just may have my scripts run the three tests in succession. Thanks for looking at this. Please let me know what you think would be most helpful so that I'm making the best use of your and my time. Thanks, ---------------- Robert LeBlanc PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 On Wed, Jun 22, 2016 at 10:21 AM, Sagi Grimberg <sagi-ImC7XgPzLAfvYQKSrp0J2Q@public.gmane.org> wrote: > Let me see if I get this correct: > >> 4.5.0_rc3_1aaa57f5_00399 >> >> sdc;10.218.128.17;4627942;1156985;18126 >> sdf;10.218.202.17;4590963;1147740;18272 >> sdk;10.218.203.17;4564980;1141245;18376 >> sdn;10.218.204.17;4571946;1142986;18348 >> sdd;10.219.128.17;4591717;1147929;18269 >> sdi;10.219.202.17;4505644;1126411;18618 >> sdg;10.219.203.17;4562001;1140500;18388 >> sdl;10.219.204.17;4583187;1145796;18303 >> sde;10.220.128.17;5511568;1377892;15220 >> sdh;10.220.202.17;5515555;1378888;15209 >> sdj;10.220.203.17;5609983;1402495;14953 >> sdm;10.220.204.17;5509035;1377258;15227 > > > In 1aaa57f5 you get on CIB ~115K IOPs per sd device > and on CX3 you get around 140K IOPs per sd device. > >> >> Mlx5_0;sde;3593013;898253;23347 100% CPU kworker/u69:2 >> Mlx5_0;sdd;3588555;897138;23376 100% CPU kworker/u69:2 >> Mlx4_0;sdc;3525662;881415;23793 100% CPU kworker/u68:0 > > > Is this on the host or the target? > >> 4.5.0_rc5_7861728d_00001 >> sdc;10.218.128.17;3747591;936897;22384 >> sdf;10.218.202.17;3750607;937651;22366 >> sdh;10.218.203.17;3750439;937609;22367 >> sdn;10.218.204.17;3771008;942752;22245 >> sde;10.219.128.17;3867678;966919;21689 >> sdg;10.219.202.17;3781889;945472;22181 >> sdk;10.219.203.17;3791804;947951;22123 >> sdl;10.219.204.17;3795406;948851;22102 >> sdd;10.220.128.17;5039110;1259777;16647 >> sdi;10.220.202.17;4992921;1248230;16801 >> sdj;10.220.203.17;5015610;1253902;16725 >> Sdm;10.220.204.17;5087087;1271771;16490 > > > In 7861728d you get on CIB ~95K IOPs per sd device > and on CX3 you get around 125K IOPs per sd device. > > I don't see any difference in the code around iser/isert, > in fact, I don't see any commit in drivers/infiniband > > >> >> Mlx5_0;sde;2930722;732680;28623 ~98% CPU kworker/u69:0 >> Mlx5_0;sdd;2910891;727722;28818 ~98% CPU kworker/u69:0 >> Mlx4_0;sdc;3263668;815917;25703 ~98% CPU kworker/u68:0 > > > Again, host or target? > >> 4.5.0_rc5_f81bf458_00018 >> sdb;10.218.128.17;5023720;1255930;16698 >> sde;10.218.202.17;5016809;1254202;16721 >> sdj;10.218.203.17;5021915;1255478;16704 >> sdk;10.218.204.17;5021314;1255328;16706 >> sdc;10.219.128.17;4984318;1246079;16830 >> sdf;10.219.202.17;4986096;1246524;16824 >> sdh;10.219.203.17;5043958;1260989;16631 >> sdm;10.219.204.17;5032460;1258115;16669 >> sdd;10.220.128.17;3736740;934185;22449 >> sdg;10.220.202.17;3728767;932191;22497 >> sdi;10.220.203.17;3752117;938029;22357 >> Sdl;10.220.204.17;3763901;940975;22287 > > > In f81bf458 you get on CIB ~125K IOPs per sd device > and on CX3 you get around 93K IOPs per sd device which > is the other way around? CIB is better than CX3? > > The commits in this gap are: > f81bf458208e iser-target: Separate flows for np listeners and connections > cma events > aea92980601f iser-target: Add new state ISER_CONN_BOUND to isert_conn > b89a7c25462b iser-target: Fix identification of login rx descriptor type > > None of those should affect the data-path. > >> >> Srpt keeps crashing couldn't test >> >> 4.5.0_rc5_5adabdd1_00023 >> Sdc;10.218.128.17;3726448;931612;22511 ~97% CPU kworker/u69:4 >> sdf;10.218.202.17;3750271;937567;22368 >> sdi;10.218.203.17;3749266;937316;22374 >> sdj;10.218.204.17;3798844;949711;22082 >> sde;10.219.128.17;3759852;939963;22311 ~97% CPU kworker/u69:4 >> sdg;10.219.202.17;3772534;943133;22236 >> sdl;10.219.203.17;3769483;942370;22254 >> sdn;10.219.204.17;3790604;947651;22130 >> sdd;10.220.128.17;5171130;1292782;16222 ~96% CPU kworker/u68:3 >> sdh;10.220.202.17;5105354;1276338;16431 >> sdk;10.220.203.17;4995300;1248825;16793 >> sdm;10.220.204.17;4959564;1239891;16914 > > > In 5adabdd1 you get on CIB ~94K IOPs per sd device > and on CX3 you get around 130K IOPs per sd device > which means you flipped again (very strange). > > The commits in this gap are: > 5adabdd122e4 iser-target: Split and properly type the login buffer > ed1083b251f0 iser-target: Remove ISER_RECV_DATA_SEG_LEN > 26c7b673db57 iser-target: Remove impossible condition from isert_wait_conn > 69c48846f1c7 iser-target: Remove redundant wait in release_conn > 6d1fba0c2cc7 iser-target: Rework connection termination > > Again, none are suspected to implicate the data-plane. > >> Srpt crashes >> >> 4.5.0_rc5_07b63196_00027 >> sdb;10.218.128.17;3606142;901535;23262 >> sdg;10.218.202.17;3570988;892747;23491 >> sdf;10.218.203.17;3576011;894002;23458 >> sdk;10.218.204.17;3558113;889528;23576 >> sdc;10.219.128.17;3577384;894346;23449 >> sde;10.219.202.17;3575401;893850;23462 >> sdj;10.219.203.17;3567798;891949;23512 >> sdl;10.219.204.17;3584262;896065;23404 >> sdd;10.220.128.17;4430680;1107670;18933 >> sdh;10.220.202.17;4488286;1122071;18690 >> sdi;10.220.203.17;4487326;1121831;18694 >> sdm;10.220.204.17;4441236;1110309;18888 > > > In 5adabdd1 you get on CIB ~89K IOPs per sd device > and on CX3 you get around 112K IOPs per sd device > > The commits in this gap are: > e3416ab2d156 iser-target: Kill the ->isert_cmd back pointer in struct > iser_tx_desc > d1ca2ed7dcf8 iser-target: Kill struct isert_rdma_wr > 9679cc51eb13 iser-target: Convert to new CQ API > > Which do effect the data-path, but nothing that can explain > a specific CIB issue. Moreover, the perf drop happened before that. > >> Srpt crashes >> >> 4.5.0_rc5_5e47f198_00036 >> sdb;10.218.128.17;3519597;879899;23834 >> sdi;10.218.202.17;3512229;878057;23884 >> sdh;10.218.203.17;3518563;879640;23841 >> sdk;10.218.204.17;3582119;895529;23418 >> sdd;10.219.128.17;3550883;887720;23624 >> sdj;10.219.202.17;3558415;889603;23574 >> sde;10.219.203.17;3552086;888021;23616 >> sdl;10.219.204.17;3579521;894880;23435 >> sdc;10.220.128.17;4532912;1133228;18506 >> sdf;10.220.202.17;4558035;1139508;18404 >> sdg;10.220.203.17;4601035;1150258;18232 >> sdm;10.220.204.17;4548150;1137037;18444 > > > Same results, and no commit added so makes sense. > > >> srpt crashes >> >> 4.6.2 vanilla default config >> sde;10.218.128.17;3431063;857765;24449 >> sdf;10.218.202.17;3360685;840171;24961 >> sdi;10.218.203.17;3355174;838793;25002 >> sdm;10.218.204.17;3360955;840238;24959 >> sdd;10.219.128.17;3337288;834322;25136 >> sdh;10.219.202.17;3327492;831873;25210 >> sdj;10.219.203.17;3380867;845216;24812 >> sdk;10.219.204.17;3418340;854585;24540 >> sdc;10.220.128.17;4668377;1167094;17969 >> sdg;10.220.202.17;4716675;1179168;17785 >> sdl;10.220.203.17;4675663;1168915;17941 >> sdn;10.220.204.17;4631519;1157879;18112 >> >> Mlx5_0;sde;3390021;847505;24745 ~98% CPU kworker/u69:3 >> Mlx5_0;sdd;3207512;801878;26153 ~98% CPU kworker/u69:3 >> Mlx4_0;sdc;2998072;749518;27980 ~98% CPU kworker/u68:0 >> >> 4.7.0_rc3_5edb5649 >> sdc;10.218.128.17;3260244;815061;25730 >> sdg;10.218.202.17;3405988;851497;24629 >> sdh;10.218.203.17;3307419;826854;25363 >> sdm;10.218.204.17;3430502;857625;24453 >> sdi;10.219.128.17;3544282;886070;23668 >> sdj;10.219.202.17;3412083;853020;24585 >> sdk;10.219.203.17;3422385;855596;24511 >> sdl;10.219.204.17;3444164;861041;24356 >> sdb;10.220.128.17;4803646;1200911;17463 >> sdd;10.220.202.17;4832982;1208245;17357 >> sde;10.220.203.17;4809430;1202357;17442 >> sdf;10.220.204.17;4808878;1202219;17444 > > > > Here there is a new rdma_rw api, which doesn't > make a difference in performance (but no improvement > also). > > > ------------------ > So all in all I still don't know what can be the root-cause > here. > > You mentioned that you are running fio over a filesystem. Is > it possible to run your tests directly over the block devices? And > can you run the fio with DIRECT-IO? > > Also, usually iser, srp and other rdma ULPs are sensitive to > the IRQ assignments of the HCA. An incorrect IRQ affinity assignment > might bring all sorts of noise to performance tests. The normal > practice to get the most out of the HCA is usually to spread the > IRQ assignments linearly on all CPUs > (https://community.mellanox.com/docs/DOC-1483). > Did you perform any steps to spread IRQ interrupts? is irqbalance daemon > on? > > It would be good to try and isolate the drop and make sure it > is real and not randomly generated due to some noise in the form of > IRQ assignments. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 20+ messages in thread
[parent not found: <CAANLjFqp8qStMCtcEjsoprfpD1=qnYguKU5+8rL9pkYwHv4PKw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>]
* Re: Connect-IB not performing as well as ConnectX-3 with iSER [not found] ` <CAANLjFqp8qStMCtcEjsoprfpD1=qnYguKU5+8rL9pkYwHv4PKw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> @ 2016-06-24 18:34 ` Robert LeBlanc 0 siblings, 0 replies; 20+ messages in thread From: Robert LeBlanc @ 2016-06-24 18:34 UTC (permalink / raw) To: Sagi Grimberg Cc: Sagi Grimberg, linux-rdma-u79uwXL29TY76Z2rM5mHXA, linux-scsi-u79uwXL29TY76Z2rM5mHXA, Max Gurtovoy Sagi, Here is an example of the different types of tests. This was only on one kernel. The first two are to set a baseline. The lines starting with buffer is using fio with direct=0, the lines starting with direct is fio with direct=1. The lines starting with block is fio running against a raw block deice (technically 40 partitions on a single drive) with direct=0. I also reduced the tests to only test one path per port instead of four like before. # /root/run_path_tests.sh check-paths #### Test all iSER paths individually #### 4.5.0-rc5-5adabdd1-00023-g5adabdd buffer;sdc;10.218.128.17;3815778;953944;21984 buffer;sdd;10.219.128.17;3743744;935936;22407 buffer;sde;10.220.128.17;4915392;1228848;17066 direct;sdc;10.218.128.17;876644;219161;95690 direct;sdd;10.219.128.17;881684;220421;95143 direct;sde;10.220.128.17;892215;223053;94020 block;sdc;10.218.128.17;3890459;972614;21562 block;sdd;10.219.128.17;4127642;1031910;20323 block;sde;10.220.128.17;4939705;1234926;16982 # /root/run_path_tests.sh check-paths #### Test all iSER paths individually #### 4.5.0-rc5-5adabdd1-00023-g5adabdd buffer;sdc;10.218.128.17;3983572;995893;21058 buffer;sdd;10.219.128.17;3774231;943557;22226 buffer;sde;10.220.128.17;4856204;1214051;17274 direct;sdc;10.218.128.17;875820;218955;95780 direct;sdd;10.219.128.17;884072;221018;94886 direct;sde;10.220.128.17;902486;225621;92950 block;sdc;10.218.128.17;3790433;947608;22131 block;sdd;10.219.128.17;3860025;965006;21732 block;sde;10.220.128.17;4946404;1236601;16959 For the following test, I set the IRQ on the initiator using mlx_tune -p HIGH_THROUGHPUT with irqbalance disabled. # /root/run_path_tests.sh check-paths #### Test all iSER paths individually #### 4.5.0-rc5-5adabdd1-00023-g5adabdd buffer;sdc;10.218.128.17;3742742;935685;22413 buffer;sdd;10.219.128.17;3786327;946581;22155 buffer;sde;10.220.128.17;5009619;1252404;16745 direct;sdc;10.218.128.17;871942;217985;96206 direct;sdd;10.219.128.17;883467;220866;94951 direct;sde;10.220.128.17;901138;225284;93089 block;sdc;10.218.128.17;3911319;977829;21447 block;sdd;10.219.128.17;3758168;939542;22321 block;sde;10.220.128.17;4968377;1242094;16884 For the following test, I also set the IRQs on the target using mlx_tune -p HIGH_THROUGHPUT and disabled irqbalance. # /root/run_path_tests.sh check-paths #### Test all iSER paths individually #### 4.5.0-rc5-5adabdd1-00023-g5adabdd buffer;sdc;10.218.128.17;3804357;951089;22050 buffer;sdd;10.219.128.17;3767113;941778;22268 buffer;sde;10.220.128.17;4966612;1241653;16890 direct;sdc;10.218.128.17;879742;219935;95353 direct;sdd;10.219.128.17;886641;221660;94611 direct;sde;10.220.128.17;886857;221714;94588 block;sdc;10.218.128.17;3760864;940216;22305 block;sdd;10.219.128.17;3763564;940891;22289 block;sde;10.220.128.17;4965436;1241359;16894 It seems that mlx_tune marginally helps, but not really providing anything groundbreaking. ---------------- Robert LeBlanc PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 On Wed, Jun 22, 2016 at 11:46 AM, Robert LeBlanc <robert-4JaGZRWAfWbajFs6igw21g@public.gmane.org> wrote: > Sagi, > > Yes you are understanding the data correctly and what I'm seeing. I > think you are also seeing the confusion that I've been running into > trying to figure this out as well. As far as your questions about SRP, > the performance data is from the initiator and the CPU info is from > the target (all fio threads on the initiator were low CPU > utilization). > > I spent a good day tweaking the IRQ assignments (spreading IRQs to all > cores, spreading to all cores on the NUMA node the card is attached > to, and spreading to all non-hyperthreaded cores on the NUMA node). > None of these provided any substantial gains/detriments (irqbalance > was not running). I don't know if there is IRQ steering going on, but > in some cases with irqbalance not running the IRQs would get pinned > back to the previous core(s) and I'd have to set them again. I did not > use the Mellanox scripts, I just did it by hand based on the > documents/scripts. I also offlined all cores on the second NUMA node > which didn't help either. I got more performance gains with nomerges > (1 or 2 provided about the same gain, 2 slightly more) and the queue. > It seems that something in 1aaa57f5 was going right as both cards > performed very well without needing any IRQ fudging. > > I understand that there are many moving parts to try and figure this > out, it could be anywhere in the IB drivers, LIO, and even the SCSI > sub systems, RAM disk implementation or file system. However since the > performance is bouncing between cards, it seems it is unlikely > something very common (except when both cards show a loss/gain), but > as you mentioned, there doesn't seem to be any rhyme or reason to the > shifts. > > I haven't been using the straight block device in these tests, before > when I did, after one thread read the data, if another read that same > block it then started reading it from cache invalidating the test. I > could only saturate the path/port by highly threaded jobs, I may have > to partition out the disk for block testing. When I ran the tests > using direct I/O the performance was far lower and harder for me to > know when I was reaching the theoretical max of the card/links/PCIe. I > just may have my scripts run the three tests in succession. > > Thanks for looking at this. Please let me know what you think would be > most helpful so that I'm making the best use of your and my time. > > Thanks, > ---------------- > Robert LeBlanc > PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 > > > On Wed, Jun 22, 2016 at 10:21 AM, Sagi Grimberg <sagi-ImC7XgPzLAfvYQKSrp0J2Q@public.gmane.org> wrote: >> Let me see if I get this correct: >> >>> 4.5.0_rc3_1aaa57f5_00399 >>> >>> sdc;10.218.128.17;4627942;1156985;18126 >>> sdf;10.218.202.17;4590963;1147740;18272 >>> sdk;10.218.203.17;4564980;1141245;18376 >>> sdn;10.218.204.17;4571946;1142986;18348 >>> sdd;10.219.128.17;4591717;1147929;18269 >>> sdi;10.219.202.17;4505644;1126411;18618 >>> sdg;10.219.203.17;4562001;1140500;18388 >>> sdl;10.219.204.17;4583187;1145796;18303 >>> sde;10.220.128.17;5511568;1377892;15220 >>> sdh;10.220.202.17;5515555;1378888;15209 >>> sdj;10.220.203.17;5609983;1402495;14953 >>> sdm;10.220.204.17;5509035;1377258;15227 >> >> >> In 1aaa57f5 you get on CIB ~115K IOPs per sd device >> and on CX3 you get around 140K IOPs per sd device. >> >>> >>> Mlx5_0;sde;3593013;898253;23347 100% CPU kworker/u69:2 >>> Mlx5_0;sdd;3588555;897138;23376 100% CPU kworker/u69:2 >>> Mlx4_0;sdc;3525662;881415;23793 100% CPU kworker/u68:0 >> >> >> Is this on the host or the target? >> >>> 4.5.0_rc5_7861728d_00001 >>> sdc;10.218.128.17;3747591;936897;22384 >>> sdf;10.218.202.17;3750607;937651;22366 >>> sdh;10.218.203.17;3750439;937609;22367 >>> sdn;10.218.204.17;3771008;942752;22245 >>> sde;10.219.128.17;3867678;966919;21689 >>> sdg;10.219.202.17;3781889;945472;22181 >>> sdk;10.219.203.17;3791804;947951;22123 >>> sdl;10.219.204.17;3795406;948851;22102 >>> sdd;10.220.128.17;5039110;1259777;16647 >>> sdi;10.220.202.17;4992921;1248230;16801 >>> sdj;10.220.203.17;5015610;1253902;16725 >>> Sdm;10.220.204.17;5087087;1271771;16490 >> >> >> In 7861728d you get on CIB ~95K IOPs per sd device >> and on CX3 you get around 125K IOPs per sd device. >> >> I don't see any difference in the code around iser/isert, >> in fact, I don't see any commit in drivers/infiniband >> >> >>> >>> Mlx5_0;sde;2930722;732680;28623 ~98% CPU kworker/u69:0 >>> Mlx5_0;sdd;2910891;727722;28818 ~98% CPU kworker/u69:0 >>> Mlx4_0;sdc;3263668;815917;25703 ~98% CPU kworker/u68:0 >> >> >> Again, host or target? >> >>> 4.5.0_rc5_f81bf458_00018 >>> sdb;10.218.128.17;5023720;1255930;16698 >>> sde;10.218.202.17;5016809;1254202;16721 >>> sdj;10.218.203.17;5021915;1255478;16704 >>> sdk;10.218.204.17;5021314;1255328;16706 >>> sdc;10.219.128.17;4984318;1246079;16830 >>> sdf;10.219.202.17;4986096;1246524;16824 >>> sdh;10.219.203.17;5043958;1260989;16631 >>> sdm;10.219.204.17;5032460;1258115;16669 >>> sdd;10.220.128.17;3736740;934185;22449 >>> sdg;10.220.202.17;3728767;932191;22497 >>> sdi;10.220.203.17;3752117;938029;22357 >>> Sdl;10.220.204.17;3763901;940975;22287 >> >> >> In f81bf458 you get on CIB ~125K IOPs per sd device >> and on CX3 you get around 93K IOPs per sd device which >> is the other way around? CIB is better than CX3? >> >> The commits in this gap are: >> f81bf458208e iser-target: Separate flows for np listeners and connections >> cma events >> aea92980601f iser-target: Add new state ISER_CONN_BOUND to isert_conn >> b89a7c25462b iser-target: Fix identification of login rx descriptor type >> >> None of those should affect the data-path. >> >>> >>> Srpt keeps crashing couldn't test >>> >>> 4.5.0_rc5_5adabdd1_00023 >>> Sdc;10.218.128.17;3726448;931612;22511 ~97% CPU kworker/u69:4 >>> sdf;10.218.202.17;3750271;937567;22368 >>> sdi;10.218.203.17;3749266;937316;22374 >>> sdj;10.218.204.17;3798844;949711;22082 >>> sde;10.219.128.17;3759852;939963;22311 ~97% CPU kworker/u69:4 >>> sdg;10.219.202.17;3772534;943133;22236 >>> sdl;10.219.203.17;3769483;942370;22254 >>> sdn;10.219.204.17;3790604;947651;22130 >>> sdd;10.220.128.17;5171130;1292782;16222 ~96% CPU kworker/u68:3 >>> sdh;10.220.202.17;5105354;1276338;16431 >>> sdk;10.220.203.17;4995300;1248825;16793 >>> sdm;10.220.204.17;4959564;1239891;16914 >> >> >> In 5adabdd1 you get on CIB ~94K IOPs per sd device >> and on CX3 you get around 130K IOPs per sd device >> which means you flipped again (very strange). >> >> The commits in this gap are: >> 5adabdd122e4 iser-target: Split and properly type the login buffer >> ed1083b251f0 iser-target: Remove ISER_RECV_DATA_SEG_LEN >> 26c7b673db57 iser-target: Remove impossible condition from isert_wait_conn >> 69c48846f1c7 iser-target: Remove redundant wait in release_conn >> 6d1fba0c2cc7 iser-target: Rework connection termination >> >> Again, none are suspected to implicate the data-plane. >> >>> Srpt crashes >>> >>> 4.5.0_rc5_07b63196_00027 >>> sdb;10.218.128.17;3606142;901535;23262 >>> sdg;10.218.202.17;3570988;892747;23491 >>> sdf;10.218.203.17;3576011;894002;23458 >>> sdk;10.218.204.17;3558113;889528;23576 >>> sdc;10.219.128.17;3577384;894346;23449 >>> sde;10.219.202.17;3575401;893850;23462 >>> sdj;10.219.203.17;3567798;891949;23512 >>> sdl;10.219.204.17;3584262;896065;23404 >>> sdd;10.220.128.17;4430680;1107670;18933 >>> sdh;10.220.202.17;4488286;1122071;18690 >>> sdi;10.220.203.17;4487326;1121831;18694 >>> sdm;10.220.204.17;4441236;1110309;18888 >> >> >> In 5adabdd1 you get on CIB ~89K IOPs per sd device >> and on CX3 you get around 112K IOPs per sd device >> >> The commits in this gap are: >> e3416ab2d156 iser-target: Kill the ->isert_cmd back pointer in struct >> iser_tx_desc >> d1ca2ed7dcf8 iser-target: Kill struct isert_rdma_wr >> 9679cc51eb13 iser-target: Convert to new CQ API >> >> Which do effect the data-path, but nothing that can explain >> a specific CIB issue. Moreover, the perf drop happened before that. >> >>> Srpt crashes >>> >>> 4.5.0_rc5_5e47f198_00036 >>> sdb;10.218.128.17;3519597;879899;23834 >>> sdi;10.218.202.17;3512229;878057;23884 >>> sdh;10.218.203.17;3518563;879640;23841 >>> sdk;10.218.204.17;3582119;895529;23418 >>> sdd;10.219.128.17;3550883;887720;23624 >>> sdj;10.219.202.17;3558415;889603;23574 >>> sde;10.219.203.17;3552086;888021;23616 >>> sdl;10.219.204.17;3579521;894880;23435 >>> sdc;10.220.128.17;4532912;1133228;18506 >>> sdf;10.220.202.17;4558035;1139508;18404 >>> sdg;10.220.203.17;4601035;1150258;18232 >>> sdm;10.220.204.17;4548150;1137037;18444 >> >> >> Same results, and no commit added so makes sense. >> >> >>> srpt crashes >>> >>> 4.6.2 vanilla default config >>> sde;10.218.128.17;3431063;857765;24449 >>> sdf;10.218.202.17;3360685;840171;24961 >>> sdi;10.218.203.17;3355174;838793;25002 >>> sdm;10.218.204.17;3360955;840238;24959 >>> sdd;10.219.128.17;3337288;834322;25136 >>> sdh;10.219.202.17;3327492;831873;25210 >>> sdj;10.219.203.17;3380867;845216;24812 >>> sdk;10.219.204.17;3418340;854585;24540 >>> sdc;10.220.128.17;4668377;1167094;17969 >>> sdg;10.220.202.17;4716675;1179168;17785 >>> sdl;10.220.203.17;4675663;1168915;17941 >>> sdn;10.220.204.17;4631519;1157879;18112 >>> >>> Mlx5_0;sde;3390021;847505;24745 ~98% CPU kworker/u69:3 >>> Mlx5_0;sdd;3207512;801878;26153 ~98% CPU kworker/u69:3 >>> Mlx4_0;sdc;2998072;749518;27980 ~98% CPU kworker/u68:0 >>> >>> 4.7.0_rc3_5edb5649 >>> sdc;10.218.128.17;3260244;815061;25730 >>> sdg;10.218.202.17;3405988;851497;24629 >>> sdh;10.218.203.17;3307419;826854;25363 >>> sdm;10.218.204.17;3430502;857625;24453 >>> sdi;10.219.128.17;3544282;886070;23668 >>> sdj;10.219.202.17;3412083;853020;24585 >>> sdk;10.219.203.17;3422385;855596;24511 >>> sdl;10.219.204.17;3444164;861041;24356 >>> sdb;10.220.128.17;4803646;1200911;17463 >>> sdd;10.220.202.17;4832982;1208245;17357 >>> sde;10.220.203.17;4809430;1202357;17442 >>> sdf;10.220.204.17;4808878;1202219;17444 >> >> >> >> Here there is a new rdma_rw api, which doesn't >> make a difference in performance (but no improvement >> also). >> >> >> ------------------ >> So all in all I still don't know what can be the root-cause >> here. >> >> You mentioned that you are running fio over a filesystem. Is >> it possible to run your tests directly over the block devices? And >> can you run the fio with DIRECT-IO? >> >> Also, usually iser, srp and other rdma ULPs are sensitive to >> the IRQ assignments of the HCA. An incorrect IRQ affinity assignment >> might bring all sorts of noise to performance tests. The normal >> practice to get the most out of the HCA is usually to spread the >> IRQ assignments linearly on all CPUs >> (https://community.mellanox.com/docs/DOC-1483). >> Did you perform any steps to spread IRQ interrupts? is irqbalance daemon >> on? >> >> It would be good to try and isolate the drop and make sure it >> is real and not randomly generated due to some noise in the form of >> IRQ assignments. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 20+ messages in thread
end of thread, other threads:[~2016-06-24 18:34 UTC | newest] Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2016-06-06 22:36 Connect-IB not performing as well as ConnectX-3 with iSER Robert LeBlanc [not found] ` <CAANLjFoL5zow4f4RXP5t8LM7wsWN1OQ-hD2mtPUBTLkJ7UZ5kA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 2016-06-07 12:02 ` Max Gurtovoy [not found] ` <5756B7D2.5040009-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> 2016-06-07 16:48 ` Robert LeBlanc [not found] ` <CAANLjFq4CoOSbng=aPHiSsFB=1HMSwAhhLiCjt+88dzz24OT9w-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 2016-06-07 22:37 ` Robert LeBlanc [not found] ` <CAANLjFoLJNQWtHHqjHmhc0iBq14NAV_GgkbyQabjzyeN56t+Ow-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 2016-06-08 13:52 ` Max Gurtovoy [not found] ` <57582336.10407-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> 2016-06-08 15:33 ` Robert LeBlanc 2016-06-10 21:36 ` Robert LeBlanc [not found] ` <CAANLjFrv-0VArTEkgqbrhzFjn1fg_egpCJuQZnAurVrHjbL_qA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 2016-06-20 15:23 ` Robert LeBlanc [not found] ` <CAANLjFqoV-5HK0c+LdEbuxd81Vm=g=WE3cQgp47dH-yfYjZjGw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 2016-06-20 21:27 ` Max Gurtovoy [not found] ` <3646a0c9-3f2d-66b8-c4da-c91ca1d01cee-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> 2016-06-20 21:52 ` Robert LeBlanc 2016-06-21 13:08 ` Sagi Grimberg [not found] ` <57693C6A.3020805-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> 2016-06-21 14:50 ` Robert LeBlanc [not found] ` <CAANLjFpUyAYB+ZzMwFKBpa4yLmALPzcRGJX1kExVrLARZmZRkA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 2016-06-21 20:26 ` Robert LeBlanc [not found] ` <CAANLjFpeL0AkuGW-q5Bmm-dff0UqFOM_sAOaG7=vyqmwnOoTcQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 2016-06-22 8:18 ` Bart Van Assche [not found] ` <86d4404a-fa6a-72de-8e83-827072c308b5-XdAiOPVOjttBDgjK7y7TUQ@public.gmane.org> 2016-06-22 12:23 ` Laurence Oberman 2016-06-22 15:45 ` Robert LeBlanc 2016-06-22 9:52 ` Sagi Grimberg 2016-06-22 16:21 ` Sagi Grimberg [not found] ` <576ABB1B.4020509-NQWnxTmZq1alnMjI0IkVqw@public.gmane.org> 2016-06-22 17:46 ` Robert LeBlanc [not found] ` <CAANLjFqp8qStMCtcEjsoprfpD1=qnYguKU5+8rL9pkYwHv4PKw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 2016-06-24 18:34 ` Robert LeBlanc
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.