From: "Sun, Mingbao" <Tyler.Sun@dell.com>
To: "linux-rdma@vger.kernel.org" <linux-rdma@vger.kernel.org>
Cc: "Sun, Ao" <Ao.Sun@dell.com>, "Gan, Ping" <Ping.Gan@dell.com>,
"Zhang, Libin" <Libin.Zhang@dell.com>,
"Cai, Yanxiu" <Yanxiu.Cai@dell.com>
Subject: [Bug Report] Limitation on number of QPs for single process observed
Date: Fri, 24 Jul 2020 11:04:57 +0000 [thread overview]
Message-ID: <BN8PR19MB260932E68B6399764F79CDA6F5770@BN8PR19MB2609.namprd19.prod.outlook.com> (raw)
Dell Customer Communication - Confidential
Hi,
Information of the Systems:
HOST lehi-dirt (server side)
lehi-dirt:~ # cat /etc/os-release
NAME="SLES"
VERSION="12-SP4"
VERSION_ID="12.4"
PRETTY_NAME="SUSE Linux Enterprise Server 12 SP4"
ID="sles"
ANSI_COLOR="0;32"
CPE_NAME="cpe:/o:suse:sles:12:sp4"
lehi-dirt:~ # uname -r
4.12.14-95.48-default
lehi-dirt:~ # ibv_devinfo
hca_id: mlx5_bond_0
transport: InfiniBand (0)
fw_ver: 14.26.6000
node_guid: 506b:4b03:00b1:7cae
sys_image_guid: 506b:4b03:00b1:7cae
vendor_id: 0x02c9
vendor_part_id: 4117
hw_ver: 0x0
board_id: DEL2810000034
phys_port_cnt: 1
Device ports:
port: 1
state: PORT_ACTIVE (4)
max_mtu: 4096 (5)
active_mtu: 4096 (5)
sm_lid: 0
port_lid: 0
port_lmc: 0x00
link_layer: Ethernet
HOST murray-dirt (client side)
murray-dirt:~ # cat /etc/os-release
NAME="SLES"
VERSION="12-SP4"
VERSION_ID="12.4"
PRETTY_NAME="SUSE Linux Enterprise Server 12 SP4"
ID="sles"
ANSI_COLOR="0;32"
CPE_NAME="cpe:/o:suse:sles:12:sp4"
murray-dirt:~ # uname -r
4.12.14-95.48-default
murray-dirt:~ # ibv_devinfo
hca_id: mlx5_bond_0
transport: InfiniBand (0)
fw_ver: 14.26.6000
node_guid: 506b:4b03:00b1:7ca6
sys_image_guid: 506b:4b03:00b1:7ca6
vendor_id: 0x02c9
vendor_part_id: 4117
hw_ver: 0x0
board_id: DEL2810000034
phys_port_cnt: 1
Device ports:
port: 1
state: PORT_ACTIVE (4)
max_mtu: 4096 (5)
active_mtu: 4096 (5)
sm_lid: 0
port_lid: 0
port_lmc: 0x00
link_layer: Ethernet
Way to produce the Bug:
Use a single user-space process as the RDMA client to create more than 339 QPs (through API rdma_connect from librdmacm) to a given RDMA server.
The problem we found is that only 339 QPs could be created.
During the creation of the 340th QP, the rdma_create_ep returns fail (Cannot allocate memory) at the client side.
Following are some of the logs generated by our test tool ib_perf.exe:
(1) 1 Server and 1 Client: Client can not create the 340th QP, failed at rdma_create_ep (Cannot allocate memory).
lehi-dirt:/home/admin/NVMe_OF_test # ib_perf.exe --server-ip 192.168.219.7 --server-port 10001 -s --qp-num 1024
qp [0] local 192.168.219.7:10001 peer 192.168.219.8:42196 created.
qp [1] local 192.168.219.7:10001 peer 192.168.219.8:50411 created.
qp [2] local 192.168.219.7:10001 peer 192.168.219.8:44152 created.
......
qp [337] local 192.168.219.7:10001 peer 192.168.219.8:46325 created.
qp [338] local 192.168.219.7:10001 peer 192.168.219.8:60163 created.
murray-dirt:/home/admin # ib_perf.exe --server-ip 192.168.219.7 --server-port 10001 -c --qp-num 1024
qp [0] local 192.168.219.8:42196 peer 192.168.219.7:10001 created.
qp [1] local 192.168.219.8:50411 peer 192.168.219.7:10001 created.
qp [2] local 192.168.219.8:44152 peer 192.168.219.7:10001 created.
......
qp [337] local 192.168.219.8:46325 peer 192.168.219.7:10001 created.
qp [338] local 192.168.219.8:60163 peer 192.168.219.7:10001 created.
ERR_DBG:/mnt/linux-dev-framework-master/apps/ib_perf/perf_frmwk.c(599)-create_connections_client:
rdma_create_ep failed: Cannot allocate memory
(2) 1 Server and 2 Clients: Server can not create the 340th QP, failed at rdma_get_request (Cannot allocate memory).
And the rdma_create_ep returned success at client side for the 340th QP.
lehi-dirt:~ # ib_perf.exe --server-ip 192.168.219.7 --server-port 10001 -s --qp-num 1024
qp [0] local 192.168.219.7:10001 peer 192.168.219.8:37360 created.
qp [1] local 192.168.219.7:10001 peer 192.168.219.8:35951 created.
......
qp [337] local 192.168.219.7:10001 peer 192.168.219.8:50314 created.
qp [338] local 192.168.219.7:10001 peer 192.168.219.8:42648 created.
ERR_DBG:/mnt/linux-dev-framework-master/apps/ib_perf/perf_frmwk.c(515)-create_connections_server:
rdma_get_request: Cannot allocate memory
murray-dirt:/home/admin # ib_perf.exe --server-ip 192.168.219.7 --server-port 10001 -c --qp-num 200
qp [0] local 192.168.219.8:37360 peer 192.168.219.7:10001 created.
qp [1] local 192.168.219.8:35951 peer 192.168.219.7:10001 created.
......
qp [198] local 192.168.219.8:59660 peer 192.168.219.7:10001 created.
qp [199] local 192.168.219.8:48077 peer 192.168.219.7:10001 created.
200 connection(s) created in total
murray-dirt:/home/admin # ib_perf.exe --server-ip 192.168.219.7 --server-port 10001 -c --qp-num 200
qp [0] local 192.168.219.8:45772 peer 192.168.219.7:10001 created.
qp [1] local 192.168.219.8:58067 peer 192.168.219.7:10001 created.
......
qp [137] local 192.168.219.8:50314 peer 192.168.219.7:10001 created.
qp [138] local 192.168.219.8:42648 peer 192.168.219.7:10001 created.
ERR_DBG:/mnt/linux-dev-framework-master/apps/ib_perf/perf_frmwk.c(630)-create_connections_client:
rdma_connect: Connection refused
(3) NVMe_OF target runs as the server, and 2 ib_perf.exe run as client (each of them creates 200 QPs): the result is OK.
murray-dirt:/home/admin # ib_perf.exe --server-ip 169.254.85.7 --server-port 4420 -c --qp-num 200
qp [0] local 169.254.85.8:53907 peer 169.254.85.7:4420 created.
qp [1] local 169.254.85.8:57988 peer 169.254.85.7:4420 created.
......
qp [198] local 169.254.85.8:58852 peer 169.254.85.7:4420 created.
qp [199] local 169.254.85.8:33436 peer 169.254.85.7:4420 created.
200 connection(s) created in total
murray-dirt:/home/admin # ib_perf.exe --server-ip 169.254.85.7 --server-port 4420 -c --qp-num 200
qp [0] local 169.254.85.8:50105 peer 169.254.85.7:4420 created.
qp [1] local 169.254.85.8:44136 peer 169.254.85.7:4420 created.
......
qp [198] local 169.254.85.8:53581 peer 169.254.85.7:4420 created.
qp [199] local 169.254.85.8:50082 peer 169.254.85.7:4420 created.
200 connection(s) created in total
Thanks,
Tyler
reply other threads:[~2020-07-24 11:05 UTC|newest]
Thread overview: [no followups] expand[flat|nested] mbox.gz Atom feed
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=BN8PR19MB260932E68B6399764F79CDA6F5770@BN8PR19MB2609.namprd19.prod.outlook.com \
--to=tyler.sun@dell.com \
--cc=Ao.Sun@dell.com \
--cc=Libin.Zhang@dell.com \
--cc=Ping.Gan@dell.com \
--cc=Yanxiu.Cai@dell.com \
--cc=linux-rdma@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).