From: Shoaib Rao <rao.shoaib@oracle.com>
To: Zhu Yanjun <zyjzyj2000@gmail.com>
Cc: RDMA mailing list <linux-rdma@vger.kernel.org>,
Jason Gunthorpe <jgg@ziepe.ca>
Subject: Re: [PATCH v2 1/1] RDMA/rxe: Bump up default maximum values used via uverbs
Date: Sun, 18 Jul 2021 16:00:02 -0700 [thread overview]
Message-ID: <2b8a485d-d9bb-1579-613c-c32f580f3f2a@oracle.com> (raw)
In-Reply-To: <74c62daa-9f20-cb51-ff6a-fb7def78b444@oracle.com>
Thanks for reporting the issue. I was able to reproduce the issue and
have fixed it. Kindly try gain.
Regards,
Shoaib
On 7/18/21 12:46 PM, Shoaib Rao wrote:
> Your urls are garbled, so I can not make any sense out of them.
>
> [root@ca-dev141 linux]# git remote --v
> origin
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
> (fetch)
> origin
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git (push)
>
> [root@ca-dev141 linux]# git show 8c1b4316c3fa
> fatal: ambiguous argument '8c1b4316c3fa': unknown revision or path not
> in the working tree.
>
> [root@ca-dev141 linux]# git log --grep='RDMA/efa: Split hardware'
> [root@ca-dev141 linux]#
>
> [root@ca-dev141 linux]# git describe
> v5.14-rc1-17-gc06676b5953b
>
> You need to do a little bit more work on your end and tell me which
> value rxe is complaining about. Also provide a kernel version that you
> are using as I have provided above.
>
> Shoaib
>
> On 7/18/21 12:59 AM, Zhu Yanjun wrote:
>> On Sat, Jul 17, 2021 at 2:09 AM Shoaib Rao <rao.shoaib@oracle.com>
>> wrote:
>>>
>>> On 7/15/21 9:07 PM, Zhu Yanjun wrote:
>>>
>>> On Fri, Jul 16, 2021 at 12:44 AM Shoaib Rao <rao.shoaib@oracle.com>
>>> wrote:
>>>
>>> Following is a link
>>>
>>> https://urldefense.com/v3/__https://marc.info/?l=linux-rdma&m=162395437604846&w=2__;!!ACWV5N9M2RV99hQ!f9TPWtgUxambtSeQ_L3h-IH7CW3SifyiumB3kjc2v_w6Ec_WYVtjWyusEtgQ60iw$
>>>
>>>
>>> Or just search for my name in the archive.
>>>
>>> Do you see any issues with this value?
>>>
>>> After this commit is applied, I confronted the following problem
>>> "
>>> [ 639.943561] rdma_rxe: unloaded
>>> [ 679.717143] rdma_rxe: loaded
>>> [ 691.721055] rdma_rxe: not enough indices for max_elem
>>> "
>>> Not sure if this problem is introduced by this commit. Please help to
>>> check this problem.
>>> Thanks a lot.
>>>
>>> Zhu Yanjun
>>>
>>> I do not see the issue.
>>>
>>> [root@ca-dev66 rxe]# lsmod | grep rdma_rxe
>>> rdma_rxe 126976 0
>>> ip6_udp_tunnel 16384 1 rdma_rxe
>>> udp_tunnel 20480 1 rdma_rxe
>>> ib_uverbs 147456 3 rdma_rxe,rdma_ucm,mlx5_ib
>>> ib_core 364544 10
>>> rdma_cm,ib_ipoib,rdma_rxe,rds_rdma,iw_cm,ib_umad,rdma_ucm,ib_uverbs,mlx5_ib,ib_cm
>>>
>>> I am using gdb to dump the values
>>>
>>> (gdb) ptype RXE_MAX_INLINE_DATA
>>> type = enum rxe_device_param {RXE_MAX_MR_SIZE = -1,
>>> RXE_PAGE_SIZE_CAP = 4294963200, RXE_MAX_QP_WR = 1048576,
>>> RXE_DEVICE_CAP_FLAGS = 137466362998, RXE_MAX_SGE = 32,
>>> RXE_MAX_WQE_SIZE = 720, RXE_MAX_INLINE_DATA = 512,
>>> RXE_MAX_SGE_RD = 32,
>>> RXE_MAX_CQ = 1048576, RXE_MAX_LOG_CQE = 15, RXE_MAX_PD = 1048576,
>>> RXE_MAX_QP_RD_ATOM = 128, RXE_MAX_RES_RD_ATOM = 258048,
>>> RXE_MAX_QP_INIT_RD_ATOM = 128, RXE_MAX_MCAST_GRP = 8192,
>>> RXE_MAX_MCAST_QP_ATTACH = 56, RXE_MAX_TOT_MCAST_QP_ATTACH =
>>> 458752,
>>> RXE_MAX_AH = 1048576, RXE_MAX_SRQ_WR = 1048576, RXE_MIN_SRQ_WR
>>> = 1,
>>> RXE_MAX_SRQ_SGE = 27, RXE_MIN_SRQ_SGE = 1,
>>> RXE_MAX_FMR_PAGE_LIST_LEN = 512, RXE_MAX_PKEYS = 64,
>>> RXE_LOCAL_CA_ACK_DELAY = 15, RXE_MAX_UCONTEXT = 1048576,
>>> RXE_NUM_PORT = 1,
>>> RXE_MAX_QP = 1048576, RXE_MIN_QP_INDEX = 16, RXE_MAX_QP_INDEX =
>>> 262144,
>>> RXE_MAX_SRQ = 1048576, RXE_MIN_SRQ_INDEX = 131073,
>>> RXE_MAX_SRQ_INDEX = 262144, RXE_MAX_MR = 1048576, RXE_MAX_MW =
>>> 4096,
>>> RXE_MIN_MR_INDEX = 1, RXE_MAX_MR_INDEX = 262144,
>>> RXE_MIN_MW_INDEX = 65537,
>>> RXE_MAX_MW_INDEX = 131072, RXE_MAX_PKT_PER_ACK = 64,
>>> RXE_MAX_UNACKED_PSNS = 128, RXE_INFLIGHT_SKBS_PER_QP_HIGH = 64,
>>> RXE_INFLIGHT_SKBS_PER_QP_LOW = 16, RXE_NSEC_ARB_TIMER_DELAY = 200,
>>> RXE_VENDOR_ID = 16777215}
>>>
>>> It is possible that you are changing some values.
>>
>> I git clone the source code from
>> https://urldefense.com/v3/__https://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma.git__;!!ACWV5N9M2RV99hQ!efPsL9fikzm6Ob1-zsmbsAzPkfqRpxYkp9oJXF9csPOJgoei9njj8RmA5IyuH447$
>> again.
>> And the first 3 commits are as below. One is your commit.
>> "
>> 6163ad3208c8 (HEAD -> for-next) RDMA/rxe: Bump up default maximum
>> values used via uverbs
>> 8c1b4316c3fa (origin/for-next, origin/HEAD) RDMA/efa: Split hardware
>> stats to device and port stats
>> 916071185b17 MAINTAINERS: Update maintainers of HiSilicon RoCE
>> "
>>
>> And when I tried to add a new rxe0,
>> "
>> error: Invalid argument
>> "
>> And the dmesg logs are as below:
>>
>> # dmesg
>> "
>> [ 70.782302] e1000: enp0s8 NIC Link is Up 1000 Mbps Full Duplex,
>> Flow Control: RX
>> [ 70.782744] IPv6: ADDRCONF(NETDEV_CHANGE): enp0s8: link becomes ready
>> [ 79.467652] rdma_rxe: loaded
>> [ 79.468348] rdma_rxe: not enough indices for max_elem
>> [ 79.468356] rdma_rxe: failed to add enp0s8
>> "
>>
>> Please follow my steps to reproduce this problem again.
>>
>> Zhu Yanjun
>>
>>> Shoaib
>>>
>>> Shoaib
>>>
>>> On 7/14/21 10:02 PM, Zhu Yanjun wrote:
>>>
>>> On Wed, Jul 14, 2021 at 4:36 PM Rao Shoaib <Rao.Shoaib@oracle.com>
>>> wrote:
>>>
>>> From: Rao Shoaib <rshoaib@ca-dev141.us.oracle.com>
>>>
>>> In our internal testing we have found that the
>>> current maximum are too smalls. Ideally there should
>>> be no limits but currently maximum values are reported
>>> via ibv_query_device, so we have to keep maximum values
>>> but they have been made suffiently large.
>>>
>>> Resubmitting after fixing an issue reported by test robot.
>>>
>>> Reported-by: kernel test robot <lkp@intel.com>
>>>
>>> Signed-off-by: Rao Shoaib <rshoaib@ca-dev141.us.oracle.com>
>>> ---
>>> drivers/infiniband/sw/rxe/rxe_param.h | 26
>>> ++++++++++++++------------
>>> 1 file changed, 14 insertions(+), 12 deletions(-)
>>>
>>> diff --git a/drivers/infiniband/sw/rxe/rxe_param.h
>>> b/drivers/infiniband/sw/rxe/rxe_param.h
>>> index 742e6ec93686..092dbff890f2 100644
>>> --- a/drivers/infiniband/sw/rxe/rxe_param.h
>>> +++ b/drivers/infiniband/sw/rxe/rxe_param.h
>>> @@ -9,6 +9,8 @@
>>>
>>> #include <uapi/rdma/rdma_user_rxe.h>
>>>
>>> +#define DEFAULT_MAX_VALUE (1 << 20)
>>>
>>> Can you let me know the link in which the above value is discussed?
>>>
>>> Thanks,
>>> Zhu Yanjun
>>>
>>> +
>>> static inline enum ib_mtu rxe_mtu_int_to_enum(int mtu)
>>> {
>>> if (mtu < 256)
>>> @@ -37,7 +39,7 @@ static inline enum ib_mtu eth_mtu_int_to_enum(int
>>> mtu)
>>> enum rxe_device_param {
>>> RXE_MAX_MR_SIZE = -1ull,
>>> RXE_PAGE_SIZE_CAP = 0xfffff000,
>>> - RXE_MAX_QP_WR = 0x4000,
>>> + RXE_MAX_QP_WR = DEFAULT_MAX_VALUE,
>>> RXE_DEVICE_CAP_FLAGS = IB_DEVICE_BAD_PKEY_CNTR
>>> | IB_DEVICE_BAD_QKEY_CNTR
>>> | IB_DEVICE_AUTO_PATH_MIG
>>> @@ -58,40 +60,40 @@ enum rxe_device_param {
>>> RXE_MAX_INLINE_DATA = RXE_MAX_WQE_SIZE -
>>> sizeof(struct
>>> rxe_send_wqe),
>>> RXE_MAX_SGE_RD = 32,
>>> - RXE_MAX_CQ = 16384,
>>> + RXE_MAX_CQ = DEFAULT_MAX_VALUE,
>>> RXE_MAX_LOG_CQE = 15,
>>> - RXE_MAX_PD = 0x7ffc,
>>> + RXE_MAX_PD = DEFAULT_MAX_VALUE,
>>> RXE_MAX_QP_RD_ATOM = 128,
>>> RXE_MAX_RES_RD_ATOM = 0x3f000,
>>> RXE_MAX_QP_INIT_RD_ATOM = 128,
>>> RXE_MAX_MCAST_GRP = 8192,
>>> RXE_MAX_MCAST_QP_ATTACH = 56,
>>> RXE_MAX_TOT_MCAST_QP_ATTACH = 0x70000,
>>> - RXE_MAX_AH = 100,
>>> - RXE_MAX_SRQ_WR = 0x4000,
>>> + RXE_MAX_AH = DEFAULT_MAX_VALUE,
>>> + RXE_MAX_SRQ_WR = DEFAULT_MAX_VALUE,
>>> RXE_MIN_SRQ_WR = 1,
>>> RXE_MAX_SRQ_SGE = 27,
>>> RXE_MIN_SRQ_SGE = 1,
>>> RXE_MAX_FMR_PAGE_LIST_LEN = 512,
>>> - RXE_MAX_PKEYS = 1,
>>> + RXE_MAX_PKEYS = 64,
>>> RXE_LOCAL_CA_ACK_DELAY = 15,
>>>
>>> - RXE_MAX_UCONTEXT = 512,
>>> + RXE_MAX_UCONTEXT = DEFAULT_MAX_VALUE,
>>>
>>> RXE_NUM_PORT = 1,
>>>
>>> - RXE_MAX_QP = 0x10000,
>>> + RXE_MAX_QP = DEFAULT_MAX_VALUE,
>>> RXE_MIN_QP_INDEX = 16,
>>> - RXE_MAX_QP_INDEX = 0x00020000,
>>> + RXE_MAX_QP_INDEX = 0x00040000,
>>>
>>> - RXE_MAX_SRQ = 0x00001000,
>>> + RXE_MAX_SRQ = DEFAULT_MAX_VALUE,
>>> RXE_MIN_SRQ_INDEX = 0x00020001,
>>> RXE_MAX_SRQ_INDEX = 0x00040000,
>>>
>>> - RXE_MAX_MR = 0x00001000,
>>> + RXE_MAX_MR = DEFAULT_MAX_VALUE,
>>> RXE_MAX_MW = 0x00001000,
>>> RXE_MIN_MR_INDEX = 0x00000001,
>>> - RXE_MAX_MR_INDEX = 0x00010000,
>>> + RXE_MAX_MR_INDEX = 0x00040000,
>>> RXE_MIN_MW_INDEX = 0x00010001,
>>> RXE_MAX_MW_INDEX = 0x00020000,
>>>
>>> --
>>> 2.27.0
>>>
next prev parent reply other threads:[~2021-07-18 23:00 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-07-14 8:35 [PATCH v2 0/1] RDMA/rxe: Bump up default maximum values used via uverbs Rao Shoaib
2021-07-14 8:35 ` [PATCH v2 1/1] " Rao Shoaib
2021-07-15 5:02 ` Zhu Yanjun
2021-07-15 16:43 ` Shoaib Rao
2021-07-16 4:07 ` Zhu Yanjun
[not found] ` <8eba2b02-36c5-e14c-3503-0e1cfeea4dd9@oracle.com>
2021-07-18 7:59 ` Zhu Yanjun
2021-07-18 19:46 ` Shoaib Rao
2021-07-18 23:00 ` Shoaib Rao [this message]
2021-07-19 2:25 ` Zhu Yanjun
2021-07-19 4:20 ` Shoaib Rao
2021-07-19 8:12 ` Zhu Yanjun
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=2b8a485d-d9bb-1579-613c-c32f580f3f2a@oracle.com \
--to=rao.shoaib@oracle.com \
--cc=jgg@ziepe.ca \
--cc=linux-rdma@vger.kernel.org \
--cc=zyjzyj2000@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).