Linux-RDMA Archive on lore.kernel.org
 help / color / Atom feed
* Re: 【BugReport】ibv_srq_pingpong test bug
       [not found] <36d848a6254b46c097b94046e3569fac@huawei.com>
@ 2019-09-11 10:33 ` Yishai Hadas
  2019-09-11 11:13   ` oulijun
  0 siblings, 1 reply; 2+ messages in thread
From: Yishai Hadas @ 2019-09-11 10:33 UTC (permalink / raw)
  To: oulijun; +Cc: dledford, roland, linux-rdma, Yishai Hadas

On 9/11/2019 10:26 AM, oulijun wrote:
> Hi, Roland Dreier and others
> 
>            I am using ibv_srq_pingpong to test based on hip08. The test 
> result as follows:
> 
>            local address:  LID 0x0000, QPN 0x0000ff, PSN 0xdca3b1, GID ::
> 
>            local address:  LID 0x0000, QPN 0x000100, PSN 0xf62247, GID ::
> 
>            local address:  LID 0x0000, QPN 0x000101, PSN 0x7de385, GID ::
> 
>            local address:  LID 0x0000, QPN 0x000102, PSN 0xc5fcf0, GID ::
> 
>           local address:  LID 0x0000, QPN 0x000103, PSN 0x3e0843, GID ::
> 
>            local address:  LID 0x0000, QPN 0x000104, PSN 0x320be9, GID ::
> 
>            local address:  LID 0x0000, QPN 0x000105, PSN 0xb82994, GID ::
> 
>            local address:  LID 0x0000, QPN 0x000106, PSN 0xf9e7fd, GID ::
> 
>            local address:  LID 0x0000, QPN 0x000107, PSN 0xdfee5d, GID ::
> 
>            local address:  LID 0x0000, QPN 0x000108, PSN 0x02891b, GID ::
> 
>            local address:  LID 0x0000, QPN 0x000109, PSN 0x37d823, GID ::
> 
>            local address:  LID 0x0000, QPN 0x00010a, PSN 0x75397a, GID ::
> 
>            local address:  LID 0x0000, QPN 0x00010b, PSN 0x0e02de, GID ::
> 
>            local address:  LID 0x0000, QPN 0x00010c, PSN 0x7e9633, GID ::
> 
>            local address:  LID 0x0000, QPN 0x00010d, PSN 0x5b4a75, GID ::
> 
>            local address:  LID 0x0000, QPN 0x00010e, PSN 0xe9a195, GID ::
> 
> Failed to modify QP[0] to RTR
>

As of the below trace it looks as you are using RoCE, correct ? if so, 
you need to supply a gid in the command line (e.g -g 0).

> Couldn't connect to remote QP
> 
>            I am targeting as follows:
> 
>            When called the ibv_modify_qp run and it will trace as follows:
> 
> static int rdma_check_ah_attr(struct ib_device *device,
> 
> 409                               struct rdma_ah_attr *ah_attr)
> 
> 410 {
> 
> 411         if (!rdma_is_port_valid(device, ah_attr->port_num))
> 
> 412                 return -EINVAL;
> 
> 413         printk("[%s, %d] point!\n", __func__, __LINE__);
> 
> 414         printk("[%s, %d] rdma_is_grh_required(device, 
> ah_attr->port_num) = %d\n",
> 
> 415                 __func__, __LINE__, rdma_is_grh_required(device, 
> ah_attr->port_num));
> 
> 416         printk("[%s, %d] ah_attr->type = %d!\n", __func__, __LINE__, 
> ah_attr->type);
> 
> 417         printk("[%s, %d] ah_attr->ah_flags = %d!\n", __func__, 
> __LINE__, ah_attr->ah_flags);
> 
> 418         if ((rdma_is_grh_required(device, ah_attr->port_num) ||
> 
> 419              ah_attr->type == RDMA_AH_ATTR_TYPE_ROCE) &&
> 
> 420             !(ah_attr->ah_flags & IB_AH_GRH))
> 
> 421                 return -EINVAL;
> 
> 422         printk("[%s, %d] point!\n", __func__, __LINE__);
> 
> 423         if (ah_attr->grh.sgid_attr) {
> 
> 424                 /*
> 
> 425                  * Make sure the passed sgid_attr is consistent with the
> 
> 426                  * parameters
> 
> 427                  */
> 
> 428                 if (ah_attr->grh.sgid_attr->index != 
> ah_attr->grh.sgid_index ||
> 
> 429                     ah_attr->grh.sgid_attr->port_num != 
> ah_attr->port_num)
> 
> 430                         return -EINVAL;
> 
> 431         }
> 
> 432         printk("[%s, %d] point!\n", __func__, __LINE__);
> 
> 433         return 0;
> 
> When trace at 420 lines, it will return fail.  I don’t understand the 
> lines. Because it should be right  when run roce mode.
> 
> The ah_attr->ah_flags is RDMA_AH_ATTR_TYPE_ROCE and ah_attr->ah_flags 
> should be IB_AH_GRH
> 
> However the value of ah_attr->ah_flags is 2.  I think that the value of 
> attr->ah_flags should have a protocol layer guarantee
> 
> So, I doubt that the protocol layer or ibv_srq_pingpong have an achieve 
> defects
> 
> At the same time I used ibv_srq_pingpong to test on cx5,  the result is 
> the same:
> 
> root@ubuntu-51-7:~# ibv_srq_pingpong -d mlx5_0 -p 10002
> 
>    local address:  LID 0x0000, QPN 0x0000ff, PSN 0xdca3b1, GID ::
> 
>    local address:  LID 0x0000, QPN 0x000100, PSN 0xf62247, GID ::
> 
>    local address:  LID 0x0000, QPN 0x000101, PSN 0x7de385, GID ::
> 
>    local address:  LID 0x0000, QPN 0x000102, PSN 0xc5fcf0, GID ::
> 
>    local address:  LID 0x0000, QPN 0x000103, PSN 0x3e0843, GID ::
> 
>    local address:  LID 0x0000, QPN 0x000104, PSN 0x320be9, GID ::
> 
>    local address:  LID 0x0000, QPN 0x000105, PSN 0xb82994, GID ::
> 
>    local address:  LID 0x0000, QPN 0x000106, PSN 0xf9e7fd, GID ::
> 
>    local address:  LID 0x0000, QPN 0x000107, PSN 0xdfee5d, GID ::
> 
>    local address:  LID 0x0000, QPN 0x000108, PSN 0x02891b, GID ::
> 
>    local address:  LID 0x0000, QPN 0x000109, PSN 0x37d823, GID ::
> 
>    local address:  LID 0x0000, QPN 0x00010a, PSN 0x75397a, GID ::
> 
>    local address:  LID 0x0000, QPN 0x00010b, PSN 0x0e02de, GID ::
> 
>    local address:  LID 0x0000, QPN 0x00010c, PSN 0x7e9633, GID ::
> 
>    local address:  LID 0x0000, QPN 0x00010d, PSN 0x5b4a75, GID ::
> 
>    local address:  LID 0x0000, QPN 0x00010e, PSN 0xe9a195, GID ::
> 
> Failed to modify QP[0] to RTR
> 
> Couldn't connect to remote QP
> 
> Thanks
> 
> Lijun Ou
> 


^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: 【BugReport】ibv_srq_pingpong test bug
  2019-09-11 10:33 ` 【BugReport】ibv_srq_pingpong test bug Yishai Hadas
@ 2019-09-11 11:13   ` oulijun
  0 siblings, 0 replies; 2+ messages in thread
From: oulijun @ 2019-09-11 11:13 UTC (permalink / raw)
  To: Yishai Hadas; +Cc: dledford, roland, linux-rdma, Yishai Hadas

在 2019/9/11 18:33, Yishai Hadas 写道:
> On 9/11/2019 10:26 AM, oulijun wrote:
>> Hi, Roland Dreier and others
>>
>>            I am using ibv_srq_pingpong to test based on hip08. The test result as follows:
>>
>>            local address:  LID 0x0000, QPN 0x0000ff, PSN 0xdca3b1, GID ::
>>
>>            local address:  LID 0x0000, QPN 0x000100, PSN 0xf62247, GID ::
>>
>>            local address:  LID 0x0000, QPN 0x000101, PSN 0x7de385, GID ::
>>
>>            local address:  LID 0x0000, QPN 0x000102, PSN 0xc5fcf0, GID ::
>>
>>           local address:  LID 0x0000, QPN 0x000103, PSN 0x3e0843, GID ::
>>
>>            local address:  LID 0x0000, QPN 0x000104, PSN 0x320be9, GID ::
>>
>>            local address:  LID 0x0000, QPN 0x000105, PSN 0xb82994, GID ::
>>
>>            local address:  LID 0x0000, QPN 0x000106, PSN 0xf9e7fd, GID ::
>>
>>            local address:  LID 0x0000, QPN 0x000107, PSN 0xdfee5d, GID ::
>>
>>            local address:  LID 0x0000, QPN 0x000108, PSN 0x02891b, GID ::
>>
>>            local address:  LID 0x0000, QPN 0x000109, PSN 0x37d823, GID ::
>>
>>            local address:  LID 0x0000, QPN 0x00010a, PSN 0x75397a, GID ::
>>
>>            local address:  LID 0x0000, QPN 0x00010b, PSN 0x0e02de, GID ::
>>
>>            local address:  LID 0x0000, QPN 0x00010c, PSN 0x7e9633, GID ::
>>
>>            local address:  LID 0x0000, QPN 0x00010d, PSN 0x5b4a75, GID ::
>>
>>            local address:  LID 0x0000, QPN 0x00010e, PSN 0xe9a195, GID ::
>>
>> Failed to modify QP[0] to RTR
>>
>
> As of the below trace it looks as you are using RoCE, correct ? if so, you need to supply a gid in the command line (e.g -g 0).
Yes. is this the limited for using this tool? I will try it. thanks.
>
>> Couldn't connect to remote QP
>>
>>            I am targeting as follows:
>>
>>            When called the ibv_modify_qp run and it will trace as follows:
>>
>> static int rdma_check_ah_attr(struct ib_device *device,
>>
>> 409                               struct rdma_ah_attr *ah_attr)
>>
>> 410 {
>>
>> 411         if (!rdma_is_port_valid(device, ah_attr->port_num))
>>
>> 412                 return -EINVAL;
>>
>> 413         printk("[%s, %d] point!\n", __func__, __LINE__);
>>
>> 414         printk("[%s, %d] rdma_is_grh_required(device, ah_attr->port_num) = %d\n",
>>
>> 415                 __func__, __LINE__, rdma_is_grh_required(device, ah_attr->port_num));
>>
>> 416         printk("[%s, %d] ah_attr->type = %d!\n", __func__, __LINE__, ah_attr->type);
>>
>> 417         printk("[%s, %d] ah_attr->ah_flags = %d!\n", __func__, __LINE__, ah_attr->ah_flags);
>>
>> 418         if ((rdma_is_grh_required(device, ah_attr->port_num) ||
>>
>> 419              ah_attr->type == RDMA_AH_ATTR_TYPE_ROCE) &&
>>
>> 420             !(ah_attr->ah_flags & IB_AH_GRH))
>>
>> 421                 return -EINVAL;
>>
>> 422         printk("[%s, %d] point!\n", __func__, __LINE__);
>>
>> 423         if (ah_attr->grh.sgid_attr) {
>>
>> 424                 /*
>>
>> 425                  * Make sure the passed sgid_attr is consistent with the
>>
>> 426                  * parameters
>>
>> 427                  */
>>
>> 428                 if (ah_attr->grh.sgid_attr->index != ah_attr->grh.sgid_index ||
>>
>> 429                     ah_attr->grh.sgid_attr->port_num != ah_attr->port_num)
>>
>> 430                         return -EINVAL;
>>
>> 431         }
>>
>> 432         printk("[%s, %d] point!\n", __func__, __LINE__);
>>
>> 433         return 0;
>>
>> When trace at 420 lines, it will return fail.  I don’t understand the lines. Because it should be right  when run roce mode.
>>
>> The ah_attr->ah_flags is RDMA_AH_ATTR_TYPE_ROCE and ah_attr->ah_flags should be IB_AH_GRH
>>
>> However the value of ah_attr->ah_flags is 2.  I think that the value of attr->ah_flags should have a protocol layer guarantee
>>
>> So, I doubt that the protocol layer or ibv_srq_pingpong have an achieve defects
>>
>> At the same time I used ibv_srq_pingpong to test on cx5,  the result is the same:
>>
>> root@ubuntu-51-7:~# ibv_srq_pingpong -d mlx5_0 -p 10002
>>
>>    local address:  LID 0x0000, QPN 0x0000ff, PSN 0xdca3b1, GID ::
>>
>>    local address:  LID 0x0000, QPN 0x000100, PSN 0xf62247, GID ::
>>
>>    local address:  LID 0x0000, QPN 0x000101, PSN 0x7de385, GID ::
>>
>>    local address:  LID 0x0000, QPN 0x000102, PSN 0xc5fcf0, GID ::
>>
>>    local address:  LID 0x0000, QPN 0x000103, PSN 0x3e0843, GID ::
>>
>>    local address:  LID 0x0000, QPN 0x000104, PSN 0x320be9, GID ::
>>
>>    local address:  LID 0x0000, QPN 0x000105, PSN 0xb82994, GID ::
>>
>>    local address:  LID 0x0000, QPN 0x000106, PSN 0xf9e7fd, GID ::
>>
>>    local address:  LID 0x0000, QPN 0x000107, PSN 0xdfee5d, GID ::
>>
>>    local address:  LID 0x0000, QPN 0x000108, PSN 0x02891b, GID ::
>>
>>    local address:  LID 0x0000, QPN 0x000109, PSN 0x37d823, GID ::
>>
>>    local address:  LID 0x0000, QPN 0x00010a, PSN 0x75397a, GID ::
>>
>>    local address:  LID 0x0000, QPN 0x00010b, PSN 0x0e02de, GID ::
>>
>>    local address:  LID 0x0000, QPN 0x00010c, PSN 0x7e9633, GID ::
>>
>>    local address:  LID 0x0000, QPN 0x00010d, PSN 0x5b4a75, GID ::
>>
>>    local address:  LID 0x0000, QPN 0x00010e, PSN 0xe9a195, GID ::
>>
>> Failed to modify QP[0] to RTR
>>
>> Couldn't connect to remote QP
>>
>> Thanks
>>
>> Lijun Ou
>>
>
>
> .
>



^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, back to index

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <36d848a6254b46c097b94046e3569fac@huawei.com>
2019-09-11 10:33 ` 【BugReport】ibv_srq_pingpong test bug Yishai Hadas
2019-09-11 11:13   ` oulijun

Linux-RDMA Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-rdma/0 linux-rdma/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-rdma linux-rdma/ https://lore.kernel.org/linux-rdma \
		linux-rdma@vger.kernel.org linux-rdma@archiver.kernel.org
	public-inbox-index linux-rdma

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-rdma


AGPL code for this site: git clone https://public-inbox.org/ public-inbox