All of lore.kernel.org
 help / color / mirror / Atom feed
* Fail to configure NVMe-fabric over soft-RoCE
@ 2017-03-06 23:19 Youngjae Lee
  2017-03-07  9:35 ` Max Gurtovoy
  0 siblings, 1 reply; 4+ messages in thread
From: Youngjae Lee @ 2017-03-06 23:19 UTC (permalink / raw)


Hi, all

Has anyone succeeded to configure NVMe over Fabrics with soft-RoCE (rxe) ?
I'm trying it with the latest rc kernel (4.11.0-rc1), but the discover 
operation (of nvme-cli) on the client side fails. (please see the 
attached nvme-cli/dmesg logs below..)

I'm following the instructions from this page to configure it. 
https://community.mellanox.com/docs/DOC-2504
A NVMe target seems to be perfectly set up on the target server side.

Dmesg log on the target server,
[ 5574.892787] nvmet: adding nsid 10 to subsystem test
[ 5574.897461] nvmet_rdma: enabling port 1 (10.1.1.17:1023)
[ 5612.369855] nvmet: creating controller 1 for subsystem 
nqn.2014-08.org.nvmexpress.discovery for NQN 
nqn.2014-08.org.nvmexpress:NVMf:uuid:15b61008-8a88-4d7b-b9be-66600269a9e7.
[ 5673.040744] nvmet_rdma: freeing queue 0

nvme-cli output and dmesg log on the client,
root at rxe2:~/nvme-cli# ./nvme discover -t rdma -a 10.1.1.17 -s 1023
Failed to write to /dev/nvme-fabrics: Input/output error

[  386.091648] rdma_rxe: qp#17 moved to error state
[  446.756855] nvme nvme0: Identify Controller failed (16391)

I enabled debug msgs of rdma_rxe to see what happened in rdma_rxe and it 
looks like there were some errors in rdma communications during the nvme 
discover operation.
....
[ 8908.806021] rdma_rxe: qp#17 state = GET_REQ
[ 8908.806022] rdma_rxe: qp#17 state = CHK_PSN
[ 8908.806023] rdma_rxe: qp#17 state = CHK_OP_SEQ
[ 8908.806025] rdma_rxe: qp#17 state = CHK_OP_VALID
[ 8908.806026] rdma_rxe: qp#17 state = CHK_RESOURCE
[ 8908.806028] rdma_rxe: qp#17 state = CHK_LENGTH
[ 8908.806030] rdma_rxe: qp#17 state = CHK_RKEY
[ 8908.806033] rdma_rxe: qp#17 state = ERR_LENGTH
[ 8908.806035] rdma_rxe: qp#17 state = COMPLETE
[ 8908.806036] rdma_rxe: qp#17 state = CLEANUP
[ 8908.806037] rdma_rxe: qp#17 state = DONE
[ 8908.806039] rdma_rxe: qp#17 state = ERROR
[ 8908.806040] rdma_rxe: qp#17 moved to error state
.....

Any advice to resolve this issue ???

Thanks.

- Youngjae Lee

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Fail to configure NVMe-fabric over soft-RoCE
  2017-03-06 23:19 Fail to configure NVMe-fabric over soft-RoCE Youngjae Lee
@ 2017-03-07  9:35 ` Max Gurtovoy
  2017-03-07 15:39   ` Youngjae Lee
  0 siblings, 1 reply; 4+ messages in thread
From: Max Gurtovoy @ 2017-03-07  9:35 UTC (permalink / raw)


adding Moni.

Youngjae Lee,

did you run some basic rdma tests before NVMEoF ?
This is a precondition.

Moni,
please advise.

Max.

On 3/7/2017 1:19 AM, Youngjae Lee wrote:
> Hi, all
>
> Has anyone succeeded to configure NVMe over Fabrics with soft-RoCE (rxe) ?
> I'm trying it with the latest rc kernel (4.11.0-rc1), but the discover
> operation (of nvme-cli) on the client side fails. (please see the
> attached nvme-cli/dmesg logs below..)
>
> I'm following the instructions from this page to configure it.
> https://community.mellanox.com/docs/DOC-2504
> A NVMe target seems to be perfectly set up on the target server side.
>
> Dmesg log on the target server,
> [ 5574.892787] nvmet: adding nsid 10 to subsystem test
> [ 5574.897461] nvmet_rdma: enabling port 1 (10.1.1.17:1023)
> [ 5612.369855] nvmet: creating controller 1 for subsystem
> nqn.2014-08.org.nvmexpress.discovery for NQN
> nqn.2014-08.org.nvmexpress:NVMf:uuid:15b61008-8a88-4d7b-b9be-66600269a9e7.
> [ 5673.040744] nvmet_rdma: freeing queue 0
>
> nvme-cli output and dmesg log on the client,
> root at rxe2:~/nvme-cli# ./nvme discover -t rdma -a 10.1.1.17 -s 1023
> Failed to write to /dev/nvme-fabrics: Input/output error
>
> [  386.091648] rdma_rxe: qp#17 moved to error state
> [  446.756855] nvme nvme0: Identify Controller failed (16391)
>
> I enabled debug msgs of rdma_rxe to see what happened in rdma_rxe and it
> looks like there were some errors in rdma communications during the nvme
> discover operation.
> ....
> [ 8908.806021] rdma_rxe: qp#17 state = GET_REQ
> [ 8908.806022] rdma_rxe: qp#17 state = CHK_PSN
> [ 8908.806023] rdma_rxe: qp#17 state = CHK_OP_SEQ
> [ 8908.806025] rdma_rxe: qp#17 state = CHK_OP_VALID
> [ 8908.806026] rdma_rxe: qp#17 state = CHK_RESOURCE
> [ 8908.806028] rdma_rxe: qp#17 state = CHK_LENGTH
> [ 8908.806030] rdma_rxe: qp#17 state = CHK_RKEY
> [ 8908.806033] rdma_rxe: qp#17 state = ERR_LENGTH
> [ 8908.806035] rdma_rxe: qp#17 state = COMPLETE
> [ 8908.806036] rdma_rxe: qp#17 state = CLEANUP
> [ 8908.806037] rdma_rxe: qp#17 state = DONE
> [ 8908.806039] rdma_rxe: qp#17 state = ERROR
> [ 8908.806040] rdma_rxe: qp#17 moved to error state
> .....
>
> Any advice to resolve this issue ???
>
> Thanks.
>
> - Youngjae Lee
>
>
> _______________________________________________
> Linux-nvme mailing list
> Linux-nvme at lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Fail to configure NVMe-fabric over soft-RoCE
  2017-03-07  9:35 ` Max Gurtovoy
@ 2017-03-07 15:39   ` Youngjae Lee
  2017-03-07 15:57     ` Moni Shoua
  0 siblings, 1 reply; 4+ messages in thread
From: Youngjae Lee @ 2017-03-07 15:39 UTC (permalink / raw)


Hi, Max,


Sure, I passed some basic rdma tests like udaddy and ibv_rc_pingtest....

Thanks.


On 3/7/17 3:35 AM, Max Gurtovoy wrote:
> adding Moni.
>
> Youngjae Lee,
>
> did you run some basic rdma tests before NVMEoF ?
> This is a precondition.
>
> Moni,
> please advise.
>
> Max.
>
> On 3/7/2017 1:19 AM, Youngjae Lee wrote:
>> Hi, all
>>
>> Has anyone succeeded to configure NVMe over Fabrics with soft-RoCE 
>> (rxe) ?
>> I'm trying it with the latest rc kernel (4.11.0-rc1), but the discover
>> operation (of nvme-cli) on the client side fails. (please see the
>> attached nvme-cli/dmesg logs below..)
>>
>> I'm following the instructions from this page to configure it.
>> https://community.mellanox.com/docs/DOC-2504
>> A NVMe target seems to be perfectly set up on the target server side.
>>
>> Dmesg log on the target server,
>> [ 5574.892787] nvmet: adding nsid 10 to subsystem test
>> [ 5574.897461] nvmet_rdma: enabling port 1 (10.1.1.17:1023)
>> [ 5612.369855] nvmet: creating controller 1 for subsystem
>> nqn.2014-08.org.nvmexpress.discovery for NQN
>> nqn.2014-08.org.nvmexpress:NVMf:uuid:15b61008-8a88-4d7b-b9be-66600269a9e7. 
>>
>> [ 5673.040744] nvmet_rdma: freeing queue 0
>>
>> nvme-cli output and dmesg log on the client,
>> root at rxe2:~/nvme-cli# ./nvme discover -t rdma -a 10.1.1.17 -s 1023
>> Failed to write to /dev/nvme-fabrics: Input/output error
>>
>> [  386.091648] rdma_rxe: qp#17 moved to error state
>> [  446.756855] nvme nvme0: Identify Controller failed (16391)
>>
>> I enabled debug msgs of rdma_rxe to see what happened in rdma_rxe and it
>> looks like there were some errors in rdma communications during the nvme
>> discover operation.
>> ....
>> [ 8908.806021] rdma_rxe: qp#17 state = GET_REQ
>> [ 8908.806022] rdma_rxe: qp#17 state = CHK_PSN
>> [ 8908.806023] rdma_rxe: qp#17 state = CHK_OP_SEQ
>> [ 8908.806025] rdma_rxe: qp#17 state = CHK_OP_VALID
>> [ 8908.806026] rdma_rxe: qp#17 state = CHK_RESOURCE
>> [ 8908.806028] rdma_rxe: qp#17 state = CHK_LENGTH
>> [ 8908.806030] rdma_rxe: qp#17 state = CHK_RKEY
>> [ 8908.806033] rdma_rxe: qp#17 state = ERR_LENGTH
>> [ 8908.806035] rdma_rxe: qp#17 state = COMPLETE
>> [ 8908.806036] rdma_rxe: qp#17 state = CLEANUP
>> [ 8908.806037] rdma_rxe: qp#17 state = DONE
>> [ 8908.806039] rdma_rxe: qp#17 state = ERROR
>> [ 8908.806040] rdma_rxe: qp#17 moved to error state
>> .....
>>
>> Any advice to resolve this issue ???
>>
>> Thanks.
>>
>> - Youngjae Lee
>>
>>
>> _______________________________________________
>> Linux-nvme mailing list
>> Linux-nvme at lists.infradead.org
>> http://lists.infradead.org/mailman/listinfo/linux-nvme
>
> _______________________________________________
> Linux-nvme mailing list
> Linux-nvme at lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-nvme
>

-- 
Best Regards.

- Youngjae Lee

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Fail to configure NVMe-fabric over soft-RoCE
  2017-03-07 15:39   ` Youngjae Lee
@ 2017-03-07 15:57     ` Moni Shoua
  0 siblings, 0 replies; 4+ messages in thread
From: Moni Shoua @ 2017-03-07 15:57 UTC (permalink / raw)


Max, please  take it and let me know if you see any connectivity issues 

-----Original Message-----
From: Youngjae Lee [mailto:leeyo@linux.vnet.ibm.com] 
Sent: Tuesday, March 07, 2017 5:39 PM
To: Max Gurtovoy <maxg at mellanox.com>; linux-nvme at lists.infradead.org; Moni Shoua <monis at mellanox.com>
Subject: Re: Fail to configure NVMe-fabric over soft-RoCE

Hi, Max,


Sure, I passed some basic rdma tests like udaddy and ibv_rc_pingtest....

Thanks.


On 3/7/17 3:35 AM, Max Gurtovoy wrote:
> adding Moni.
>
> Youngjae Lee,
>
> did you run some basic rdma tests before NVMEoF ?
> This is a precondition.
>
> Moni,
> please advise.
>
> Max.
>
> On 3/7/2017 1:19 AM, Youngjae Lee wrote:
>> Hi, all
>>
>> Has anyone succeeded to configure NVMe over Fabrics with soft-RoCE
>> (rxe) ?
>> I'm trying it with the latest rc kernel (4.11.0-rc1), but the 
>> discover operation (of nvme-cli) on the client side fails. (please 
>> see the attached nvme-cli/dmesg logs below..)
>>
>> I'm following the instructions from this page to configure it.
>> https://community.mellanox.com/docs/DOC-2504
>> A NVMe target seems to be perfectly set up on the target server side.
>>
>> Dmesg log on the target server,
>> [ 5574.892787] nvmet: adding nsid 10 to subsystem test [ 5574.897461] 
>> nvmet_rdma: enabling port 1 (10.1.1.17:1023) [ 5612.369855] nvmet: 
>> creating controller 1 for subsystem 
>> nqn.2014-08.org.nvmexpress.discovery for NQN 
>> nqn.2014-08.org.nvmexpress:NVMf:uuid:15b61008-8a88-4d7b-b9be-66600269a9e7.
>>
>> [ 5673.040744] nvmet_rdma: freeing queue 0
>>
>> nvme-cli output and dmesg log on the client, root at rxe2:~/nvme-cli# 
>> ./nvme discover -t rdma -a 10.1.1.17 -s 1023 Failed to write to 
>> /dev/nvme-fabrics: Input/output error
>>
>> [  386.091648] rdma_rxe: qp#17 moved to error state [  446.756855] 
>> nvme nvme0: Identify Controller failed (16391)
>>
>> I enabled debug msgs of rdma_rxe to see what happened in rdma_rxe and 
>> it looks like there were some errors in rdma communications during 
>> the nvme discover operation.
>> ....
>> [ 8908.806021] rdma_rxe: qp#17 state = GET_REQ [ 8908.806022] 
>> rdma_rxe: qp#17 state = CHK_PSN [ 8908.806023] rdma_rxe: qp#17 state 
>> = CHK_OP_SEQ [ 8908.806025] rdma_rxe: qp#17 state = CHK_OP_VALID [ 
>> 8908.806026] rdma_rxe: qp#17 state = CHK_RESOURCE [ 8908.806028] 
>> rdma_rxe: qp#17 state = CHK_LENGTH [ 8908.806030] rdma_rxe: qp#17 
>> state = CHK_RKEY [ 8908.806033] rdma_rxe: qp#17 state = ERR_LENGTH [ 
>> 8908.806035] rdma_rxe: qp#17 state = COMPLETE [ 8908.806036] 
>> rdma_rxe: qp#17 state = CLEANUP [ 8908.806037] rdma_rxe: qp#17 state 
>> = DONE [ 8908.806039] rdma_rxe: qp#17 state = ERROR [ 8908.806040] 
>> rdma_rxe: qp#17 moved to error state .....
>>
>> Any advice to resolve this issue ???
>>
>> Thanks.
>>
>> - Youngjae Lee
>>
>>
>> _______________________________________________
>> Linux-nvme mailing list
>> Linux-nvme at lists.infradead.org
>> http://lists.infradead.org/mailman/listinfo/linux-nvme
>
> _______________________________________________
> Linux-nvme mailing list
> Linux-nvme at lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-nvme
>

--
Best Regards.

- Youngjae Lee

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2017-03-07 15:57 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-03-06 23:19 Fail to configure NVMe-fabric over soft-RoCE Youngjae Lee
2017-03-07  9:35 ` Max Gurtovoy
2017-03-07 15:39   ` Youngjae Lee
2017-03-07 15:57     ` Moni Shoua

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.