Hi Shahar,


Thank you for careful reply. I cound understand what you say well.

In our case, we control reactor's location too.


Thanks,

Shuhei





差出人: SPDK <spdk-bounces@lists.01.org> が Shahar Salzman <shahar.salzman@kaminario.com> の代理で送信
送信日時: 2018年5月28日 17:49
宛先: Storage Performance Development Kit
件名: [!]Re: [SPDK] Running initiator and target on same host
 

Hi Shuhei,


The only reason I stated the reactor was that I noticed that he has a 4 CPU box, so if 2 are allocated for reactor, then there is a high probability that if the initiator process is not run on another CPU then it may get timed out due to the reactor polling.


Shahar



From: SPDK <spdk-bounces@lists.01.org> on behalf of 松本周平 / MATSUMOTO,SHUUHEI <shuhei.matsumoto.xt@hitachi.com>
Sent: Monday, May 28, 2018 4:45:45 AM
To: Storage Performance Development Kit
Subject: Re: [SPDK] Running initiator and target on same host
 

Hi Sudheendra,


What you are saying is when SPDK nvmf-tgt starts, ssh is exited immediately?

I'm afraid that if you use the same RNIC for the console connection and SPDK nvmf-tgt, SPDK nvmf-tgt takes control of RNIC from kernel by VFIO.


We have confirmed working when both nvmf-initiator and nvmf-tgt are SPDK and on the same server.

We have used another 1G NIC for ssh connection.


We have not tried much when nvmf-initiator is kernel and nvmf-tgt is SPDK because of the above reason.


What mean your initiator and target is not clear for me yet.

Shahar, I don't feel this is related with reactor or loopback yet but will you correct me if I'm wrong?

Thanks,
Shuhei



差出人: SPDK <spdk-bounces@lists.01.org> が Shahar Salzman <shahar.salzman@kaminario.com> の代理で送信
送信日時: 2018年5月27日 21:48
宛先: Storage Performance Development Kit
件名: [!]Re: [SPDK] Running initiator and target on same host
 

I would like to know which RNIC are you working with.

I think that the older Mellanox cards didn't support loopback.




From: SPDK <spdk-bounces@lists.01.org> on behalf of Sudheendra Sampath <sudheendra.sampath@gmail.com>
Sent: Thursday, May 24, 2018 7:02 PM
To: Storage Performance Development Kit
Subject: Re: [SPDK] Running initiator and target on same host
 
Thanks Shahar for your prompt reply.

Did you try with a single reactor?

So, you mean instead of starting nvmf_tgt reactors on core 2 and core 3 (i.e. -m 0xc), just use -m 0x8 or -m 0x4) and my client continues to use 0x1.  I can try that out and let you know.

Can you elaborate a little more about your setup?

What else would you like to know about the setup ?

I am assuming the message " kernel: [1535128.958782] detected loopback device" you are seeing is from the nvmf_tgt output.

Thanks again for your help and support.

On Thu, May 24, 2018 at 1:20 AM, Shahar Salzman <shahar.salzman@kaminario.com> wrote:

Hi Sudheendra,


We are running this setup for our nightly NVMeF regression.

The nightly is running an old version of spdk, but I have also run this with the master branch.


When connecting I usually see this log:

kernel: [1535128.958782] detected loopback device

We are using ConnectX-4, and running a single subsystem with our bdev_user (currently under discussion in this mailing list and in gerrithub) devices attached. We are using a single reactor + application pollers each bound to its own CPU.

Did you try with a single reactor?
Can you elaborate a little more about your setup?

Shahar





From: SPDK <spdk-bounces@lists.01.org> on behalf of Sudheendra Sampath <sudheendra.sampath@gmail.com>
Sent: Wednesday, May 23, 2018 7:44:59 PM
To: spdk@lists.01.org
Subject: [SPDK] Running initiator and target on same host
 
I tried this in my setup and below is my configuration :

I have 4 cpus with 1 core per socket.  1 numa node.

CPU(s):                4
On-line CPU(s) list:   0-3
Thread(s) per core:    1
Core(s) per socket:    1
Socket(s):             4
NUMA node(s):          1

Hugepage information:

HugePages_Total:    3824
HugePages_Free:     3312
HugePages_Rsvd:        0
HugePages_Surp:        0
Hugepagesize:       2048 kB

I start nvmf_tgt using the following options :

-r /var/run/spdk.sock -m 0xc -s 1024

Since I am using -m 0xc, the reactors starts on core 2 and core 3.  Here is the output :

[ DPDK EAL parameters: nvmf -c 0xc -m 1024 --file-prefix=spdk_pid14924 ]
EAL: Detected 4 lcore(s)
EAL: No free hugepages reported in hugepages-1048576kB
EAL: Probing VFIO support...
app.c: 377:spdk_app_start: *NOTICE*: Total cores available: 2
reactor.c: 654:spdk_reactors_init: *NOTICE*: Occupied cpu socket mask is 0x1
reactor.c: 426:_spdk_reactor_run: *NOTICE*: Reactor started on core 3 on socket 0
reactor.c: 426:_spdk_reactor_run: *NOTICE*: Reactor started on core 2 on socket 0

I run the initiator and force it to run on core 0x1.  

The host that I am doing 'ssh' to gets disconnected.  Here is the output when the host gets disconnected :

rdma.c:1458:spdk_nvmf_rdma_accept: *INFO*: Acceptor Event: RDMA_CM_EVENT_CONNECT_REQUEST
rdma.c: 654:nvmf_rdma_connect: *INFO*: Connect Recv on fabric intf name rxe0, dev_name uverbs0
rdma.c: 658:nvmf_rdma_connect: *INFO*: Listen Id was 0x22a4b10 with verbs 0x22a3630. ListenAddr: 0x22a48d0
rdma.c: 664:nvmf_rdma_connect: *INFO*: Calculating Queue Depth
rdma.c: 669:nvmf_rdma_connect: *INFO*: Target Max Queue Depth: 128
rdma.c: 674:nvmf_rdma_connect: *INFO*: Local NIC Max Send/Recv Queue Depth: 16384 Max Read/Write Queue Depth: 128
rdma.c: 681:nvmf_rdma_connect: *INFO*: Host (Initiator) NIC Max Incoming RDMA R/W operations: 32 Max Outgoing RDMA R/W operations: 0
rdma.c: 690:nvmf_rdma_connect: *INFO*: Host Receive Queue Size: 32
rdma.c: 691:nvmf_rdma_connect: *INFO*: Host Send Queue Size: 31
rdma.c: 697:nvmf_rdma_connect: *INFO*: Final Negotiated Queue Depth: 32 R/W Depth: 32
rdma.c: 371:spdk_nvmf_rdma_qpair_initialize: *INFO*: New RDMA Connection: 0x26b0720
rdma.c: 405:spdk_nvmf_rdma_qpair_initialize: *INFO*: Command Array: 0x7f8477a07000 Length: 800 LKey: 8bc0
rdma.c: 407:spdk_nvmf_rdma_qpair_initialize: *INFO*: Completion Array: 0x7f8477a06000 Length: 200 LKey: 8c80
rdma.c: 409:spdk_nvmf_rdma_qpair_initialize: *INFO*: In Capsule Data Array: 0x7f84777fe000 Length: 20000 LKey: 8d01
rdma.c: 604:spdk_nvmf_rdma_event_accept: *INFO*: Sent back the accept
rdma.c:1458:spdk_nvmf_rdma_accept: *INFO*: Acceptor Event: RDMA_CM_EVENT_ESTABLISHED
Connection to 172.22.4.152 closed by remote host.
Connection to 172.22.4.152 closed.

Has anyone tried this in your setup and if so, any help is highly appreciated.  

--
Regards

Sudheendra Sampath

Note:  I don't see any kernel panic, but just the logon session to the 
           machine where I am trying this gets disconnected.

_______________________________________________
SPDK mailing list
SPDK@lists.01.org
https://lists.01.org/mailman/listinfo/spdk




--
Regards

Sudheendra Sampath