* CUDA not working with ib_write_bw
@ 2015-02-02 15:59 Steve Wise
[not found] ` <E51087C8852F7244A3C5126F7988F0D36A0DAF90@nkgeml511-mbx.china.huawei.com>
0 siblings, 1 reply; 2+ messages in thread
From: Steve Wise @ 2015-02-02 15:59 UTC (permalink / raw)
To: gilr-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb
Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Stephen Bates
Hey Gil,
I'm trying to test iWARP RDMA <-> GPU memory and I compiled CUDA into the top-o-tree perftest repo. My Nvidia setup is working
because I have verified it with another gpu rdma package (donard from pmc). But when using ib_write_bw the server gets an error
registering the gpu memory with the device. Below is the output from ib_write_bw. I instrumented the kernel registration path and
I find that get_user_pages() is returning -14 (-EFAULT) when called by ib_umem_get().
Q: Is this supposed to work with the upstream RDMA drivers? I'm using a 3.16.3 kernel.org kernel.
Thanks,
Steve
---
[root@stevo1 perftest]# ./ib_write_bw -R --use_cuda
************************************
* Waiting for client to connect... *
************************************
---------------------------------------------------------------------------------------
RDMA_Write BW Test
Dual-port : OFF Device : cxgb4_1
Number of qps : 1 Transport type : IW
Connection type : RC Using SRQ : OFF
CQ Moderation : 100
Mtu : 1024[B]
Link type : Ethernet
Gid index : 0
Max inline data : 0[B]
rdma_cm QPs : ON
Data ex. method : rdma_cm
---------------------------------------------------------------------------------------
Waiting for client rdma_cm QP to connect
Please run the same command with the IB/RoCE interface IP
---------------------------------------------------------------------------------------
initializing CUDA
There is 1 device supporting CUDA
[pid = 14124, dev = 0] device name = [Tesla K20Xm]
creating CUDA Ctx
making it the current CUDA Ctx
cuMemAlloc() of a 131072 bytes GPU buffer
allocated GPU buffer address at 0000001304260000 pointer=0x1304260000
Couldn't allocate MR
Unable to create the resources needed by comm struct
Unable to perform rdma_client function
[root@stevo1 perftest]#
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 2+ messages in thread
* RE: CUDA not working with ib_write_bw
[not found] ` <E51087C8852F7244A3C5126F7988F0D36A0DAF90-gH8YcUvTLkOF0yysJWCP+gK1hpo4iccwjNknBlVQO8k@public.gmane.org>
@ 2015-02-04 15:16 ` Steve Wise
0 siblings, 0 replies; 2+ messages in thread
From: Steve Wise @ 2015-02-04 15:16 UTC (permalink / raw)
To: 'Zhangfengwei', gilr-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb
Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, 'Stephen Bates'
I want to RDMA to/from gpu memory from a remote peer. So the data arrives at the RDMA device from the wire, and is DMA'd directly
to GPU memory of the peer adapter in the system.
> -----Original Message-----
> From: Zhangfengwei [mailto:fngw.zhang-hv44wF8Li93QT0dZR+AlfA@public.gmane.org]
> Sent: Wednesday, February 04, 2015 12:42 AM
> To: Steve Wise; gilr-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org
> Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org; Stephen Bates
> Subject: 答复: CUDA not working with ib_write_bw
>
> Hi Steve,
>
> Do you want to transfer the data to the gpu buffer directly ? I guess the DMA seems not to do this all by itself.
>
> -----邮件原件-----
> 发件人: linux-rdma-owner-u79uwXL29TY76Z2rM5mHXA@public.gmane.org [mailto:linux-rdma-owner-u79uwXL29TY76Z2rM5mHXA@public.gmane.org] 代表 Steve Wise
> 发送时间: 2015年2月2日 23:59
> 收件人: gilr-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org
> 抄送: linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org; Stephen Bates
> 主题: CUDA not working with ib_write_bw
>
> Hey Gil,
>
> I'm trying to test iWARP RDMA <-> GPU memory and I compiled CUDA into the top-o-tree perftest repo. My Nvidia setup is working
> because I have verified it with another gpu rdma package (donard from pmc). But when using ib_write_bw the server gets an error
> registering the gpu memory with the device. Below is the output from ib_write_bw. I instrumented the kernel registration path
and I find
> that get_user_pages() is returning -14 (-EFAULT) when called by ib_umem_get().
>
> Q: Is this supposed to work with the upstream RDMA drivers? I'm using a 3.16.3 kernel.org kernel.
>
> Thanks,
>
> Steve
> ---
>
> [root@stevo1 perftest]# ./ib_write_bw -R --use_cuda
>
> ************************************
> * Waiting for client to connect... *
> ************************************
> ---------------------------------------------------------------------------------------
> RDMA_Write BW Test
> Dual-port : OFF Device : cxgb4_1
> Number of qps : 1 Transport type : IW
> Connection type : RC Using SRQ : OFF
> CQ Moderation : 100
> Mtu : 1024[B]
> Link type : Ethernet
> Gid index : 0
> Max inline data : 0[B]
> rdma_cm QPs : ON
> Data ex. method : rdma_cm
> ---------------------------------------------------------------------------------------
> Waiting for client rdma_cm QP to connect Please run the same command with the IB/RoCE interface IP
> ---------------------------------------------------------------------------------------
> initializing CUDA
> There is 1 device supporting CUDA
> [pid = 14124, dev = 0] device name = [Tesla K20Xm] creating CUDA Ctx making it the current CUDA Ctx
> cuMemAlloc() of a 131072 bytes GPU buffer allocated GPU buffer address at 0000001304260000 pointer=0x1304260000 Couldn't allocate
> MR Unable to create the resources needed by comm struct Unable to perform rdma_client function
> [root@stevo1 perftest]#
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More
> majordomo info at http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2015-02-04 15:16 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-02-02 15:59 CUDA not working with ib_write_bw Steve Wise
[not found] ` <E51087C8852F7244A3C5126F7988F0D36A0DAF90@nkgeml511-mbx.china.huawei.com>
[not found] ` <E51087C8852F7244A3C5126F7988F0D36A0DAF90-gH8YcUvTLkOF0yysJWCP+gK1hpo4iccwjNknBlVQO8k@public.gmane.org>
2015-02-04 15:16 ` Steve Wise
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.