* CUDA not working with ib_write_bw @ 2015-02-02 15:59 Steve Wise [not found] ` <E51087C8852F7244A3C5126F7988F0D36A0DAF90@nkgeml511-mbx.china.huawei.com> 0 siblings, 1 reply; 2+ messages in thread From: Steve Wise @ 2015-02-02 15:59 UTC (permalink / raw) To: gilr-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Stephen Bates Hey Gil, I'm trying to test iWARP RDMA <-> GPU memory and I compiled CUDA into the top-o-tree perftest repo. My Nvidia setup is working because I have verified it with another gpu rdma package (donard from pmc). But when using ib_write_bw the server gets an error registering the gpu memory with the device. Below is the output from ib_write_bw. I instrumented the kernel registration path and I find that get_user_pages() is returning -14 (-EFAULT) when called by ib_umem_get(). Q: Is this supposed to work with the upstream RDMA drivers? I'm using a 3.16.3 kernel.org kernel. Thanks, Steve --- [root@stevo1 perftest]# ./ib_write_bw -R --use_cuda ************************************ * Waiting for client to connect... * ************************************ --------------------------------------------------------------------------------------- RDMA_Write BW Test Dual-port : OFF Device : cxgb4_1 Number of qps : 1 Transport type : IW Connection type : RC Using SRQ : OFF CQ Moderation : 100 Mtu : 1024[B] Link type : Ethernet Gid index : 0 Max inline data : 0[B] rdma_cm QPs : ON Data ex. method : rdma_cm --------------------------------------------------------------------------------------- Waiting for client rdma_cm QP to connect Please run the same command with the IB/RoCE interface IP --------------------------------------------------------------------------------------- initializing CUDA There is 1 device supporting CUDA [pid = 14124, dev = 0] device name = [Tesla K20Xm] creating CUDA Ctx making it the current CUDA Ctx cuMemAlloc() of a 131072 bytes GPU buffer allocated GPU buffer address at 0000001304260000 pointer=0x1304260000 Couldn't allocate MR Unable to create the resources needed by comm struct Unable to perform rdma_client function [root@stevo1 perftest]# -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 2+ messages in thread
[parent not found: <E51087C8852F7244A3C5126F7988F0D36A0DAF90@nkgeml511-mbx.china.huawei.com>]
[parent not found: <E51087C8852F7244A3C5126F7988F0D36A0DAF90-gH8YcUvTLkOF0yysJWCP+gK1hpo4iccwjNknBlVQO8k@public.gmane.org>]
* RE: CUDA not working with ib_write_bw [not found] ` <E51087C8852F7244A3C5126F7988F0D36A0DAF90-gH8YcUvTLkOF0yysJWCP+gK1hpo4iccwjNknBlVQO8k@public.gmane.org> @ 2015-02-04 15:16 ` Steve Wise 0 siblings, 0 replies; 2+ messages in thread From: Steve Wise @ 2015-02-04 15:16 UTC (permalink / raw) To: 'Zhangfengwei', gilr-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, 'Stephen Bates' I want to RDMA to/from gpu memory from a remote peer. So the data arrives at the RDMA device from the wire, and is DMA'd directly to GPU memory of the peer adapter in the system. > -----Original Message----- > From: Zhangfengwei [mailto:fngw.zhang-hv44wF8Li93QT0dZR+AlfA@public.gmane.org] > Sent: Wednesday, February 04, 2015 12:42 AM > To: Steve Wise; gilr-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org > Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org; Stephen Bates > Subject: 答复: CUDA not working with ib_write_bw > > Hi Steve, > > Do you want to transfer the data to the gpu buffer directly ? I guess the DMA seems not to do this all by itself. > > -----邮件原件----- > 发件人: linux-rdma-owner-u79uwXL29TY76Z2rM5mHXA@public.gmane.org [mailto:linux-rdma-owner-u79uwXL29TY76Z2rM5mHXA@public.gmane.org] 代表 Steve Wise > 发送时间: 2015年2月2日 23:59 > 收件人: gilr-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org > 抄送: linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org; Stephen Bates > 主题: CUDA not working with ib_write_bw > > Hey Gil, > > I'm trying to test iWARP RDMA <-> GPU memory and I compiled CUDA into the top-o-tree perftest repo. My Nvidia setup is working > because I have verified it with another gpu rdma package (donard from pmc). But when using ib_write_bw the server gets an error > registering the gpu memory with the device. Below is the output from ib_write_bw. I instrumented the kernel registration path and I find > that get_user_pages() is returning -14 (-EFAULT) when called by ib_umem_get(). > > Q: Is this supposed to work with the upstream RDMA drivers? I'm using a 3.16.3 kernel.org kernel. > > Thanks, > > Steve > --- > > [root@stevo1 perftest]# ./ib_write_bw -R --use_cuda > > ************************************ > * Waiting for client to connect... * > ************************************ > --------------------------------------------------------------------------------------- > RDMA_Write BW Test > Dual-port : OFF Device : cxgb4_1 > Number of qps : 1 Transport type : IW > Connection type : RC Using SRQ : OFF > CQ Moderation : 100 > Mtu : 1024[B] > Link type : Ethernet > Gid index : 0 > Max inline data : 0[B] > rdma_cm QPs : ON > Data ex. method : rdma_cm > --------------------------------------------------------------------------------------- > Waiting for client rdma_cm QP to connect Please run the same command with the IB/RoCE interface IP > --------------------------------------------------------------------------------------- > initializing CUDA > There is 1 device supporting CUDA > [pid = 14124, dev = 0] device name = [Tesla K20Xm] creating CUDA Ctx making it the current CUDA Ctx > cuMemAlloc() of a 131072 bytes GPU buffer allocated GPU buffer address at 0000001304260000 pointer=0x1304260000 Couldn't allocate > MR Unable to create the resources needed by comm struct Unable to perform rdma_client function > [root@stevo1 perftest]# > > -- > To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More > majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2015-02-04 15:16 UTC | newest] Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2015-02-02 15:59 CUDA not working with ib_write_bw Steve Wise [not found] ` <E51087C8852F7244A3C5126F7988F0D36A0DAF90@nkgeml511-mbx.china.huawei.com> [not found] ` <E51087C8852F7244A3C5126F7988F0D36A0DAF90-gH8YcUvTLkOF0yysJWCP+gK1hpo4iccwjNknBlVQO8k@public.gmane.org> 2015-02-04 15:16 ` Steve Wise
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.