All of lore.kernel.org
 help / color / mirror / Atom feed
* CUDA not working with ib_write_bw
@ 2015-02-02 15:59 Steve Wise
       [not found] ` <E51087C8852F7244A3C5126F7988F0D36A0DAF90@nkgeml511-mbx.china.huawei.com>
  0 siblings, 1 reply; 2+ messages in thread
From: Steve Wise @ 2015-02-02 15:59 UTC (permalink / raw)
  To: gilr-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Stephen Bates

Hey Gil,

I'm trying to test iWARP RDMA <-> GPU memory and I compiled CUDA into the top-o-tree perftest repo.  My Nvidia setup is working
because I have verified it with another gpu rdma package (donard from pmc).  But when using ib_write_bw the server gets an error
registering the gpu memory with the device.  Below is the output from ib_write_bw.  I instrumented the kernel registration path and
I find that get_user_pages() is returning -14 (-EFAULT) when called by ib_umem_get(). 

Q:  Is this supposed to work with the upstream RDMA drivers?   I'm using a 3.16.3 kernel.org kernel.

Thanks,

Steve
---

[root@stevo1 perftest]# ./ib_write_bw -R --use_cuda

************************************
* Waiting for client to connect... *
************************************
---------------------------------------------------------------------------------------
                    RDMA_Write BW Test
 Dual-port       : OFF          Device         : cxgb4_1
 Number of qps   : 1            Transport type : IW
 Connection type : RC           Using SRQ      : OFF
 CQ Moderation   : 100
 Mtu             : 1024[B]
 Link type       : Ethernet
 Gid index       : 0
 Max inline data : 0[B]
 rdma_cm QPs     : ON
 Data ex. method : rdma_cm
---------------------------------------------------------------------------------------
 Waiting for client rdma_cm QP to connect
 Please run the same command with the IB/RoCE interface IP
---------------------------------------------------------------------------------------
initializing CUDA
There is 1 device supporting CUDA
[pid = 14124, dev = 0] device name = [Tesla K20Xm]
creating CUDA Ctx
making it the current CUDA Ctx
cuMemAlloc() of a 131072 bytes GPU buffer
allocated GPU buffer address at 0000001304260000 pointer=0x1304260000
Couldn't allocate MR
 Unable to create the resources needed by comm struct
Unable to perform rdma_client function
[root@stevo1 perftest]#

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 2+ messages in thread

* RE: CUDA not working with ib_write_bw
       [not found]   ` <E51087C8852F7244A3C5126F7988F0D36A0DAF90-gH8YcUvTLkOF0yysJWCP+gK1hpo4iccwjNknBlVQO8k@public.gmane.org>
@ 2015-02-04 15:16     ` Steve Wise
  0 siblings, 0 replies; 2+ messages in thread
From: Steve Wise @ 2015-02-04 15:16 UTC (permalink / raw)
  To: 'Zhangfengwei', gilr-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, 'Stephen Bates'

I want to RDMA to/from gpu memory from a remote peer.  So the data arrives at the RDMA device from the wire, and is DMA'd directly
to GPU memory of the peer adapter in the system.



> -----Original Message-----
> From: Zhangfengwei [mailto:fngw.zhang-hv44wF8Li93QT0dZR+AlfA@public.gmane.org]
> Sent: Wednesday, February 04, 2015 12:42 AM
> To: Steve Wise; gilr-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org
> Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org; Stephen Bates
> Subject: 答复: CUDA not working with ib_write_bw
> 
> Hi Steve,
> 
> Do you want to transfer the data to the gpu buffer directly ? I guess the DMA seems not to do this all by itself.
> 
> -----邮件原件-----
> 发件人: linux-rdma-owner-u79uwXL29TY76Z2rM5mHXA@public.gmane.org [mailto:linux-rdma-owner-u79uwXL29TY76Z2rM5mHXA@public.gmane.org] 代表 Steve Wise
> 发送时间: 2015年2月2日 23:59
> 收件人: gilr-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org
> 抄送: linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org; Stephen Bates
> 主题: CUDA not working with ib_write_bw
> 
> Hey Gil,
> 
> I'm trying to test iWARP RDMA <-> GPU memory and I compiled CUDA into the top-o-tree perftest repo.  My Nvidia setup is working
> because I have verified it with another gpu rdma package (donard from pmc).  But when using ib_write_bw the server gets an error
> registering the gpu memory with the device.  Below is the output from ib_write_bw.  I instrumented the kernel registration path
and I find
> that get_user_pages() is returning -14 (-EFAULT) when called by ib_umem_get().
> 
> Q:  Is this supposed to work with the upstream RDMA drivers?   I'm using a 3.16.3 kernel.org kernel.
> 
> Thanks,
> 
> Steve
> ---
> 
> [root@stevo1 perftest]# ./ib_write_bw -R --use_cuda
> 
> ************************************
> * Waiting for client to connect... *
> ************************************
> ---------------------------------------------------------------------------------------
>                     RDMA_Write BW Test
>  Dual-port       : OFF          Device         : cxgb4_1
>  Number of qps   : 1            Transport type : IW
>  Connection type : RC           Using SRQ      : OFF
>  CQ Moderation   : 100
>  Mtu             : 1024[B]
>  Link type       : Ethernet
>  Gid index       : 0
>  Max inline data : 0[B]
>  rdma_cm QPs     : ON
>  Data ex. method : rdma_cm
> ---------------------------------------------------------------------------------------
>  Waiting for client rdma_cm QP to connect  Please run the same command with the IB/RoCE interface IP
> ---------------------------------------------------------------------------------------
> initializing CUDA
> There is 1 device supporting CUDA
> [pid = 14124, dev = 0] device name = [Tesla K20Xm] creating CUDA Ctx making it the current CUDA Ctx
> cuMemAlloc() of a 131072 bytes GPU buffer allocated GPU buffer address at 0000001304260000 pointer=0x1304260000 Couldn't allocate
> MR  Unable to create the resources needed by comm struct Unable to perform rdma_client function
> [root@stevo1 perftest]#
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More
> majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2015-02-04 15:16 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-02-02 15:59 CUDA not working with ib_write_bw Steve Wise
     [not found] ` <E51087C8852F7244A3C5126F7988F0D36A0DAF90@nkgeml511-mbx.china.huawei.com>
     [not found]   ` <E51087C8852F7244A3C5126F7988F0D36A0DAF90-gH8YcUvTLkOF0yysJWCP+gK1hpo4iccwjNknBlVQO8k@public.gmane.org>
2015-02-04 15:16     ` Steve Wise

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.