All of lore.kernel.org
 help / color / mirror / Atom feed
* Sharing MR Between Multiple Connections
@ 2012-11-14  4:36 Christopher Mitchell
       [not found] ` <CAPb9-SHQ=r1MYu=771z2cTi9bAq=ueWMjw2vFDoSHbNByK5awA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 9+ messages in thread
From: Christopher Mitchell @ 2012-11-14  4:36 UTC (permalink / raw)
  To: linux-rdma-u79uwXL29TY76Z2rM5mHXA

Hi,

I am working on building an Infiniband application with a server that
can handle many simultaneous clients. The server exposes a chunk of
memory that each of the clients can read via RDMA. I was previously
creating a new MR on the server for each client (and of course in that
connection's PD). However, under stress testing, I realized that
ibv_reg_mr() started failing after I simultaneously MRed the same area
enough times to cover 20.0 GB. I presume that the problem is reaching
some pinning limit, although ulimit reports "unlimited" for all
relevant possibilities. I tried creating a single global PD and a
single MR to be shared among the multiple connections, but
rdma_create_qp() fails with an invalid argument when I try to do that.
I therefore deduce that the PD specified in rdma_create_qp() must
correspond to an active connection, not simply be created by opening a
device.

Long question short: is there any way I can share the same MR among
multiple clients, so that my shared memory region is limited to N
bytes instead of N/C (clients) bytes?

Thanks in advance,
Christopher

end
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Fwd: Sharing MR Between Multiple Connections
       [not found] ` <CAPb9-SHQ=r1MYu=771z2cTi9bAq=ueWMjw2vFDoSHbNByK5awA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2012-11-14  5:47   ` Christopher Mitchell
       [not found]     ` <CAPb9-SHVmNFm-rCviMLfu7GQhK065fWGTGzJssxLrLMZHDMFpQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  2012-11-14  8:27   ` Yann Droneaud
  2012-11-14 13:40   ` Atchley, Scott
  2 siblings, 1 reply; 9+ messages in thread
From: Christopher Mitchell @ 2012-11-14  5:47 UTC (permalink / raw)
  To: linux-rdma-u79uwXL29TY76Z2rM5mHXA

Hi,

I am working on building an Infiniband application with a server that
can handle many simultaneous clients. The server exposes a chunk of
memory that each of the clients can read via RDMA. I was previously
creating a new MR on the server for each client (and of course in that
connection's PD). However, under stress testing, I realized that
ibv_reg_mr() started failing after I simultaneously MRed the same area
enough times to cover 20.0 GB. I presume that the problem is reaching
some pinning limit, although ulimit reports "unlimited" for all
relevant possibilities. I tried creating a single global PD and a
single MR to be shared among the multiple connections, but
rdma_create_qp() fails with an invalid argument when I try to do that.
I therefore deduce that the PD specified in rdma_create_qp() must
correspond to an active connection, not simply be created by opening a
device.

Long question short: is there any way I can share the same MR among
multiple clients, so that my shared memory region is limited to N
bytes instead of N/C (clients) bytes?

Thanks in advance,
Christopher

end
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Sharing MR Between Multiple Connections
       [not found]     ` <CAPb9-SHVmNFm-rCviMLfu7GQhK065fWGTGzJssxLrLMZHDMFpQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2012-11-14  6:09       ` Christopher Mitchell
  2012-11-14 17:13       ` Hefty, Sean
  1 sibling, 0 replies; 9+ messages in thread
From: Christopher Mitchell @ 2012-11-14  6:09 UTC (permalink / raw)
  To: linux-rdma-u79uwXL29TY76Z2rM5mHXA

Please forgive my accidental faux-pas of a double-post. I incorrectly
thought my use of a non-subscribed email address would prevent my post
from getting through.

On Wed, Nov 14, 2012 at 12:47 AM, Christopher Mitchell
<kermmartian-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
> Hi,
>
> [snip]
>
> Thanks in advance,
> Christopher
>
> end
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Sharing MR Between Multiple Connections
       [not found] ` <CAPb9-SHQ=r1MYu=771z2cTi9bAq=ueWMjw2vFDoSHbNByK5awA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  2012-11-14  5:47   ` Fwd: " Christopher Mitchell
@ 2012-11-14  8:27   ` Yann Droneaud
  2012-11-14 13:40   ` Atchley, Scott
  2 siblings, 0 replies; 9+ messages in thread
From: Yann Droneaud @ 2012-11-14  8:27 UTC (permalink / raw)
  To: Christopher Mitchell; +Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Yann Droneaud

Le mardi 13 novembre 2012 à 23:36 -0500, Christopher Mitchell a écrit :
> Hi,
> 
> I am working on building an Infiniband application with a server that
> can handle many simultaneous clients. The server exposes a chunk of
> memory that each of the clients can read via RDMA. I was previously
> creating a new MR on the server for each client (and of course in that
> connection's PD). However, under stress testing, I realized that
> ibv_reg_mr() started failing after I simultaneously MRed the same area
> enough times to cover 20.0 GB. I presume that the problem is reaching
> some pinning limit, although ulimit reports "unlimited" for all
> relevant possibilities.

There's a limit of the number of registered memory region per HCA.

See ibv_query_device(), struct ibv_device_attr, field max_mr.

>  I tried creating a single global PD and a
> single MR to be shared among the multiple connections, but
> rdma_create_qp() fails with an invalid argument when I try to do that.
> I therefore deduce that the PD specified in rdma_create_qp() must
> correspond to an active connection, not simply be created by opening a
> device.
> 
> Long question short: is there any way I can share the same MR among
> multiple clients, so that my shared memory region is limited to N
> bytes instead of N/C (clients) bytes?
> 

If each rdma_cm_id descriptor is tied to the same ibv_context, you might
be able to share one memory pool registered with ibv_reg_mr().
This is quite usual since you are likely to have only one HCA in your
system. Get a first context after binding your listening rdma_cm_id. 

Regards

-- 
Yann Droneaud
OPTEYA


--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Sharing MR Between Multiple Connections
       [not found] ` <CAPb9-SHQ=r1MYu=771z2cTi9bAq=ueWMjw2vFDoSHbNByK5awA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  2012-11-14  5:47   ` Fwd: " Christopher Mitchell
  2012-11-14  8:27   ` Yann Droneaud
@ 2012-11-14 13:40   ` Atchley, Scott
  2 siblings, 0 replies; 9+ messages in thread
From: Atchley, Scott @ 2012-11-14 13:40 UTC (permalink / raw)
  To: Christopher Mitchell; +Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA

On Nov 13, 2012, at 11:36 PM, Christopher Mitchell <christopher-1z5WdJkP5Frk1uMJSBkQmQ@public.gmane.org> wrote:

> Hi,
> 
> I am working on building an Infiniband application with a server that
> can handle many simultaneous clients. The server exposes a chunk of
> memory that each of the clients can read via RDMA. I was previously
> creating a new MR on the server for each client (and of course in that
> connection's PD). However, under stress testing, I realized that
> ibv_reg_mr() started failing after I simultaneously MRed the same area
> enough times to cover 20.0 GB. I presume that the problem is reaching
> some pinning limit, although ulimit reports "unlimited" for all
> relevant possibilities. I tried creating a single global PD and a
> single MR to be shared among the multiple connections, but
> rdma_create_qp() fails with an invalid argument when I try to do that.
> I therefore deduce that the PD specified in rdma_create_qp() must
> correspond to an active connection, not simply be created by opening a
> device.
> 
> Long question short: is there any way I can share the same MR among
> multiple clients, so that my shared memory region is limited to N
> bytes instead of N/C (clients) bytes?

Christopher,

Yes, it is possible. You have to use the same PD for all QPs/connections. We do this in CCI when using the Verbs transport.

Scott--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 9+ messages in thread

* RE: Sharing MR Between Multiple Connections
       [not found]     ` <CAPb9-SHVmNFm-rCviMLfu7GQhK065fWGTGzJssxLrLMZHDMFpQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  2012-11-14  6:09       ` Christopher Mitchell
@ 2012-11-14 17:13       ` Hefty, Sean
       [not found]         ` <1828884A29C6694DAF28B7E6B8A8237346AD8B5A-P5GAC/sN6hmkrb+BlOpmy7fspsVTdybXVpNB7YpNyf8@public.gmane.org>
  1 sibling, 1 reply; 9+ messages in thread
From: Hefty, Sean @ 2012-11-14 17:13 UTC (permalink / raw)
  To: Christopher Mitchell, linux-rdma-u79uwXL29TY76Z2rM5mHXA

> I am working on building an Infiniband application with a server that
> can handle many simultaneous clients. The server exposes a chunk of
> memory that each of the clients can read via RDMA. I was previously
> creating a new MR on the server for each client (and of course in that
> connection's PD). However, under stress testing, I realized that
> ibv_reg_mr() started failing after I simultaneously MRed the same area
> enough times to cover 20.0 GB. I presume that the problem is reaching
> some pinning limit, although ulimit reports "unlimited" for all
> relevant possibilities. I tried creating a single global PD and a
> single MR to be shared among the multiple connections, but
> rdma_create_qp() fails with an invalid argument when I try to do that.
> I therefore deduce that the PD specified in rdma_create_qp() must
> correspond to an active connection, not simply be created by opening a
> device.

The rdma_cm will automatically allocate one PD per RDMA device.  You can share this PD among multiple connections.  To use this PD, pass in NULL into rdma_create_qp().  The rdma_cm_id will reference the shared PD.
 
> Long question short: is there any way I can share the same MR among
> multiple clients, so that my shared memory region is limited to N
> bytes instead of N/C (clients) bytes?

Yes, but if there are multiple devices being used, then you'll need to track those registrations separately.

- Sean
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Sharing MR Between Multiple Connections
       [not found]         ` <1828884A29C6694DAF28B7E6B8A8237346AD8B5A-P5GAC/sN6hmkrb+BlOpmy7fspsVTdybXVpNB7YpNyf8@public.gmane.org>
@ 2012-11-14 18:05           ` Christopher Mitchell
       [not found]             ` <CAPb9-SEHXKKmVGhsxTrku19H1q3hBE=iejjYMOfMSk+K1W09Zg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 9+ messages in thread
From: Christopher Mitchell @ 2012-11-14 18:05 UTC (permalink / raw)
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA

On Wed, Nov 14, 2012 at 12:13 PM, Hefty, Sean <sean.hefty-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org> wrote:

> The rdma_cm will automatically allocate one PD per RDMA device.  You can share this PD among multiple connections.  To use this PD, pass in NULL into rdma_create_qp().  The rdma_cm_id will reference the shared PD.

That's great to know; thanks Sean (and everyone else who confirmed
that a shared PD and possibly CQ is the way to go). Is there a way to
reference this shared PD in ibv_reg_mr() as well? It seems that using
NULL there too is not the solution, and I'm having difficulty tracing
back from struct rdma_cm_id to find out where that shared PD is
stored. I'd prefer to use this solution for simplicity's sake, so any
additional details would be greatly appreciated. For what it's worth,
I tried passing NULL to rdma_create_qp() as you suggested, and found
it to work admirably.

>> Long question short: is there any way I can share the same MR among
>> multiple clients, so that my shared memory region is limited to N
>> bytes instead of N/C (clients) bytes?
>
> Yes, but if there are multiple devices being used, then you'll need to track those registrations separately.

That's reasonable.

Thanks,
Christopher
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 9+ messages in thread

* RE: Sharing MR Between Multiple Connections
       [not found]             ` <CAPb9-SEHXKKmVGhsxTrku19H1q3hBE=iejjYMOfMSk+K1W09Zg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2012-11-14 18:19               ` Hefty, Sean
       [not found]                 ` <1828884A29C6694DAF28B7E6B8A8237346AD8BC5-P5GAC/sN6hmkrb+BlOpmy7fspsVTdybXVpNB7YpNyf8@public.gmane.org>
  0 siblings, 1 reply; 9+ messages in thread
From: Hefty, Sean @ 2012-11-14 18:19 UTC (permalink / raw)
  To: Christopher Mitchell; +Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA

> > The rdma_cm will automatically allocate one PD per RDMA device.  You can
> share this PD among multiple connections.  To use this PD, pass in NULL into
> rdma_create_qp().  The rdma_cm_id will reference the shared PD.
> 
> That's great to know; thanks Sean (and everyone else who confirmed
> that a shared PD and possibly CQ is the way to go). Is there a way to
> reference this shared PD in ibv_reg_mr() as well? It seems that using
> NULL there too is not the solution, and I'm having difficulty tracing
> back from struct rdma_cm_id to find out where that shared PD is
> stored. I'd prefer to use this solution for simplicity's sake, so any
> additional details would be greatly appreciated. For what it's worth,
> I tried passing NULL to rdma_create_qp() as you suggested, and found
> it to work admirably.

struct rdma_cm_id *id;

ibv_reg_mr(id->pd, ...);
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Sharing MR Between Multiple Connections
       [not found]                 ` <1828884A29C6694DAF28B7E6B8A8237346AD8BC5-P5GAC/sN6hmkrb+BlOpmy7fspsVTdybXVpNB7YpNyf8@public.gmane.org>
@ 2012-11-15  0:06                   ` Christopher Mitchell
  0 siblings, 0 replies; 9+ messages in thread
From: Christopher Mitchell @ 2012-11-15  0:06 UTC (permalink / raw)
  To: linux-rdma-u79uwXL29TY76Z2rM5mHXA

On Wed, Nov 14, 2012 at 1:19 PM, Hefty, Sean <sean.hefty-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org> wrote:
>> > The rdma_cm will automatically allocate one PD per RDMA device.  You can
>> share this PD among multiple connections.  To use this PD, pass in NULL into
>> rdma_create_qp().  The rdma_cm_id will reference the shared PD.
>>
>> That's great to know; thanks Sean (and everyone else who confirmed
>> that a shared PD and possibly CQ is the way to go). Is there a way to
>> reference this shared PD in ibv_reg_mr() as well? It seems that using
>> NULL there too is not the solution, and I'm having difficulty tracing
>> back from struct rdma_cm_id to find out where that shared PD is
>> stored. I'd prefer to use this solution for simplicity's sake, so any
>> additional details would be greatly appreciated. For what it's worth,
>> I tried passing NULL to rdma_create_qp() as you suggested, and found
>> it to work admirably.
>
> struct rdma_cm_id *id;
>
> ibv_reg_mr(id->pd, ...);

It actually turned out to be id->qp->pd for me for some reason; the
rdma_cm_id struct on my machine has no pd member. But other than that,
all is working great. Thanks again.

Christopher
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2012-11-15  0:06 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-11-14  4:36 Sharing MR Between Multiple Connections Christopher Mitchell
     [not found] ` <CAPb9-SHQ=r1MYu=771z2cTi9bAq=ueWMjw2vFDoSHbNByK5awA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2012-11-14  5:47   ` Fwd: " Christopher Mitchell
     [not found]     ` <CAPb9-SHVmNFm-rCviMLfu7GQhK065fWGTGzJssxLrLMZHDMFpQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2012-11-14  6:09       ` Christopher Mitchell
2012-11-14 17:13       ` Hefty, Sean
     [not found]         ` <1828884A29C6694DAF28B7E6B8A8237346AD8B5A-P5GAC/sN6hmkrb+BlOpmy7fspsVTdybXVpNB7YpNyf8@public.gmane.org>
2012-11-14 18:05           ` Christopher Mitchell
     [not found]             ` <CAPb9-SEHXKKmVGhsxTrku19H1q3hBE=iejjYMOfMSk+K1W09Zg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2012-11-14 18:19               ` Hefty, Sean
     [not found]                 ` <1828884A29C6694DAF28B7E6B8A8237346AD8BC5-P5GAC/sN6hmkrb+BlOpmy7fspsVTdybXVpNB7YpNyf8@public.gmane.org>
2012-11-15  0:06                   ` Christopher Mitchell
2012-11-14  8:27   ` Yann Droneaud
2012-11-14 13:40   ` Atchley, Scott

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.