All of lore.kernel.org
 help / color / mirror / Atom feed
* IPoIB multiqueue support?
@ 2010-05-10 21:08 Christoph Lameter
       [not found] ` <alpine.DEB.2.00.1005101607580.17916-sBS69tsa9Uj/9pzu0YdTqQ@public.gmane.org>
  0 siblings, 1 reply; 10+ messages in thread
From: Christoph Lameter @ 2010-05-10 21:08 UTC (permalink / raw)
  To: linux-rdma-u79uwXL29TY76Z2rM5mHXA

I see that some IB nics can do multiqueu in ethernet mode.

Is there any work on multiqueue support for IPoIB going on?

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: IPoIB multiqueue support?
       [not found] ` <alpine.DEB.2.00.1005101607580.17916-sBS69tsa9Uj/9pzu0YdTqQ@public.gmane.org>
@ 2010-05-10 21:12   ` Roland Dreier
       [not found]     ` <adawrvbxxyd.fsf-BjVyx320WGW9gfZ95n9DRSW4+XlvGpQz@public.gmane.org>
  0 siblings, 1 reply; 10+ messages in thread
From: Roland Dreier @ 2010-05-10 21:12 UTC (permalink / raw)
  To: Christoph Lameter; +Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA

 > Is there any work on multiqueue support for IPoIB going on?

No, although one could view connected mode as an even better place to
start, since you already get perfect classification by remote peer for
free.

 - R.
-- 
Roland Dreier <rolandd-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org> || For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/index.html
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: IPoIB multiqueue support?
       [not found]     ` <adawrvbxxyd.fsf-BjVyx320WGW9gfZ95n9DRSW4+XlvGpQz@public.gmane.org>
@ 2010-05-11 13:53       ` Christoph Lameter
       [not found]         ` <alpine.DEB.2.00.1005110851530.1500-sBS69tsa9Uj/9pzu0YdTqQ@public.gmane.org>
  0 siblings, 1 reply; 10+ messages in thread
From: Christoph Lameter @ 2010-05-11 13:53 UTC (permalink / raw)
  To: Roland Dreier; +Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA

On Mon, 10 May 2010, Roland Dreier wrote:

>  > Is there any work on multiqueue support for IPoIB going on?
>
> No, although one could view connected mode as an even better place to
> start, since you already get perfect classification by remote peer for
> free.

I am mostly interested in multicast traffic. Connected mode is not
relevant to that usage scenario.




--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: IPoIB multiqueue support?
       [not found]         ` <alpine.DEB.2.00.1005110851530.1500-sBS69tsa9Uj/9pzu0YdTqQ@public.gmane.org>
@ 2010-05-11 15:56           ` Roland Dreier
       [not found]             ` <adawrvawhyc.fsf-BjVyx320WGW9gfZ95n9DRSW4+XlvGpQz@public.gmane.org>
  0 siblings, 1 reply; 10+ messages in thread
From: Roland Dreier @ 2010-05-11 15:56 UTC (permalink / raw)
  To: Christoph Lameter; +Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA

 > I am mostly interested in multicast traffic. Connected mode is not
 > relevant to that usage scenario.

As I said, I don't think anyone is working on it.  However it wouldn't
be that hard to get something pretty good for multicast, since the
InfiniBand multicast join mechanism would let you have essentially a
perfect filter for steering individual multicast groups to whichever QP
(ring) you wanted to.

Of course you could also implement the equivalent thing in userspace and
probably get even better performance.

 - R.
-- 
Roland Dreier <rolandd-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org> || For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/index.html
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: IPoIB multiqueue support?
       [not found]             ` <adawrvawhyc.fsf-BjVyx320WGW9gfZ95n9DRSW4+XlvGpQz@public.gmane.org>
@ 2010-05-11 20:17               ` Christoph Lameter
       [not found]                 ` <alpine.DEB.2.00.1005111516500.1500-sBS69tsa9Uj/9pzu0YdTqQ@public.gmane.org>
  0 siblings, 1 reply; 10+ messages in thread
From: Christoph Lameter @ 2010-05-11 20:17 UTC (permalink / raw)
  To: Roland Dreier; +Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA

On Tue, 11 May 2010, Roland Dreier wrote:

>  > I am mostly interested in multicast traffic. Connected mode is not
>  > relevant to that usage scenario.
>
> As I said, I don't think anyone is working on it.  However it wouldn't
> be that hard to get something pretty good for multicast, since the
> InfiniBand multicast join mechanism would let you have essentially a
> perfect filter for steering individual multicast groups to whichever QP
> (ring) you wanted to.

Right but then would each individual QP need its own IP address?

> Of course you could also implement the equivalent thing in userspace and
> probably get even better performance.

Start a QP listening to IPoIB mc traffic?

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: IPoIB multiqueue support?
       [not found]                 ` <alpine.DEB.2.00.1005111516500.1500-sBS69tsa9Uj/9pzu0YdTqQ@public.gmane.org>
@ 2010-05-11 20:43                   ` Jason Gunthorpe
       [not found]                     ` <20100511204358.GQ15969-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
  2010-05-11 21:52                   ` Roland Dreier
  1 sibling, 1 reply; 10+ messages in thread
From: Jason Gunthorpe @ 2010-05-11 20:43 UTC (permalink / raw)
  To: Christoph Lameter; +Cc: Roland Dreier, linux-rdma-u79uwXL29TY76Z2rM5mHXA

On Tue, May 11, 2010 at 03:17:36PM -0500, Christoph Lameter wrote:
> On Tue, 11 May 2010, Roland Dreier wrote:
> 
> >  > I am mostly interested in multicast traffic. Connected mode is not
> >  > relevant to that usage scenario.
> >
> > As I said, I don't think anyone is working on it.  However it wouldn't
> > be that hard to get something pretty good for multicast, since the
> > InfiniBand multicast join mechanism would let you have essentially a
> > perfect filter for steering individual multicast groups to whichever QP
> > (ring) you wanted to.
> 
> Right but then would each individual QP need its own IP address?

I think Roland means that each IP multicast address is mapped into an
IB multicast GID, and you can bind a QP to a set of MGIDs. Right now
the driver binds all MGIDs to the rx QP and basically ignores the
MGID on receive.

To go multi-queue you'd create multiple QPs and spread the MGID binds
amongst them.

> > Of course you could also implement the equivalent thing in userspace and
> > probably get even better performance.
> 
> Start a QP listening to IPoIB mc traffic?

Yes, but the downside is that if you rely on the kernel to the group
join then the HCA will send the packet to user space and the kernel QP.

Some of the weird features in the RDMA CM seem to be for supporting
this..

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: IPoIB multiqueue support?
       [not found]                     ` <20100511204358.GQ15969-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
@ 2010-05-11 20:50                       ` Christoph Lameter
       [not found]                         ` <alpine.DEB.2.00.1005111548450.8388-sBS69tsa9Uj/9pzu0YdTqQ@public.gmane.org>
  0 siblings, 1 reply; 10+ messages in thread
From: Christoph Lameter @ 2010-05-11 20:50 UTC (permalink / raw)
  To: Jason Gunthorpe; +Cc: Roland Dreier, linux-rdma-u79uwXL29TY76Z2rM5mHXA

On Tue, 11 May 2010, Jason Gunthorpe wrote:

> > Right but then would each individual QP need its own IP address?
>
> I think Roland means that each IP multicast address is mapped into an
> IB multicast GID, and you can bind a QP to a set of MGIDs. Right now
> the driver binds all MGIDs to the rx QP and basically ignores the
> MGID on receive.

Aha.

> To go multi-queue you'd create multiple QPs and spread the MGID binds
> amongst them.

It would be best to bind them to the QP of the local processor (assuming
that the process continues to run on that processor).

What about unicast traffic? One QP gets all unicast?

> Yes, but the downside is that if you rely on the kernel to the group
> join then the HCA will send the packet to user space and the kernel QP.
>
> Some of the weird features in the RDMA CM seem to be for supporting
> this..

The UMCAST flag can stop the kernel from processing the IGMP reply.

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: IPoIB multiqueue support?
       [not found]                         ` <alpine.DEB.2.00.1005111548450.8388-sBS69tsa9Uj/9pzu0YdTqQ@public.gmane.org>
@ 2010-05-11 20:58                           ` Jason Gunthorpe
       [not found]                             ` <20100511205808.GS15969-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
  0 siblings, 1 reply; 10+ messages in thread
From: Jason Gunthorpe @ 2010-05-11 20:58 UTC (permalink / raw)
  To: Christoph Lameter; +Cc: Roland Dreier, linux-rdma-u79uwXL29TY76Z2rM5mHXA

On Tue, May 11, 2010 at 03:50:35PM -0500, Christoph Lameter wrote:

> > To go multi-queue you'd create multiple QPs and spread the MGID binds
> > amongst them.
> 
> It would be best to bind them to the QP of the local processor (assuming
> that the process continues to run on that processor).

Yes
 
> What about unicast traffic? One QP gets all unicast?

IIRC, there are some HCA-specific features for spreading traffic
amongst QPs using a hash of the IP/TCP/UDP headers. I don't know
anything about them though.

Within the standard IB functionality the best you could do is to
create a wack of QPs and then return different QPNs in your ARP
replies. Though this is very limited and probably not worth doing.
 
> > Yes, but the downside is that if you rely on the kernel to the group
> > join then the HCA will send the packet to user space and the kernel QP.
> >
> > Some of the weird features in the RDMA CM seem to be for supporting
> > this..
> 
> The UMCAST flag can stop the kernel from processing the IGMP reply.

I'm not talking about IGMP, but the IB version of IGMP, the kernel
joins the group in IB land and also attaches the IPOIB QP. This can
all be faked out in userspace, but it isn't entirely straightforward.

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: IPoIB multiqueue support?
       [not found]                 ` <alpine.DEB.2.00.1005111516500.1500-sBS69tsa9Uj/9pzu0YdTqQ@public.gmane.org>
  2010-05-11 20:43                   ` Jason Gunthorpe
@ 2010-05-11 21:52                   ` Roland Dreier
  1 sibling, 0 replies; 10+ messages in thread
From: Roland Dreier @ 2010-05-11 21:52 UTC (permalink / raw)
  To: Christoph Lameter; +Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA

 > > As I said, I don't think anyone is working on it.  However it wouldn't
 > > be that hard to get something pretty good for multicast, since the
 > > InfiniBand multicast join mechanism would let you have essentially a
 > > perfect filter for steering individual multicast groups to whichever QP
 > > (ring) you wanted to.

 > Right but then would each individual QP need its own IP address?

No, that's the beauty of multicast -- you just join the multicast group
for a given IP address, and you get the traffic.  Doing multiqueue for
unicast traffic would require some form of flow steering (RSS) in the
adapter (which some have).

 > > Of course you could also implement the equivalent thing in userspace and
 > > probably get even better performance.

 > Start a QP listening to IPoIB mc traffic?

Yes, I believe there have have several proprietary userspace libraries
that essentially do UDP IPoIB multicast in userspace.  There are a few
funky hooks in the IPoIB driver related to that IIRC.

 - R.
-- 
Roland Dreier <rolandd-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org> || For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/index.html
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: IPoIB multiqueue support?
       [not found]                             ` <20100511205808.GS15969-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
@ 2010-05-11 22:11                               ` Christoph Lameter
  0 siblings, 0 replies; 10+ messages in thread
From: Christoph Lameter @ 2010-05-11 22:11 UTC (permalink / raw)
  To: Jason Gunthorpe; +Cc: Roland Dreier, linux-rdma-u79uwXL29TY76Z2rM5mHXA

On Tue, 11 May 2010, Jason Gunthorpe wrote:

> > The UMCAST flag can stop the kernel from processing the IGMP reply.
>
> I'm not talking about IGMP, but the IB version of IGMP, the kernel
> joins the group in IB land and also attaches the IPOIB QP. This can
> all be faked out in userspace, but it isn't entirely straightforward.

Yes vendors use the UMCAST flag to avoid this.

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2010-05-11 22:11 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-05-10 21:08 IPoIB multiqueue support? Christoph Lameter
     [not found] ` <alpine.DEB.2.00.1005101607580.17916-sBS69tsa9Uj/9pzu0YdTqQ@public.gmane.org>
2010-05-10 21:12   ` Roland Dreier
     [not found]     ` <adawrvbxxyd.fsf-BjVyx320WGW9gfZ95n9DRSW4+XlvGpQz@public.gmane.org>
2010-05-11 13:53       ` Christoph Lameter
     [not found]         ` <alpine.DEB.2.00.1005110851530.1500-sBS69tsa9Uj/9pzu0YdTqQ@public.gmane.org>
2010-05-11 15:56           ` Roland Dreier
     [not found]             ` <adawrvawhyc.fsf-BjVyx320WGW9gfZ95n9DRSW4+XlvGpQz@public.gmane.org>
2010-05-11 20:17               ` Christoph Lameter
     [not found]                 ` <alpine.DEB.2.00.1005111516500.1500-sBS69tsa9Uj/9pzu0YdTqQ@public.gmane.org>
2010-05-11 20:43                   ` Jason Gunthorpe
     [not found]                     ` <20100511204358.GQ15969-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2010-05-11 20:50                       ` Christoph Lameter
     [not found]                         ` <alpine.DEB.2.00.1005111548450.8388-sBS69tsa9Uj/9pzu0YdTqQ@public.gmane.org>
2010-05-11 20:58                           ` Jason Gunthorpe
     [not found]                             ` <20100511205808.GS15969-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2010-05-11 22:11                               ` Christoph Lameter
2010-05-11 21:52                   ` Roland Dreier

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.