* Re: QoS in local SA entity @ 2009-11-05 12:07 Or Gerlitz [not found] ` <4AF2C00A.4040808-smomgflXvOZWk0Htik3J/w@public.gmane.org> 0 siblings, 1 reply; 9+ messages in thread From: Or Gerlitz @ 2009-11-05 12:07 UTC (permalink / raw) To: Sean Hefty, Roland Dreier; +Cc: linux-rdma, Jason Gunthorpe > > I think this really needs to be discussed wrt the implementation of the entity providing the path records. fair-enough, lets do it then... > I think what's needed is a way for the SA to distribute QoS information to the end nodes, so that the decisions can be made locally. If someone wants some sort of dynamic QoS management and is happy using a small cluster, then they can disable any local SA entities and contact the SA directly. I believe we can go also on a middle way, where the SA isn't contacted directly using path query for each resolution, but rather "indirectly" e.g using a dedicated multicast based protocol. > In the case of ACM, the pkey is embedded in the MGID. 'Something' could tell the SA to create ACM multicast groups using a specific SL for a given MGID or pkey in the join request. That SL would be distributed to the end nodes when they joined their groups. So assuming ACM supports AF_INET, using network stack route lookup on the destination address / rdma_bind on the source address, etc as we discussed, ACM can use the rdma-cm to resolve the pkey, then use this pkey the MGID and a management software could tell the SA to use a specific SL for MGIDs on this partition. Next, ACM can use this SL in the path it generates for the IB connection, makes sense? > The entity that provides the path records cannot depend on calling into the librdmacm. The dependency needs to go the other way. I understand that you want to be dependent less as much as possible, but I believe that my suggestion doesn't contradict your design but rather enhance it. Or. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 9+ messages in thread
[parent not found: <4AF2C00A.4040808-smomgflXvOZWk0Htik3J/w@public.gmane.org>]
* RE: QoS in local SA entity [not found] ` <4AF2C00A.4040808-smomgflXvOZWk0Htik3J/w@public.gmane.org> @ 2009-11-05 16:40 ` Sean Hefty [not found] ` <9BF1CEFA7F6F44F5B5641065C4914EB5-Zpru7NauK7drdx17CPfAsdBPR1lH4CV8@public.gmane.org> 0 siblings, 1 reply; 9+ messages in thread From: Sean Hefty @ 2009-11-05 16:40 UTC (permalink / raw) To: 'Or Gerlitz', Roland Dreier; +Cc: linux-rdma, Jason Gunthorpe >I believe we can go also on a middle way, where the SA isn't contacted >directly using path query for each resolution, but rather "indirectly" >e.g using a dedicated multicast based protocol. Yes - I wasn't trying to limit how the SA could 'distribute' QoS information to the end nodes. ACM will obtain QoS information from the SA when it joins its multicast groups. >So assuming ACM supports AF_INET, using network stack route lookup on >the destination address / rdma_bind on the source address, etc as we >discussed, ACM can use the rdma-cm to resolve the pkey, then use this >pkey the MGID and a management software could tell the SA to use a >specific SL for MGIDs on this partition. Next, ACM can use this SL in >the path it generates for the IB connection, makes sense? ACM is intended to be a service that's used by the librdmacm to resolve address mappings and routes. Trying to have ACM use the librdmacm ends up with a circular dependency. That's the part I'm trying to avoid. ACM uses address mappings as defined in an address configuration file (IP -> device, port, pkey). The address file can be created using the provided ib_acme utility, which uses the current system configuration (in an ugly way, but it works). I think this provides QoS behavior similar to what you're describing. At some future point, the ib_acm service can be merged with ib_acme to respond to dynamic changes in ipoib address mappings, but that's a non-trivial amount of work and involves changes to the ACM multicast groups. - Sean -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 9+ messages in thread
[parent not found: <9BF1CEFA7F6F44F5B5641065C4914EB5-Zpru7NauK7drdx17CPfAsdBPR1lH4CV8@public.gmane.org>]
* Re: QoS in local SA entity [not found] ` <9BF1CEFA7F6F44F5B5641065C4914EB5-Zpru7NauK7drdx17CPfAsdBPR1lH4CV8@public.gmane.org> @ 2009-11-08 6:25 ` Or Gerlitz [not found] ` <4AF66473.2050303-smomgflXvOZWk0Htik3J/w@public.gmane.org> 0 siblings, 1 reply; 9+ messages in thread From: Or Gerlitz @ 2009-11-08 6:25 UTC (permalink / raw) To: Sean Hefty; +Cc: linux-rdma Sean Hefty wrote: > I wasn't trying to limit how the SA could 'distribute' QoS information to the end nodes. ACM will obtain QoS information from the SA when it joins its > multicast groups excellent... still, this is dependent on how the ACM MGIDs are constructed, I'll take a look on the code. > ACM is intended to be a service that's used by the librdmacm to resolve address mappings and routes. Trying to have ACM use the librdmacm ends up with a circular dependency. That's the part I'm trying to avoid. fail-enough, I believe that my suggestion is doable also without circular dependency, e.g as you indicated below or with a fairly small enhancement of librdmacm, see next > ACM uses address mappings as defined in an address configuration file (IP -> > device, port, pkey). The address file can be created using the provided ib_acme utility, which uses the current system configuration (in an ugly way, but it works). I think this provides QoS behavior similar to what you're describing I assume you are referring to an IP local to the system where ACM runs on correct? this would work well for applications calling rdma_bind and/or rdma_resolve_address while specifying a source address. To support also the case of application which do neither of these two, that is call rdma_resolve_addr with dest address only, I suggest to enhance librdmacm-calling-ACM flow and resolve the source address using route lookup from user space, next the librdmacm can issue rdma_bind on behalf of this ID and you have the <device, port, pkey> triplet at your hand so now the ACM call can be made form librdmacm. Writing this, I realized that better(should) be done also for apps _resove_addr with src ip specified. This way you have unified flow for the ACM use in librdmacm for either of apps A,B,C below A.1 rdma_bind(src=X) A.2 rdma_resolve_addr(src=null, dst=Y) B.1 rdma_resolve_addr(src=null, dst=Y) C.1 rdma_resolve_addr(src=X, dst=Y) where librdmacm calling-ACM flow is L1. compute source address L2. issue kernel rdma_bind to source address and resolve <device, port, pkey> L3. issue ACM address (DGID) resolution call using (<device, port, pkey>, dest-ip) makes sense? if yes, what's the need in the address configuration file? Or. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 9+ messages in thread
[parent not found: <4AF66473.2050303-smomgflXvOZWk0Htik3J/w@public.gmane.org>]
* Re: QoS in local SA entity [not found] ` <4AF66473.2050303-smomgflXvOZWk0Htik3J/w@public.gmane.org> @ 2009-11-09 0:56 ` Jason Gunthorpe [not found] ` <20091109005607.GV1966-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org> 2009-11-09 18:38 ` Sean Hefty 1 sibling, 1 reply; 9+ messages in thread From: Jason Gunthorpe @ 2009-11-09 0:56 UTC (permalink / raw) To: Or Gerlitz; +Cc: Sean Hefty, linux-rdma On Sun, Nov 08, 2009 at 08:25:55AM +0200, Or Gerlitz wrote: > >ACM is intended to be a service that's used by the librdmacm to resolve > >address mappings and routes. Trying to have ACM use the librdmacm ends up > >with a circular dependency. That's the part I'm trying to avoid. > > fail-enough, I believe that my suggestion is doable also without > circular dependency, e.g as you indicated below or with a fairly small > enhancement of librdmacm, see next The entire point of the rdma_getaddrinfo + AF_IB is to avoid hacking up librdmacm for every address lookup/cache scheme someone invents. The desired flow would be: rdma_getaddrinfo("User-Specified-Host-String","User-Specified-Port-String", &hints,&res); // Server flow (hints.af_flags |= AI_PASSIVE) rdma_bind(res[0].bind_addr); rmda_listen(res[0].listen_addr); // Client Flow rdma_bind(res[0].bind_addr); // Optional rdma_resolve_addr2(res[0].bind_addr,res[0].dest_addr,res[0].extra_info); And under rdma_getaddrinfo we could have any number of modules, like glibc does. Well written apps should already be using normal getaddrinfo, so we can design an upgrade to rdma_getaddrinfo to be very minor, source wise. Un upgraded apps don't get the new functionality. Jason -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 9+ messages in thread
[parent not found: <20091109005607.GV1966-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>]
* Re: QoS in local SA entity [not found] ` <20091109005607.GV1966-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org> @ 2009-11-09 7:44 ` Or Gerlitz [not found] ` <4AF7C85F.5000604-smomgflXvOZWk0Htik3J/w@public.gmane.org> 0 siblings, 1 reply; 9+ messages in thread From: Or Gerlitz @ 2009-11-09 7:44 UTC (permalink / raw) To: Jason Gunthorpe; +Cc: Sean Hefty, linux-rdma Jason Gunthorpe wrote: > The entire point of the rdma_getaddrinfo + AF_IB is to avoid hacking up librdmacm for every address lookup/cache scheme someone invents the entire simple point I am trying to make is that rdma_getaddrinfo + AF_INET is doable, is simple and is needed to keep up the essence of the rdma-cm. I don't see how AF_IB buys anything to anyone that but if you want to push it up as long as AF_INET is first and most supported/interoperable future/present go and add your bits. As you indicated the route lookup I was mentioning could be done in rdma_addrinfo, sure with &res including both source and destination addresses. No rdma_resolve_addr2 is needed the one that exists now has source addresses specified, I don't see that extra info is needed for AF_INET that was resolved with rdma_getaddrinfo is this AF_IB specific? I don't see why the app should bother on calling rdma_getaddrinfo, it can be done by librdmacm with rdma_getaddrinfo having multiple modules as you suggested. I am in favor of the approach suggested by Sean of librdmacm either doing its native flow or under environment variable doing an alternative flow, where your suggestion not to have the 2nd flow being tightly coupled with ACM, e.g through using get_addrinfo abstraction and friends makes sense (yes!) Or. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 9+ messages in thread
[parent not found: <4AF7C85F.5000604-smomgflXvOZWk0Htik3J/w@public.gmane.org>]
* Re: QoS in local SA entity [not found] ` <4AF7C85F.5000604-smomgflXvOZWk0Htik3J/w@public.gmane.org> @ 2009-11-09 8:08 ` Jason Gunthorpe [not found] ` <20091109080812.GX1966-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org> 0 siblings, 1 reply; 9+ messages in thread From: Jason Gunthorpe @ 2009-11-09 8:08 UTC (permalink / raw) To: Or Gerlitz; +Cc: Sean Hefty, linux-rdma On Mon, Nov 09, 2009 at 09:44:31AM +0200, Or Gerlitz wrote: > No rdma_resolve_addr2 is needed the one that exists now has > source addresses specified, I don't see that extra info is needed for > AF_INET that was resolved with rdma_getaddrinfo is this AF_IB specific? The extra info in rdma_resolve_addr2 carries the IB specific path information from the rdma_getaddrinfo module to the kernel for the address pair. Then entire purpose of AF_IB is to let user space tell the kernel it does not want a kernel side ND and PR query, instead user space will provide all the information. Think of it this way, ACM takes over the entire process of what AF_INET does in the kernel. AF_INET talks directly to the IB CM module in the kernel. Thus, it also makes sense that ACM would need to talk to IB CM directly as well. AF_IB is that direct connection. > I don't see why the app should bother on calling rdma_getaddrinfo, it > can be done by librdmacm with rdma_getaddrinfo having multiple modules > as you suggested. I am in favor of the approach suggested by Sean of > librdmacm either doing its native flow or under environment variable > doing an alternative flow, where your suggestion not to have the 2nd > flow being tightly coupled with ACM, e.g through using get_addrinfo > abstraction and friends makes sense (yes!) I don't entirely understand this paragraph, but the point of a string based rdma_getaddrinfo is exactly the same point as for IP - strings may have different meaning and may encode richer information than a simple sock addr (eg normal getaddrinfo can determine AF_INET, AF_INET6, and AF_UNIX depending on the form of the string). For instance it might make sense to trigger/disable the ACM method with a special string based indicator. Jason -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 9+ messages in thread
[parent not found: <20091109080812.GX1966-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>]
* Re: QoS in local SA entity [not found] ` <20091109080812.GX1966-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org> @ 2009-11-10 5:51 ` Or Gerlitz 0 siblings, 0 replies; 9+ messages in thread From: Or Gerlitz @ 2009-11-10 5:51 UTC (permalink / raw) To: Jason Gunthorpe; +Cc: Sean Hefty, linux-rdma Jason Gunthorpe wrote: > The extra info in rdma_resolve_addr2 carries the IB specific path information from the rdma_getaddrinfo module to the kernel for the address pair. The entire purpose of AF_IB is to let user space tell the kernel it does not want a kernel side ND and PR query, instead user space will provide all the information. The kernel patches posted by Sean replace the ND/PR flow with a two steps process, first specifying a DGID to the kernel next specifying a PATH. My suggestion is to have a librdmacm initiated bind before the sending the DGID to the kernel, this way AF_INET would be supported perfectly under the slight limitation that the source address <device, port, pkey> tuple would be chosen by route lookup and not by the neigh->dev that what resolved by the kernel ND. This is only when the modified flow of librdmacm is taken (e.g under user specification with environment variable etc). --If-- on top of that you want to add AF_IB, we may be able to do that, but I don't see why the whole thing should be made for AF_IB only. > Think of it this way, ACM takes over the entire process of what AF_INET does in the kernel. AF_INET talks directly to the IB CM module in the kernel. Thus, it also makes sense that ACM would need to talk to IB CM directly as well. AF_IB is that direct connection. I don't agree we must state it this way. I see ACM as an alternative way for AF_INET to resolve ND/PR. Or. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 9+ messages in thread
* RE: QoS in local SA entity [not found] ` <4AF66473.2050303-smomgflXvOZWk0Htik3J/w@public.gmane.org> 2009-11-09 0:56 ` Jason Gunthorpe @ 2009-11-09 18:38 ` Sean Hefty [not found] ` <5C9CD47F123648F0A926E151BF775484-Zpru7NauK7drdx17CPfAsdBPR1lH4CV8@public.gmane.org> 1 sibling, 1 reply; 9+ messages in thread From: Sean Hefty @ 2009-11-09 18:38 UTC (permalink / raw) To: 'Or Gerlitz'; +Cc: linux-rdma >L1. compute source address >L2. issue kernel rdma_bind to source address and resolve <device, port, >pkey> >L3. issue ACM address (DGID) resolution call using (<device, port, >pkey>, dest-ip) > >makes sense? if yes, what's the need in the address configuration file? Here is where we're at today: rdma_resolve_addr: - Source sends a multicast request to destination IP - Destination performs a path record query - Destination sends a response with IP to DGID mapping rdma_resolve_route: - Source performs a path record query The current implementation of ACM converts this to: ** Source sends a multicast request to destination IP ** Destination sends a response with IP to DGID mapping - Path record is constructed from multicast group information ACM needs to know what the local addresses are, so it can respond to requests for those addresses. - Sean -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 9+ messages in thread
[parent not found: <5C9CD47F123648F0A926E151BF775484-Zpru7NauK7drdx17CPfAsdBPR1lH4CV8@public.gmane.org>]
* Re: QoS in local SA entity [not found] ` <5C9CD47F123648F0A926E151BF775484-Zpru7NauK7drdx17CPfAsdBPR1lH4CV8@public.gmane.org> @ 2009-11-10 5:29 ` Or Gerlitz 0 siblings, 0 replies; 9+ messages in thread From: Or Gerlitz @ 2009-11-10 5:29 UTC (permalink / raw) To: Sean Hefty; +Cc: linux-rdma Sean Hefty wrote: > [...] The current implementation of ACM converts this to: > ** Source sends a multicast request to destination IP > ** Destination sends a response with IP to DGID mapping > - Path record is constructed from multicast group information > ACM needs to know what the local addresses are, so it can respond to requests > for those addresses okay got it. Still, how do you see my suggestion on the unified/modified librdmacm flow (L1/L2/L3 in my email) which would be taken when working against a "DGID/Route" provider such as ACM? Or. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2009-11-10 5:51 UTC | newest] Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2009-11-05 12:07 QoS in local SA entity Or Gerlitz [not found] ` <4AF2C00A.4040808-smomgflXvOZWk0Htik3J/w@public.gmane.org> 2009-11-05 16:40 ` Sean Hefty [not found] ` <9BF1CEFA7F6F44F5B5641065C4914EB5-Zpru7NauK7drdx17CPfAsdBPR1lH4CV8@public.gmane.org> 2009-11-08 6:25 ` Or Gerlitz [not found] ` <4AF66473.2050303-smomgflXvOZWk0Htik3J/w@public.gmane.org> 2009-11-09 0:56 ` Jason Gunthorpe [not found] ` <20091109005607.GV1966-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org> 2009-11-09 7:44 ` Or Gerlitz [not found] ` <4AF7C85F.5000604-smomgflXvOZWk0Htik3J/w@public.gmane.org> 2009-11-09 8:08 ` Jason Gunthorpe [not found] ` <20091109080812.GX1966-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org> 2009-11-10 5:51 ` Or Gerlitz 2009-11-09 18:38 ` Sean Hefty [not found] ` <5C9CD47F123648F0A926E151BF775484-Zpru7NauK7drdx17CPfAsdBPR1lH4CV8@public.gmane.org> 2009-11-10 5:29 ` Or Gerlitz
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.