From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Kyle Mestery (kmestery)" Subject: Re: [RFC v4] Add TCP encap_rcv hook (repost) Date: Wed, 25 Apr 2012 13:36:46 +0000 Message-ID: References: <20120423.161313.1582195533832554777.davem@davemloft.net> <20120423.170817.1103719420692884446.davem@davemloft.net> <20120423223255.GG580@verge.net.au> <20120424022514.GB5357@verge.net.au> <807AC914-2F33-46C7-99DC-E2F8F0F97531@cisco.com> <20120425083925.GB6661@verge.net.au> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Cc: "" , "" , "" , "" , "" , "" , David Miller To: Simon Horman Return-path: In-Reply-To: <20120425083925.GB6661-/R6kz+dDXgpPR4JQBCEnsQ@public.gmane.org> Content-Language: en-US Content-ID: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: dev-bounces-yBygre7rU0TnMu66kgdUjQ@public.gmane.org Errors-To: dev-bounces-yBygre7rU0TnMu66kgdUjQ@public.gmane.org List-Id: netdev.vger.kernel.org On Apr 25, 2012, at 3:39 AM, Simon Horman wrote: > On Tue, Apr 24, 2012 at 04:02:41PM +0000, Kyle Mestery (kmestery) wrote: >> On Apr 23, 2012, at 9:25 PM, Simon Horman wrote: >>> On Mon, Apr 23, 2012 at 03:59:24PM -0700, Jesse Gross wrote: >>>> On Mon, Apr 23, 2012 at 3:32 PM, Simon Horman wrote: >>>>> On Mon, Apr 23, 2012 at 02:38:07PM -0700, Jesse Gross wrote: >>>>>> On Mon, Apr 23, 2012 at 2:08 PM, David Miller wrote: >>>>>>> From: Jesse Gross >>>>>>> Date: Mon, 23 Apr 2012 13:53:42 -0700 >>>>>>> >>>>>>>> On Mon, Apr 23, 2012 at 1:13 PM, David Miller wrote: >>>>>>>>> From: Jesse Gross >>>>>>>>> Date: Mon, 23 Apr 2012 13:08:49 -0700 >>>>>>>>> >>>>>>>>>> Assuming that the TCP stack generates large TSO frames on transmit >>>>>>>>>> (which could be the local stack; something sent by a VM; or packets >>>>>>>>>> received, coalesced by GRO and then encapsulated by STT) then you can >>>>>>>>>> just prepend the STT header (possibly slightly adjusting things like >>>>>>>>>> requested MSS, number of segments, etc. slightly). After that it's >>>>>>>>>> possible to just output the resulting frame through the IP stack like >>>>>>>>>> all tunnels do today. >>>>>>>>> >>>>>>>>> Which seems to potentially suggest a stronger intergration of the STT >>>>>>>>> tunnel transmit path into our IP stack rather than the approach Simon >>>>>>>>> is taking >>>>>>>> >>>>>>>> Did you have something in mind? >>>>>>> >>>>>>> A normal bonafide tunnel netdevice driver like GRE instead of the >>>>>>> openvswitch approach Simon is using. >>>>>> >>>>>> Ahh, yes, that I agree with. Independent of this, there's work being >>>>>> done to make it so that OVS can use the normal in-tree tunneling code >>>>>> and not need its own. Once that's done I expect that STT will follow >>>>>> the same model. >>>>> >>>>> Hi Jesse, >>>>> >>>>> I am wondering how firm the plans to on allowing OVS to use in-tree tunnel >>>>> code are. I'm happy to move my efforts over to an in-tree STT implementation >>>>> but ultimately I would like to get STT running in conjunction with OVS. >>>> >>>> I would say that it's a firm goal but the implementation probably >>>> still has a ways to go. Kyle Mestery (CC'ed) has volunteered to work >>>> on this in support of adding VXLAN, which needs some additional >>>> flexibility that this approach would also provide. You might want to >>>> talk to him to see if there are ways that you guys can work together >>>> on it if you are interested. Having better integration with upstream >>>> tunneling is definitely a step that OVS needs to make and sooner would >>>> be better than later. >>> >>> Hi Jesse, Hi Kyle, >>> >>> that sounds like an excellent plan. >>> >>> Kyle, do you have any thoughts on how we might best work together on this? >>> Perhaps there are some patches floating around that I could take a look at? >>> >> >> Hi Simon: >> >> The VXLAN work has been slow going for me at this point. What I have works, but is far from complete. It's available here: >> >> https://github.com/mestery/ovs-vxlan/tree/vxlan >> >> This is based on a fairly recent version of OVS. I'm currently working to allow tunnels to be flow-based rather than port-based, as they currently exist. >> As Jesse may have mentioned, doing this allows us to move most tunnel state into user space. The outer header can now be part of the flow lookup and can >> be passed to user space, so things like multicast learning for VXLAN become possible. >> >> With regards to working together, ping me off-list and we can work something out, I'm very much in favor of this! > > Hi Kyle, > > the component that is of most interest to me is enabling OVS to use in-tree > tunnelling code - as it seems that makes most sense for an implementation > of STT. I have taken a brief look over your vxlan work and it isn't clear > to me if it is moving towards being an in-tree implementation. Moreover, > I'm a rather unclear on what changes need to be made to OVS in order for > in-tree tunneling to be used. > > My recollection is that OVS did make use of in-tree tunnelling code > but this was removed in favour of the current implementation for various > reasons (performance being one IIRC). I gather that revisiting in-tree > tunnelling won't revisit the previous set of problems. But I'm unclear how. > > Jesse, is it possible for you to describe that in a little detail > or point me to some information? Simon: The changes I have in there now are taking the first step of trying to add support for flow-based tunneling, in the case of VXLAN. Once we do that, we can remove (if we want) the existing port-based tunneling code. I was planning this as a first step. I would also to understand from Jesse better the direction with regards to moving to in-tree tunneling. I assume the changes Jesse and I had talked about a few months back around flow-based tunneling will still be compatible with the in-tree tunneling as well. Thanks, Kyle