All of lore.kernel.org
 help / color / mirror / Atom feed
* skb_checksum_setup() placement in pv-ops vs. legacy kernel
@ 2010-12-03 10:52 Jan Beulich
  2010-12-03 11:12 ` Ian Campbell
  0 siblings, 1 reply; 8+ messages in thread
From: Jan Beulich @ 2010-12-03 10:52 UTC (permalink / raw)
  To: Ian Campbell, Jeremy Fitzhardinge; +Cc: xen-devel

Ian, Jeremy,

knowing pretty little about networking, it nevertheless seems to me
that the different placement of skb_checksum_setup() (in the receive
paths of pv-ops vs in various transmit paths in legacy) poses a
compatibility problem (nothing done on either side if sending from
pv-ops to legacy, and done on both ends when sending from legacy
to pv-ops). Am I overlooking something here?

Thanks, Jan

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: skb_checksum_setup() placement in pv-ops vs. legacy kernel
  2010-12-03 10:52 skb_checksum_setup() placement in pv-ops vs. legacy kernel Jan Beulich
@ 2010-12-03 11:12 ` Ian Campbell
  2010-12-03 11:51   ` Jan Beulich
  0 siblings, 1 reply; 8+ messages in thread
From: Ian Campbell @ 2010-12-03 11:12 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Jeremy Fitzhardinge, xen-devel

On Fri, 2010-12-03 at 10:52 +0000, Jan Beulich wrote:
> Ian, Jeremy,
> 
> knowing pretty little about networking, it nevertheless seems to me
> that the different placement of skb_checksum_setup() (in the receive
> paths of pv-ops vs in various transmit paths in legacy) poses a
> compatibility problem (nothing done on either side if sending from
> pv-ops to legacy, and done on both ends when sending from legacy
> to pv-ops). Am I overlooking something here?

Possibly confusion due to the backwards naming convention in netback?

The pvops dom0 side calls skb_checksum_setup in net_tx_submit which
(counter-intuitively) is the function which receives the skb from the
guest and passes it up to the dom0 network stack (i.e. it handles guest
tx).

Since we call skb_checksum_setup on the ingress path all skbs in the
domain 0 network stack always have their checksum fields correctly
initialised and there is never anything to be done when transmitting
transmitting out the other side, either to another domU or to a physical
device, and therefore it doesn't matter which kernel the domU is
running.

On legacy dom0 skb_checksum_setup is called on the generic transmit
path, so skbs in the domain 0 network stack can have uninitialised
checksum fields but this is always fixed up before passing back down to
either netback (called the rx path in netback parlance) or a physical
device. This can (and has) caused trouble in the past where networking
subsystems are interested in the checksum fields before egress, e.g. we
needed to do fixup in various netfilter code paths etc.

Ian.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: skb_checksum_setup() placement in pv-ops vs. legacy kernel
  2010-12-03 11:12 ` Ian Campbell
@ 2010-12-03 11:51   ` Jan Beulich
  2010-12-03 12:06     ` Ian Campbell
  0 siblings, 1 reply; 8+ messages in thread
From: Jan Beulich @ 2010-12-03 11:51 UTC (permalink / raw)
  To: Ian Campbell; +Cc: Jeremy Fitzhardinge, xen-devel

>>> On 03.12.10 at 12:12, Ian Campbell <Ian.Campbell@eu.citrix.com> wrote:
> On Fri, 2010-12-03 at 10:52 +0000, Jan Beulich wrote:
>> Ian, Jeremy,
>> 
>> knowing pretty little about networking, it nevertheless seems to me
>> that the different placement of skb_checksum_setup() (in the receive
>> paths of pv-ops vs in various transmit paths in legacy) poses a
>> compatibility problem (nothing done on either side if sending from
>> pv-ops to legacy, and done on both ends when sending from legacy
>> to pv-ops). Am I overlooking something here?
> 
> Possibly confusion due to the backwards naming convention in netback?

No - note that I wrote it specifically this way in the original mail.

> The pvops dom0 side calls skb_checksum_setup in net_tx_submit which
> (counter-intuitively) is the function which receives the skb from the
> guest and passes it up to the dom0 network stack (i.e. it handles guest
> tx).
> 
> Since we call skb_checksum_setup on the ingress path all skbs in the
> domain 0 network stack always have their checksum fields correctly
> initialised and there is never anything to be done when transmitting
> transmitting out the other side, either to another domU or to a physical
> device, and therefore it doesn't matter which kernel the domU is
> running.
> 
> On legacy dom0 skb_checksum_setup is called on the generic transmit
> path, so skbs in the domain 0 network stack can have uninitialised
> checksum fields but this is always fixed up before passing back down to
> either netback (called the rx path in netback parlance) or a physical
> device. This can (and has) caused trouble in the past where networking
> subsystems are interested in the checksum fields before egress, e.g. we
> needed to do fixup in various netfilter code paths etc.

Yes, I can see the benefit of doing it the pv-ops way. The question is
what happens for a transmission from pv-ops (frontend or backend -
nothing done in the transmit path) to legacy (again frontend or
backend - nothing done in the receive path). Secondary question was
whether the duplicated effort on transmission the other way around
may be a (performance) issue.

Jan

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: skb_checksum_setup() placement in pv-ops vs. legacy kernel
  2010-12-03 11:51   ` Jan Beulich
@ 2010-12-03 12:06     ` Ian Campbell
  2010-12-03 12:24       ` Jan Beulich
  0 siblings, 1 reply; 8+ messages in thread
From: Ian Campbell @ 2010-12-03 12:06 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Jeremy Fitzhardinge, xen-devel

On Fri, 2010-12-03 at 11:51 +0000, Jan Beulich wrote:
> >>> On 03.12.10 at 12:12, Ian Campbell <Ian.Campbell@eu.citrix.com> wrote:
> > On Fri, 2010-12-03 at 10:52 +0000, Jan Beulich wrote:
> >> Ian, Jeremy,
> >> 
> >> knowing pretty little about networking, it nevertheless seems to me
> >> that the different placement of skb_checksum_setup() (in the receive
> >> paths of pv-ops vs in various transmit paths in legacy) poses a
> >> compatibility problem (nothing done on either side if sending from
> >> pv-ops to legacy, and done on both ends when sending from legacy
> >> to pv-ops). Am I overlooking something here?
> > 
> > Possibly confusion due to the backwards naming convention in netback?
> 
> No - note that I wrote it specifically this way in the original mail.
> 
> > The pvops dom0 side calls skb_checksum_setup in net_tx_submit which
> > (counter-intuitively) is the function which receives the skb from the
> > guest and passes it up to the dom0 network stack (i.e. it handles guest
> > tx).
> > 
> > Since we call skb_checksum_setup on the ingress path all skbs in the
> > domain 0 network stack always have their checksum fields correctly
> > initialised and there is never anything to be done when transmitting
> > transmitting out the other side, either to another domU or to a physical
> > device, and therefore it doesn't matter which kernel the domU is
> > running.
> > 
> > On legacy dom0 skb_checksum_setup is called on the generic transmit
> > path, so skbs in the domain 0 network stack can have uninitialised
> > checksum fields but this is always fixed up before passing back down to
> > either netback (called the rx path in netback parlance) or a physical
> > device. This can (and has) caused trouble in the past where networking
> > subsystems are interested in the checksum fields before egress, e.g. we
> > needed to do fixup in various netfilter code paths etc.
> 
> Yes, I can see the benefit of doing it the pv-ops way. The question is
> what happens for a transmission from pv-ops (frontend or backend -
> nothing done in the transmit path) to legacy (again frontend or
> backend - nothing done in the receive path).

You mean a packet flowing pvops-domU -> pvops-dom0 -> legacy?

In this case the dom0 kernel does the necessary setup at (*) in the
pvops-domU -> (*) pvops-dom0 hop so there is nothing to do on the
pvops-dom0 (*) ->legacy hop.

If the legacy kernel forwards the packet further it will have to do the
setup on its egress path, this is the same if dom0 is pvops or legacy.

> Secondary question was whether the duplicated effort on transmission the other way around
> may be a (performance) issue.

You mean the legacy (*) -> (*) pvops-dom0 -> pvops-domU case?

In that case the setup is done at the two (*)'s but it is not really
"duplicated" as such since it is in the context of two separate skbs. if
the dom0 was legacy then the second one would still happen but on the
egress path.

I have a feeling I'm not understanding what your concern is correctly.
If the above isn't what you mean can you give an example of the path of
the packet and when the setup is (not) occurring.

Ian.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: skb_checksum_setup() placement in pv-ops vs. legacy kernel
  2010-12-03 12:06     ` Ian Campbell
@ 2010-12-03 12:24       ` Jan Beulich
  2010-12-03 13:18         ` Ian Campbell
  0 siblings, 1 reply; 8+ messages in thread
From: Jan Beulich @ 2010-12-03 12:24 UTC (permalink / raw)
  To: Ian Campbell; +Cc: Jeremy Fitzhardinge, xen-devel

>>> On 03.12.10 at 13:06, Ian Campbell <Ian.Campbell@eu.citrix.com> wrote:
> You mean a packet flowing pvops-domU -> pvops-dom0 -> legacy?

I was actually just thinking in terms of a simple pair (see below).

> In this case the dom0 kernel does the necessary setup at (*) in the
> pvops-domU -> (*) pvops-dom0 hop so there is nothing to do on the
> pvops-dom0 (*) ->legacy hop.
> 
> If the legacy kernel forwards the packet further it will have to do the
> setup on its egress path, this is the same if dom0 is pvops or legacy.
> 
>> Secondary question was whether the duplicated effort on transmission the 
> other way around
>> may be a (performance) issue.
> 
> You mean the legacy (*) -> (*) pvops-dom0 -> pvops-domU case?
> 
> In that case the setup is done at the two (*)'s but it is not really
> "duplicated" as such since it is in the context of two separate skbs. if
> the dom0 was legacy then the second one would still happen but on the
> egress path.
> 
> I have a feeling I'm not understanding what your concern is correctly.
> If the above isn't what you mean can you give an example of the path of
> the packet and when the setup is (not) occurring.

pv-ops-{front,back}end -> legacy-{back,front} (for example a
pv-ops DomU sending a packet to (not through) a legacy Dom0,
or pv-ops Dom0 sending to legacy DomU). Of course, if the
packet fully passes the backend domain's stack, it will have
undergone the setup at least once (either on its way into or
out of that stack).

Jan

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: skb_checksum_setup() placement in pv-ops vs. legacy kernel
  2010-12-03 12:24       ` Jan Beulich
@ 2010-12-03 13:18         ` Ian Campbell
  2010-12-07 12:24           ` Jan Beulich
  0 siblings, 1 reply; 8+ messages in thread
From: Ian Campbell @ 2010-12-03 13:18 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Jeremy Fitzhardinge, xen-devel

On Fri, 2010-12-03 at 12:24 +0000, Jan Beulich wrote:
> >>> On 03.12.10 at 13:06, Ian Campbell <Ian.Campbell@eu.citrix.com> wrote:
> > You mean a packet flowing pvops-domU -> pvops-dom0 -> legacy?
> 
> I was actually just thinking in terms of a simple pair (see below).
> 
> > In this case the dom0 kernel does the necessary setup at (*) in the
> > pvops-domU -> (*) pvops-dom0 hop so there is nothing to do on the
> > pvops-dom0 (*) ->legacy hop.
> > 
> > If the legacy kernel forwards the packet further it will have to do the
> > setup on its egress path, this is the same if dom0 is pvops or legacy.
> > 
> >> Secondary question was whether the duplicated effort on transmission the 
> > other way around
> >> may be a (performance) issue.
> > 
> > You mean the legacy (*) -> (*) pvops-dom0 -> pvops-domU case?
> > 
> > In that case the setup is done at the two (*)'s but it is not really
> > "duplicated" as such since it is in the context of two separate skbs. if
> > the dom0 was legacy then the second one would still happen but on the
> > egress path.
> > 
> > I have a feeling I'm not understanding what your concern is correctly.
> > If the above isn't what you mean can you give an example of the path of
> > the packet and when the setup is (not) occurring.
> 
> pv-ops-{front,back}end -> legacy-{back,front} (for example a
> pv-ops DomU sending a packet to (not through) a legacy Dom0,

The setup which is done in skb_checksum_setup is internal to the guest's
skb data structure and doesn't cross the pv interface boundary. The
fields which it sets up are just offsets to the checksum field in the
packet, it doesn't actually manipulate the content of the packet or
impact what goes into the ring until/unless the guest does TSO or
something similar in which case the kernel needs to make sure the fields
are setup first.

So in the pvops-front->legacy-back case the legacy dom0 is already happy
with having skbs with invalid checksum fields floating around in its
stack since it sees the exact same thing in the
legacy-front->legacy-back case.

If it gets to a point where it needs the fields to be valid (either to
forward on or if in some case it matters for local delivery) then it
still has to do the necessary setup at that point.

> or pv-ops Dom0 sending to legacy DomU).

Same here, the legacy kernel knows it needs to setup the skb checksum
fields before it uses them.

> Of course, if the
> packet fully passes the backend domain's stack, it will have
> undergone the setup at least once (either on its way into or
> out of that stack).

Ian.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: skb_checksum_setup() placement in pv-ops vs. legacy kernel
  2010-12-03 13:18         ` Ian Campbell
@ 2010-12-07 12:24           ` Jan Beulich
  2010-12-07 13:29             ` Ian Campbell
  0 siblings, 1 reply; 8+ messages in thread
From: Jan Beulich @ 2010-12-07 12:24 UTC (permalink / raw)
  To: Ian Campbell; +Cc: Jeremy Fitzhardinge, xen-devel

>>> On 03.12.10 at 14:18, Ian Campbell <Ian.Campbell@eu.citrix.com> wrote:
> The setup which is done in skb_checksum_setup is internal to the guest's
> skb data structure and doesn't cross the pv interface boundary. The
> fields which it sets up are just offsets to the checksum field in the
> packet, it doesn't actually manipulate the content of the packet or
> impact what goes into the ring until/unless the guest does TSO or
> something similar in which case the kernel needs to make sure the fields
> are setup first.

Okay, that makes it much easier to change the behavior then.

What I'm then not understanding is who the consumer of this
data is, and why it wasn't done the receive path way from the
beginning. Were there issues with the no longer used loopback
driver? Or did kernel networking infrastructure change (if so,
it'd be nice to know when and what)?

Thanks for bearing with me,
Jan

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: skb_checksum_setup() placement in pv-ops vs. legacy kernel
  2010-12-07 12:24           ` Jan Beulich
@ 2010-12-07 13:29             ` Ian Campbell
  0 siblings, 0 replies; 8+ messages in thread
From: Ian Campbell @ 2010-12-07 13:29 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Jeremy Fitzhardinge, xen-devel

On Tue, 2010-12-07 at 12:24 +0000, Jan Beulich wrote:
> >>> On 03.12.10 at 14:18, Ian Campbell <Ian.Campbell@eu.citrix.com> wrote:
> > The setup which is done in skb_checksum_setup is internal to the guest's
> > skb data structure and doesn't cross the pv interface boundary. The
> > fields which it sets up are just offsets to the checksum field in the
> > packet, it doesn't actually manipulate the content of the packet or
> > impact what goes into the ring until/unless the guest does TSO or
> > something similar in which case the kernel needs to make sure the fields
> > are setup first.
> 
> Okay, that makes it much easier to change the behavior then.
> 
> What I'm then not understanding is who the consumer of this
> data is,

The physical NIC driver can use it as part of setting up its descriptors
fo transmit with TSO. I think the software TSO/GSO egress paths use it
too in skb_checksum_help().

> and why it wasn't done the receive path way from the
> beginning. Were there issues with the no longer used loopback
> driver? Or did kernel networking infrastructure change (if so,
> it'd be nice to know when and what)?

I don't really know the answer to this, it's a little before my time.

Perhaps it was simply a desire to defer work as long as possible in the
hopes that it won't be necessary for some reason? e.g. the skb gets
dropped and not delivered. Doesn't seem terribly compelling to me --
perhaps someone else remembers that far back.

A bunch of stuff relating the CHECKSUM_* changed at some point after
2.6.18 but I don't know if that had any impact on this aspect of things.

Ian.

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2010-12-07 13:29 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-12-03 10:52 skb_checksum_setup() placement in pv-ops vs. legacy kernel Jan Beulich
2010-12-03 11:12 ` Ian Campbell
2010-12-03 11:51   ` Jan Beulich
2010-12-03 12:06     ` Ian Campbell
2010-12-03 12:24       ` Jan Beulich
2010-12-03 13:18         ` Ian Campbell
2010-12-07 12:24           ` Jan Beulich
2010-12-07 13:29             ` Ian Campbell

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.