All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] bonding: reset queue mapping prior to transmission to physical device
@ 2011-06-02 18:03 Neil Horman
  2011-06-02 18:35 ` Ben Hutchings
                   ` (2 more replies)
  0 siblings, 3 replies; 33+ messages in thread
From: Neil Horman @ 2011-06-02 18:03 UTC (permalink / raw)
  To: netdev; +Cc: Neil Horman, Jay Vosburgh, Andy Gospodarek, David S. Miller

The bonding driver is multiqueue enabled, in which each queue represents a slave
to enable optional steering of output frames to given slaves against the default
output policy.  However, it needs to reset the skb->queue_mapping prior to
queuing to the physical device or the physical slave (if it is multiqueue) could
wind up transmitting on an unintended tx queue (one that was reserved for
specific traffic classes for instance)

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
CC: Jay Vosburgh <fubar@us.ibm.com>
CC: Andy Gospodarek <andy@greyhouse.net>
CC: "David S. Miller" <davem@davemloft.net>
---
 drivers/net/bonding/bond_main.c |    5 +++++
 1 files changed, 5 insertions(+), 0 deletions(-)

diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index 17b4dd9..812c70c 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -4200,6 +4200,11 @@ static inline int bond_slave_override(struct bonding *bond,
 	/* If the slave isn't UP, use default transmit policy. */
 	if (slave && slave->queue_id && IS_UP(slave->dev) &&
 	    (slave->link == BOND_LINK_UP)) {
+		/*
+ 		 * Reset the queue mapping to allow the underlying slave	
+ 		 * the chance to re-select an appropriate tx queue
+ 		 */
+		skb_set_queue_mapping(skb, 0);
 		res = bond_dev_queue_xmit(bond, skb, slave->dev);
 	}
 
-- 
1.7.3.4


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* Re: [PATCH] bonding: reset queue mapping prior to transmission to physical device
  2011-06-02 18:03 [PATCH] bonding: reset queue mapping prior to transmission to physical device Neil Horman
@ 2011-06-02 18:35 ` Ben Hutchings
  2011-06-02 18:56   ` Neil Horman
  2011-06-02 19:59   ` David Miller
  2011-06-02 20:07 ` David Miller
  2011-06-03 13:26 ` [PATCH] bonding: reset queue mapping prior to transmission to physical device (v2) Neil Horman
  2 siblings, 2 replies; 33+ messages in thread
From: Ben Hutchings @ 2011-06-02 18:35 UTC (permalink / raw)
  To: Neil Horman; +Cc: netdev, Jay Vosburgh, Andy Gospodarek, David S. Miller

On Thu, 2011-06-02 at 14:03 -0400, Neil Horman wrote:
> The bonding driver is multiqueue enabled, in which each queue represents a slave
> to enable optional steering of output frames to given slaves against the default
> output policy.  However, it needs to reset the skb->queue_mapping prior to
> queuing to the physical device or the physical slave (if it is multiqueue) could
> wind up transmitting on an unintended tx queue (one that was reserved for
> specific traffic classes for instance)
>
> Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
> CC: Jay Vosburgh <fubar@us.ibm.com>
> CC: Andy Gospodarek <andy@greyhouse.net>
> CC: "David S. Miller" <davem@davemloft.net>
> ---
>  drivers/net/bonding/bond_main.c |    5 +++++
>  1 files changed, 5 insertions(+), 0 deletions(-)
> 
> diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
> index 17b4dd9..812c70c 100644
> --- a/drivers/net/bonding/bond_main.c
> +++ b/drivers/net/bonding/bond_main.c
> @@ -4200,6 +4200,11 @@ static inline int bond_slave_override(struct bonding *bond,
>  	/* If the slave isn't UP, use default transmit policy. */
>  	if (slave && slave->queue_id && IS_UP(slave->dev) &&
>  	    (slave->link == BOND_LINK_UP)) {
> +		/*
> + 		 * Reset the queue mapping to allow the underlying slave	
> + 		 * the chance to re-select an appropriate tx queue
> + 		 */
> +		skb_set_queue_mapping(skb, 0);
>  		res = bond_dev_queue_xmit(bond, skb, slave->dev);
>  	}
>  

So far as I can see, this has no effect, because dev_queue_xmit() always
sets queue_mapping (in dev_pick_tx()).

What is the problem you're seeing?

Ben.

-- 
Ben Hutchings, Senior Software Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH] bonding: reset queue mapping prior to transmission to physical device
  2011-06-02 18:35 ` Ben Hutchings
@ 2011-06-02 18:56   ` Neil Horman
  2011-06-02 19:09     ` Ben Hutchings
  2011-06-02 19:59   ` David Miller
  1 sibling, 1 reply; 33+ messages in thread
From: Neil Horman @ 2011-06-02 18:56 UTC (permalink / raw)
  To: Ben Hutchings; +Cc: netdev, Jay Vosburgh, Andy Gospodarek, David S. Miller

On Thu, Jun 02, 2011 at 07:35:53PM +0100, Ben Hutchings wrote:
> On Thu, 2011-06-02 at 14:03 -0400, Neil Horman wrote:
> > The bonding driver is multiqueue enabled, in which each queue represents a slave
> > to enable optional steering of output frames to given slaves against the default
> > output policy.  However, it needs to reset the skb->queue_mapping prior to
> > queuing to the physical device or the physical slave (if it is multiqueue) could
> > wind up transmitting on an unintended tx queue (one that was reserved for
> > specific traffic classes for instance)
> >
> > Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
> > CC: Jay Vosburgh <fubar@us.ibm.com>
> > CC: Andy Gospodarek <andy@greyhouse.net>
> > CC: "David S. Miller" <davem@davemloft.net>
> > ---
> >  drivers/net/bonding/bond_main.c |    5 +++++
> >  1 files changed, 5 insertions(+), 0 deletions(-)
> > 
> > diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
> > index 17b4dd9..812c70c 100644
> > --- a/drivers/net/bonding/bond_main.c
> > +++ b/drivers/net/bonding/bond_main.c
> > @@ -4200,6 +4200,11 @@ static inline int bond_slave_override(struct bonding *bond,
> >  	/* If the slave isn't UP, use default transmit policy. */
> >  	if (slave && slave->queue_id && IS_UP(slave->dev) &&
> >  	    (slave->link == BOND_LINK_UP)) {
> > +		/*
> > + 		 * Reset the queue mapping to allow the underlying slave	
> > + 		 * the chance to re-select an appropriate tx queue
> > + 		 */
> > +		skb_set_queue_mapping(skb, 0);
> >  		res = bond_dev_queue_xmit(bond, skb, slave->dev);
> >  	}
> >  
> 
> So far as I can see, this has no effect, because dev_queue_xmit() always
> sets queue_mapping (in dev_pick_tx()).
> 
it resets the queue mapping exactly as you would expect it to.  bonding is a
multiqueue enabled device and selects a potentially non-zero queue based on the
output of bond_select_queue.

> What is the problem you're seeing?
> 
The problem is exctly that.  dev_pick_tx() on the bond device sets the
queue_mapping as per the result of bond_select_queue (the ndo_select_queue
method for the bonding driver).  The implementation there is based on the use of
tc with bonding, so that output slaves can be selected for certain types of
traffic.  But when that mechanism is used, skb->queue_mapping is preserved when
the bonding driver queues the frame to the underlying slave.  This denies the
slave (if its also a multiqueue device) the opportunity to reselect the queue
properly, because of this call path:

bond_queue_xmit
 dev_queue_xmit(slave_dev)
  dev_pick_tx()
   skb_tx_hash()
    __skb_tx_hash()

__skb_tx_hash sees that skb_queue_recorded returns true, and assigns a hardware queue mapping
based on what the bonding driver chose using its own internal logic.  Since
bonding uses the multiqueue infrastructure to do slave output selection without
any regard for slave output queue selection, it seems to me we should really
reset the queue mapping to zero so the slave device can pick its own tx queue.

Neil

> Ben.
> 
> -- 
> Ben Hutchings, Senior Software Engineer, Solarflare
> Not speaking for my employer; that's the marketing department's job.
> They asked us to note that Solarflare product names are trademarked.
> 
> 

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH] bonding: reset queue mapping prior to transmission to physical device
  2011-06-02 18:56   ` Neil Horman
@ 2011-06-02 19:09     ` Ben Hutchings
  2011-06-02 19:46       ` Neil Horman
  0 siblings, 1 reply; 33+ messages in thread
From: Ben Hutchings @ 2011-06-02 19:09 UTC (permalink / raw)
  To: Neil Horman; +Cc: netdev, Jay Vosburgh, Andy Gospodarek, David S. Miller

On Thu, 2011-06-02 at 14:56 -0400, Neil Horman wrote:
> On Thu, Jun 02, 2011 at 07:35:53PM +0100, Ben Hutchings wrote:
> > On Thu, 2011-06-02 at 14:03 -0400, Neil Horman wrote:
> > > The bonding driver is multiqueue enabled, in which each queue represents a slave
> > > to enable optional steering of output frames to given slaves against the default
> > > output policy.  However, it needs to reset the skb->queue_mapping prior to
> > > queuing to the physical device or the physical slave (if it is multiqueue) could
> > > wind up transmitting on an unintended tx queue (one that was reserved for
> > > specific traffic classes for instance)
[...]
> > So far as I can see, this has no effect, because dev_queue_xmit() always
> > sets queue_mapping (in dev_pick_tx()).
> > 
> it resets the queue mapping exactly as you would expect it to.  bonding is a
> multiqueue enabled device and selects a potentially non-zero queue based on the
> output of bond_select_queue.
> 
> > What is the problem you're seeing?
> > 
> The problem is exctly that.  dev_pick_tx() on the bond device sets the
> queue_mapping as per the result of bond_select_queue (the ndo_select_queue
> method for the bonding driver).  The implementation there is based on the use of
> tc with bonding, so that output slaves can be selected for certain types of
> traffic.  But when that mechanism is used, skb->queue_mapping is preserved when
> the bonding driver queues the frame to the underlying slave.  This denies the
> slave (if its also a multiqueue device) the opportunity to reselect the queue
> properly, because of this call path:
> 
> bond_queue_xmit
>  dev_queue_xmit(slave_dev)
>   dev_pick_tx()
>    skb_tx_hash()
>     __skb_tx_hash()
> 
> __skb_tx_hash sees that skb_queue_recorded returns true, and assigns a hardware queue mapping
> based on what the bonding driver chose using its own internal logic.  Since
> bonding uses the multiqueue infrastructure to do slave output selection without
> any regard for slave output queue selection, it seems to me we should really
> reset the queue mapping to zero so the slave device can pick its own tx queue.

So you're effectively clearing the *RX queue* number (as this is before
dev_pick_tx()) in order to influence TX queue selection.

Here, the bonding device seems to be behaving as a forwarding device.
If TX queue selection can go wrong for certain combinations of queue
configuration when forwarding, then this is a problem for IP forwarding
and bridging as well, isn't it?

Ben.

-- 
Ben Hutchings, Senior Software Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH] bonding: reset queue mapping prior to transmission to physical device
  2011-06-02 19:09     ` Ben Hutchings
@ 2011-06-02 19:46       ` Neil Horman
  2011-06-02 19:52         ` Nicolas de Pesloüan
                           ` (2 more replies)
  0 siblings, 3 replies; 33+ messages in thread
From: Neil Horman @ 2011-06-02 19:46 UTC (permalink / raw)
  To: Ben Hutchings; +Cc: netdev, Jay Vosburgh, Andy Gospodarek, David S. Miller

On Thu, Jun 02, 2011 at 08:09:49PM +0100, Ben Hutchings wrote:
> On Thu, 2011-06-02 at 14:56 -0400, Neil Horman wrote:
> > On Thu, Jun 02, 2011 at 07:35:53PM +0100, Ben Hutchings wrote:
> > > On Thu, 2011-06-02 at 14:03 -0400, Neil Horman wrote:
> > > > The bonding driver is multiqueue enabled, in which each queue represents a slave
> > > > to enable optional steering of output frames to given slaves against the default
> > > > output policy.  However, it needs to reset the skb->queue_mapping prior to
> > > > queuing to the physical device or the physical slave (if it is multiqueue) could
> > > > wind up transmitting on an unintended tx queue (one that was reserved for
> > > > specific traffic classes for instance)
> [...]
> > > So far as I can see, this has no effect, because dev_queue_xmit() always
> > > sets queue_mapping (in dev_pick_tx()).
> > > 
> > it resets the queue mapping exactly as you would expect it to.  bonding is a
> > multiqueue enabled device and selects a potentially non-zero queue based on the
> > output of bond_select_queue.
> > 
> > > What is the problem you're seeing?
> > > 
> > The problem is exctly that.  dev_pick_tx() on the bond device sets the
> > queue_mapping as per the result of bond_select_queue (the ndo_select_queue
> > method for the bonding driver).  The implementation there is based on the use of
> > tc with bonding, so that output slaves can be selected for certain types of
> > traffic.  But when that mechanism is used, skb->queue_mapping is preserved when
> > the bonding driver queues the frame to the underlying slave.  This denies the
> > slave (if its also a multiqueue device) the opportunity to reselect the queue
> > properly, because of this call path:
> > 
> > bond_queue_xmit
> >  dev_queue_xmit(slave_dev)
> >   dev_pick_tx()
> >    skb_tx_hash()
> >     __skb_tx_hash()
> > 
> > __skb_tx_hash sees that skb_queue_recorded returns true, and assigns a hardware queue mapping
> > based on what the bonding driver chose using its own internal logic.  Since
> > bonding uses the multiqueue infrastructure to do slave output selection without
> > any regard for slave output queue selection, it seems to me we should really
> > reset the queue mapping to zero so the slave device can pick its own tx queue.
> 
> So you're effectively clearing the *RX queue* number (as this is before
> dev_pick_tx()) in order to influence TX queue selection.
> 
No, you're not seeing it properly.  In bonding (as with all stacked devices) we
make two passes through dev_pick_tx.

1) The first time we call dev_pick_tx is when the network stack calls
dev_queue_xmit on the bond device.  here we call ndo_select_queue, which calls
bond_select_queue.  This method tells the stack which queue to enqueue the skb
on for the bond device.  We can use tc's skbedit action to select a particular
queue on the bond device for various classes of traffic, and those queues
correspond to individual slave devices.

2) the second time we call dev_pick_tx is when the bonding bond_queue_xmit
routine (which is the bonding drivers ndo_start_xmit method, called after the
frist dev_pick_tx), calls dev_queue_xmit, passing the slave net_device pointer).
In this case we do whatever the slave device has configured (either reset the
queue to zero, or call the slaves ndo_select queue method).

What I'm fixing is the fact that the bonding drivers queue_mappnig value is
'leaking' down to the slave.

Lets say, for example we're bonding two multiqueue devices, and have the bonding
driver configured to send all traffic to 192.168.1.1 over the first slave (which
we can accomplish using an appropriate tc rule on the bond device, setting
frames with that dest ip to have a skb->queue_mapping of 1).

In the above example, frames bound for 192.168.1.1 have skb->queue_mapping set
to 1 when they enter bond_queue_xmit.  bond_queue_xmit, calls
dev_queue_xmit(slave, skb), which calls dev_pick_tx(slave, skb).  If we (for
simplicity's sake) assume the slave has no ndo_select_queue method, dev_pick_tx
calls __skb_tx_hash to determine the output queue.  But the bonding driver
already set skb->queue_mapping to 1 (because it wanted this frame output on the
first slave, not because it wanted to transmit the frame on the slaves tx queue
1).  Nevertheless, __skb_tx_hash() sees that skb_rx_queue_recorded returns true
here and as a result, calls skb_rx_get_queue, which returns 0.  Because of that
we wind up transmitting on hardware queue 0 all the time, which is not at all
what we want.  What we want is for the bonding driver to be able to use
queue_mapping for its own purposes, and then allow the physical device to make
its own output queue selection independently.  To do this, queue_mapping needs
to be zero, prior to queuenig the skb to the slave, which is exactly what this
patch does.

> Here, the bonding device seems to be behaving as a forwarding device.
> If TX queue selection can go wrong for certain combinations of queue
> configuration when forwarding, then this is a problem for IP forwarding
> and bridging as well, isn't it?
> 
Potentially, yes.  I only fixed this because I was looking at bonding and its
queue_mapping behavior, and saw that this needed fixing.  Bridging and IP
forwarding should also likely clear the queue mapping in the forwarding path
somewhere to avoid selecting an output tx queue that is a function of whatever
queue and device it arrived on during ingress.  I've not yet looked to see if
thats already being done.

Neil

> Ben.
> 
> -- 
> Ben Hutchings, Senior Software Engineer, Solarflare
> Not speaking for my employer; that's the marketing department's job.
> They asked us to note that Solarflare product names are trademarked.
> 
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH] bonding: reset queue mapping prior to transmission to physical device
  2011-06-02 19:46       ` Neil Horman
@ 2011-06-02 19:52         ` Nicolas de Pesloüan
  2011-06-02 20:04         ` David Miller
  2011-06-02 20:13         ` Ben Hutchings
  2 siblings, 0 replies; 33+ messages in thread
From: Nicolas de Pesloüan @ 2011-06-02 19:52 UTC (permalink / raw)
  To: Neil Horman
  Cc: Ben Hutchings, netdev, Jay Vosburgh, Andy Gospodarek, David S. Miller

Le 02/06/2011 21:46, Neil Horman a écrit :
> On Thu, Jun 02, 2011 at 08:09:49PM +0100, Ben Hutchings wrote:
>> On Thu, 2011-06-02 at 14:56 -0400, Neil Horman wrote:
>>> On Thu, Jun 02, 2011 at 07:35:53PM +0100, Ben Hutchings wrote:
>>>> On Thu, 2011-06-02 at 14:03 -0400, Neil Horman wrote:
>>>>> The bonding driver is multiqueue enabled, in which each queue represents a slave
>>>>> to enable optional steering of output frames to given slaves against the default
>>>>> output policy.  However, it needs to reset the skb->queue_mapping prior to
>>>>> queuing to the physical device or the physical slave (if it is multiqueue) could
>>>>> wind up transmitting on an unintended tx queue (one that was reserved for
>>>>> specific traffic classes for instance)
>> [...]
>>>> So far as I can see, this has no effect, because dev_queue_xmit() always
>>>> sets queue_mapping (in dev_pick_tx()).
>>>>
>>> it resets the queue mapping exactly as you would expect it to.  bonding is a
>>> multiqueue enabled device and selects a potentially non-zero queue based on the
>>> output of bond_select_queue.
>>>
>>>> What is the problem you're seeing?
>>>>
>>> The problem is exctly that.  dev_pick_tx() on the bond device sets the
>>> queue_mapping as per the result of bond_select_queue (the ndo_select_queue
>>> method for the bonding driver).  The implementation there is based on the use of
>>> tc with bonding, so that output slaves can be selected for certain types of
>>> traffic.  But when that mechanism is used, skb->queue_mapping is preserved when
>>> the bonding driver queues the frame to the underlying slave.  This denies the
>>> slave (if its also a multiqueue device) the opportunity to reselect the queue
>>> properly, because of this call path:
>>>
>>> bond_queue_xmit
>>>   dev_queue_xmit(slave_dev)
>>>    dev_pick_tx()
>>>     skb_tx_hash()
>>>      __skb_tx_hash()
>>>
>>> __skb_tx_hash sees that skb_queue_recorded returns true, and assigns a hardware queue mapping
>>> based on what the bonding driver chose using its own internal logic.  Since
>>> bonding uses the multiqueue infrastructure to do slave output selection without
>>> any regard for slave output queue selection, it seems to me we should really
>>> reset the queue mapping to zero so the slave device can pick its own tx queue.
>>
>> So you're effectively clearing the *RX queue* number (as this is before
>> dev_pick_tx()) in order to influence TX queue selection.
>>
> No, you're not seeing it properly.  In bonding (as with all stacked devices) we
> make two passes through dev_pick_tx.
>
> 1) The first time we call dev_pick_tx is when the network stack calls
> dev_queue_xmit on the bond device.  here we call ndo_select_queue, which calls
> bond_select_queue.  This method tells the stack which queue to enqueue the skb
> on for the bond device.  We can use tc's skbedit action to select a particular
> queue on the bond device for various classes of traffic, and those queues
> correspond to individual slave devices.
>
> 2) the second time we call dev_pick_tx is when the bonding bond_queue_xmit
> routine (which is the bonding drivers ndo_start_xmit method, called after the
> frist dev_pick_tx), calls dev_queue_xmit, passing the slave net_device pointer).
> In this case we do whatever the slave device has configured (either reset the
> queue to zero, or call the slaves ndo_select queue method).
>
> What I'm fixing is the fact that the bonding drivers queue_mappnig value is
> 'leaking' down to the slave.
>
> Lets say, for example we're bonding two multiqueue devices, and have the bonding
> driver configured to send all traffic to 192.168.1.1 over the first slave (which
> we can accomplish using an appropriate tc rule on the bond device, setting
> frames with that dest ip to have a skb->queue_mapping of 1).
>
> In the above example, frames bound for 192.168.1.1 have skb->queue_mapping set
> to 1 when they enter bond_queue_xmit.  bond_queue_xmit, calls
> dev_queue_xmit(slave, skb), which calls dev_pick_tx(slave, skb).  If we (for
> simplicity's sake) assume the slave has no ndo_select_queue method, dev_pick_tx
> calls __skb_tx_hash to determine the output queue.  But the bonding driver
> already set skb->queue_mapping to 1 (because it wanted this frame output on the
> first slave, not because it wanted to transmit the frame on the slaves tx queue
> 1).  Nevertheless, __skb_tx_hash() sees that skb_rx_queue_recorded returns true
> here and as a result, calls skb_rx_get_queue, which returns 0.  Because of that
> we wind up transmitting on hardware queue 0 all the time, which is not at all
> what we want.  What we want is for the bonding driver to be able to use
> queue_mapping for its own purposes, and then allow the physical device to make
> its own output queue selection independently.  To do this, queue_mapping needs
> to be zero, prior to queuenig the skb to the slave, which is exactly what this
> patch does.

This sounds good to me.

Reviewed-by: Nicolas de Pesloüan <nicolas.2p.debian@free.fr>

>
>> Here, the bonding device seems to be behaving as a forwarding device.
>> If TX queue selection can go wrong for certain combinations of queue
>> configuration when forwarding, then this is a problem for IP forwarding
>> and bridging as well, isn't it?
>>
> Potentially, yes.  I only fixed this because I was looking at bonding and its
> queue_mapping behavior, and saw that this needed fixing.  Bridging and IP
> forwarding should also likely clear the queue mapping in the forwarding path
> somewhere to avoid selecting an output tx queue that is a function of whatever
> queue and device it arrived on during ingress.  I've not yet looked to see if
> thats already being done.
>
> Neil

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH] bonding: reset queue mapping prior to transmission to physical device
  2011-06-02 18:35 ` Ben Hutchings
  2011-06-02 18:56   ` Neil Horman
@ 2011-06-02 19:59   ` David Miller
  1 sibling, 0 replies; 33+ messages in thread
From: David Miller @ 2011-06-02 19:59 UTC (permalink / raw)
  To: bhutchings; +Cc: nhorman, netdev, fubar, andy

From: Ben Hutchings <bhutchings@solarflare.com>
Date: Thu, 02 Jun 2011 19:35:53 +0100

> So far as I can see, this has no effect, because dev_queue_xmit() always
> sets queue_mapping (in dev_pick_tx()).

__skb_tx_hash() uses any hash value recorded in the SKB before trying
to manually it.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH] bonding: reset queue mapping prior to transmission to physical device
  2011-06-02 19:46       ` Neil Horman
  2011-06-02 19:52         ` Nicolas de Pesloüan
@ 2011-06-02 20:04         ` David Miller
  2011-06-02 20:13           ` Nicolas de Pesloüan
  2011-06-02 20:13         ` Ben Hutchings
  2 siblings, 1 reply; 33+ messages in thread
From: David Miller @ 2011-06-02 20:04 UTC (permalink / raw)
  To: nhorman; +Cc: bhutchings, netdev, fubar, andy

From: Neil Horman <nhorman@tuxdriver.com>
Date: Thu, 2 Jun 2011 15:46:21 -0400

> Potentially, yes.  I only fixed this because I was looking at bonding and its
> queue_mapping behavior, and saw that this needed fixing.  Bridging and IP
> forwarding should also likely clear the queue mapping in the forwarding path
> somewhere to avoid selecting an output tx queue that is a function of whatever
> queue and device it arrived on during ingress.  I've not yet looked to see if
> thats already being done.

No we do not do this, intentionally.

That way the input classification and queue selection is mirrored
on the transmit side, which we absolutely want to happen.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH] bonding: reset queue mapping prior to transmission to physical device
  2011-06-02 18:03 [PATCH] bonding: reset queue mapping prior to transmission to physical device Neil Horman
  2011-06-02 18:35 ` Ben Hutchings
@ 2011-06-02 20:07 ` David Miller
  2011-06-02 20:22   ` Nicolas de Pesloüan
  2011-06-03  1:04   ` Neil Horman
  2011-06-03 13:26 ` [PATCH] bonding: reset queue mapping prior to transmission to physical device (v2) Neil Horman
  2 siblings, 2 replies; 33+ messages in thread
From: David Miller @ 2011-06-02 20:07 UTC (permalink / raw)
  To: nhorman; +Cc: netdev, fubar, andy

From: Neil Horman <nhorman@tuxdriver.com>
Date: Thu,  2 Jun 2011 14:03:19 -0400

> The bonding driver is multiqueue enabled, in which each queue represents a slave
> to enable optional steering of output frames to given slaves against the default
> output policy.  However, it needs to reset the skb->queue_mapping prior to
> queuing to the physical device or the physical slave (if it is multiqueue) could
> wind up transmitting on an unintended tx queue (one that was reserved for
> specific traffic classes for instance)
> 
> Signed-off-by: Neil Horman <nhorman@tuxdriver.com>

Since, as I mentioned, the idea when we are forwarding and bridging is that
we use the input receive classification to influence the spread on transmit,
I think things like this bonding case should remember the rxhash setting
before they override it and then restore that value right before invoking
dev_queue_xmit().

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH] bonding: reset queue mapping prior to transmission to physical device
  2011-06-02 19:46       ` Neil Horman
  2011-06-02 19:52         ` Nicolas de Pesloüan
  2011-06-02 20:04         ` David Miller
@ 2011-06-02 20:13         ` Ben Hutchings
  2011-06-03  1:16           ` Neil Horman
  2 siblings, 1 reply; 33+ messages in thread
From: Ben Hutchings @ 2011-06-02 20:13 UTC (permalink / raw)
  To: Neil Horman; +Cc: netdev, Jay Vosburgh, Andy Gospodarek, David S. Miller

On Thu, 2011-06-02 at 15:46 -0400, Neil Horman wrote:
> On Thu, Jun 02, 2011 at 08:09:49PM +0100, Ben Hutchings wrote:
> > On Thu, 2011-06-02 at 14:56 -0400, Neil Horman wrote:
> > > On Thu, Jun 02, 2011 at 07:35:53PM +0100, Ben Hutchings wrote:
> > > > On Thu, 2011-06-02 at 14:03 -0400, Neil Horman wrote:
> > > > > The bonding driver is multiqueue enabled, in which each queue represents a slave
> > > > > to enable optional steering of output frames to given slaves against the default
> > > > > output policy.  However, it needs to reset the skb->queue_mapping prior to
> > > > > queuing to the physical device or the physical slave (if it is multiqueue) could
> > > > > wind up transmitting on an unintended tx queue (one that was reserved for
> > > > > specific traffic classes for instance)
> > [...]
> > > > So far as I can see, this has no effect, because dev_queue_xmit() always
> > > > sets queue_mapping (in dev_pick_tx()).
> > > > 
> > > it resets the queue mapping exactly as you would expect it to.  bonding is a
> > > multiqueue enabled device and selects a potentially non-zero queue based on the
> > > output of bond_select_queue.
> > > 
> > > > What is the problem you're seeing?
> > > > 
> > > The problem is exctly that.  dev_pick_tx() on the bond device sets the
> > > queue_mapping as per the result of bond_select_queue (the ndo_select_queue
> > > method for the bonding driver).  The implementation there is based on the use of
> > > tc with bonding, so that output slaves can be selected for certain types of
> > > traffic.  But when that mechanism is used, skb->queue_mapping is preserved when
> > > the bonding driver queues the frame to the underlying slave.  This denies the
> > > slave (if its also a multiqueue device) the opportunity to reselect the queue
> > > properly, because of this call path:
> > > 
> > > bond_queue_xmit
> > >  dev_queue_xmit(slave_dev)
> > >   dev_pick_tx()
> > >    skb_tx_hash()
> > >     __skb_tx_hash()
> > > 
> > > __skb_tx_hash sees that skb_queue_recorded returns true, and assigns a hardware queue mapping
> > > based on what the bonding driver chose using its own internal logic.  Since
> > > bonding uses the multiqueue infrastructure to do slave output selection without
> > > any regard for slave output queue selection, it seems to me we should really
> > > reset the queue mapping to zero so the slave device can pick its own tx queue.
> > 
> > So you're effectively clearing the *RX queue* number (as this is before
> > dev_pick_tx()) in order to influence TX queue selection.
> > 
> No, you're not seeing it properly.  In bonding (as with all stacked devices) we
> make two passes through dev_pick_tx.

Only if the stacked device has its own software queues and schedulers;
e.g. VLAN devices don't.

> 1) The first time we call dev_pick_tx is when the network stack calls
> dev_queue_xmit on the bond device.  here we call ndo_select_queue, which calls
> bond_select_queue.  This method tells the stack which queue to enqueue the skb
> on for the bond device.  We can use tc's skbedit action to select a particular
> queue on the bond device for various classes of traffic, and those queues
> correspond to individual slave devices.
> 
> 2) the second time we call dev_pick_tx is when the bonding bond_queue_xmit
> routine (which is the bonding drivers ndo_start_xmit method, called after the
> frist dev_pick_tx), calls dev_queue_xmit, passing the slave net_device pointer).
> In this case we do whatever the slave device has configured (either reset the
> queue to zero, or call the slaves ndo_select queue method).
> 
> What I'm fixing is the fact that the bonding drivers queue_mappnig value is
> 'leaking' down to the slave.

And this is because dev_queue_xmit() assumes that it's a record of the
RX queue number.

The differing interpretation of queue_mapping for RX and TX is annoying
and I think we should change the initial value of queue_mapping to -1.
But that's a separate issue.

> Lets say, for example we're bonding two multiqueue devices, and have the bonding
> driver configured to send all traffic to 192.168.1.1 over the first slave (which
> we can accomplish using an appropriate tc rule on the bond device, setting
> frames with that dest ip to have a skb->queue_mapping of 1).
> 
> In the above example, frames bound for 192.168.1.1 have skb->queue_mapping set
> to 1 when they enter bond_queue_xmit.  bond_queue_xmit, calls
> dev_queue_xmit(slave, skb), which calls dev_pick_tx(slave, skb).  If we (for
> simplicity's sake) assume the slave has no ndo_select_queue method, dev_pick_tx
> calls __skb_tx_hash to determine the output queue.  But the bonding driver
> already set skb->queue_mapping to 1 (because it wanted this frame output on the
> first slave, not because it wanted to transmit the frame on the slaves tx queue
> 1).  Nevertheless, __skb_tx_hash() sees that skb_rx_queue_recorded returns true
> here and as a result, calls skb_rx_get_queue, which returns 0.  Because of that
> we wind up transmitting on hardware queue 0 all the time, which is not at all
> what we want.  What we want is for the bonding driver to be able to use
> queue_mapping for its own purposes, and then allow the physical device to make
> its own output queue selection independently.
[...]

I think I understand - the bonding device is effectively single-queue
and shouldn't record an RX queue number.  But I think you should define
and use skb_clear_rx_queue() to set queue_mapping=0, rather than abusing
skb_set_queue_mapping() which is meant to take a TX queue number.

Ben.

-- 
Ben Hutchings, Senior Software Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH] bonding: reset queue mapping prior to transmission to physical device
  2011-06-02 20:04         ` David Miller
@ 2011-06-02 20:13           ` Nicolas de Pesloüan
  2011-06-02 20:46             ` David Miller
  0 siblings, 1 reply; 33+ messages in thread
From: Nicolas de Pesloüan @ 2011-06-02 20:13 UTC (permalink / raw)
  To: David Miller; +Cc: nhorman, bhutchings, netdev, fubar, andy

Le 02/06/2011 22:04, David Miller a écrit :
> From: Neil Horman<nhorman@tuxdriver.com>
> Date: Thu, 2 Jun 2011 15:46:21 -0400
>
>> Potentially, yes.  I only fixed this because I was looking at bonding and its
>> queue_mapping behavior, and saw that this needed fixing.  Bridging and IP
>> forwarding should also likely clear the queue mapping in the forwarding path
>> somewhere to avoid selecting an output tx queue that is a function of whatever
>> queue and device it arrived on during ingress.  I've not yet looked to see if
>> thats already being done.
>
> No we do not do this, intentionally.
>
> That way the input classification and queue selection is mirrored
> on the transmit side, which we absolutely want to happen.

Can you confirm that this is the expected behavior for IP forwarding and bridge but not for bonding?

To be more precise, due to the way bonding use queue mapping for slave selection, it is desirable to 
clear the mapping before sending to the slave, because the meaning of the mapping for the slave 
interface might be really different from the meaning for the bonding interface. Arguably, this is 
the mapping usage in bonding which is "different" from other usages, but...

	Nicolas.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH] bonding: reset queue mapping prior to transmission to physical device
  2011-06-02 20:07 ` David Miller
@ 2011-06-02 20:22   ` Nicolas de Pesloüan
  2011-06-03  1:04   ` Neil Horman
  1 sibling, 0 replies; 33+ messages in thread
From: Nicolas de Pesloüan @ 2011-06-02 20:22 UTC (permalink / raw)
  To: David Miller; +Cc: nhorman, netdev, fubar, andy

Le 02/06/2011 22:07, David Miller a écrit :
> From: Neil Horman<nhorman@tuxdriver.com>
> Date: Thu,  2 Jun 2011 14:03:19 -0400
>
>> The bonding driver is multiqueue enabled, in which each queue represents a slave
>> to enable optional steering of output frames to given slaves against the default
>> output policy.  However, it needs to reset the skb->queue_mapping prior to
>> queuing to the physical device or the physical slave (if it is multiqueue) could
>> wind up transmitting on an unintended tx queue (one that was reserved for
>> specific traffic classes for instance)
>>
>> Signed-off-by: Neil Horman<nhorman@tuxdriver.com>
>
> Since, as I mentioned, the idea when we are forwarding and bridging is that
> we use the input receive classification to influence the spread on transmit,
> I think things like this bonding case should remember the rxhash setting
> before they override it and then restore that value right before invoking
> dev_queue_xmit().

Ok, now I understand. Maybe, using queue mapping for special slave selection wasn't such a good idea 
at the very beginning, because it pollutes the RX mapping that is expected to be propagated up to 
TX. Restoring the original value before invoking dev_queue_xmit() would fix this, but I'm not sure 
it is the cleanest way to do it.

	Nicolas.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH] bonding: reset queue mapping prior to transmission to physical device
  2011-06-02 20:13           ` Nicolas de Pesloüan
@ 2011-06-02 20:46             ` David Miller
  2011-06-02 20:51               ` Ben Hutchings
  0 siblings, 1 reply; 33+ messages in thread
From: David Miller @ 2011-06-02 20:46 UTC (permalink / raw)
  To: nicolas.2p.debian; +Cc: nhorman, bhutchings, netdev, fubar, andy

From: Nicolas de Pesloüan <nicolas.2p.debian@gmail.com>
Date: Thu, 02 Jun 2011 22:13:57 +0200

> To be more precise, due to the way bonding use queue mapping for slave
> selection, it is desirable to clear the mapping before sending to the
> slave, because the meaning of the mapping for the slave interface
> might be really different from the meaning for the bonding
> interface. Arguably, this is the mapping usage in bonding which is
> "different" from other usages, but...

This just confirms my reasoning behind why I wanted to discourage
drivers from providing explicit ->ndo_select_queue() methods unless
absolutely necessary.

Information now gets lost in cases like this bonding issue.

Bonding should definitely, as I suggested, remember the original
rxhash value and restore it when sending to the slave.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH] bonding: reset queue mapping prior to transmission to physical device
  2011-06-02 20:46             ` David Miller
@ 2011-06-02 20:51               ` Ben Hutchings
  2011-06-02 21:10                 ` David Miller
  0 siblings, 1 reply; 33+ messages in thread
From: Ben Hutchings @ 2011-06-02 20:51 UTC (permalink / raw)
  To: David Miller; +Cc: nicolas.2p.debian, nhorman, netdev, fubar, andy

On Thu, 2011-06-02 at 13:46 -0700, David Miller wrote:
> From: Nicolas de Pesloüan <nicolas.2p.debian@gmail.com>
> Date: Thu, 02 Jun 2011 22:13:57 +0200
> 
> > To be more precise, due to the way bonding use queue mapping for slave
> > selection, it is desirable to clear the mapping before sending to the
> > slave, because the meaning of the mapping for the slave interface
> > might be really different from the meaning for the bonding
> > interface. Arguably, this is the mapping usage in bonding which is
> > "different" from other usages, but...
> 
> This just confirms my reasoning behind why I wanted to discourage
> drivers from providing explicit ->ndo_select_queue() methods unless
> absolutely necessary.
> 
> Information now gets lost in cases like this bonding issue.
> 
> Bonding should definitely, as I suggested, remember the original
> rxhash value and restore it when sending to the slave.

Surely RX queue (queue_mapping), not RX hash (rxhash, which is unchanged
anyway AFAIK).

Ben.

-- 
Ben Hutchings, Senior Software Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH] bonding: reset queue mapping prior to transmission to physical device
  2011-06-02 20:51               ` Ben Hutchings
@ 2011-06-02 21:10                 ` David Miller
  0 siblings, 0 replies; 33+ messages in thread
From: David Miller @ 2011-06-02 21:10 UTC (permalink / raw)
  To: bhutchings; +Cc: nicolas.2p.debian, nhorman, netdev, fubar, andy

From: Ben Hutchings <bhutchings@solarflare.com>
Date: Thu, 02 Jun 2011 21:51:46 +0100

> On Thu, 2011-06-02 at 13:46 -0700, David Miller wrote:
>> From: Nicolas de Pesloüan <nicolas.2p.debian@gmail.com>
>> Date: Thu, 02 Jun 2011 22:13:57 +0200
>> 
>> > To be more precise, due to the way bonding use queue mapping for slave
>> > selection, it is desirable to clear the mapping before sending to the
>> > slave, because the meaning of the mapping for the slave interface
>> > might be really different from the meaning for the bonding
>> > interface. Arguably, this is the mapping usage in bonding which is
>> > "different" from other usages, but...
>> 
>> This just confirms my reasoning behind why I wanted to discourage
>> drivers from providing explicit ->ndo_select_queue() methods unless
>> absolutely necessary.
>> 
>> Information now gets lost in cases like this bonding issue.
>> 
>> Bonding should definitely, as I suggested, remember the original
>> rxhash value and restore it when sending to the slave.
> 
> Surely RX queue (queue_mapping), not RX hash (rxhash, which is unchanged
> anyway AFAIK).

Right.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH] bonding: reset queue mapping prior to transmission to physical device
  2011-06-02 20:07 ` David Miller
  2011-06-02 20:22   ` Nicolas de Pesloüan
@ 2011-06-03  1:04   ` Neil Horman
  1 sibling, 0 replies; 33+ messages in thread
From: Neil Horman @ 2011-06-03  1:04 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, fubar, andy

On Thu, Jun 02, 2011 at 01:07:10PM -0700, David Miller wrote:
> From: Neil Horman <nhorman@tuxdriver.com>
> Date: Thu,  2 Jun 2011 14:03:19 -0400
> 
> > The bonding driver is multiqueue enabled, in which each queue represents a slave
> > to enable optional steering of output frames to given slaves against the default
> > output policy.  However, it needs to reset the skb->queue_mapping prior to
> > queuing to the physical device or the physical slave (if it is multiqueue) could
> > wind up transmitting on an unintended tx queue (one that was reserved for
> > specific traffic classes for instance)
> > 
> > Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
> 
> Since, as I mentioned, the idea when we are forwarding and bridging is that
> we use the input receive classification to influence the spread on transmit,
> I think things like this bonding case should remember the rxhash setting
> before they override it and then restore that value right before invoking
> dev_queue_xmit().
> 
Ok, I can respin the patch to do that.  I'll handle it in the AM.

Thanks
Neil


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH] bonding: reset queue mapping prior to transmission to physical device
  2011-06-02 20:13         ` Ben Hutchings
@ 2011-06-03  1:16           ` Neil Horman
  0 siblings, 0 replies; 33+ messages in thread
From: Neil Horman @ 2011-06-03  1:16 UTC (permalink / raw)
  To: Ben Hutchings; +Cc: netdev, Jay Vosburgh, Andy Gospodarek, David S. Miller

On Thu, Jun 02, 2011 at 09:13:24PM +0100, Ben Hutchings wrote:
> On Thu, 2011-06-02 at 15:46 -0400, Neil Horman wrote:
> > On Thu, Jun 02, 2011 at 08:09:49PM +0100, Ben Hutchings wrote:
> > > On Thu, 2011-06-02 at 14:56 -0400, Neil Horman wrote:
> > > > On Thu, Jun 02, 2011 at 07:35:53PM +0100, Ben Hutchings wrote:
> > > > > On Thu, 2011-06-02 at 14:03 -0400, Neil Horman wrote:
> > > > > > The bonding driver is multiqueue enabled, in which each queue represents a slave
> > > > > > to enable optional steering of output frames to given slaves against the default
> > > > > > output policy.  However, it needs to reset the skb->queue_mapping prior to
> > > > > > queuing to the physical device or the physical slave (if it is multiqueue) could
> > > > > > wind up transmitting on an unintended tx queue (one that was reserved for
> > > > > > specific traffic classes for instance)
> > > [...]
> > > > > So far as I can see, this has no effect, because dev_queue_xmit() always
> > > > > sets queue_mapping (in dev_pick_tx()).
> > > > > 
> > > > it resets the queue mapping exactly as you would expect it to.  bonding is a
> > > > multiqueue enabled device and selects a potentially non-zero queue based on the
> > > > output of bond_select_queue.
> > > > 
> > > > > What is the problem you're seeing?
> > > > > 
> > > > The problem is exctly that.  dev_pick_tx() on the bond device sets the
> > > > queue_mapping as per the result of bond_select_queue (the ndo_select_queue
> > > > method for the bonding driver).  The implementation there is based on the use of
> > > > tc with bonding, so that output slaves can be selected for certain types of
> > > > traffic.  But when that mechanism is used, skb->queue_mapping is preserved when
> > > > the bonding driver queues the frame to the underlying slave.  This denies the
> > > > slave (if its also a multiqueue device) the opportunity to reselect the queue
> > > > properly, because of this call path:
> > > > 
> > > > bond_queue_xmit
> > > >  dev_queue_xmit(slave_dev)
> > > >   dev_pick_tx()
> > > >    skb_tx_hash()
> > > >     __skb_tx_hash()
> > > > 
> > > > __skb_tx_hash sees that skb_queue_recorded returns true, and assigns a hardware queue mapping
> > > > based on what the bonding driver chose using its own internal logic.  Since
> > > > bonding uses the multiqueue infrastructure to do slave output selection without
> > > > any regard for slave output queue selection, it seems to me we should really
> > > > reset the queue mapping to zero so the slave device can pick its own tx queue.
> > > 
> > > So you're effectively clearing the *RX queue* number (as this is before
> > > dev_pick_tx()) in order to influence TX queue selection.
> > > 
> > No, you're not seeing it properly.  In bonding (as with all stacked devices) we
> > make two passes through dev_pick_tx.
> 
> Only if the stacked device has its own software queues and schedulers;
> e.g. VLAN devices don't.
> 
Every net_device has a default fifo qdisc, and as such, each net_device when
called with dev_queue_xmit calls dev_pick_tx, and that includes vlans.  It just
so happens that vlans are a single queue device and come out of dev_pick_tx with
a queue_mapping of 0.  Thats all moot of course, since we're talking about
bonding.

> > 1) The first time we call dev_pick_tx is when the network stack calls
> > dev_queue_xmit on the bond device.  here we call ndo_select_queue, which calls
> > bond_select_queue.  This method tells the stack which queue to enqueue the skb
> > on for the bond device.  We can use tc's skbedit action to select a particular
> > queue on the bond device for various classes of traffic, and those queues
> > correspond to individual slave devices.
> > 
> > 2) the second time we call dev_pick_tx is when the bonding bond_queue_xmit
> > routine (which is the bonding drivers ndo_start_xmit method, called after the
> > frist dev_pick_tx), calls dev_queue_xmit, passing the slave net_device pointer).
> > In this case we do whatever the slave device has configured (either reset the
> > queue to zero, or call the slaves ndo_select queue method).
> > 
> > What I'm fixing is the fact that the bonding drivers queue_mappnig value is
> > 'leaking' down to the slave.
> 
> And this is because dev_queue_xmit() assumes that it's a record of the
> RX queue number.
> 
Yes, correct.  stacked devices provide the opportuninty for queue_mapping to no
longer be that rx queue number, but rather a tx queue mapping.  This is a
potentential problem with all stacked devices, but bonding is the only thing
showing it right now because its the only stacked device that can set a non-zero
queue_mapping value.

> The differing interpretation of queue_mapping for RX and TX is annoying
> and I think we should change the initial value of queue_mapping to -1.
> But that's a separate issue.
> 
Agreed.

> > Lets say, for example we're bonding two multiqueue devices, and have the bonding
> > driver configured to send all traffic to 192.168.1.1 over the first slave (which
> > we can accomplish using an appropriate tc rule on the bond device, setting
> > frames with that dest ip to have a skb->queue_mapping of 1).
> > 
> > In the above example, frames bound for 192.168.1.1 have skb->queue_mapping set
> > to 1 when they enter bond_queue_xmit.  bond_queue_xmit, calls
> > dev_queue_xmit(slave, skb), which calls dev_pick_tx(slave, skb).  If we (for
> > simplicity's sake) assume the slave has no ndo_select_queue method, dev_pick_tx
> > calls __skb_tx_hash to determine the output queue.  But the bonding driver
> > already set skb->queue_mapping to 1 (because it wanted this frame output on the
> > first slave, not because it wanted to transmit the frame on the slaves tx queue
> > 1).  Nevertheless, __skb_tx_hash() sees that skb_rx_queue_recorded returns true
> > here and as a result, calls skb_rx_get_queue, which returns 0.  Because of that
> > we wind up transmitting on hardware queue 0 all the time, which is not at all
> > what we want.  What we want is for the bonding driver to be able to use
> > queue_mapping for its own purposes, and then allow the physical device to make
> > its own output queue selection independently.
> [...]
> 
> I think I understand - the bonding device is effectively single-queue
> and shouldn't record an RX queue number.  But I think you should define
Well, no, its a multiqueue device, it just happens to treat slaves as queues,
rather than internal hardware objects.  But I suppose you're saying the same
thing that dave is, in that we should restore the origional queue_mapping,
rather than clearing it to preserve the input semantics when we reach the
physical device.  I can do that.

Neil

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [PATCH] bonding: reset queue mapping prior to transmission to physical device (v2)
  2011-06-02 18:03 [PATCH] bonding: reset queue mapping prior to transmission to physical device Neil Horman
  2011-06-02 18:35 ` Ben Hutchings
  2011-06-02 20:07 ` David Miller
@ 2011-06-03 13:26 ` Neil Horman
  2011-06-03 14:43   ` Ben Hutchings
  2011-06-03 14:59   ` [PATCH] bonding: reset queue mapping prior to transmission to physical device (v2) Vitalii Demianets
  2 siblings, 2 replies; 33+ messages in thread
From: Neil Horman @ 2011-06-03 13:26 UTC (permalink / raw)
  To: netdev; +Cc: Neil Horman, Jay Vosburgh, Andy Gospodarek, David S. Miller

The bonding driver is multiqueue enabled, in which each queue represents a slave
to enable optional steering of output frames to given slaves against the default
output policy.  However, it needs to reset the skb->queue_mapping prior to
queuing to the physical device or the physical slave (if it is multiqueue) could
wind up transmitting on an unintended tx queue

Change Notes:
v2) Based on first pass review, updated the patch to restore the origional queue
mapping that was found in bond_select_queue, rather than simply resetting to
zero.  This preserves the value of queue_mapping when it was set on receive in
the forwarding case which is desireable.

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
CC: Jay Vosburgh <fubar@us.ibm.com>
CC: Andy Gospodarek <andy@greyhouse.net>
CC: "David S. Miller" <davem@davemloft.net>
---
 drivers/net/bonding/bond_main.c |   10 ++++++++++
 1 files changed, 10 insertions(+), 0 deletions(-)

diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index 17b4dd9..8761d86 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -400,6 +400,11 @@ int bond_dev_queue_xmit(struct bonding *bond, struct sk_buff *skb,
 {
 	skb->dev = slave_dev;
 	skb->priority = 1;
+	/*
+	 *restore the origional mapping
+	 */
+	skb_set_queue_mapping(skb, (u16)skb->cb[0]);
+
 	if (unlikely(netpoll_tx_running(slave_dev)))
 		bond_netpoll_send_skb(bond_get_slave_by_dev(bond, slave_dev), skb);
 	else
@@ -4216,6 +4221,11 @@ static u16 bond_select_queue(struct net_device *dev, struct sk_buff *skb)
 	 */
 	u16 txq = skb_rx_queue_recorded(skb) ? skb_get_rx_queue(skb) : 0;
 
+	/*
+ 	 * Save the origional txq to restore before passing to the driver
+ 	 */
+	skb->cb[0] = txq;
+
 	if (unlikely(txq >= dev->real_num_tx_queues)) {
 		do {
 			txq -= dev->real_num_tx_queues;
-- 
1.7.3.4


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* Re: [PATCH] bonding: reset queue mapping prior to transmission to physical device (v2)
  2011-06-03 13:26 ` [PATCH] bonding: reset queue mapping prior to transmission to physical device (v2) Neil Horman
@ 2011-06-03 14:43   ` Ben Hutchings
  2011-06-03 17:32     ` [PATCH] bonding: reset queue mapping prior to transmission to physical device (v3) Neil Horman
  2011-06-03 14:59   ` [PATCH] bonding: reset queue mapping prior to transmission to physical device (v2) Vitalii Demianets
  1 sibling, 1 reply; 33+ messages in thread
From: Ben Hutchings @ 2011-06-03 14:43 UTC (permalink / raw)
  To: Neil Horman; +Cc: netdev, Jay Vosburgh, Andy Gospodarek, David S. Miller

On Fri, 2011-06-03 at 09:26 -0400, Neil Horman wrote:
> The bonding driver is multiqueue enabled, in which each queue represents a slave
> to enable optional steering of output frames to given slaves against the default
> output policy.  However, it needs to reset the skb->queue_mapping prior to
> queuing to the physical device or the physical slave (if it is multiqueue) could
> wind up transmitting on an unintended tx queue
> 
> Change Notes:
> v2) Based on first pass review, updated the patch to restore the origional queue
> mapping that was found in bond_select_queue, rather than simply resetting to
> zero.  This preserves the value of queue_mapping when it was set on receive in
> the forwarding case which is desireable.
> 
> Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
> CC: Jay Vosburgh <fubar@us.ibm.com>
> CC: Andy Gospodarek <andy@greyhouse.net>
> CC: "David S. Miller" <davem@davemloft.net>
> ---
>  drivers/net/bonding/bond_main.c |   10 ++++++++++
>  1 files changed, 10 insertions(+), 0 deletions(-)
> 
> diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
> index 17b4dd9..8761d86 100644
> --- a/drivers/net/bonding/bond_main.c
> +++ b/drivers/net/bonding/bond_main.c
> @@ -400,6 +400,11 @@ int bond_dev_queue_xmit(struct bonding *bond, struct sk_buff *skb,
>  {
>  	skb->dev = slave_dev;
>  	skb->priority = 1;
> +	/*
> +	 *restore the origional mapping
> +	 */
> +	skb_set_queue_mapping(skb, (u16)skb->cb[0]);
> +
>  	if (unlikely(netpoll_tx_running(slave_dev)))
>  		bond_netpoll_send_skb(bond_get_slave_by_dev(bond, slave_dev), skb);
>  	else
> @@ -4216,6 +4221,11 @@ static u16 bond_select_queue(struct net_device *dev, struct sk_buff *skb)
>  	 */
>  	u16 txq = skb_rx_queue_recorded(skb) ? skb_get_rx_queue(skb) : 0;
>  
> +	/*
> + 	 * Save the origional txq to restore before passing to the driver
> + 	 */
> +	skb->cb[0] = txq;
> +

This is throwing away the top 8 bits.  It's unlikely to hurt in
practice, but queue numbers are currently 16-bit and we already have
hardware with >256 queues (even though we don't currently expose them
all to the kernel).

Please use ((u16 *)skb->cb)[0].

Ben.

>  	if (unlikely(txq >= dev->real_num_tx_queues)) {
>  		do {
>  			txq -= dev->real_num_tx_queues;

-- 
Ben Hutchings, Senior Software Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH] bonding: reset queue mapping prior to transmission to physical device (v2)
  2011-06-03 13:26 ` [PATCH] bonding: reset queue mapping prior to transmission to physical device (v2) Neil Horman
  2011-06-03 14:43   ` Ben Hutchings
@ 2011-06-03 14:59   ` Vitalii Demianets
  1 sibling, 0 replies; 33+ messages in thread
From: Vitalii Demianets @ 2011-06-03 14:59 UTC (permalink / raw)
  To: Neil Horman; +Cc: netdev, Jay Vosburgh, Andy Gospodarek, David S. Miller

On Friday 03 June 2011 16:26:47 Neil Horman wrote:
[...]
> +	/*
> +	 *restore the origional mapping
 Style and spelling nits, shouldn't it be:
* Restore the original mapping
> +	 */
> +	skb_set_queue_mapping(skb, (u16)skb->cb[0]);
As already pointed by Ben, it should be ((u16 *)skb->cb)[0]
[...]
> +	/*
> + 	 * Save the origional txq to restore before passing to the driver
Spelling: original
> + 	 */
> +	skb->cb[0] = txq;
As already pointed by Ben, it should be ((u16 *)skb->cb)[0]

-- 
Vitalii Demianets

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [PATCH] bonding: reset queue mapping prior to transmission to physical device (v3)
  2011-06-03 14:43   ` Ben Hutchings
@ 2011-06-03 17:32     ` Neil Horman
  2011-06-03 17:59       ` Ben Hutchings
  2011-06-03 18:06       ` [PATCH] bonding: reset queue mapping prior to transmission to physical device (v3) Jay Vosburgh
  0 siblings, 2 replies; 33+ messages in thread
From: Neil Horman @ 2011-06-03 17:32 UTC (permalink / raw)
  To: netdev; +Cc: Neil Horman, Jay Vosburgh, Andy Gospodarek, David S. Miller

The bonding driver is multiqueue enabled, in which each queue represents a slave
to enable optional steering of output frames to given slaves against the default
output policy.  However, it needs to reset the skb->queue_mapping prior to
queuing to the physical device or the physical slave (if it is multiqueue) could
wind up transmitting on an unintended tx queue

Change Notes:
v2) Based on first pass review, updated the patch to restore the origional queue
mapping that was found in bond_select_queue, rather than simply resetting to
zero.  This preserves the value of queue_mapping when it was set on receive in
the forwarding case which is desireable.

v3) Fixed spelling an casting error in skb->cb

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
CC: Jay Vosburgh <fubar@us.ibm.com>
CC: Andy Gospodarek <andy@greyhouse.net>
CC: "David S. Miller" <davem@davemloft.net>
---
 drivers/net/bonding/bond_main.c |   10 ++++++++++
 1 files changed, 10 insertions(+), 0 deletions(-)

diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index 17b4dd9..dbb1048 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -400,6 +400,11 @@ int bond_dev_queue_xmit(struct bonding *bond, struct sk_buff *skb,
 {
 	skb->dev = slave_dev;
 	skb->priority = 1;
+	/*
+	 *restore the origional mapping
+	 */
+	skb_set_queue_mapping(skb, ((u16 *)skb->cb)[0]);
+
 	if (unlikely(netpoll_tx_running(slave_dev)))
 		bond_netpoll_send_skb(bond_get_slave_by_dev(bond, slave_dev), skb);
 	else
@@ -4216,6 +4221,11 @@ static u16 bond_select_queue(struct net_device *dev, struct sk_buff *skb)
 	 */
 	u16 txq = skb_rx_queue_recorded(skb) ? skb_get_rx_queue(skb) : 0;
 
+	/*
+ 	 * Save the original txq to restore before passing to the driver
+ 	 */
+	((u16 *)skb->cb)[0] = txq;
+
 	if (unlikely(txq >= dev->real_num_tx_queues)) {
 		do {
 			txq -= dev->real_num_tx_queues;
-- 
1.7.3.4


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* Re: [PATCH] bonding: reset queue mapping prior to transmission to physical device (v3)
  2011-06-03 17:32     ` [PATCH] bonding: reset queue mapping prior to transmission to physical device (v3) Neil Horman
@ 2011-06-03 17:59       ` Ben Hutchings
  2011-06-03 18:36         ` Neil Horman
  2011-06-03 19:24         ` [PATCH] bonding: reset queue mapping prior to transmission to physical device (v4) Neil Horman
  2011-06-03 18:06       ` [PATCH] bonding: reset queue mapping prior to transmission to physical device (v3) Jay Vosburgh
  1 sibling, 2 replies; 33+ messages in thread
From: Ben Hutchings @ 2011-06-03 17:59 UTC (permalink / raw)
  To: Neil Horman; +Cc: netdev, Jay Vosburgh, Andy Gospodarek, David S. Miller

On Fri, 2011-06-03 at 13:32 -0400, Neil Horman wrote:
> The bonding driver is multiqueue enabled, in which each queue represents a slave
> to enable optional steering of output frames to given slaves against the default
> output policy.  However, it needs to reset the skb->queue_mapping prior to
> queuing to the physical device or the physical slave (if it is multiqueue) could
> wind up transmitting on an unintended tx queue
> 
> Change Notes:
> v2) Based on first pass review, updated the patch to restore the origional queue
> mapping that was found in bond_select_queue, rather than simply resetting to
> zero.  This preserves the value of queue_mapping when it was set on receive in
> the forwarding case which is desireable.
> 
> v3) Fixed spelling an casting error in skb->cb
> 
> Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
> CC: Jay Vosburgh <fubar@us.ibm.com>
> CC: Andy Gospodarek <andy@greyhouse.net>
> CC: "David S. Miller" <davem@davemloft.net>
> ---
>  drivers/net/bonding/bond_main.c |   10 ++++++++++
>  1 files changed, 10 insertions(+), 0 deletions(-)
> 
> diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
> index 17b4dd9..dbb1048 100644
> --- a/drivers/net/bonding/bond_main.c
> +++ b/drivers/net/bonding/bond_main.c
> @@ -400,6 +400,11 @@ int bond_dev_queue_xmit(struct bonding *bond, struct sk_buff *skb,
>  {
>  	skb->dev = slave_dev;
>  	skb->priority = 1;
> +	/*
> +	 *restore the origional mapping
> +	 */
> +	skb_set_queue_mapping(skb, ((u16 *)skb->cb)[0]);
> +
>  	if (unlikely(netpoll_tx_running(slave_dev)))
>  		bond_netpoll_send_skb(bond_get_slave_by_dev(bond, slave_dev), skb);
>  	else
> @@ -4216,6 +4221,11 @@ static u16 bond_select_queue(struct net_device *dev, struct sk_buff *skb)
>  	 */
>  	u16 txq = skb_rx_queue_recorded(skb) ? skb_get_rx_queue(skb) : 0;
>  
> +	/*
> + 	 * Save the original txq to restore before passing to the driver
> + 	 */
> +	((u16 *)skb->cb)[0] = txq;
> +
>  	if (unlikely(txq >= dev->real_num_tx_queues)) {
>  		do {
>  			txq -= dev->real_num_tx_queues;

I should have read this more thoroughly before - I'm afraid this is
still not quite right as skb_get_rx_queue() will subtract 1.  You need
to save skb_get_queue_mapping() rather than txq.

But skb_{get,set}_queue_mapping() seem to be meant to deal with TX queue
numbers so maybe it's better to save and restore skb->queue_mapping
directly.

Ben.

-- 
Ben Hutchings, Senior Software Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH] bonding: reset queue mapping prior to transmission to physical device (v3)
  2011-06-03 17:32     ` [PATCH] bonding: reset queue mapping prior to transmission to physical device (v3) Neil Horman
  2011-06-03 17:59       ` Ben Hutchings
@ 2011-06-03 18:06       ` Jay Vosburgh
  1 sibling, 0 replies; 33+ messages in thread
From: Jay Vosburgh @ 2011-06-03 18:06 UTC (permalink / raw)
  To: Neil Horman; +Cc: netdev, Andy Gospodarek, David S. Miller

Neil Horman <nhorman@tuxdriver.com> wrote:

>The bonding driver is multiqueue enabled, in which each queue represents a slave
>to enable optional steering of output frames to given slaves against the default
>output policy.  However, it needs to reset the skb->queue_mapping prior to
>queuing to the physical device or the physical slave (if it is multiqueue) could
>wind up transmitting on an unintended tx queue
>
>Change Notes:
>v2) Based on first pass review, updated the patch to restore the origional queue
>mapping that was found in bond_select_queue, rather than simply resetting to
>zero.  This preserves the value of queue_mapping when it was set on receive in
>the forwarding case which is desireable.
>
>v3) Fixed spelling an casting error in skb->cb
>
>Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
>CC: Jay Vosburgh <fubar@us.ibm.com>
>CC: Andy Gospodarek <andy@greyhouse.net>
>CC: "David S. Miller" <davem@davemloft.net>
>---
> drivers/net/bonding/bond_main.c |   10 ++++++++++
> 1 files changed, 10 insertions(+), 0 deletions(-)
>
>diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
>index 17b4dd9..dbb1048 100644
>--- a/drivers/net/bonding/bond_main.c
>+++ b/drivers/net/bonding/bond_main.c
>@@ -400,6 +400,11 @@ int bond_dev_queue_xmit(struct bonding *bond, struct sk_buff *skb,
> {
> 	skb->dev = slave_dev;
> 	skb->priority = 1;
>+	/*
>+	 *restore the origional mapping
>+	 */

	I don't think this comment (or the one below) really adds much.
If you think they're necessary, then (a) it's "original," and (b)
they'll probably fit on one line.

	Or, alternately, make one of them (probably the one below) big
enough to actually explain what's going on, especially if (as Ben says)
you need to stash queue_mapping directly.

	-J

>+	skb_set_queue_mapping(skb, ((u16 *)skb->cb)[0]);
>+
> 	if (unlikely(netpoll_tx_running(slave_dev)))
> 		bond_netpoll_send_skb(bond_get_slave_by_dev(bond, slave_dev), skb);
> 	else
>@@ -4216,6 +4221,11 @@ static u16 bond_select_queue(struct net_device *dev, struct sk_buff *skb)
> 	 */
> 	u16 txq = skb_rx_queue_recorded(skb) ? skb_get_rx_queue(skb) : 0;
>
>+	/*
>+ 	 * Save the original txq to restore before passing to the driver
>+ 	 */
>+	((u16 *)skb->cb)[0] = txq;
>+
> 	if (unlikely(txq >= dev->real_num_tx_queues)) {
> 		do {
> 			txq -= dev->real_num_tx_queues;
>-- 
>1.7.3.4

---
	-Jay Vosburgh, IBM Linux Technology Center, fubar@us.ibm.com

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH] bonding: reset queue mapping prior to transmission to physical device (v3)
  2011-06-03 17:59       ` Ben Hutchings
@ 2011-06-03 18:36         ` Neil Horman
  2011-06-03 19:12           ` Ben Hutchings
  2011-06-03 19:24         ` [PATCH] bonding: reset queue mapping prior to transmission to physical device (v4) Neil Horman
  1 sibling, 1 reply; 33+ messages in thread
From: Neil Horman @ 2011-06-03 18:36 UTC (permalink / raw)
  To: Ben Hutchings; +Cc: netdev, Jay Vosburgh, Andy Gospodarek, David S. Miller

On Fri, Jun 03, 2011 at 06:59:38PM +0100, Ben Hutchings wrote:
> On Fri, 2011-06-03 at 13:32 -0400, Neil Horman wrote:
> > The bonding driver is multiqueue enabled, in which each queue represents a slave
> > to enable optional steering of output frames to given slaves against the default
> > output policy.  However, it needs to reset the skb->queue_mapping prior to
> > queuing to the physical device or the physical slave (if it is multiqueue) could
> > wind up transmitting on an unintended tx queue
> > 
> > Change Notes:
> > v2) Based on first pass review, updated the patch to restore the origional queue
> > mapping that was found in bond_select_queue, rather than simply resetting to
> > zero.  This preserves the value of queue_mapping when it was set on receive in
> > the forwarding case which is desireable.
> > 
> > v3) Fixed spelling an casting error in skb->cb
> > 
> > Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
> > CC: Jay Vosburgh <fubar@us.ibm.com>
> > CC: Andy Gospodarek <andy@greyhouse.net>
> > CC: "David S. Miller" <davem@davemloft.net>
> > ---
> >  drivers/net/bonding/bond_main.c |   10 ++++++++++
> >  1 files changed, 10 insertions(+), 0 deletions(-)
> > 
> > diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
> > index 17b4dd9..dbb1048 100644
> > --- a/drivers/net/bonding/bond_main.c
> > +++ b/drivers/net/bonding/bond_main.c
> > @@ -400,6 +400,11 @@ int bond_dev_queue_xmit(struct bonding *bond, struct sk_buff *skb,
> >  {
> >  	skb->dev = slave_dev;
> >  	skb->priority = 1;
> > +	/*
> > +	 *restore the origional mapping
> > +	 */
> > +	skb_set_queue_mapping(skb, ((u16 *)skb->cb)[0]);
> > +
> >  	if (unlikely(netpoll_tx_running(slave_dev)))
> >  		bond_netpoll_send_skb(bond_get_slave_by_dev(bond, slave_dev), skb);
> >  	else
> > @@ -4216,6 +4221,11 @@ static u16 bond_select_queue(struct net_device *dev, struct sk_buff *skb)
> >  	 */
> >  	u16 txq = skb_rx_queue_recorded(skb) ? skb_get_rx_queue(skb) : 0;
> >  
> > +	/*
> > + 	 * Save the original txq to restore before passing to the driver
> > + 	 */
> > +	((u16 *)skb->cb)[0] = txq;
> > +
> >  	if (unlikely(txq >= dev->real_num_tx_queues)) {
> >  		do {
> >  			txq -= dev->real_num_tx_queues;
> 
> I should have read this more thoroughly before - I'm afraid this is
> still not quite right as skb_get_rx_queue() will subtract 1.  You need
> to save skb_get_queue_mapping() rather than txq.
> 
I agree that we should use skb_get_queue_mapping here, but looking at this begs
the question now, is the queue value here a 1 based or a zero based value?   The
way I see it, skb->queue_mapping can be set from one of two paths:

1) from the ingress rx_path of another interface, in which case queue_mapping
will be 1 based value which we should subtract one from

2) from a local source, after having passed through a tc filter with an skbedit
action attached to it that set the queue mapping

In either case I agree, we should save and restore the raw value, rather than
the adjusted one, but it begs the question in the greater scheme, is the raw
value  1 based or zero based?  


Thats probably discussion for another work item/patch, but it bears
consideration
Neil


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH] bonding: reset queue mapping prior to transmission to physical device (v3)
  2011-06-03 18:36         ` Neil Horman
@ 2011-06-03 19:12           ` Ben Hutchings
  2011-06-03 19:23             ` Neil Horman
  0 siblings, 1 reply; 33+ messages in thread
From: Ben Hutchings @ 2011-06-03 19:12 UTC (permalink / raw)
  To: Neil Horman; +Cc: netdev, Jay Vosburgh, Andy Gospodarek, David S. Miller

On Fri, 2011-06-03 at 14:36 -0400, Neil Horman wrote:
> On Fri, Jun 03, 2011 at 06:59:38PM +0100, Ben Hutchings wrote:
> > On Fri, 2011-06-03 at 13:32 -0400, Neil Horman wrote:
> > > The bonding driver is multiqueue enabled, in which each queue represents a slave
> > > to enable optional steering of output frames to given slaves against the default
> > > output policy.  However, it needs to reset the skb->queue_mapping prior to
> > > queuing to the physical device or the physical slave (if it is multiqueue) could
> > > wind up transmitting on an unintended tx queue
> > > 
> > > Change Notes:
> > > v2) Based on first pass review, updated the patch to restore the origional queue
> > > mapping that was found in bond_select_queue, rather than simply resetting to
> > > zero.  This preserves the value of queue_mapping when it was set on receive in
> > > the forwarding case which is desireable.
> > > 
> > > v3) Fixed spelling an casting error in skb->cb
> > > 
> > > Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
> > > CC: Jay Vosburgh <fubar@us.ibm.com>
> > > CC: Andy Gospodarek <andy@greyhouse.net>
> > > CC: "David S. Miller" <davem@davemloft.net>
> > > ---
> > >  drivers/net/bonding/bond_main.c |   10 ++++++++++
> > >  1 files changed, 10 insertions(+), 0 deletions(-)
> > > 
> > > diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
> > > index 17b4dd9..dbb1048 100644
> > > --- a/drivers/net/bonding/bond_main.c
> > > +++ b/drivers/net/bonding/bond_main.c
> > > @@ -400,6 +400,11 @@ int bond_dev_queue_xmit(struct bonding *bond, struct sk_buff *skb,
> > >  {
> > >  	skb->dev = slave_dev;
> > >  	skb->priority = 1;
> > > +	/*
> > > +	 *restore the origional mapping
> > > +	 */
> > > +	skb_set_queue_mapping(skb, ((u16 *)skb->cb)[0]);
> > > +
> > >  	if (unlikely(netpoll_tx_running(slave_dev)))
> > >  		bond_netpoll_send_skb(bond_get_slave_by_dev(bond, slave_dev), skb);
> > >  	else
> > > @@ -4216,6 +4221,11 @@ static u16 bond_select_queue(struct net_device *dev, struct sk_buff *skb)
> > >  	 */
> > >  	u16 txq = skb_rx_queue_recorded(skb) ? skb_get_rx_queue(skb) : 0;
> > >  
> > > +	/*
> > > + 	 * Save the original txq to restore before passing to the driver
> > > + 	 */
> > > +	((u16 *)skb->cb)[0] = txq;
> > > +
> > >  	if (unlikely(txq >= dev->real_num_tx_queues)) {
> > >  		do {
> > >  			txq -= dev->real_num_tx_queues;
> > 
> > I should have read this more thoroughly before - I'm afraid this is
> > still not quite right as skb_get_rx_queue() will subtract 1.  You need
> > to save skb_get_queue_mapping() rather than txq.
> > 
> I agree that we should use skb_get_queue_mapping here, but looking at this begs
> the question now, is the queue value here a 1 based or a zero based value?

My understanding is that queue_mapping is supposed to be:
1. Before and including dev_pick_tx(): optionally-recorded RX queue
index (1-based)
2. After dev_pick_tx(): selected TX queue index (0-based)

But in every stacked device case 2 leads back into case 1 and
queue_mapping has to be converted at some point.

ndo_select_queue() is called within dev_pick_tx(), so case 1 applies.

> The way I see it, skb->queue_mapping can be set from one of two paths:
> 
> 1) from the ingress rx_path of another interface, in which case queue_mapping
> will be 1 based value which we should subtract one from
> 
> 2) from a local source, after having passed through a tc filter with an skbedit
> action attached to it that set the queue mapping

This implies that skbedit is called in case 1 and is recording a fake RX
queue index with an off-by-one error.

> In either case I agree, we should save and restore the raw value, rather than
> the adjusted one, but it begs the question in the greater scheme, is the raw
> value  1 based or zero based?  
> 
> Thats probably discussion for another work item/patch, but it bears
> consideration

Absolutely.

We really have to get rid of 1-based indexing, so that the only
difference is that in case 2 we know:

    0 <= skb->queue_mapping < skb->dev->real_num_tx_queues

Ben.

-- 
Ben Hutchings, Senior Software Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH] bonding: reset queue mapping prior to transmission to physical device (v3)
  2011-06-03 19:12           ` Ben Hutchings
@ 2011-06-03 19:23             ` Neil Horman
  0 siblings, 0 replies; 33+ messages in thread
From: Neil Horman @ 2011-06-03 19:23 UTC (permalink / raw)
  To: Ben Hutchings; +Cc: netdev, Jay Vosburgh, Andy Gospodarek, David S. Miller

On Fri, Jun 03, 2011 at 08:12:51PM +0100, Ben Hutchings wrote:
> On Fri, 2011-06-03 at 14:36 -0400, Neil Horman wrote:
> > On Fri, Jun 03, 2011 at 06:59:38PM +0100, Ben Hutchings wrote:
> > > On Fri, 2011-06-03 at 13:32 -0400, Neil Horman wrote:
> > > > The bonding driver is multiqueue enabled, in which each queue represents a slave
> > > > to enable optional steering of output frames to given slaves against the default
> > > > output policy.  However, it needs to reset the skb->queue_mapping prior to
> > > > queuing to the physical device or the physical slave (if it is multiqueue) could
> > > > wind up transmitting on an unintended tx queue
> > > > 
> > > > Change Notes:
> > > > v2) Based on first pass review, updated the patch to restore the origional queue
> > > > mapping that was found in bond_select_queue, rather than simply resetting to
> > > > zero.  This preserves the value of queue_mapping when it was set on receive in
> > > > the forwarding case which is desireable.
> > > > 
> > > > v3) Fixed spelling an casting error in skb->cb
> > > > 
> > > > Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
> > > > CC: Jay Vosburgh <fubar@us.ibm.com>
> > > > CC: Andy Gospodarek <andy@greyhouse.net>
> > > > CC: "David S. Miller" <davem@davemloft.net>
> > > > ---
> > > >  drivers/net/bonding/bond_main.c |   10 ++++++++++
> > > >  1 files changed, 10 insertions(+), 0 deletions(-)
> > > > 
> > > > diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
> > > > index 17b4dd9..dbb1048 100644
> > > > --- a/drivers/net/bonding/bond_main.c
> > > > +++ b/drivers/net/bonding/bond_main.c
> > > > @@ -400,6 +400,11 @@ int bond_dev_queue_xmit(struct bonding *bond, struct sk_buff *skb,
> > > >  {
> > > >  	skb->dev = slave_dev;
> > > >  	skb->priority = 1;
> > > > +	/*
> > > > +	 *restore the origional mapping
> > > > +	 */
> > > > +	skb_set_queue_mapping(skb, ((u16 *)skb->cb)[0]);
> > > > +
> > > >  	if (unlikely(netpoll_tx_running(slave_dev)))
> > > >  		bond_netpoll_send_skb(bond_get_slave_by_dev(bond, slave_dev), skb);
> > > >  	else
> > > > @@ -4216,6 +4221,11 @@ static u16 bond_select_queue(struct net_device *dev, struct sk_buff *skb)
> > > >  	 */
> > > >  	u16 txq = skb_rx_queue_recorded(skb) ? skb_get_rx_queue(skb) : 0;
> > > >  
> > > > +	/*
> > > > + 	 * Save the original txq to restore before passing to the driver
> > > > + 	 */
> > > > +	((u16 *)skb->cb)[0] = txq;
> > > > +
> > > >  	if (unlikely(txq >= dev->real_num_tx_queues)) {
> > > >  		do {
> > > >  			txq -= dev->real_num_tx_queues;
> > > 
> > > I should have read this more thoroughly before - I'm afraid this is
> > > still not quite right as skb_get_rx_queue() will subtract 1.  You need
> > > to save skb_get_queue_mapping() rather than txq.
> > > 
> > I agree that we should use skb_get_queue_mapping here, but looking at this begs
> > the question now, is the queue value here a 1 based or a zero based value?
> 
> My understanding is that queue_mapping is supposed to be:
> 1. Before and including dev_pick_tx(): optionally-recorded RX queue
> index (1-based)
> 2. After dev_pick_tx(): selected TX queue index (0-based)
> 
> But in every stacked device case 2 leads back into case 1 and
> queue_mapping has to be converted at some point.
> 
> ndo_select_queue() is called within dev_pick_tx(), so case 1 applies.
> 
> > The way I see it, skb->queue_mapping can be set from one of two paths:
> > 
> > 1) from the ingress rx_path of another interface, in which case queue_mapping
> > will be 1 based value which we should subtract one from
> > 
> > 2) from a local source, after having passed through a tc filter with an skbedit
> > action attached to it that set the queue mapping
> 
> This implies that skbedit is called in case 1 and is recording a fake RX
> queue index with an off-by-one error.
> 
Yes, it does, and that just seems wrong to me, in that it requires users of tc
to understand that queue_mapping is a one based value, but can be set to zero.
It implies that tc users have to have implicit knoweldge of how the driver model
works, that seems wrong.

> > In either case I agree, we should save and restore the raw value, rather than
> > the adjusted one, but it begs the question in the greater scheme, is the raw
> > value  1 based or zero based?  
> > 
> > Thats probably discussion for another work item/patch, but it bears
> > consideration
> 
> Absolutely.
> 
> We really have to get rid of 1-based indexing, so that the only
> difference is that in case 2 we know:
> 
>     0 <= skb->queue_mapping < skb->dev->real_num_tx_queues
> 
Agreed.  Not quite sure of the best way to do that, but I'll put some thought
into it.
Neil

> Ben.
> 
> -- 
> Ben Hutchings, Senior Software Engineer, Solarflare
> Not speaking for my employer; that's the marketing department's job.
> They asked us to note that Solarflare product names are trademarked.
> 
> 

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [PATCH] bonding: reset queue mapping prior to transmission to physical device (v4)
  2011-06-03 17:59       ` Ben Hutchings
  2011-06-03 18:36         ` Neil Horman
@ 2011-06-03 19:24         ` Neil Horman
  2011-06-03 19:48           ` Eric Dumazet
  2011-06-03 20:35           ` [PATCH] bonding: reset queue mapping prior to transmission to physical device (v5) Neil Horman
  1 sibling, 2 replies; 33+ messages in thread
From: Neil Horman @ 2011-06-03 19:24 UTC (permalink / raw)
  To: netdev; +Cc: Neil Horman, Jay Vosburgh, Andy Gospodarek, David S. Miller

The bonding driver is multiqueue enabled, in which each queue represents a slave
to enable optional steering of output frames to given slaves against the default
output policy.  However, it needs to reset the skb->queue_mapping prior to
queuing to the physical device or the physical slave (if it is multiqueue) could
wind up transmitting on an unintended tx queue

Change Notes:
v2) Based on first pass review, updated the patch to restore the origional queue
mapping that was found in bond_select_queue, rather than simply resetting to
zero.  This preserves the value of queue_mapping when it was set on receive in
the forwarding case which is desireable.

v3) Fixed spelling an casting error in skb->cb

v4) fixed to store raw queue_mapping to avoid double decrement

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
CC: Jay Vosburgh <fubar@us.ibm.com>
CC: Andy Gospodarek <andy@greyhouse.net>
CC: "David S. Miller" <davem@davemloft.net>
---
 drivers/net/bonding/bond_main.c |    8 ++++++++
 1 files changed, 8 insertions(+), 0 deletions(-)

diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index 17b4dd9..76adf27 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -400,6 +400,9 @@ int bond_dev_queue_xmit(struct bonding *bond, struct sk_buff *skb,
 {
 	skb->dev = slave_dev;
 	skb->priority = 1;
+
+	skb->queue_mapping = ((u16 *)skb->cb)[0];
+
 	if (unlikely(netpoll_tx_running(slave_dev)))
 		bond_netpoll_send_skb(bond_get_slave_by_dev(bond, slave_dev), skb);
 	else
@@ -4216,6 +4219,11 @@ static u16 bond_select_queue(struct net_device *dev, struct sk_buff *skb)
 	 */
 	u16 txq = skb_rx_queue_recorded(skb) ? skb_get_rx_queue(skb) : 0;
 
+	/*
+ 	 * Save the original txq to restore before passing to the driver
+ 	 */
+	((u16 *)skb->cb)[0] = skb->queue_mapping;
+
 	if (unlikely(txq >= dev->real_num_tx_queues)) {
 		do {
 			txq -= dev->real_num_tx_queues;
-- 
1.7.3.4


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* Re: [PATCH] bonding: reset queue mapping prior to transmission to physical device (v4)
  2011-06-03 19:24         ` [PATCH] bonding: reset queue mapping prior to transmission to physical device (v4) Neil Horman
@ 2011-06-03 19:48           ` Eric Dumazet
  2011-06-03 19:57             ` Neil Horman
  2011-06-03 20:35           ` [PATCH] bonding: reset queue mapping prior to transmission to physical device (v5) Neil Horman
  1 sibling, 1 reply; 33+ messages in thread
From: Eric Dumazet @ 2011-06-03 19:48 UTC (permalink / raw)
  To: Neil Horman; +Cc: netdev, Jay Vosburgh, Andy Gospodarek, David S. Miller

Le vendredi 03 juin 2011 à 15:24 -0400, Neil Horman a écrit :
> The bonding driver is multiqueue enabled, in which each queue represents a slave
> to enable optional steering of output frames to given slaves against the default
> output policy.  However, it needs to reset the skb->queue_mapping prior to
> queuing to the physical device or the physical slave (if it is multiqueue) could
> wind up transmitting on an unintended tx queue
> 
> Change Notes:
> v2) Based on first pass review, updated the patch to restore the origional queue
> mapping that was found in bond_select_queue, rather than simply resetting to
> zero.  This preserves the value of queue_mapping when it was set on receive in
> the forwarding case which is desireable.
> 
> v3) Fixed spelling an casting error in skb->cb
> 
> v4) fixed to store raw queue_mapping to avoid double decrement
> 
> Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
> CC: Jay Vosburgh <fubar@us.ibm.com>
> CC: Andy Gospodarek <andy@greyhouse.net>
> CC: "David S. Miller" <davem@davemloft.net>
> ---
>  drivers/net/bonding/bond_main.c |    8 ++++++++
>  1 files changed, 8 insertions(+), 0 deletions(-)
> 
> diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
> index 17b4dd9..76adf27 100644
> --- a/drivers/net/bonding/bond_main.c
> +++ b/drivers/net/bonding/bond_main.c
> @@ -400,6 +400,9 @@ int bond_dev_queue_xmit(struct bonding *bond, struct sk_buff *skb,
>  {
>  	skb->dev = slave_dev;
>  	skb->priority = 1;
> +
> +	skb->queue_mapping = ((u16 *)skb->cb)[0];

Please dont do that. Use a helper.

For example :

#define bond_queue_mapping(skb) (*(unsigned int *)((skb)->cb))


	skb->queue_mapping = bond_queue_mapping(skb);
	

> +
>  	if (unlikely(netpoll_tx_running(slave_dev)))
>  		bond_netpoll_send_skb(bond_get_slave_by_dev(bond, slave_dev), skb);
>  	else
> @@ -4216,6 +4219,11 @@ static u16 bond_select_queue(struct net_device *dev, struct sk_buff *skb)
>  	 */
>  	u16 txq = skb_rx_queue_recorded(skb) ? skb_get_rx_queue(skb) : 0;
>  
> +	/*
> + 	 * Save the original txq to restore before passing to the driver
> + 	 */
> +	((u16 *)skb->cb)[0] = skb->queue_mapping;
	
	bond_queue_mapping(skb) = skb->queue_mapping;

> +
>  	if (unlikely(txq >= dev->real_num_tx_queues)) {
>  		do {
>  			txq -= dev->real_num_tx_queues;



^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH] bonding: reset queue mapping prior to transmission to physical device (v4)
  2011-06-03 19:48           ` Eric Dumazet
@ 2011-06-03 19:57             ` Neil Horman
  2011-06-03 20:05               ` Eric Dumazet
  0 siblings, 1 reply; 33+ messages in thread
From: Neil Horman @ 2011-06-03 19:57 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: netdev, Jay Vosburgh, Andy Gospodarek, David S. Miller

On Fri, Jun 03, 2011 at 09:48:51PM +0200, Eric Dumazet wrote:
> Le vendredi 03 juin 2011 à 15:24 -0400, Neil Horman a écrit :
> > The bonding driver is multiqueue enabled, in which each queue represents a slave
> > to enable optional steering of output frames to given slaves against the default
> > output policy.  However, it needs to reset the skb->queue_mapping prior to
> > queuing to the physical device or the physical slave (if it is multiqueue) could
> > wind up transmitting on an unintended tx queue
> > 
> > Change Notes:
> > v2) Based on first pass review, updated the patch to restore the origional queue
> > mapping that was found in bond_select_queue, rather than simply resetting to
> > zero.  This preserves the value of queue_mapping when it was set on receive in
> > the forwarding case which is desireable.
> > 
> > v3) Fixed spelling an casting error in skb->cb
> > 
> > v4) fixed to store raw queue_mapping to avoid double decrement
> > 
> > Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
> > CC: Jay Vosburgh <fubar@us.ibm.com>
> > CC: Andy Gospodarek <andy@greyhouse.net>
> > CC: "David S. Miller" <davem@davemloft.net>
> > ---
> >  drivers/net/bonding/bond_main.c |    8 ++++++++
> >  1 files changed, 8 insertions(+), 0 deletions(-)
> > 
> > diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
> > index 17b4dd9..76adf27 100644
> > --- a/drivers/net/bonding/bond_main.c
> > +++ b/drivers/net/bonding/bond_main.c
> > @@ -400,6 +400,9 @@ int bond_dev_queue_xmit(struct bonding *bond, struct sk_buff *skb,
> >  {
> >  	skb->dev = slave_dev;
> >  	skb->priority = 1;
> > +
> > +	skb->queue_mapping = ((u16 *)skb->cb)[0];
> 
> Please dont do that. Use a helper.
> 
Why?  It seems to be reasonably common practice for drivers to access
queue_mapping directly.

Examples can be found in:
ixgbe_xmit_frame
mlx4_en_xmit
qlge_send
igb_xmit_frame_adv
gfar_start_xmit

among others.

Not saying its correct to do so necessecarily, but it seems a helper doesn't buy
us much here, particularly a per-driver helper.  If a helper really should be
used, why not just consistently use skb_get_queue_mapping?
Neil

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH] bonding: reset queue mapping prior to transmission to physical device (v4)
  2011-06-03 19:57             ` Neil Horman
@ 2011-06-03 20:05               ` Eric Dumazet
  0 siblings, 0 replies; 33+ messages in thread
From: Eric Dumazet @ 2011-06-03 20:05 UTC (permalink / raw)
  To: Neil Horman; +Cc: netdev, Jay Vosburgh, Andy Gospodarek, David S. Miller

Le vendredi 03 juin 2011 à 15:57 -0400, Neil Horman a écrit :
> On Fri, Jun 03, 2011 at 09:48:51PM +0200, Eric Dumazet wrote:
> > Le vendredi 03 juin 2011 à 15:24 -0400, Neil Horman a écrit :
> > > The bonding driver is multiqueue enabled, in which each queue represents a slave
> > > to enable optional steering of output frames to given slaves against the default
> > > output policy.  However, it needs to reset the skb->queue_mapping prior to
> > > queuing to the physical device or the physical slave (if it is multiqueue) could
> > > wind up transmitting on an unintended tx queue
> > > 
> > > Change Notes:
> > > v2) Based on first pass review, updated the patch to restore the origional queue
> > > mapping that was found in bond_select_queue, rather than simply resetting to
> > > zero.  This preserves the value of queue_mapping when it was set on receive in
> > > the forwarding case which is desireable.
> > > 
> > > v3) Fixed spelling an casting error in skb->cb
> > > 
> > > v4) fixed to store raw queue_mapping to avoid double decrement
> > > 
> > > Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
> > > CC: Jay Vosburgh <fubar@us.ibm.com>
> > > CC: Andy Gospodarek <andy@greyhouse.net>
> > > CC: "David S. Miller" <davem@davemloft.net>
> > > ---
> > >  drivers/net/bonding/bond_main.c |    8 ++++++++
> > >  1 files changed, 8 insertions(+), 0 deletions(-)
> > > 
> > > diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
> > > index 17b4dd9..76adf27 100644
> > > --- a/drivers/net/bonding/bond_main.c
> > > +++ b/drivers/net/bonding/bond_main.c
> > > @@ -400,6 +400,9 @@ int bond_dev_queue_xmit(struct bonding *bond, struct sk_buff *skb,
> > >  {
> > >  	skb->dev = slave_dev;
> > >  	skb->priority = 1;
> > > +
> > > +	skb->queue_mapping = ((u16 *)skb->cb)[0];
> > 
> > Please dont do that. Use a helper.
> > 
> Why?  It seems to be reasonably common practice for drivers to access
> queue_mapping directly.
> 
> Examples can be found in:
> ixgbe_xmit_frame
> mlx4_en_xmit
> qlge_send
> igb_xmit_frame_adv
> gfar_start_xmit
> 
> among others.
> 
> Not saying its correct to do so necessecarily, but it seems a helper doesn't buy
> us much here, particularly a per-driver helper.  If a helper really should be
> used, why not just consistently use skb_get_queue_mapping?
> Neil

I was speaking of skb->cb access, of course, sorry if you missed my
point ;)

Please take a look at various files using helpers for this.

For example : net/ipv4/igmp.c

#define igmp_skb_size(skb) (*(unsigned int *)((skb)->cb))




^ permalink raw reply	[flat|nested] 33+ messages in thread

* [PATCH] bonding: reset queue mapping prior to transmission to physical device (v5)
  2011-06-03 19:24         ` [PATCH] bonding: reset queue mapping prior to transmission to physical device (v4) Neil Horman
  2011-06-03 19:48           ` Eric Dumazet
@ 2011-06-03 20:35           ` Neil Horman
  2011-06-03 23:31             ` Jay Vosburgh
  1 sibling, 1 reply; 33+ messages in thread
From: Neil Horman @ 2011-06-03 20:35 UTC (permalink / raw)
  To: netdev; +Cc: Neil Horman, Jay Vosburgh, Andy Gospodarek, David S. Miller

The bonding driver is multiqueue enabled, in which each queue represents a slave
to enable optional steering of output frames to given slaves against the default
output policy.  However, it needs to reset the skb->queue_mapping prior to
queuing to the physical device or the physical slave (if it is multiqueue) could
wind up transmitting on an unintended tx queue

Change Notes:
v2) Based on first pass review, updated the patch to restore the origional queue
mapping that was found in bond_select_queue, rather than simply resetting to
zero.  This preserves the value of queue_mapping when it was set on receive in
the forwarding case which is desireable.

v3) Fixed spelling an casting error in skb->cb

v4) fixed to store raw queue_mapping to avoid double decrement

v5) Eric D requested that ->cb access be wrapped in a macro.

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
CC: Jay Vosburgh <fubar@us.ibm.com>
CC: Andy Gospodarek <andy@greyhouse.net>
CC: "David S. Miller" <davem@davemloft.net>
---
 drivers/net/bonding/bond_main.c |   11 +++++++++++
 1 files changed, 11 insertions(+), 0 deletions(-)

diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index 17b4dd9..abf6e19 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -388,6 +388,8 @@ struct vlan_entry *bond_next_vlan(struct bonding *bond, struct vlan_entry *curr)
 	return next;
 }
 
+#define bond_queue_mapping(skb) (*(u16 *)((skb)->cb))
+
 /**
  * bond_dev_queue_xmit - Prepare skb for xmit.
  *
@@ -400,6 +402,9 @@ int bond_dev_queue_xmit(struct bonding *bond, struct sk_buff *skb,
 {
 	skb->dev = slave_dev;
 	skb->priority = 1;
+
+	skb->queue_mapping = bond_queue_mapping(skb); 
+
 	if (unlikely(netpoll_tx_running(slave_dev)))
 		bond_netpoll_send_skb(bond_get_slave_by_dev(bond, slave_dev), skb);
 	else
@@ -4206,6 +4211,7 @@ static inline int bond_slave_override(struct bonding *bond,
 	return res;
 }
 
+
 static u16 bond_select_queue(struct net_device *dev, struct sk_buff *skb)
 {
 	/*
@@ -4216,6 +4222,11 @@ static u16 bond_select_queue(struct net_device *dev, struct sk_buff *skb)
 	 */
 	u16 txq = skb_rx_queue_recorded(skb) ? skb_get_rx_queue(skb) : 0;
 
+	/*
+ 	 * Save the original txq to restore before passing to the driver
+ 	 */
+	bond_queue_mapping(skb) = skb->queue_mapping;
+
 	if (unlikely(txq >= dev->real_num_tx_queues)) {
 		do {
 			txq -= dev->real_num_tx_queues;
-- 
1.7.3.4


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* Re: [PATCH] bonding: reset queue mapping prior to transmission to physical device (v5)
  2011-06-03 20:35           ` [PATCH] bonding: reset queue mapping prior to transmission to physical device (v5) Neil Horman
@ 2011-06-03 23:31             ` Jay Vosburgh
  2011-06-05 21:32               ` David Miller
  0 siblings, 1 reply; 33+ messages in thread
From: Jay Vosburgh @ 2011-06-03 23:31 UTC (permalink / raw)
  To: Neil Horman; +Cc: netdev, Andy Gospodarek, David S. Miller

Neil Horman <nhorman@tuxdriver.com> wrote:

>The bonding driver is multiqueue enabled, in which each queue represents a slave
>to enable optional steering of output frames to given slaves against the default
>output policy.  However, it needs to reset the skb->queue_mapping prior to
>queuing to the physical device or the physical slave (if it is multiqueue) could
>wind up transmitting on an unintended tx queue
>
>Change Notes:
>v2) Based on first pass review, updated the patch to restore the origional queue
>mapping that was found in bond_select_queue, rather than simply resetting to
>zero.  This preserves the value of queue_mapping when it was set on receive in
>the forwarding case which is desireable.
>
>v3) Fixed spelling an casting error in skb->cb
>
>v4) fixed to store raw queue_mapping to avoid double decrement
>
>v5) Eric D requested that ->cb access be wrapped in a macro.

	Shouldn't the change log go below the "---" so it doesn't end up
in the git commit log?

	In any event, I looked for ways into bond_dev_queue_xmit without
first passing through bond_select_queue (lest stale cb[] data intrude),
and I don't see any, so I think this is ok.

Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>

	-J

>Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
>CC: Jay Vosburgh <fubar@us.ibm.com>
>CC: Andy Gospodarek <andy@greyhouse.net>
>CC: "David S. Miller" <davem@davemloft.net>
>---
> drivers/net/bonding/bond_main.c |   11 +++++++++++
> 1 files changed, 11 insertions(+), 0 deletions(-)
>
>diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
>index 17b4dd9..abf6e19 100644
>--- a/drivers/net/bonding/bond_main.c
>+++ b/drivers/net/bonding/bond_main.c
>@@ -388,6 +388,8 @@ struct vlan_entry *bond_next_vlan(struct bonding *bond, struct vlan_entry *curr)
> 	return next;
> }
>
>+#define bond_queue_mapping(skb) (*(u16 *)((skb)->cb))
>+
> /**
>  * bond_dev_queue_xmit - Prepare skb for xmit.
>  *
>@@ -400,6 +402,9 @@ int bond_dev_queue_xmit(struct bonding *bond, struct sk_buff *skb,
> {
> 	skb->dev = slave_dev;
> 	skb->priority = 1;
>+
>+	skb->queue_mapping = bond_queue_mapping(skb); 
>+
> 	if (unlikely(netpoll_tx_running(slave_dev)))
> 		bond_netpoll_send_skb(bond_get_slave_by_dev(bond, slave_dev), skb);
> 	else
>@@ -4206,6 +4211,7 @@ static inline int bond_slave_override(struct bonding *bond,
> 	return res;
> }
>
>+
> static u16 bond_select_queue(struct net_device *dev, struct sk_buff *skb)
> {
> 	/*
>@@ -4216,6 +4222,11 @@ static u16 bond_select_queue(struct net_device *dev, struct sk_buff *skb)
> 	 */
> 	u16 txq = skb_rx_queue_recorded(skb) ? skb_get_rx_queue(skb) : 0;
>
>+	/*
>+ 	 * Save the original txq to restore before passing to the driver
>+ 	 */
>+	bond_queue_mapping(skb) = skb->queue_mapping;
>+
> 	if (unlikely(txq >= dev->real_num_tx_queues)) {
> 		do {
> 			txq -= dev->real_num_tx_queues;
>-- 
>1.7.3.4
>

---
	-Jay Vosburgh, IBM Linux Technology Center, fubar@us.ibm.com


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH] bonding: reset queue mapping prior to transmission to physical device (v5)
  2011-06-03 23:31             ` Jay Vosburgh
@ 2011-06-05 21:32               ` David Miller
  0 siblings, 0 replies; 33+ messages in thread
From: David Miller @ 2011-06-05 21:32 UTC (permalink / raw)
  To: fubar; +Cc: nhorman, netdev, andy

From: Jay Vosburgh <fubar@us.ibm.com>
Date: Fri, 03 Jun 2011 16:31:06 -0700

> Neil Horman <nhorman@tuxdriver.com> wrote:
> 
>>The bonding driver is multiqueue enabled, in which each queue represents a slave
>>to enable optional steering of output frames to given slaves against the default
>>output policy.  However, it needs to reset the skb->queue_mapping prior to
>>queuing to the physical device or the physical slave (if it is multiqueue) could
>>wind up transmitting on an unintended tx queue
>>
>>Change Notes:
>>v2) Based on first pass review, updated the patch to restore the origional queue
>>mapping that was found in bond_select_queue, rather than simply resetting to
>>zero.  This preserves the value of queue_mapping when it was set on receive in
>>the forwarding case which is desireable.
>>
>>v3) Fixed spelling an casting error in skb->cb
>>
>>v4) fixed to store raw queue_mapping to avoid double decrement
>>
>>v5) Eric D requested that ->cb access be wrapped in a macro.
> 
> 	Shouldn't the change log go below the "---" so it doesn't end up
> in the git commit log?

Yes, usually, however in this case Neil's change notes give some
important information about why the patch is implemented the way it
is, especially the notes about "v2" so I left them in the commit
message.

> 	In any event, I looked for ways into bond_dev_queue_xmit without
> first passing through bond_select_queue (lest stale cb[] data intrude),
> and I don't see any, so I think this is ok.
> 
> Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>

Applied, thanks everyone.

^ permalink raw reply	[flat|nested] 33+ messages in thread

end of thread, other threads:[~2011-06-05 21:33 UTC | newest]

Thread overview: 33+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-06-02 18:03 [PATCH] bonding: reset queue mapping prior to transmission to physical device Neil Horman
2011-06-02 18:35 ` Ben Hutchings
2011-06-02 18:56   ` Neil Horman
2011-06-02 19:09     ` Ben Hutchings
2011-06-02 19:46       ` Neil Horman
2011-06-02 19:52         ` Nicolas de Pesloüan
2011-06-02 20:04         ` David Miller
2011-06-02 20:13           ` Nicolas de Pesloüan
2011-06-02 20:46             ` David Miller
2011-06-02 20:51               ` Ben Hutchings
2011-06-02 21:10                 ` David Miller
2011-06-02 20:13         ` Ben Hutchings
2011-06-03  1:16           ` Neil Horman
2011-06-02 19:59   ` David Miller
2011-06-02 20:07 ` David Miller
2011-06-02 20:22   ` Nicolas de Pesloüan
2011-06-03  1:04   ` Neil Horman
2011-06-03 13:26 ` [PATCH] bonding: reset queue mapping prior to transmission to physical device (v2) Neil Horman
2011-06-03 14:43   ` Ben Hutchings
2011-06-03 17:32     ` [PATCH] bonding: reset queue mapping prior to transmission to physical device (v3) Neil Horman
2011-06-03 17:59       ` Ben Hutchings
2011-06-03 18:36         ` Neil Horman
2011-06-03 19:12           ` Ben Hutchings
2011-06-03 19:23             ` Neil Horman
2011-06-03 19:24         ` [PATCH] bonding: reset queue mapping prior to transmission to physical device (v4) Neil Horman
2011-06-03 19:48           ` Eric Dumazet
2011-06-03 19:57             ` Neil Horman
2011-06-03 20:05               ` Eric Dumazet
2011-06-03 20:35           ` [PATCH] bonding: reset queue mapping prior to transmission to physical device (v5) Neil Horman
2011-06-03 23:31             ` Jay Vosburgh
2011-06-05 21:32               ` David Miller
2011-06-03 18:06       ` [PATCH] bonding: reset queue mapping prior to transmission to physical device (v3) Jay Vosburgh
2011-06-03 14:59   ` [PATCH] bonding: reset queue mapping prior to transmission to physical device (v2) Vitalii Demianets

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.